Index: user/ngie/more-tests/UPDATING =================================================================== --- user/ngie/more-tests/UPDATING (revision 281584) +++ user/ngie/more-tests/UPDATING (revision 281585) @@ -1,1173 +1,1177 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. NOTE TO PEOPLE WHO THINK THAT FreeBSD 11.x IS SLOW: FreeBSD 11.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely disable the most expensive debugging functionality run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) +20150415: + The const qualifier has been removed from iconv(3) to comply with + POSIX. The ports tree is aware of this from r384038 onwards. + 20150324: From legacy ata(4) driver was removed support for SATA controllers supported by more functional drivers ahci(4), siis(4) and mvs(4). Kernel modules ataahci and ataadaptec were removed completely, replaced by ahci and mvs modules respectively. 20150315: Clang, llvm and lldb have been upgraded to 3.6.0 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150307: The 32-bit PowerPC kernel has been changed to a position-independent executable. This can only be booted with a version of loader(8) newer than January 31, 2015, so make sure to update both world and kernel before rebooting. 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The rc.d/jail script has been updated to support jail(8) configuration file. The "jail__*" rc.conf(5) variables for per-jail configuration are automatically converted to /var/run/jail..conf before the jail(8) utility is invoked. This is transparently backward compatible. See below about some incompatibilities and rc.conf(5) manual page for more details. These variables are now deprecated in favor of jail(8) configuration file. One can use "rc.d/jail config " command to generate a jail(8) configuration file in /var/run/jail..conf without running the jail(8) utility. The default pathname of the configuration file is /etc/jail.conf and can be specified by using $jail_conf or $jail__conf variables. Please note that jail_devfs_ruleset accepts an integer at this moment. Please consider to rewrite the ruleset name with an integer. 20130930: BIND has been removed from the base system. If all you need is a local resolver, simply enable and start the local_unbound service instead. Otherwise, several versions of BIND are available in the ports tree. The dns/bind99 port is one example. With this change, nslookup(1) and dig(1) are no longer in the base system. Users should instead use host(1) and drill(1) which are in the base system. Alternatively, nslookup and dig can be obtained by installing the dns/bind-tools port. 20130916: With the addition of unbound(8), a new unbound user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20130911: OpenSSH is now built with DNSSEC support, and will by default silently trust signed SSHFP records. This can be controlled with the VerifyHostKeyDNS client configuration setting. DNSSEC support can be disabled entirely with the WITHOUT_LDNS option in src.conf. 20130906: The GNU Compiler Collection and C++ standard library (libstdc++) are no longer built by default on platforms where clang is the system compiler. You can enable them with the WITH_GCC and WITH_GNUCXX options in src.conf. 20130905: The PROCDESC kernel option is now part of the GENERIC kernel configuration and is required for the rwhod(8) to work. If you are using custom kernel configuration, you should include 'options PROCDESC'. 20130905: The API and ABI related to the Capsicum framework was modified in backward incompatible way. The userland libraries and programs have to be recompiled to work with the new kernel. This includes the following libraries and programs, but the whole buildworld is advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl, kdump, procstat, rwho, rwhod, uniq. 20130903: AES-NI intrinsic support has been added to gcc. The AES-NI module has been updated to use this support. A new gcc is required to build the aesni module on both i386 and amd64. 20130821: The PADLOCK_RNG and RDRAND_RNG kernel options are now devices. Thus "device padlock_rng" and "device rdrand_rng" should be used instead of "options PADLOCK_RNG" & "options RDRAND_RNG". 20130813: WITH_ICONV has been split into two feature sets. WITH_ICONV now enables just the iconv* functionality and is now on by default. WITH_LIBICONV_COMPAT enables the libiconv api and link time compatability. Set WITHOUT_ICONV to build the old way. If you have been using WITH_ICONV before, you will very likely need to turn on WITH_LIBICONV_COMPAT. 20130806: INVARIANTS option now enables DEBUG for code with OpenSolaris and Illumos origin, including ZFS. If you have INVARIANTS in your kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG explicitly. DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS) locks if WITNESS option was set. Because that generated a lot of witness(9) reports and all of them were believed to be false positives, this is no longer done. New option OPENSOLARIS_WITNESS can be used to achieve the previous behavior. 20130806: Timer values in IPv6 data structures now use time_uptime instead of time_second. Although this is not a user-visible functional change, userland utilities which directly use them---ndp(8), rtadvd(8), and rtsold(8) in the base system---need to be updated to r253970 or later. 20130802: find -delete can now delete the pathnames given as arguments, instead of only files found below them or if the pathname did not contain any slashes. Formerly, the following error message would result: find: -delete: : relative path potentially not safe Deleting the pathnames given as arguments can be prevented without error messages using -mindepth 1 or by changing directory and passing "." as argument to find. This works in the old as well as the new version of find. 20130726: Behavior of devfs rules path matching has been changed. Pattern is now always matched against fully qualified devfs path and slash characters must be explicitly matched by slashes in pattern (FNM_PATHNAME). Rulesets involving devfs subdirectories must be reviewed. 20130716: The default ARM ABI has changed to the ARM EABI. The old ABI is incompatible with the ARM EABI and all programs and modules will need to be rebuilt to work with a new kernel. To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set. NOTE: Support for the old ABI will be removed in the future and users are advised to upgrade. 20130709: pkg_install has been disconnected from the build if you really need it you should add WITH_PKGTOOLS in your src.conf(5). 20130709: Most of network statistics structures were changed to be able keep 64-bits counters. Thus all tools, that work with networking statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.) 20130629: Fix targets that run multiple make's to use && rather than ; so that subsequent steps depend on success of previous. NOTE: if building 'universe' with -j* on stable/8 or stable/9 it would be better to start the build using bmake, to avoid overloading the machine. 20130618: Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file. 20130615: CVS has been removed from the base system. An exact copy of the code is available from the devel/cvs port. 20130613: Some people report the following error after the switch to bmake: make: illegal option -- J usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable] ... *** [buildworld] Error code 2 this likely due to an old instance of make in ${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE}) which src/Makefile will use that blindly, if it exists, so if you see the above error: rm -rf `make -V MAKEPATH` should resolve it. 20130516: Use bmake by default. Whereas before one could choose to build with bmake via -DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old make. The goal is to remove these knobs for 10-RELEASE. It is worth noting that bmake (like gmake) treats the command line as the unit of failure, rather than statements within the command line. Thus '(cd some/where && dosomething)' is safer than 'cd some/where; dosomething'. The '()' allows consistent behavior in parallel build. 20130429: Fix a bug that allows NFS clients to issue READDIR on files. 20130426: The WITHOUT_IDEA option has been removed because the IDEA patent expired. 20130426: The sysctl which controls TRIM support under ZFS has been renamed from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been enabled by default. 20130425: The mergemaster command now uses the default MAKEOBJDIRPREFIX rather than creating it's own in the temporary directory in order allow access to bootstrapped versions of tools such as install and mtree. When upgrading from version of FreeBSD where the install command does not support -l, you will need to install a new mergemaster command if mergemaster -p is required. This can be accomplished with the command (cd src/usr.sbin/mergemaster && make install). 20130404: Legacy ATA stack, disabled and replaced by new CAM-based one since FreeBSD 9.0, completely removed from the sources. Kernel modules atadisk and atapi*, user-level tools atacontrol and burncd are removed. Kernel option `options ATA_CAM` is now permanently enabled and removed. 20130319: SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2) and socketpair(2). Software, in particular Kerberos, may automatically detect and use these during building. The resulting binaries will not work on older kernels. 20130308: CTL_DISABLE has also been added to the sparc64 GENERIC (for further information, see the respective 20130304 entry). 20130304: Recent commits to callout(9) changed the size of struct callout, so the KBI is probably heavily disturbed. Also, some functions in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced by macros. Every kernel module using it won't load, so rebuild is requested. The ctl device has been re-enabled in GENERIC for i386 and amd64, but does not initialize by default (because of the new CTL_DISABLE option) to save memory. To re-enable it, remove the CTL_DISABLE option from the kernel config file or set kern.cam.ctl.disable=0 in /boot/loader.conf. 20130301: The ctl device has been disabled in GENERIC for i386 and amd64. This was done due to the extra memory being allocated at system initialisation time by the ctl driver which was only used if a CAM target device was created. This makes a FreeBSD system unusable on 128MB or less of RAM. 20130208: A new compression method (lz4) has been merged to -HEAD. Please refer to zpool-features(7) for more information. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20130129: A BSD-licensed patch(1) variant has been added and is installed as bsdpatch, being the GNU version the default patch. To inverse the logic and use the BSD-licensed one as default, while having the GNU version installed as gnupatch, rebuild and install world with the WITH_BSD_PATCH knob set. 20130121: Due to the use of the new -l option to install(1) during build and install, you must take care not to directly set the INSTALL make variable in your /etc/make.conf, /etc/src.conf, or on the command line. If you wish to use the -C flag for all installs you may be able to add INSTALL+=-C to /etc/make.conf or /etc/src.conf. 20130118: The install(1) option -M has changed meaning and now takes an argument that is a file or path to append logs to. In the unlikely event that -M was the last option on the command line and the command line contained at least two files and a target directory the first file will have logs appended to it. The -M option served little practical purpose in the last decade so its use is expected to be extremely rare. 20121223: After switching to Clang as the default compiler some users of ZFS on i386 systems started to experience stack overflow kernel panics. Please consider using 'options KSTACK_PAGES=4' in such configurations. 20121222: GEOM_LABEL now mangles label names read from file system metadata. Mangling affect labels containing spaces, non-printable characters, '%' or '"'. Device names in /etc/fstab and other places may need to be updated. 20121217: By default, only the 10 most recent kernel dumps will be saved. To restore the previous behaviour (no limit on the number of kernel dumps stored in the dump directory) add the following line to /etc/rc.conf: savecore_flags="" 20121201: With the addition of auditdistd(8), a new auditdistd user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20121117: The sin6_scope_id member variable in struct sockaddr_in6 is now filled by the kernel before passing the structure to the userland via sysctl or routing socket. This means the KAME-specific embedded scope id in sin6_addr.s6_addr[2] is always cleared in userland application. This behavior can be controlled by net.inet6.ip6.deembed_scopeid. __FreeBSD_version is bumped to 1000025. 20121105: On i386 and amd64 systems WITH_CLANG_IS_CC is now the default. This means that the world and kernel will be compiled with clang and that clang will be installed as /usr/bin/cc, /usr/bin/c++, and /usr/bin/cpp. To disable this behavior and revert to building with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions of current may need to bootstrap WITHOUT_CLANG first if the clang build fails (its compatibility window doesn't extend to the 9 stable branch point). 20121102: The IPFIREWALL_FORWARD kernel option has been removed. Its functionality now turned on by default. 20121023: The ZERO_COPY_SOCKET kernel option has been removed and split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP. NB: SOCKET_SEND_COW uses the VM page based copy-on-write mechanism which is not safe and may result in kernel crashes. NB: The SOCKET_RECV_PFLIP mechanism is useless as no current driver supports disposeable external page sized mbuf storage. Proper replacements for both zero-copy mechanisms are under consideration and will eventually lead to complete removal of the two kernel options. 20121023: The IPv4 network stack has been converted to network byte order. The following modules need to be recompiled together with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4), pf(4), ipfw(4), ng_ipfw(4), stf(4). 20121022: Support for non-MPSAFE filesystems was removed from VFS. The VFS_VERSION was bumped, all filesystem modules shall be recompiled. 20121018: All the non-MPSAFE filesystems have been disconnected from the build. The full list includes: codafs, hpfs, ntfs, nwfs, portalfs, smbfs, xfs. 20121016: The interface cloning API and ABI has changed. The following modules need to be recompiled together with kernel: ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4), vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4), faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4). 20121015: The sdhci driver was split in two parts: sdhci (generic SD Host Controller logic) and sdhci_pci (actual hardware driver). No kernel config modifications are required, but if you load sdhc as a module you must switch to sdhci_pci instead. 20121014: Import the FUSE kernel and userland support into base system. 20121013: The GNU sort(1) program has been removed since the BSD-licensed sort(1) has been the default for quite some time and no serious problems have been reported. The corresponding WITH_GNU_SORT knob has also gone. 20121006: The pfil(9) API/ABI for AF_INET family has been changed. Packet filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled with new kernel. 20121001: The net80211(4) ABI has been changed to allow for improved driver PS-POLL and power-save support. All wireless drivers need to be recompiled to work with the new kernel. 20120913: The random(4) support for the VIA hardware random number generator (`PADLOCK') is no longer enabled unconditionally. Add the padlock_rng device in the custom kernel config if needed. The GENERIC kernels on i386 and amd64 do include the device, so the change only affects the custom kernel configurations. 20120908: The pf(4) packet filter ABI has been changed. pfctl(8) and snmp_pf module need to be recompiled to work with new kernel. 20120828: A new ZFS feature flag "com.delphix:empty_bpobj" has been merged to -HEAD. Pools that have empty_bpobj in active state can not be imported read-write with ZFS implementations that do not support this feature. For more information read the zpool-features(5) manual page. 20120727: The sparc64 ZFS loader has been changed to no longer try to auto- detect ZFS providers based on diskN aliases but now requires these to be explicitly listed in the OFW boot-device environment variable. 20120712: The OpenSSL has been upgraded to 1.0.1c. Any binaries requiring libcrypto.so.6 or libssl.so.6 must be recompiled. Also, there are configuration changes. Make sure to merge /etc/ssl/openssl.cnf. 20120712: The following sysctls and tunables have been renamed for consistency with other variables: kern.cam.da.da_send_ordered -> kern.cam.da.send_ordered kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered 20120628: The sort utility has been replaced with BSD sort. For now, GNU sort is also available as "gnusort" or the default can be set back to GNU sort by setting WITH_GNU_SORT. In this case, BSD sort will be installed as "bsdsort". 20120611: A new version of ZFS (pool version 5000) has been merged to -HEAD. Starting with this version the old system of ZFS pool versioning is superseded by "feature flags". This concept enables forward compatibility against certain future changes in functionality of ZFS pools. The first read-only compatible "feature flag" for ZFS pools is named "com.delphix:async_destroy". For more information read the new zpool-features(5) manual page. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20120417: The malloc(3) implementation embedded in libc now uses sources imported as contrib/jemalloc. The most disruptive API change is to /etc/malloc.conf. If your system has an old-style /etc/malloc.conf, delete it prior to installworld, and optionally re-create it using the new format after rebooting. See malloc.conf(5) for details (specifically the TUNING section and the "opt.*" entries in the MALLCTL NAMESPACE section). 20120328: Big-endian MIPS TARGET_ARCH values no longer end in "eb". mips64eb is now spelled mips64. mipsn32eb is now spelled mipsn32. mipseb is now spelled mips. This is to aid compatibility with third-party software that expects this naming scheme in uname(3). Little-endian settings are unchanged. If you are updating a big-endian mips64 machine from before this change, you may need to set MACHINE_ARCH=mips64 in your environment before the new build system will recognize your machine. 20120306: Disable by default the option VFS_ALLOW_NONMPSAFE for all supported platforms. 20120229: Now unix domain sockets behave "as expected" on nullfs(5). Previously nullfs(5) did not pass through all behaviours to the underlying layer, as a result if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. 20120211: The getifaddrs upgrade path broken with 20111215 has been restored. If you have upgraded in between 20111215 and 20120209 you need to recompile libc again with your kernel. You still need to recompile world to be able to configure CARP but this restriction already comes from 20111215. 20120114: The set_rcvar() function has been removed from /etc/rc.subr. All base and ports rc.d scripts have been updated, so if you have a port installed with a script in /usr/local/etc/rc.d you can either hand-edit the rcvar= line, or reinstall the port. An easy way to handle the mass-update of /etc/rc.d: rm /etc/rc.d/* && mergemaster -i 20120109: panic(9) now stops other CPUs in the SMP systems, disables interrupts on the current CPU and prevents other threads from running. This behavior can be reverted using the kern.stop_scheduler_on_panic tunable/sysctl. The new behavior can be incompatible with kern.sync_on_panic. 20111215: The carp(4) facility has been changed significantly. Configuration of the CARP protocol via ifconfig(8) has changed, as well as format of CARP events submitted to devd(8) has changed. See manual pages for more information. The arpbalance feature of carp(4) is currently not supported anymore. Size of struct in_aliasreq, struct in6_aliasreq has changed. User utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8), need to be recompiled. 20111122: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. 20111108: The option VFS_ALLOW_NONMPSAFE option has been added in order to explicitely support non-MPSAFE filesystems. It is on by default for all supported platform at this present time. 20111101: The broken amd(4) driver has been replaced with esp(4) in the amd64, i386 and pc98 GENERIC kernel configuration files. 20110930: sysinstall has been removed 20110923: The stable/9 branch created in subversion. This corresponds to the RELENG_9 branch in CVS. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach before reporting problems with a major version upgrade. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To just build a kernel when you know that it won't mess you up -------------------------------------------------------------- This assumes you are already running a CURRENT system. Replace ${arch} with the architecture of your machine (e.g. "i386", "arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc). cd src/sys/${arch}/conf config KERNEL_NAME_HERE cd ../compile/KERNEL_NAME_HERE make depend make make install If this fails, go to the "To build a kernel" section. To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make kernel KERNCONF=YOUR_KERNEL_HERE [8] [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a cd src adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a noop. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] In order to have a kernel that can run the 4.x binaries needed to do an installworld, you must include the COMPAT_FREEBSD4 option in your kernel. Failure to do so may leave you with a system that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is required to run the 5.x binaries on more recent kernels. And so on for COMPAT_FREEBSD6 and COMPAT_FREEBSD7. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. [9] When checking out sources, you must include the -P flag to have cvs prune empty directories. If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since October 10, 2007. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. All Rights Reserved. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: user/ngie/more-tests/bin/csh/config.h =================================================================== --- user/ngie/more-tests/bin/csh/config.h (revision 281584) +++ user/ngie/more-tests/bin/csh/config.h (revision 281585) @@ -1,277 +1,277 @@ /* $FreeBSD$ */ /* config.h. Generated from config.h.in by configure. */ /* config.h.in. Generated from configure.in by autoheader. */ /* Define to the type of elements in the array set by `getgroups'. Usually this is either `int' or `gid_t'. */ #define GETGROUPS_T gid_t /* Define to 1 if the `getpgrp' function requires zero arguments. */ #define GETPGRP_VOID 1 /* Define to 1 if you have the header file. */ /* #undef HAVE_AUTH_H */ /* Define to 1 if you have the header file. */ /* #undef HAVE_CRYPT_H */ /* Define to 1 if you have the declaration of `crypt', and to 0 if you don't. */ #define HAVE_DECL_CRYPT 1 /* Define to 1 if you have the declaration of `environ', and to 0 if you don't. */ #define HAVE_DECL_ENVIRON 0 /* Define to 1 if you have the declaration of `gethostname', and to 0 if you don't. */ #define HAVE_DECL_GETHOSTNAME 1 /* Define to 1 if you have the declaration of `getpgrp', and to 0 if you don't. */ #define HAVE_DECL_GETPGRP 1 /* Define to 1 if you have the header file, and it defines `DIR'. */ #define HAVE_DIRENT_H 1 /* Define to 1 if you have the `dup2' function. */ #define HAVE_DUP2 1 /* Define to 1 if you have the header file. */ /* #undef HAVE_FEATURES_H */ /* Define to 1 if you have the `getauthid' function. */ /* #undef HAVE_GETAUTHID */ /* Define to 1 if you have the `getcwd' function. */ #define HAVE_GETCWD 1 /* Define to 1 if you have the `gethostname' function. */ #define HAVE_GETHOSTNAME 1 /* Define to 1 if you have the `getpwent' function. */ #define HAVE_GETPWENT 1 /* Define to 1 if you have the `getutent' function. */ /* #undef HAVE_GETUTENT */ /* Define to 1 if you have the `getutxent' function. */ #define HAVE_GETUTXENT 1 /* Define if you have the iconv() function and it works. */ /* #undef HAVE_ICONV */ /* Define to 1 if you have the header file. */ #define HAVE_INTTYPES_H 1 /* Define to 1 if the system has the type `long long'. */ #define HAVE_LONG_LONG 1 /* Define to 1 if you have the `mallinfo' function. */ /* #undef HAVE_MALLINFO */ /* Define to 1 if mbrtowc and mbstate_t are properly declared. */ #define HAVE_MBRTOWC 1 /* Define to 1 if you have the `memmove' function. */ #define HAVE_MEMMOVE 1 /* Define to 1 if you have the header file. */ #define HAVE_MEMORY_H 1 /* Define to 1 if you have the `memset' function. */ #define HAVE_MEMSET 1 /* Define to 1 if you have the `mkstemp' function. */ #define HAVE_MKSTEMP 1 /* Define to 1 if you have the header file, and it defines `DIR'. */ /* #undef HAVE_NDIR_H */ /* Define to 1 if you have the `nice' function. */ #define HAVE_NICE 1 /* Define to 1 if you have the `nl_langinfo' function. */ #define HAVE_NL_LANGINFO 1 /* Define to 1 if you have the header file. */ #define HAVE_PATHS_H 1 /* Define to 1 if you have the `sbrk' function. */ #define HAVE_SBRK 1 /* Define to 1 if you have the `setpgid' function. */ #define HAVE_SETPGID 1 /* Define to 1 if you have the `setpriority' function. */ #define HAVE_SETPRIORITY 1 /* Define to 1 if you have the header file. */ /* #undef HAVE_SHADOW_H */ /* Define to 1 if you have the header file. */ #define HAVE_STDINT_H 1 /* Define to 1 if you have the header file. */ #define HAVE_STDLIB_H 1 /* Define to 1 if you have the `strcoll' function and it is properly defined. */ #define HAVE_STRCOLL 1 /* Define to 1 if you have the `strerror' function. */ #define HAVE_STRERROR 1 /* Define to 1 if you have the header file. */ #define HAVE_STRINGS_H 1 /* Define to 1 if you have the header file. */ #define HAVE_STRING_H 1 /* Define to 1 if you have the `strstr' function. */ #define HAVE_STRSTR 1 /* Define to 1 if `d_ino' is a member of `struct dirent'. */ #define HAVE_STRUCT_DIRENT_D_INO 1 /* Define to 1 if `ss_family' is a member of `struct sockaddr_storage'. */ #define HAVE_STRUCT_SOCKADDR_STORAGE_SS_FAMILY 1 /* Define to 1 if `ut_host' is a member of `struct utmpx'. */ #define HAVE_STRUCT_UTMPX_UT_HOST 1 /* Define to 1 if `ut_tv' is a member of `struct utmpx'. */ #define HAVE_STRUCT_UTMPX_UT_TV 1 /* Define to 1 if `ut_user' is a member of `struct utmpx'. */ #define HAVE_STRUCT_UTMPX_UT_USER 1 /* Define to 1 if `ut_xtime' is a member of `struct utmpx'. */ /* #undef HAVE_STRUCT_UTMPX_UT_XTIME */ /* Define to 1 if `ut_host' is a member of `struct utmp'. */ #define HAVE_STRUCT_UTMP_UT_HOST 1 /* Define to 1 if `ut_tv' is a member of `struct utmp'. */ #define HAVE_STRUCT_UTMP_UT_TV 1 /* Define to 1 if `ut_user' is a member of `struct utmp'. */ #define HAVE_STRUCT_UTMP_UT_USER 1 /* Define to 1 if `ut_xtime' is a member of `struct utmp'. */ /* #undef HAVE_STRUCT_UTMP_UT_XTIME */ /* Define to 1 if you have the `sysconf' function. */ #define HAVE_SYSCONF 1 /* Define to 1 if you have the header file, and it defines `DIR'. */ /* #undef HAVE_SYS_DIR_H */ /* Define to 1 if you have the header file, and it defines `DIR'. */ /* #undef HAVE_SYS_NDIR_H */ /* Define to 1 if you have the header file. */ #define HAVE_SYS_STAT_H 1 /* Define to 1 if you have the header file. */ #define HAVE_SYS_TYPES_H 1 /* Define to 1 if you have the header file. */ #define HAVE_UNISTD_H 1 /* Define to 1 if you have the header file. */ #define HAVE_UTMPX_H 1 /* Define to 1 if you have the header file. */ /* #undef HAVE_UTMP_H */ /* Define to 1 if you have the header file. */ #define HAVE_WCHAR_H 1 /* Define to 1 if you have the header file. */ #define HAVE_WCTYPE_H 1 /* Define to 1 if you have the `wcwidth' function. */ #define HAVE_WCWIDTH 1 /* Define as const if the declaration of iconv() needs const. */ -#define ICONV_CONST const +#define ICONV_CONST /* Support NLS. */ #define NLS 1 /* Support NLS catalogs. */ #define NLS_CATALOGS 1 /* Define to the address where bug reports for this package should be sent. */ #define PACKAGE_BUGREPORT "http://bugs.gw.com/" /* Define to the full name of this package. */ #define PACKAGE_NAME "tcsh" /* Define to the full name and version of this package. */ #define PACKAGE_STRING "tcsh 6.18.01" /* Define to the one symbol short name of this package. */ #define PACKAGE_TARNAME "tcsh" /* Define to the home page for this package. */ #define PACKAGE_URL "" /* Define to the version of this package. */ #define PACKAGE_VERSION "6.18.01" /* Define to 1 if the `setpgrp' function takes no argument. */ /* #undef SETPGRP_VOID */ /* The size of `wchar_t', as computed by sizeof. */ #define SIZEOF_WCHAR_T 4 /* Define to 1 if the `S_IS*' macros in do not work properly. */ /* #undef STAT_MACROS_BROKEN */ /* Define to 1 if you have the ANSI C header files. */ #define STDC_HEADERS 1 /* Define for Solaris 2.5.1 so the uint32_t typedef from , , or is not used. If the typedef were allowed, the #define below would cause a syntax error. */ /* #undef _UINT32_T */ /* Define to empty if `const' does not conform to ANSI C. */ /* #undef const */ /* Define to `int' if doesn't define. */ /* #undef gid_t */ /* Define to `int' if does not define. */ /* #undef mode_t */ /* Define to `unsigned int' if does not define. */ /* #undef size_t */ /* Define to `int' if neither nor define. */ /* #undef socklen_t */ /* Define to `int' not defined in . */ /* #undef ssize_t */ /* Define to `int' if doesn't define. */ /* #undef uid_t */ /* Define to the type of an unsigned integer type of width exactly 32 bits if such a type exists and the standard includes do not define it. */ /* #undef uint32_t */ /* Define to empty if the keyword `volatile' does not work. Warning: valid code using `volatile' can become incorrect without. Disable with care. */ /* #undef volatile */ #include "config_p.h" #include "config_f.h" /* Work around a vendor issue where config_f.h is #undef'ing this setting */ #define SYSMALLOC Index: user/ngie/more-tests/contrib/pjdfstest/tests/open/20.t =================================================================== --- user/ngie/more-tests/contrib/pjdfstest/tests/open/20.t (revision 281584) +++ user/ngie/more-tests/contrib/pjdfstest/tests/open/20.t (revision 281585) @@ -1,20 +1,24 @@ #!/bin/sh # $FreeBSD: head/tools/regression/pjdfstest/tests/open/20.t 211352 2010-08-15 21:24:17Z pjd $ desc="open returns ETXTBSY when the file is a pure procedure (shared text) file that is being executed and the open() system call requests write access" dir=`dirname $0` . ${dir}/../misc.sh [ "${os}:${fs}" = "FreeBSD:UFS" ] || quick_exit echo "1..4" n0=`namegen` cp -pf `which sleep` ${n0} ./${n0} 3 & +while ! pkill -0 -f ./${n0}; do + sleep 0.1 +done expect ETXTBSY open ${n0} O_WRONLY expect ETXTBSY open ${n0} O_RDWR expect ETXTBSY open ${n0} O_RDONLY,O_TRUNC +pkill -9 -f ./${n0} expect 0 unlink ${n0} Index: user/ngie/more-tests/contrib/pjdfstest/tests/truncate/11.t =================================================================== --- user/ngie/more-tests/contrib/pjdfstest/tests/truncate/11.t (revision 281584) +++ user/ngie/more-tests/contrib/pjdfstest/tests/truncate/11.t (revision 281585) @@ -1,18 +1,22 @@ #!/bin/sh # $FreeBSD: head/tools/regression/pjdfstest/tests/truncate/11.t 211352 2010-08-15 21:24:17Z pjd $ desc="truncate returns ETXTBSY the file is a pure procedure (shared text) file that is being executed" dir=`dirname $0` . ${dir}/../misc.sh [ "${os}" = "FreeBSD" ] || quick_exit echo "1..2" n0=`namegen` cp -pf `which sleep` ${n0} ./${n0} 3 & +while ! pkill -0 -f ./${n0}; do + sleep 0.1 +done expect ETXTBSY truncate ${n0} 123 +pkill -9 -f ./${n0} expect 0 unlink ${n0} Index: user/ngie/more-tests/contrib/smbfs/include/netsmb/smb_lib.h =================================================================== --- user/ngie/more-tests/contrib/smbfs/include/netsmb/smb_lib.h (revision 281584) +++ user/ngie/more-tests/contrib/smbfs/include/netsmb/smb_lib.h (revision 281585) @@ -1,258 +1,258 @@ /* * Copyright (c) 2000-2001 Boris Popov * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Boris Popov. * 4. Neither the name of the author nor the names of any co-contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $Id: smb_lib.h,v 1.24 2001/12/20 15:19:43 bp Exp $ * $FreeBSD$ */ #ifndef _NETSMB_SMB_LIB_H_ #define _NETSMB_SMB_LIB_H_ #include #include #ifndef SMB_CFG_FILE #define SMB_CFG_FILE "/usr/local/etc/nsmb.conf" #endif #define STDPARAM_ARGS 'A':case 'B':case 'C':case 'E':case 'I': \ case 'L':case 'M': \ case 'N':case 'U':case 'R':case 'S':case 'T': \ case 'W':case 'O':case 'P' #define STDPARAM_OPT "A:BCE:I:L:M:NO:P:U:R:S:T:W:" /* * bits to indicate the source of error */ #define SMB_ERRTYPE_MASK 0xf0000 #define SMB_SYS_ERROR 0x00000 #define SMB_RAP_ERROR 0x10000 #define SMB_NB_ERROR 0x20000 #ifndef min #define min(a,b) (((a)<(b)) ? (a) : (b)) #endif #define getb(buf,ofs) (((const u_int8_t *)(buf))[ofs]) #define setb(buf,ofs,val) (((u_int8_t*)(buf))[ofs])=val #define getbw(buf,ofs) ((u_int16_t)(getb(buf,ofs))) #define getw(buf,ofs) (*((u_int16_t*)(&((u_int8_t*)(buf))[ofs]))) #define getdw(buf,ofs) (*((u_int32_t*)(&((u_int8_t*)(buf))[ofs]))) #if (BYTE_ORDER == LITTLE_ENDIAN) #define getwle(buf,ofs) (*((u_int16_t*)(&((u_int8_t*)(buf))[ofs]))) #define getdle(buf,ofs) (*((u_int32_t*)(&((u_int8_t*)(buf))[ofs]))) #define getwbe(buf,ofs) (ntohs(getwle(buf,ofs))) #define getdbe(buf,ofs) (ntohl(getdle(buf,ofs))) #define setwle(buf,ofs,val) getwle(buf,ofs)=val #define setwbe(buf,ofs,val) getwle(buf,ofs)=htons(val) #define setdle(buf,ofs,val) getdle(buf,ofs)=val #define setdbe(buf,ofs,val) getdle(buf,ofs)=htonl(val) #else /* (BYTE_ORDER == LITTLE_ENDIAN) */ #define getwbe(buf,ofs) (*((u_int16_t*)(&((u_int8_t*)(buf))[ofs]))) #define getdbe(buf,ofs) (*((u_int32_t*)(&((u_int8_t*)(buf))[ofs]))) #define getwle(buf,ofs) (bswap16(getwbe(buf,ofs))) #define getdle(buf,ofs) (bswap32(getdbe(buf,ofs))) #define setwbe(buf,ofs,val) getwbe(buf,ofs)=val #define setwle(buf,ofs,val) getwbe(buf,ofs)=bswap16(val) #define setdbe(buf,ofs,val) getdbe(buf,ofs)=val #define setdle(buf,ofs,val) getdbe(buf,ofs)=bswap32(val) #endif /* (BYTE_ORDER == LITTLE_ENDIAN) */ /* * SMB work context. Used to store all values which is necessary * to establish connection to an SMB server. */ struct smb_ctx { int ct_flags; /* SMBCF_ */ int ct_fd; /* handle of connection */ int ct_parsedlevel; int ct_minlevel; int ct_maxlevel; char * ct_srvaddr; /* hostname or IP address of server */ char ct_locname[SMB_MAXUSERNAMELEN + 1]; const char * ct_uncnext; struct nb_ctx * ct_nb; struct smbioc_ossn ct_ssn; struct smbioc_oshare ct_sh; long ct_smbtcpport; }; #define SMBCF_NOPWD 0x0001 /* don't ask for a password */ #define SMBCF_SRIGHTS 0x0002 /* share access rights was supplied */ #define SMBCF_LOCALE 0x0004 /* use current locale */ #define SMBCF_RESOLVED 0x8000 /* structure has been verified */ /* * request handling structures */ struct mbuf { int m_len; int m_maxlen; char * m_data; struct mbuf * m_next; }; struct mbdata { struct mbuf * mb_top; struct mbuf * mb_cur; char * mb_pos; int mb_count; }; #define M_ALIGNFACTOR (sizeof(long)) #define M_ALIGN(len) (((len) + M_ALIGNFACTOR - 1) & ~(M_ALIGNFACTOR - 1)) #define M_BASESIZE (sizeof(struct mbuf)) #define M_MINSIZE (256 - M_BASESIZE) #define M_TOP(m) ((char*)(m) + M_BASESIZE) #define mtod(m,t) ((t)(m)->m_data) #define M_TRAILINGSPACE(m) ((m)->m_maxlen - (m)->m_len) struct smb_rq { u_char rq_cmd; struct mbdata rq_rq; struct mbdata rq_rp; struct smb_ctx *rq_ctx; int rq_wcount; int rq_bcount; }; struct smb_bitname { u_int bn_bit; char *bn_name; }; extern struct rcfile *smb_rc; __BEGIN_DECLS struct sockaddr; int smb_lib_init(void); int smb_open_rcfile(void); void smb_error(const char *, int,...); char *smb_printb(char *, int, const struct smb_bitname *); void *smb_dumptree(void); /* * Context management */ int smb_ctx_init(struct smb_ctx *, int, char *[], int, int, int); void smb_ctx_done(struct smb_ctx *); int smb_ctx_parseunc(struct smb_ctx *, const char *, int, const char **); int smb_ctx_setcharset(struct smb_ctx *, const char *); int smb_ctx_setserver(struct smb_ctx *, const char *); int smb_ctx_setnbport(struct smb_ctx *, int); int smb_ctx_setsmbport(struct smb_ctx *, int); int smb_ctx_setuser(struct smb_ctx *, const char *); int smb_ctx_setshare(struct smb_ctx *, const char *, int); int smb_ctx_setscope(struct smb_ctx *, const char *); int smb_ctx_setworkgroup(struct smb_ctx *, const char *); int smb_ctx_setpassword(struct smb_ctx *, const char *); int smb_ctx_setsrvaddr(struct smb_ctx *, const char *); int smb_ctx_opt(struct smb_ctx *, int, const char *); int smb_ctx_lookup(struct smb_ctx *, int, int); int smb_ctx_login(struct smb_ctx *); int smb_ctx_readrc(struct smb_ctx *); int smb_ctx_resolve(struct smb_ctx *); int smb_ctx_setflags(struct smb_ctx *, int, int, int); -int smb_smb_open_print_file(struct smb_ctx *, int, int, const char *, smbfh*); +int smb_smb_open_print_file(struct smb_ctx *, int, int, char *, smbfh*); int smb_smb_close_print_file(struct smb_ctx *, smbfh); int smb_read(struct smb_ctx *, smbfh, off_t, size_t, char *); int smb_write(struct smb_ctx *, smbfh, off_t, size_t, const char *); #define smb_rq_getrequest(rqp) (&(rqp)->rq_rq) #define smb_rq_getreply(rqp) (&(rqp)->rq_rp) int smb_rq_init(struct smb_ctx *, u_char, size_t, struct smb_rq **); void smb_rq_done(struct smb_rq *); void smb_rq_wend(struct smb_rq *); int smb_rq_simple(struct smb_rq *); -int smb_rq_dmem(struct mbdata *, const char *, size_t); -int smb_rq_dstring(struct mbdata *, const char *); +int smb_rq_dmem(struct mbdata *, char *, size_t); +int smb_rq_dstring(struct mbdata *, char *); int smb_t2_request(struct smb_ctx *, int, int, const char *, int, void *, int, void *, int *, void *, int *, void *); char* smb_simplecrypt(char *dst, const char *src); int smb_simpledecrypt(char *dst, const char *src); int m_getm(struct mbuf *, size_t, struct mbuf **); int m_lineup(struct mbuf *, struct mbuf **); int mb_init(struct mbdata *, size_t); int mb_initm(struct mbdata *, struct mbuf *); int mb_done(struct mbdata *); int mb_fit(struct mbdata *mbp, size_t size, char **pp); int mb_put_uint8(struct mbdata *, u_int8_t); int mb_put_uint16be(struct mbdata *, u_int16_t); int mb_put_uint16le(struct mbdata *, u_int16_t); int mb_put_uint32be(struct mbdata *, u_int32_t); int mb_put_uint32le(struct mbdata *, u_int32_t); int mb_put_int64be(struct mbdata *, int64_t); int mb_put_int64le(struct mbdata *, int64_t); int mb_put_mem(struct mbdata *, const char *, size_t); int mb_put_pstring(struct mbdata *mbp, const char *s); int mb_put_mbuf(struct mbdata *, struct mbuf *); int mb_get_uint8(struct mbdata *, u_int8_t *); int mb_get_uint16(struct mbdata *, u_int16_t *); int mb_get_uint16le(struct mbdata *, u_int16_t *); int mb_get_uint16be(struct mbdata *, u_int16_t *); int mb_get_uint32(struct mbdata *, u_int32_t *); int mb_get_uint32be(struct mbdata *, u_int32_t *); int mb_get_uint32le(struct mbdata *, u_int32_t *); int mb_get_int64(struct mbdata *, int64_t *); int mb_get_int64be(struct mbdata *, int64_t *); int mb_get_int64le(struct mbdata *, int64_t *); int mb_get_mem(struct mbdata *, char *, size_t); extern u_char nls_lower[256], nls_upper[256]; int nls_setrecode(const char *, const char *); int nls_setlocale(const char *); -char* nls_str_toext(char *, const char *); -char* nls_str_toloc(char *, const char *); -void* nls_mem_toext(void *, const void *, int); -void* nls_mem_toloc(void *, const void *, int); +char* nls_str_toext(char *, char *); +char* nls_str_toloc(char *, char *); +void* nls_mem_toext(void *, void *, int); +void* nls_mem_toloc(void *, void *, int); char* nls_str_upper(char *, const char *); char* nls_str_lower(char *, const char *); __END_DECLS #endif /* _NETSMB_SMB_LIB_H_ */ Index: user/ngie/more-tests/contrib/smbfs/lib/smb/nls.c =================================================================== --- user/ngie/more-tests/contrib/smbfs/lib/smb/nls.c (revision 281584) +++ user/ngie/more-tests/contrib/smbfs/lib/smb/nls.c (revision 281585) @@ -1,223 +1,223 @@ /* * Copyright (c) 2000-2001, Boris Popov * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Boris Popov. * 4. Neither the name of the author nor the names of any co-contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $Id: nls.c,v 1.10 2002/07/22 08:33:59 bp Exp $ */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #ifdef HAVE_ICONV #include #endif u_char nls_lower[256]; u_char nls_upper[256]; #ifdef HAVE_ICONV static iconv_t nls_toext, nls_toloc; #endif int nls_setlocale(const char *name) { int i; if (setlocale(LC_CTYPE, name) == NULL) { warnx("can't set locale '%s'\n", name); return EINVAL; } for (i = 0; i < 256; i++) { nls_lower[i] = tolower(i); nls_upper[i] = toupper(i); } return 0; } int nls_setrecode(const char *local, const char *external) { #ifdef HAVE_ICONV iconv_t icd; if (nls_toext) iconv_close(nls_toext); if (nls_toloc) iconv_close(nls_toloc); nls_toext = nls_toloc = (iconv_t)0; icd = iconv_open(external, local); if (icd == (iconv_t)-1) return errno; nls_toext = icd; icd = iconv_open(local, external); if (icd == (iconv_t)-1) { iconv_close(nls_toext); nls_toext = (iconv_t)0; return errno; } nls_toloc = icd; return 0; #else return ENOENT; #endif } char * -nls_str_toloc(char *dst, const char *src) +nls_str_toloc(char *dst, char *src) { #ifdef HAVE_ICONV char *p = dst; size_t inlen, outlen; if (nls_toloc == (iconv_t)0) return strcpy(dst, src); inlen = outlen = strlen(src); iconv(nls_toloc, NULL, NULL, &p, &outlen); while (iconv(nls_toloc, &src, &inlen, &p, &outlen) == -1) { *p++ = *src++; inlen--; outlen--; } *p = 0; return dst; #else return strcpy(dst, src); #endif } char * -nls_str_toext(char *dst, const char *src) +nls_str_toext(char *dst, char *src) { #ifdef HAVE_ICONV char *p = dst; size_t inlen, outlen; if (nls_toext == (iconv_t)0) return strcpy(dst, src); inlen = outlen = strlen(src); iconv(nls_toext, NULL, NULL, &p, &outlen); while (iconv(nls_toext, &src, &inlen, &p, &outlen) == -1) { *p++ = *src++; inlen--; outlen--; } *p = 0; return dst; #else return strcpy(dst, src); #endif } void * -nls_mem_toloc(void *dst, const void *src, int size) +nls_mem_toloc(void *dst, void *src, int size) { #ifdef HAVE_ICONV char *p = dst; - const char *s = src; + char *s = src; size_t inlen, outlen; if (size == 0) return NULL; if (nls_toloc == (iconv_t)0) return memcpy(dst, src, size); inlen = outlen = size; iconv(nls_toloc, NULL, NULL, &p, &outlen); while (iconv(nls_toloc, &s, &inlen, &p, &outlen) == -1) { *p++ = *s++; inlen--; outlen--; } return dst; #else return memcpy(dst, src, size); #endif } void * -nls_mem_toext(void *dst, const void *src, int size) +nls_mem_toext(void *dst, void *src, int size) { #ifdef HAVE_ICONV char *p = dst; - const char *s = src; + char *s = src; size_t inlen, outlen; if (size == 0) return NULL; if (nls_toext == (iconv_t)0) return memcpy(dst, src, size); inlen = outlen = size; iconv(nls_toext, NULL, NULL, &p, &outlen); while (iconv(nls_toext, &s, &inlen, &p, &outlen) == -1) { *p++ = *s++; inlen--; outlen--; } return dst; #else return memcpy(dst, src, size); #endif } char * nls_str_upper(char *dst, const char *src) { char *p = dst; while (*src) *dst++ = toupper(*src++); *dst = 0; return p; } char * nls_str_lower(char *dst, const char *src) { char *p = dst; while (*src) *dst++ = tolower(*src++); *dst = 0; return p; } Index: user/ngie/more-tests/contrib/smbfs/lib/smb/print.c =================================================================== --- user/ngie/more-tests/contrib/smbfs/lib/smb/print.c (revision 281584) +++ user/ngie/more-tests/contrib/smbfs/lib/smb/print.c (revision 281585) @@ -1,97 +1,97 @@ /* * Copyright (c) 2000, Boris Popov * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Boris Popov. * 4. Neither the name of the author nor the names of any co-contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $Id: print.c,v 1.4 2001/04/16 04:33:01 bp Exp $ */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include /*#include */ #include #include #include int smb_smb_open_print_file(struct smb_ctx *ctx, int setuplen, int mode, - const char *ident, smbfh *fhp) + char *ident, smbfh *fhp) { struct smb_rq *rqp; struct mbdata *mbp; int error; error = smb_rq_init(ctx, SMB_COM_OPEN_PRINT_FILE, 2, &rqp); if (error) return error; mbp = smb_rq_getrequest(rqp); mb_put_uint16le(mbp, setuplen); mb_put_uint16le(mbp, mode); smb_rq_wend(rqp); mb_put_uint8(mbp, SMB_DT_ASCII); smb_rq_dstring(mbp, ident); error = smb_rq_simple(rqp); if (!error) { mbp = smb_rq_getreply(rqp); mb_get_uint16(mbp, fhp); } smb_rq_done(rqp); return error; } int smb_smb_close_print_file(struct smb_ctx *ctx, smbfh fh) { struct smb_rq *rqp; struct mbdata *mbp; int error; error = smb_rq_init(ctx, SMB_COM_CLOSE_PRINT_FILE, 0, &rqp); if (error) return error; mbp = smb_rq_getrequest(rqp); mb_put_mem(mbp, (char*)&fh, 2); smb_rq_wend(rqp); error = smb_rq_simple(rqp); smb_rq_done(rqp); return error; } Index: user/ngie/more-tests/contrib/smbfs/lib/smb/rq.c =================================================================== --- user/ngie/more-tests/contrib/smbfs/lib/smb/rq.c (revision 281584) +++ user/ngie/more-tests/contrib/smbfs/lib/smb/rq.c (revision 281585) @@ -1,180 +1,180 @@ /* * Copyright (c) 2000, Boris Popov * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Boris Popov. * 4. Neither the name of the author nor the names of any co-contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $Id: rq.c,v 1.7 2001/04/16 04:33:01 bp Exp $ * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include int smb_rq_init(struct smb_ctx *ctx, u_char cmd, size_t rpbufsz, struct smb_rq **rqpp) { struct smb_rq *rqp; rqp = malloc(sizeof(*rqp)); if (rqp == NULL) return ENOMEM; bzero(rqp, sizeof(*rqp)); rqp->rq_cmd = cmd; rqp->rq_ctx = ctx; mb_init(&rqp->rq_rq, M_MINSIZE); mb_init(&rqp->rq_rp, rpbufsz); *rqpp = rqp; return 0; } void smb_rq_done(struct smb_rq *rqp) { mb_done(&rqp->rq_rp); mb_done(&rqp->rq_rq); free(rqp); } void smb_rq_wend(struct smb_rq *rqp) { if (rqp->rq_rq.mb_count & 1) smb_error("smbrq_wend: odd word count\n", 0); rqp->rq_wcount = rqp->rq_rq.mb_count / 2; rqp->rq_rq.mb_count = 0; } int -smb_rq_dmem(struct mbdata *mbp, const char *src, size_t size) +smb_rq_dmem(struct mbdata *mbp, char *src, size_t size) { struct mbuf *m; char * dst; int cplen, error; if (size == 0) return 0; m = mbp->mb_cur; if ((error = m_getm(m, size, &m)) != 0) return error; while (size > 0) { cplen = M_TRAILINGSPACE(m); if (cplen == 0) { m = m->m_next; continue; } if (cplen > (int)size) cplen = size; dst = mtod(m, char *) + m->m_len; nls_mem_toext(dst, src, cplen); size -= cplen; src += cplen; m->m_len += cplen; mbp->mb_count += cplen; } mbp->mb_pos = mtod(m, char *) + m->m_len; mbp->mb_cur = m; return 0; } int -smb_rq_dstring(struct mbdata *mbp, const char *s) +smb_rq_dstring(struct mbdata *mbp, char *s) { return smb_rq_dmem(mbp, s, strlen(s) + 1); } int smb_rq_simple(struct smb_rq *rqp) { struct smbioc_rq krq; struct mbdata *mbp; char *data; mbp = smb_rq_getrequest(rqp); m_lineup(mbp->mb_top, &mbp->mb_top); data = mtod(mbp->mb_top, char*); bzero(&krq, sizeof(krq)); krq.ioc_cmd = rqp->rq_cmd; krq.ioc_twc = rqp->rq_wcount; krq.ioc_twords = data; krq.ioc_tbc = mbp->mb_count; krq.ioc_tbytes = data + rqp->rq_wcount * 2; mbp = smb_rq_getreply(rqp); krq.ioc_rpbufsz = mbp->mb_top->m_maxlen; krq.ioc_rpbuf = mtod(mbp->mb_top, char *); if (ioctl(rqp->rq_ctx->ct_fd, SMBIOC_REQUEST, &krq) == -1) return errno; mbp->mb_top->m_len = krq.ioc_rwc * 2 + krq.ioc_rbc; rqp->rq_wcount = krq.ioc_rwc; rqp->rq_bcount = krq.ioc_rbc; return 0; } int smb_t2_request(struct smb_ctx *ctx, int setup, int setupcount, const char *name, int tparamcnt, void *tparam, int tdatacnt, void *tdata, int *rparamcnt, void *rparam, int *rdatacnt, void *rdata) { struct smbioc_t2rq krq; bzero(&krq, sizeof(krq)); krq.ioc_setup[0] = setup; krq.ioc_setupcnt = setupcount; krq.ioc_name = (char *)name; krq.ioc_tparamcnt = tparamcnt; krq.ioc_tparam = tparam; krq.ioc_tdatacnt = tdatacnt; krq.ioc_tdata = tdata; krq.ioc_rparamcnt = *rparamcnt; krq.ioc_rparam = rparam; krq.ioc_rdatacnt = *rdatacnt; krq.ioc_rdata = rdata; if (ioctl(ctx->ct_fd, SMBIOC_T2RQ, &krq) == -1) return errno; *rparamcnt = krq.ioc_rparamcnt; *rdatacnt = krq.ioc_rdatacnt; return 0; } Index: user/ngie/more-tests/etc/login.conf =================================================================== --- user/ngie/more-tests/etc/login.conf (revision 281584) +++ user/ngie/more-tests/etc/login.conf (revision 281585) @@ -1,321 +1,321 @@ # login.conf - login class capabilities database. # # Remember to rebuild the database after each change to this file: # # cap_mkdb /etc/login.conf # # This file controls resource limits, accounting limits and # default user environment settings. # # $FreeBSD$ # # Default settings effectively disable resource limits, see the # examples below for a starting point to enable them. # defaults # These settings are used by login(1) by default for classless users # Note that entries like "cputime" set both "cputime-cur" and "cputime-max" # # Note that since a colon ':' is used to separate capability entries, # a \c escape sequence must be used to embed a literal colon in the # value or name of a capability (see the ``CGETNUM AND CGETSTR SYNTAX # AND SEMANTICS'' section of getcap(3) for more escape sequences). default:\ :passwd_format=sha512:\ :copyright=/etc/COPYRIGHT:\ :welcome=/etc/motd:\ - :setenv=MAIL=/var/mail/$,BLOCKSIZE=K:LC_COLLATE=C:\ + :setenv=MAIL=/var/mail/$,BLOCKSIZE=K,LC_COLLATE=C:\ :path=/sbin /bin /usr/sbin /usr/bin /usr/local/sbin /usr/local/bin ~/bin:\ :nologin=/var/run/nologin:\ :cputime=unlimited:\ :datasize=unlimited:\ :stacksize=unlimited:\ :memorylocked=64K:\ :memoryuse=unlimited:\ :filesize=unlimited:\ :coredumpsize=unlimited:\ :openfiles=unlimited:\ :maxproc=unlimited:\ :sbsize=unlimited:\ :vmemoryuse=unlimited:\ :swapuse=unlimited:\ :pseudoterminals=unlimited:\ :kqueues=unlimited:\ :priority=0:\ :ignoretime@:\ :umask=022: # # A collection of common class names - forward them all to 'default' # (login would normally do this anyway, but having a class name # here suppresses the diagnostic) # standard:\ :tc=default: xuser:\ :tc=default: staff:\ :tc=default: daemon:\ :memorylocked=128M:\ :tc=default: news:\ :tc=default: dialer:\ :tc=default: # # Root can always login # # N.B. login_getpwclass(3) will use this entry for the root account, # in preference to 'default'. root:\ :ignorenologin:\ :memorylocked=unlimited:\ :tc=default: # # Russian Users Accounts. Setup proper environment variables. # russian|Russian Users Accounts:\ :charset=UTF-8:\ :lang=ru_RU.UTF-8:\ :tc=default: ###################################################################### ###################################################################### ## ## Example entries ## ###################################################################### ###################################################################### ## Example defaults ## These settings are used by login(1) by default for classless users ## Note that entries like "cputime" set both "cputime-cur" and "cputime-max" # #default:\ # :cputime=infinity:\ # :datasize-cur=22M:\ # :stacksize-cur=8M:\ # :memorylocked-cur=10M:\ # :memoryuse-cur=30M:\ # :filesize=infinity:\ # :coredumpsize=infinity:\ # :maxproc-cur=64:\ # :openfiles-cur=64:\ # :priority=0:\ # :requirehome@:\ # :umask=022:\ # :tc=auth-defaults: # # ## ## standard - standard user defaults ## #standard:\ # :copyright=/etc/COPYRIGHT:\ # :welcome=/etc/motd:\ # :setenv=MAIL=/var/mail/$,BLOCKSIZE=K:\ # :path=~/bin /bin /usr/bin /usr/local/bin:\ # :manpath=/usr/share/man /usr/local/man:\ # :nologin=/var/run/nologin:\ # :cputime=1h30m:\ # :datasize=8M:\ # :vmemoryuse=100M:\ # :stacksize=2M:\ # :memorylocked=4M:\ # :memoryuse=8M:\ # :filesize=8M:\ # :coredumpsize=8M:\ # :openfiles=24:\ # :maxproc=32:\ # :priority=0:\ # :requirehome:\ # :passwordtime=90d:\ # :umask=002:\ # :ignoretime@:\ # :tc=default: # # ## ## users of X (needs more resources!) ## #xuser:\ # :manpath=/usr/share/man /usr/local/man:\ # :cputime=4h:\ # :datasize=12M:\ # :vmemoryuse=infinity:\ # :stacksize=4M:\ # :filesize=8M:\ # :memoryuse=16M:\ # :openfiles=32:\ # :maxproc=48:\ # :tc=standard: # # ## ## Staff users - few restrictions and allow login anytime ## #staff:\ # :ignorenologin:\ # :ignoretime:\ # :requirehome@:\ # :accounted@:\ # :path=~/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin /usr/local/sbin:\ # :umask=022:\ # :tc=standard: # # ## ## root - fallback for root logins ## #root:\ # :path=~/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin /usr/local/sbin:\ # :cputime=infinity:\ # :datasize=infinity:\ # :stacksize=infinity:\ # :memorylocked=infinity:\ # :memoryuse=infinity:\ # :filesize=infinity:\ # :coredumpsize=infinity:\ # :openfiles=infinity:\ # :maxproc=infinity:\ # :memoryuse-cur=32M:\ # :maxproc-cur=64:\ # :openfiles-cur=1024:\ # :priority=0:\ # :requirehome@:\ # :umask=022:\ # :tc=auth-root-defaults: # # ## ## Settings used by /etc/rc ## #daemon:\ # :coredumpsize@:\ # :coredumpsize-cur=0:\ # :datasize=infinity:\ # :datasize-cur@:\ # :maxproc=512:\ # :maxproc-cur@:\ # :memoryuse-cur=64M:\ # :memorylocked-cur=64M:\ # :openfiles=1024:\ # :openfiles-cur@:\ # :stacksize=16M:\ # :stacksize-cur@:\ # :tc=default: # # ## ## Settings used by news subsystem ## #news:\ # :path=/usr/local/news/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin /usr/local/sbin:\ # :cputime=infinity:\ # :filesize=128M:\ # :datasize-cur=64M:\ # :stacksize-cur=32M:\ # :coredumpsize-cur=0:\ # :maxmemorysize-cur=128M:\ # :memorylocked=32M:\ # :maxproc=128:\ # :openfiles=256:\ # :tc=default: # # ## ## The dialer class should be used for a dialup PPP account ## Welcome messages/news suppressed ## #dialer:\ # :hushlogin:\ # :requirehome@:\ # :cputime=unlimited:\ # :filesize=2M:\ # :datasize=2M:\ # :stacksize=4M:\ # :coredumpsize=0:\ # :memoryuse=4M:\ # :memorylocked=1M:\ # :maxproc=16:\ # :openfiles=32:\ # :tc=standard: # # ## ## Site full-time 24/7 PPP connection ## - no time accounting, restricted to access via dialin lines ## #site:\ # :ignoretime:\ # :passwordtime@:\ # :refreshtime@:\ # :refreshperiod@:\ # :sessionlimit@:\ # :autodelete@:\ # :expireperiod@:\ # :graceexpire@:\ # :gracetime@:\ # :warnexpire@:\ # :warnpassword@:\ # :idletime@:\ # :sessiontime@:\ # :daytime@:\ # :weektime@:\ # :monthtime@:\ # :warntime@:\ # :accounted@:\ # :tc=dialer:\ # :tc=staff: # # ## ## Example standard accounting entries for subscriber levels ## # #subscriber|Subscribers:\ # :accounted:\ # :refreshtime=180d:\ # :refreshperiod@:\ # :sessionlimit@:\ # :autodelete=30d:\ # :expireperiod=180d:\ # :graceexpire=7d:\ # :gracetime=10m:\ # :warnexpire=7d:\ # :warnpassword=7d:\ # :idletime=30m:\ # :sessiontime=4h:\ # :daytime=6h:\ # :weektime=40h:\ # :monthtime=120h:\ # :warntime=4h:\ # :tc=standard: # # ## ## Subscriber accounts. These accounts have their login times ## accounted and have access limits applied. ## #subppp|PPP Subscriber Accounts:\ # :tc=dialer:\ # :tc=subscriber: # # #subshell|Shell Subscriber Accounts:\ # :tc=subscriber: # ## ## If you want some of the accounts to use traditional UNIX DES based ## password hashes. ## #des_users:\ # :passwd_format=des:\ # :tc=default: Index: user/ngie/more-tests/etc/rc.d/hostid_save =================================================================== --- user/ngie/more-tests/etc/rc.d/hostid_save (revision 281584) +++ user/ngie/more-tests/etc/rc.d/hostid_save (revision 281585) @@ -1,28 +1,35 @@ #!/bin/sh # # $FreeBSD$ # # PROVIDE: hostid_save # REQUIRE: root # KEYWORD: nojail . /etc/rc.subr name="hostid_save" start_cmd="hostid_save" stop_cmd=":" rcvar="hostid_enable" hostid_save() { - if [ ! -r ${hostid_file} ]; then - $SYSCTL_N kern.hostuuid > ${hostid_file} - if [ $? -ne 0 ]; then - warn "could not store hostuuid in ${hostid_file}." + current_hostid=`$SYSCTL_N kern.hostuuid` + + if [ -r ${hostid_file} ]; then + read saved_hostid < ${hostid_file} + if [ ${saved_hostid} = ${current_hostid} ]; then + exit 0 fi + fi + + echo ${current_hostid} > ${hostid_file} + if [ $? -ne 0 ]; then + warn "could not store hostuuid in ${hostid_file}." fi } load_rc_config $name run_rc_command "$1" Index: user/ngie/more-tests/etc =================================================================== --- user/ngie/more-tests/etc (revision 281584) +++ user/ngie/more-tests/etc (revision 281585) Property changes on: user/ngie/more-tests/etc ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/etc:r281414-281584 Index: user/ngie/more-tests/include/iconv.h =================================================================== --- user/ngie/more-tests/include/iconv.h (revision 281584) +++ user/ngie/more-tests/include/iconv.h (revision 281585) @@ -1,133 +1,133 @@ /* $FreeBSD$ */ /* $NetBSD: iconv.h,v 1.6 2005/02/03 04:39:32 perry Exp $ */ /*- * Copyright (c) 2003 Citrus Project, * Copyright (c) 2009, 2010 Gabor Kovesdan * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #ifndef _ICONV_H_ #define _ICONV_H_ #include #include #include #include #include #ifdef __cplusplus typedef bool __iconv_bool; #elif __STDC_VERSION__ >= 199901L typedef _Bool __iconv_bool; #else typedef int __iconv_bool; #endif struct __tag_iconv_t; typedef struct __tag_iconv_t *iconv_t; __BEGIN_DECLS iconv_t iconv_open(const char *, const char *); -size_t iconv(iconv_t, const char ** __restrict, +size_t iconv(iconv_t, char ** __restrict, size_t * __restrict, char ** __restrict, size_t * __restrict); int iconv_close(iconv_t); /* * non-portable interfaces for iconv */ int __iconv_get_list(char ***, size_t *, __iconv_bool); void __iconv_free_list(char **, size_t); -size_t __iconv(iconv_t, const char **, size_t *, char **, +size_t __iconv(iconv_t, char **, size_t *, char **, size_t *, __uint32_t, size_t *); #define __ICONV_F_HIDE_INVALID 0x0001 /* * GNU interfaces for iconv */ typedef struct { void *spaceholder[64]; } iconv_allocation_t; int iconv_open_into(const char *, const char *, iconv_allocation_t *); void iconv_set_relocation_prefix(const char *, const char *); /* * iconvctl() request macros */ #define ICONV_TRIVIALP 0 #define ICONV_GET_TRANSLITERATE 1 #define ICONV_SET_TRANSLITERATE 2 #define ICONV_GET_DISCARD_ILSEQ 3 #define ICONV_SET_DISCARD_ILSEQ 4 #define ICONV_SET_HOOKS 5 #define ICONV_SET_FALLBACKS 6 #define ICONV_GET_ILSEQ_INVALID 128 #define ICONV_SET_ILSEQ_INVALID 129 typedef void (*iconv_unicode_char_hook) (unsigned int mbr, void *data); typedef void (*iconv_wide_char_hook) (wchar_t wc, void *data); struct iconv_hooks { iconv_unicode_char_hook uc_hook; iconv_wide_char_hook wc_hook; void *data; }; /* * Fallbacks aren't supported but type definitions are provided for * source compatibility. */ typedef void (*iconv_unicode_mb_to_uc_fallback) (const char*, size_t, void (*write_replacement) (const unsigned int *, size_t, void*), void*, void*); typedef void (*iconv_unicode_uc_to_mb_fallback) (unsigned int, void (*write_replacement) (const char *, size_t, void*), void*, void*); typedef void (*iconv_wchar_mb_to_wc_fallback) (const char*, size_t, void (*write_replacement) (const wchar_t *, size_t, void*), void*, void*); typedef void (*iconv_wchar_wc_to_mb_fallback) (wchar_t, void (*write_replacement) (const char *, size_t, void*), void*, void*); struct iconv_fallbacks { iconv_unicode_mb_to_uc_fallback mb_to_uc_fallback; iconv_unicode_uc_to_mb_fallback uc_to_mb_fallback; iconv_wchar_mb_to_wc_fallback mb_to_wc_fallback; iconv_wchar_wc_to_mb_fallback wc_to_mb_fallback; void *data; }; void iconvlist(int (*do_one) (unsigned int, const char * const *, void *), void *); const char *iconv_canonicalize(const char *); int iconvctl(iconv_t, int, void *); __END_DECLS #endif /* !_ICONV_H_ */ Index: user/ngie/more-tests/include =================================================================== --- user/ngie/more-tests/include (revision 281584) +++ user/ngie/more-tests/include (revision 281585) Property changes on: user/ngie/more-tests/include ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/include:r281414-281584 Index: user/ngie/more-tests/lib/libarchive/Makefile =================================================================== --- user/ngie/more-tests/lib/libarchive/Makefile (revision 281584) +++ user/ngie/more-tests/lib/libarchive/Makefile (revision 281585) @@ -1,414 +1,414 @@ # $FreeBSD$ .include LIBARCHIVEDIR= ${.CURDIR}/../../contrib/libarchive LIB= archive LIBADD= z bz2 lzma bsdxml CFLAGS+= -DHAVE_BZLIB_H=1 -DHAVE_LIBLZMA=1 -DHAVE_LZMA_H=1 # FreeBSD SHLIB_MAJOR value is managed as part of the FreeBSD system. # It has no real relation to the libarchive version number. SHLIB_MAJOR= 6 CFLAGS+= -DPLATFORM_CONFIG_H=\"${.CURDIR}/config_freebsd.h\" CFLAGS+= -I${.OBJDIR} .if ${MK_OPENSSL} != "no" CFLAGS+= -DWITH_OPENSSL LIBADD+= crypto .else LIBADD+= md .endif .if ${MK_ICONV} != "no" # TODO: This can be changed back to CFLAGS once iconv works correctly # with statically linked binaries. -SHARED_CFLAGS+= -DHAVE_ICONV=1 -DHAVE_ICONV_H=1 -DICONV_CONST=const +SHARED_CFLAGS+= -DHAVE_ICONV=1 -DHAVE_ICONV_H=1 -DICONV_CONST= .endif .if ${MACHINE_ARCH:Marm*} != "" || ${MACHINE_ARCH:Mmips*} != "" || \ ${MACHINE_ARCH:Msparc64*} != "" NO_WCAST_ALIGN= yes .if ${MACHINE_ARCH:M*64*} == "" CFLAGS+= -DPPMD_32BIT .endif .endif NO_WCAST_ALIGN.clang= .ifndef COMPAT_32BIT beforeinstall: ${INSTALL} -C -o ${LIBOWN} -g ${LIBGRP} -m ${LIBMODE} \ ${.CURDIR}/libarchive.pc ${DESTDIR}${LIBDATADIR}/pkgconfig .endif .PATH: ${LIBARCHIVEDIR}/libarchive # Headers to be installed in /usr/include INCS= archive.h archive_entry.h # Sources to be compiled. SRCS= archive_acl.c \ archive_check_magic.c \ archive_cmdline.c \ archive_crypto.c \ archive_entry.c \ archive_entry_copy_stat.c \ archive_entry_link_resolver.c \ archive_entry_sparse.c \ archive_entry_stat.c \ archive_entry_strmode.c \ archive_entry_xattr.c \ archive_getdate.c \ archive_match.c \ archive_options.c \ archive_pathmatch.c \ archive_ppmd7.c \ archive_rb.c \ archive_read.c \ archive_read_append_filter.c \ archive_read_data_into_fd.c \ archive_read_disk_entry_from_file.c \ archive_read_disk_posix.c \ archive_read_disk_set_standard_lookup.c \ archive_read_extract.c \ archive_read_open_fd.c \ archive_read_open_file.c \ archive_read_open_filename.c \ archive_read_open_memory.c \ archive_read_set_format.c \ archive_read_set_options.c \ archive_read_support_filter_all.c \ archive_read_support_filter_bzip2.c \ archive_read_support_filter_compress.c \ archive_read_support_filter_gzip.c \ archive_read_support_filter_grzip.c \ archive_read_support_filter_lrzip.c \ archive_read_support_filter_lzop.c \ archive_read_support_filter_none.c \ archive_read_support_filter_program.c \ archive_read_support_filter_rpm.c \ archive_read_support_filter_uu.c \ archive_read_support_filter_xz.c \ archive_read_support_format_7zip.c \ archive_read_support_format_all.c \ archive_read_support_format_ar.c \ archive_read_support_format_by_code.c \ archive_read_support_format_cab.c \ archive_read_support_format_cpio.c \ archive_read_support_format_empty.c \ archive_read_support_format_iso9660.c \ archive_read_support_format_lha.c \ archive_read_support_format_mtree.c \ archive_read_support_format_rar.c \ archive_read_support_format_raw.c \ archive_read_support_format_tar.c \ archive_read_support_format_xar.c \ archive_read_support_format_zip.c \ archive_string.c \ archive_string_sprintf.c \ archive_util.c \ archive_virtual.c \ archive_write.c \ archive_write_add_filter.c \ archive_write_disk_acl.c \ archive_write_disk_set_standard_lookup.c \ archive_write_disk_posix.c \ archive_write_open_fd.c \ archive_write_open_file.c \ archive_write_open_filename.c \ archive_write_open_memory.c \ archive_write_add_filter_b64encode.c \ archive_write_add_filter_by_name.c \ archive_write_add_filter_bzip2.c \ archive_write_add_filter_compress.c \ archive_write_add_filter_grzip.c \ archive_write_add_filter_gzip.c \ archive_write_add_filter_lrzip.c \ archive_write_add_filter_lzop.c \ archive_write_add_filter_none.c \ archive_write_add_filter_program.c \ archive_write_add_filter_uuencode.c \ archive_write_add_filter_xz.c \ archive_write_set_format.c \ archive_write_set_format_7zip.c \ archive_write_set_format_ar.c \ archive_write_set_format_by_name.c \ archive_write_set_format_cpio.c \ archive_write_set_format_cpio_newc.c \ archive_write_set_format_gnutar.c \ archive_write_set_format_iso9660.c \ archive_write_set_format_mtree.c \ archive_write_set_format_pax.c \ archive_write_set_format_shar.c \ archive_write_set_format_ustar.c \ archive_write_set_format_v7tar.c \ archive_write_set_format_xar.c \ archive_write_set_format_zip.c \ archive_write_set_options.c \ filter_fork_posix.c # Man pages to be installed. MAN= archive_entry.3 \ archive_entry_acl.3 \ archive_entry_linkify.3 \ archive_entry_paths.3 \ archive_entry_perms.3 \ archive_entry_stat.3 \ archive_entry_time.3 \ archive_read.3 \ archive_read_data.3 \ archive_read_disk.3 \ archive_read_extract.3 \ archive_read_filter.3 \ archive_read_format.3 \ archive_read_free.3 \ archive_read_header.3 \ archive_read_new.3 \ archive_read_open.3 \ archive_read_set_options.3 \ archive_util.3 \ archive_write.3 \ archive_write_blocksize.3 \ archive_write_data.3 \ archive_write_disk.3 \ archive_write_filter.3 \ archive_write_finish_entry.3 \ archive_write_format.3 \ archive_write_free.3 \ archive_write_header.3 \ archive_write_new.3 \ archive_write_open.3 \ archive_write_set_options.3 \ cpio.5 \ libarchive.3 \ libarchive_changes.3 \ libarchive_internals.3 \ libarchive-formats.5 \ tar.5 # Symlink the man pages under each function name. MLINKS+= archive_entry.3 archive_entry_clear.3 MLINKS+= archive_entry.3 archive_entry_clone.3 MLINKS+= archive_entry.3 archive_entry_free.3 MLINKS+= archive_entry.3 archive_entry_new.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_add_entry.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_add_entry_w.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_clear.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_count.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_next.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_next_w.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_reset.3 MLINKS+= archive_entry_acl.3 archive_entry_acl_text_w.3 MLINKS+= archive_entry_linkify.3 archive_entry_linkresolver.3 MLINKS+= archive_entry_linkify.3 archive_entry_linkresolver_new.3 MLINKS+= archive_entry_linkify.3 archive_entry_linkresolver_set_strategy.3 MLINKS+= archive_entry_linkify.3 archive_entry_linkresolver_free.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_hardlink.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_hardlink_w.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_link.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_link_w.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_pathname.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_pathname_w.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_sourcepath.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_symlink.3 MLINKS+= archive_entry_paths.3 archive_entry_copy_symlink_w.3 MLINKS+= archive_entry_paths.3 archive_entry_hardlink.3 MLINKS+= archive_entry_paths.3 archive_entry_hardlink_w.3 MLINKS+= archive_entry_paths.3 archive_entry_pathname.3 MLINKS+= archive_entry_paths.3 archive_entry_pathname_w.3 MLINKS+= archive_entry_paths.3 archive_entry_set_hardlink.3 MLINKS+= archive_entry_paths.3 archive_entry_set_link.3 MLINKS+= archive_entry_paths.3 archive_entry_set_pathname.3 MLINKS+= archive_entry_paths.3 archive_entry_set_symlink.3 MLINKS+= archive_entry_paths.3 archive_entry_symlink.3 MLINKS+= archive_entry_paths.3 archive_entry_symlink_w.3 MLINKS+= archive_entry_paths.3 archive_entry_update_symlink_utf8.3 MLINKS+= archive_entry_paths.3 archive_entry_update_hardlink_utf8.3 MLINKS+= archive_entry_perms.3 archive_entry_copy_fflags_text.3 MLINKS+= archive_entry_perms.3 archive_entry_copy_fflags_text_w.3 MLINKS+= archive_entry_perms.3 archive_entry_copy_gname.3 MLINKS+= archive_entry_perms.3 archive_entry_copy_gname_w.3 MLINKS+= archive_entry_perms.3 archive_entry_copy_uname.3 MLINKS+= archive_entry_perms.3 archive_entry_copy_uname_w.3 MLINKS+= archive_entry_perms.3 archive_entry_fflags.3 MLINKS+= archive_entry_perms.3 archive_entry_fflags_text.3 MLINKS+= archive_entry_perms.3 archive_entry_gid.3 MLINKS+= archive_entry_perms.3 archive_entry_gname.3 MLINKS+= archive_entry_perms.3 archive_entry_gname_w.3 MLINKS+= archive_entry_perms.3 archive_entry_set_fflags.3 MLINKS+= archive_entry_perms.3 archive_entry_set_gid.3 MLINKS+= archive_entry_perms.3 archive_entry_set_gname.3 MLINKS+= archive_entry_perms.3 archive_entry_perm.3 MLINKS+= archive_entry_perms.3 archive_entry_set_perm.3 MLINKS+= archive_entry_perms.3 archive_entry_set_uid.3 MLINKS+= archive_entry_perms.3 archive_entry_set_uname.3 MLINKS+= archive_entry_perms.3 archive_entry_strmode.3 MLINKS+= archive_entry_perms.3 archive_entry_uid.3 MLINKS+= archive_entry_perms.3 archive_entry_uname.3 MLINKS+= archive_entry_perms.3 archive_entry_uname_w.3 MLINKS+= archive_entry_perms.3 archive_entry_update_gname_utf8.3 MLINKS+= archive_entry_perms.3 archive_entry_update_uname_utf8.3 MLINKS+= archive_entry_stat.3 archive_entry_copy_stat.3 MLINKS+= archive_entry_stat.3 archive_entry_dev.3 MLINKS+= archive_entry_stat.3 archive_entry_dev_is_set.3 MLINKS+= archive_entry_stat.3 archive_entry_devmajor.3 MLINKS+= archive_entry_stat.3 archive_entry_devminor.3 MLINKS+= archive_entry_stat.3 archive_entry_filetype.3 MLINKS+= archive_entry_stat.3 archive_entry_ino.3 MLINKS+= archive_entry_stat.3 archive_entry_ino64.3 MLINKS+= archive_entry_stat.3 archive_entry_ino_is_set.3 MLINKS+= archive_entry_stat.3 archive_entry_mode.3 MLINKS+= archive_entry_stat.3 archive_entry_nlink.3 MLINKS+= archive_entry_stat.3 archive_entry_rdev.3 MLINKS+= archive_entry_stat.3 archive_entry_rdevmajor.3 MLINKS+= archive_entry_stat.3 archive_entry_rdevminor.3 MLINKS+= archive_entry_stat.3 archive_entry_set_dev.3 MLINKS+= archive_entry_stat.3 archive_entry_set_devmajor.3 MLINKS+= archive_entry_stat.3 archive_entry_set_devminor.3 MLINKS+= archive_entry_stat.3 archive_entry_set_filetype.3 MLINKS+= archive_entry_stat.3 archive_entry_set_ino.3 MLINKS+= archive_entry_stat.3 archive_entry_set_ino64.3 MLINKS+= archive_entry_stat.3 archive_entry_set_mode.3 MLINKS+= archive_entry_stat.3 archive_entry_set_nlink.3 MLINKS+= archive_entry_stat.3 archive_entry_set_rdev.3 MLINKS+= archive_entry_stat.3 archive_entry_set_rdevmajor.3 MLINKS+= archive_entry_stat.3 archive_entry_set_rdevminor.3 MLINKS+= archive_entry_stat.3 archive_entry_set_size.3 MLINKS+= archive_entry_stat.3 archive_entry_size.3 MLINKS+= archive_entry_stat.3 archive_entry_size_is_set.3 MLINKS+= archive_entry_stat.3 archive_entry_unset_size.3 MLINKS+= archive_entry_time.3 archive_entry_atime.3 MLINKS+= archive_entry_time.3 archive_entry_atime_is_set.3 MLINKS+= archive_entry_time.3 archive_entry_atime_nsec.3 MLINKS+= archive_entry_time.3 archive_entry_birthtime.3 MLINKS+= archive_entry_time.3 archive_entry_birthtime_is_set.3 MLINKS+= archive_entry_time.3 archive_entry_birthtime_nsec.3 MLINKS+= archive_entry_time.3 archive_entry_ctime.3 MLINKS+= archive_entry_time.3 archive_entry_ctime_is_set.3 MLINKS+= archive_entry_time.3 archive_entry_ctime_nsec.3 MLINKS+= archive_entry_time.3 archive_entry_mtime.3 MLINKS+= archive_entry_time.3 archive_entry_mtime_is_set.3 MLINKS+= archive_entry_time.3 archive_entry_mtime_nsec.3 MLINKS+= archive_entry_time.3 archive_entry_set_atime.3 MLINKS+= archive_entry_time.3 archive_entry_set_birthtime.3 MLINKS+= archive_entry_time.3 archive_entry_set_ctime.3 MLINKS+= archive_entry_time.3 archive_entry_set_mtime.3 MLINKS+= archive_entry_time.3 archive_entry_unset_atime.3 MLINKS+= archive_entry_time.3 archive_entry_unset_birthtime.3 MLINKS+= archive_entry_time.3 archive_entry_unset_ctime.3 MLINKS+= archive_entry_time.3 archive_entry_unset_mtime.3 MLINKS+= archive_read_data.3 archive_read_data_block.3 MLINKS+= archive_read_data.3 archive_read_data_into_fd.3 MLINKS+= archive_read_data.3 archive_read_data_skip.3 MLINKS+= archive_read_header.3 archive_read_next_header.3 MLINKS+= archive_read_header.3 archive_read_next_header2.3 MLINKS+= archive_read_extract.3 archive_read_extract2.3 MLINKS+= archive_read_extract.3 archive_read_extract_set_progress_callback.3 MLINKS+= archive_read_extract.3 archive_read_extract_set_skip_file.3 MLINKS+= archive_read_open.3 archive_read_open2.3 MLINKS+= archive_read_open.3 archive_read_open_FILE.3 MLINKS+= archive_read_open.3 archive_read_open_fd.3 MLINKS+= archive_read_open.3 archive_read_open_file.3 MLINKS+= archive_read_open.3 archive_read_open_filename.3 MLINKS+= archive_read_open.3 archive_read_open_memory.3 MLINKS+= archive_read_free.3 archive_read_close.3 MLINKS+= archive_read_free.3 archive_read_finish.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_all.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_bzip2.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_compress.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_gzip.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_lzma.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_none.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_xz.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_program.3 MLINKS+= archive_read_filter.3 archive_read_support_filter_program_signature.3 MLINKS+= archive_read_format.3 archive_read_support_format_7zip.3 MLINKS+= archive_read_format.3 archive_read_support_format_all.3 MLINKS+= archive_read_format.3 archive_read_support_format_ar.3 MLINKS+= archive_read_format.3 archive_read_support_format_by_code.3 MLINKS+= archive_read_format.3 archive_read_support_format_cab.3 MLINKS+= archive_read_format.3 archive_read_support_format_cpio.3 MLINKS+= archive_read_format.3 archive_read_support_format_empty.3 MLINKS+= archive_read_format.3 archive_read_support_format_iso9660.3 MLINKS+= archive_read_format.3 archive_read_support_format_lha.3 MLINKS+= archive_read_format.3 archive_read_support_format_mtree.3 MLINKS+= archive_read_format.3 archive_read_support_format_rar.3 MLINKS+= archive_read_format.3 archive_read_support_format_raw.3 MLINKS+= archive_read_format.3 archive_read_support_format_tar.3 MLINKS+= archive_read_format.3 archive_read_support_format_xar.3 MLINKS+= archive_read_format.3 archive_read_support_format_zip.3 MLINKS+= archive_read_disk.3 archive_read_disk_entry_from_file.3 MLINKS+= archive_read_disk.3 archive_read_disk_gname.3 MLINKS+= archive_read_disk.3 archive_read_disk_new.3 MLINKS+= archive_read_disk.3 archive_read_disk_set_gname_lookup.3 MLINKS+= archive_read_disk.3 archive_read_disk_set_standard_lookup.3 MLINKS+= archive_read_disk.3 archive_read_disk_set_symlink_hybrid.3 MLINKS+= archive_read_disk.3 archive_read_disk_set_symlink_logical.3 MLINKS+= archive_read_disk.3 archive_read_disk_set_symlink_physical.3 MLINKS+= archive_read_disk.3 archive_read_disk_set_uname_lookup.3 MLINKS+= archive_read_disk.3 archive_read_disk_uname.3 MLINKS+= archive_read_set_options.3 archive_read_set_filter_option.3 MLINKS+= archive_read_set_options.3 archive_read_set_format_option.3 MLINKS+= archive_read_set_options.3 archive_read_set_option.3 MLINKS+= archive_util.3 archive_clear_error.3 MLINKS+= archive_util.3 archive_compression.3 MLINKS+= archive_util.3 archive_compression_name.3 MLINKS+= archive_util.3 archive_copy_error.3 MLINKS+= archive_util.3 archive_errno.3 MLINKS+= archive_util.3 archive_error_string.3 MLINKS+= archive_util.3 archive_file_count.3 MLINKS+= archive_util.3 archive_filter_code.3 MLINKS+= archive_util.3 archive_filter_count.3 MLINKS+= archive_util.3 archive_filter_name.3 MLINKS+= archive_util.3 archive_format.3 MLINKS+= archive_util.3 archive_format_name.3 MLINKS+= archive_util.3 archive_position.3 MLINKS+= archive_util.3 archive_set_error.3 MLINKS+= archive_write_blocksize.3 archive_write_get_bytes_in_last_block.3 MLINKS+= archive_write_blocksize.3 archive_write_get_bytes_per_block.3 MLINKS+= archive_write_blocksize.3 archive_write_set_bytes_in_last_block.3 MLINKS+= archive_write_blocksize.3 archive_write_set_bytes_per_block.3 MLINKS+= archive_write_disk.3 archive_write_data_block.3 MLINKS+= archive_write_disk.3 archive_write_disk_new.3 MLINKS+= archive_write_disk.3 archive_write_disk_set_group_lookup.3 MLINKS+= archive_write_disk.3 archive_write_disk_set_options.3 MLINKS+= archive_write_disk.3 archive_write_disk_set_skip_file.3 MLINKS+= archive_write_disk.3 archive_write_disk_set_standard_lookup.3 MLINKS+= archive_write_disk.3 archive_write_disk_set_user_lookup.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_bzip2.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_compress.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_gzip.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_lzip.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_lzma.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_none.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_program.3 MLINKS+= archive_write_filter.3 archive_write_add_filter_xz.3 MLINKS+= archive_write_format.3 archive_write_set_format_cpio.3 MLINKS+= archive_write_format.3 archive_write_set_format_pax.3 MLINKS+= archive_write_format.3 archive_write_set_format_pax_restricted.3 MLINKS+= archive_write_format.3 archive_write_set_format_shar.3 MLINKS+= archive_write_format.3 archive_write_set_format_shar_dump.3 MLINKS+= archive_write_format.3 archive_write_set_format_ustar.3 MLINKS+= archive_write_free.3 archive_write_close.3 MLINKS+= archive_write_free.3 archive_write_fail.3 MLINKS+= archive_write_free.3 archive_write_finish.3 MLINKS+= archive_write_open.3 archive_write_open_FILE.3 MLINKS+= archive_write_open.3 archive_write_open_fd.3 MLINKS+= archive_write_open.3 archive_write_open_file.3 MLINKS+= archive_write_open.3 archive_write_open_filename.3 MLINKS+= archive_write_open.3 archive_write_open_memory.3 MLINKS+= archive_write_set_options.3 archive_write_set_filter_option.3 MLINKS+= archive_write_set_options.3 archive_write_set_format_option.3 MLINKS+= archive_write_set_options.3 archive_write_set_option.3 MLINKS+= libarchive.3 archive.3 .PHONY: check test clean-test check test: cd ${.CURDIR}/test && make obj && make test clean-test: cd ${.CURDIR}/test && make clean .include Index: user/ngie/more-tests/lib/libc/iconv/__iconv.c =================================================================== --- user/ngie/more-tests/lib/libc/iconv/__iconv.c (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/__iconv.c (revision 281585) @@ -1,38 +1,38 @@ /*- * Copyright (c) 2013 Peter Wemm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include "iconv-internal.h" size_t -__iconv(iconv_t a, const char **b, size_t *c, char **d, +__iconv(iconv_t a, char **b, size_t *c, char **d, size_t *e, __uint32_t f, size_t *g) { return __bsd___iconv(a, b, c, d, e, f, g); } Index: user/ngie/more-tests/lib/libc/iconv/bsd_iconv.c =================================================================== --- user/ngie/more-tests/lib/libc/iconv/bsd_iconv.c (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/bsd_iconv.c (revision 281585) @@ -1,319 +1,319 @@ /* $FreeBSD$ */ /* $NetBSD: iconv.c,v 1.11 2009/03/03 16:22:33 explorer Exp $ */ /*- * Copyright (c) 2003 Citrus Project, * Copyright (c) 2009, 2010 Gabor Kovesdan , * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_types.h" #include "citrus_module.h" #include "citrus_esdb.h" #include "citrus_hash.h" #include "citrus_iconv.h" #include "iconv-internal.h" #define ISBADF(_h_) (!(_h_) || (_h_) == (iconv_t)-1) static iconv_t __bsd___iconv_open(const char *out, const char *in, struct _citrus_iconv *handle) { const char *out_slashes; char *out_noslashes; int ret; /* * Remove anything following a //, as these are options (like * //ignore, //translate, etc) and we just don't handle them. * This is for compatibility with software that uses these * blindly. */ out_slashes = strstr(out, "//"); if (out_slashes != NULL) { out_noslashes = strndup(out, out_slashes - out); if (out_noslashes == NULL) { errno = ENOMEM; return ((iconv_t)-1); } ret = _citrus_iconv_open(&handle, in, out_noslashes); free(out_noslashes); } else { ret = _citrus_iconv_open(&handle, in, out); } if (ret) { errno = ret == ENOENT ? EINVAL : ret; return ((iconv_t)-1); } handle->cv_shared->ci_discard_ilseq = strcasestr(out, "//IGNORE"); handle->cv_shared->ci_ilseq_invalid = false; handle->cv_shared->ci_hooks = NULL; return ((iconv_t)(void *)handle); } iconv_t __bsd_iconv_open(const char *out, const char *in) { return (__bsd___iconv_open(out, in, NULL)); } int __bsd_iconv_open_into(const char *out, const char *in, iconv_allocation_t *ptr) { struct _citrus_iconv *handle; handle = (struct _citrus_iconv *)ptr; return ((__bsd___iconv_open(out, in, handle) == (iconv_t)-1) ? -1 : 0); } int __bsd_iconv_close(iconv_t handle) { if (ISBADF(handle)) { errno = EBADF; return (-1); } _citrus_iconv_close((struct _citrus_iconv *)(void *)handle); return (0); } size_t -__bsd_iconv(iconv_t handle, const char **in, size_t *szin, char **out, size_t *szout) +__bsd_iconv(iconv_t handle, char **in, size_t *szin, char **out, size_t *szout) { size_t ret; int err; if (ISBADF(handle)) { errno = EBADF; return ((size_t)-1); } err = _citrus_iconv_convert((struct _citrus_iconv *)(void *)handle, in, szin, out, szout, 0, &ret); if (err) { errno = err; ret = (size_t)-1; } return (ret); } size_t -__bsd___iconv(iconv_t handle, const char **in, size_t *szin, char **out, +__bsd___iconv(iconv_t handle, char **in, size_t *szin, char **out, size_t *szout, uint32_t flags, size_t *invalids) { size_t ret; int err; if (ISBADF(handle)) { errno = EBADF; return ((size_t)-1); } err = _citrus_iconv_convert((struct _citrus_iconv *)(void *)handle, in, szin, out, szout, flags, &ret); if (invalids) *invalids = ret; if (err) { errno = err; ret = (size_t)-1; } return (ret); } int __bsd___iconv_get_list(char ***rlist, size_t *rsz, bool sorted) { int ret; ret = _citrus_esdb_get_list(rlist, rsz, sorted); if (ret) { errno = ret; return (-1); } return (0); } void __bsd___iconv_free_list(char **list, size_t sz) { _citrus_esdb_free_list(list, sz); } /* * GNU-compatibile non-standard interfaces. */ static int qsort_helper(const void *first, const void *second) { const char * const *s1; const char * const *s2; s1 = first; s2 = second; return (strcmp(*s1, *s2)); } void __bsd_iconvlist(int (*do_one) (unsigned int, const char * const *, void *), void *data) { char **list, **names; const char * const *np; char *curitem, *curkey, *slashpos; size_t sz; unsigned int i, j; i = 0; if (__bsd___iconv_get_list(&list, &sz, true)) list = NULL; qsort((void *)list, sz, sizeof(char *), qsort_helper); while (i < sz) { j = 0; slashpos = strchr(list[i], '/'); curkey = (char *)malloc(slashpos - list[i] + 2); names = (char **)malloc(sz * sizeof(char *)); if ((curkey == NULL) || (names == NULL)) { __bsd___iconv_free_list(list, sz); return; } strlcpy(curkey, list[i], slashpos - list[i] + 1); names[j++] = curkey; for (; (i < sz) && (memcmp(curkey, list[i], strlen(curkey)) == 0); i++) { slashpos = strchr(list[i], '/'); curitem = (char *)malloc(strlen(slashpos) + 1); if (curitem == NULL) { __bsd___iconv_free_list(list, sz); return; } strlcpy(curitem, &slashpos[1], strlen(slashpos) + 1); if (strcmp(curkey, curitem) == 0) { continue; } names[j++] = curitem; } np = (const char * const *)names; do_one(j, np, data); free(names); } __bsd___iconv_free_list(list, sz); } __inline const char * __bsd_iconv_canonicalize(const char *name) { return (_citrus_iconv_canonicalize(name)); } int __bsd_iconvctl(iconv_t cd, int request, void *argument) { struct _citrus_iconv *cv; struct iconv_hooks *hooks; const char *convname; char src[PATH_MAX], *dst; int *i; cv = (struct _citrus_iconv *)(void *)cd; hooks = (struct iconv_hooks *)argument; i = (int *)argument; if (ISBADF(cd)) { errno = EBADF; return (-1); } switch (request) { case ICONV_TRIVIALP: convname = cv->cv_shared->ci_convname; dst = strchr(convname, '/'); strlcpy(src, convname, dst - convname + 1); dst++; if ((convname == NULL) || (src == NULL) || (dst == NULL)) return (-1); *i = strcmp(src, dst) == 0 ? 1 : 0; return (0); case ICONV_GET_TRANSLITERATE: *i = 1; return (0); case ICONV_SET_TRANSLITERATE: return ((*i == 1) ? 0 : -1); case ICONV_GET_DISCARD_ILSEQ: *i = cv->cv_shared->ci_discard_ilseq ? 1 : 0; return (0); case ICONV_SET_DISCARD_ILSEQ: cv->cv_shared->ci_discard_ilseq = *i; return (0); case ICONV_SET_HOOKS: cv->cv_shared->ci_hooks = hooks; return (0); case ICONV_SET_FALLBACKS: errno = EOPNOTSUPP; return (-1); case ICONV_GET_ILSEQ_INVALID: *i = cv->cv_shared->ci_ilseq_invalid ? 1 : 0; return (0); case ICONV_SET_ILSEQ_INVALID: cv->cv_shared->ci_ilseq_invalid = *i; return (0); default: errno = EINVAL; return (-1); } } void __bsd_iconv_set_relocation_prefix(const char *orig_prefix __unused, const char *curr_prefix __unused) { } Index: user/ngie/more-tests/lib/libc/iconv/citrus_iconv.h =================================================================== --- user/ngie/more-tests/lib/libc/iconv/citrus_iconv.h (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/citrus_iconv.h (revision 281585) @@ -1,64 +1,64 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_iconv.h,v 1.5 2008/02/09 14:56:20 junyoung Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _CITRUS_ICONV_H_ #define _CITRUS_ICONV_H_ struct _citrus_iconv_shared; struct _citrus_iconv_ops; struct _citrus_iconv; __BEGIN_DECLS int _citrus_iconv_open(struct _citrus_iconv * __restrict * __restrict, const char * __restrict, const char * __restrict); void _citrus_iconv_close(struct _citrus_iconv *); const char *_citrus_iconv_canonicalize(const char *); __END_DECLS #include "citrus_iconv_local.h" #define _CITRUS_ICONV_F_HIDE_INVALID 0x0001 /* * _citrus_iconv_convert: * convert a string. */ static __inline int _citrus_iconv_convert(struct _citrus_iconv * __restrict cv, - const char * __restrict * __restrict in, size_t * __restrict inbytes, + char * __restrict * __restrict in, size_t * __restrict inbytes, char * __restrict * __restrict out, size_t * __restrict outbytes, uint32_t flags, size_t * __restrict nresults) { return (*cv->cv_shared->ci_ops->io_convert)(cv, in, inbytes, out, outbytes, flags, nresults); } #endif Index: user/ngie/more-tests/lib/libc/iconv/citrus_iconv_local.h =================================================================== --- user/ngie/more-tests/lib/libc/iconv/citrus_iconv_local.h (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/citrus_iconv_local.h (revision 281585) @@ -1,110 +1,110 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_iconv_local.h,v 1.3 2008/02/09 14:56:20 junyoung Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _CITRUS_ICONV_LOCAL_H_ #define _CITRUS_ICONV_LOCAL_H_ #include #include #define _CITRUS_ICONV_GETOPS_FUNC_BASE(_n_) \ int _n_(struct _citrus_iconv_ops *) #define _CITRUS_ICONV_GETOPS_FUNC(_n_) \ _CITRUS_ICONV_GETOPS_FUNC_BASE(_citrus_##_n_##_iconv_getops) #define _CITRUS_ICONV_DECLS(_m_) \ static int _citrus_##_m_##_iconv_init_shared \ (struct _citrus_iconv_shared * __restrict, \ const char * __restrict, const char * __restrict); \ static void _citrus_##_m_##_iconv_uninit_shared \ (struct _citrus_iconv_shared *); \ static int _citrus_##_m_##_iconv_convert \ (struct _citrus_iconv * __restrict, \ - const char * __restrict * __restrict, \ + char * __restrict * __restrict, \ size_t * __restrict, \ char * __restrict * __restrict, \ size_t * __restrict outbytes, \ uint32_t, size_t * __restrict); \ static int _citrus_##_m_##_iconv_init_context \ (struct _citrus_iconv *); \ static void _citrus_##_m_##_iconv_uninit_context \ (struct _citrus_iconv *) #define _CITRUS_ICONV_DEF_OPS(_m_) \ extern struct _citrus_iconv_ops _citrus_##_m_##_iconv_ops; \ struct _citrus_iconv_ops _citrus_##_m_##_iconv_ops = { \ /* io_init_shared */ &_citrus_##_m_##_iconv_init_shared, \ /* io_uninit_shared */ &_citrus_##_m_##_iconv_uninit_shared, \ /* io_init_context */ &_citrus_##_m_##_iconv_init_context, \ /* io_uninit_context */ &_citrus_##_m_##_iconv_uninit_context, \ /* io_convert */ &_citrus_##_m_##_iconv_convert \ } typedef _CITRUS_ICONV_GETOPS_FUNC_BASE((*_citrus_iconv_getops_t)); typedef int (*_citrus_iconv_init_shared_t) (struct _citrus_iconv_shared * __restrict, const char * __restrict, const char * __restrict); typedef void (*_citrus_iconv_uninit_shared_t) (struct _citrus_iconv_shared *); typedef int (*_citrus_iconv_convert_t) (struct _citrus_iconv * __restrict, - const char *__restrict* __restrict, size_t * __restrict, + char *__restrict* __restrict, size_t * __restrict, char * __restrict * __restrict, size_t * __restrict, uint32_t, size_t * __restrict); typedef int (*_citrus_iconv_init_context_t)(struct _citrus_iconv *); typedef void (*_citrus_iconv_uninit_context_t)(struct _citrus_iconv *); struct _citrus_iconv_ops { _citrus_iconv_init_shared_t io_init_shared; _citrus_iconv_uninit_shared_t io_uninit_shared; _citrus_iconv_init_context_t io_init_context; _citrus_iconv_uninit_context_t io_uninit_context; _citrus_iconv_convert_t io_convert; }; struct _citrus_iconv_shared { struct _citrus_iconv_ops *ci_ops; void *ci_closure; _CITRUS_HASH_ENTRY(_citrus_iconv_shared) ci_hash_entry; TAILQ_ENTRY(_citrus_iconv_shared) ci_tailq_entry; _citrus_module_t ci_module; unsigned int ci_used_count; char *ci_convname; bool ci_discard_ilseq; struct iconv_hooks *ci_hooks; bool ci_ilseq_invalid; }; struct _citrus_iconv { struct _citrus_iconv_shared *cv_shared; void *cv_closure; }; #endif Index: user/ngie/more-tests/lib/libc/iconv/citrus_none.c =================================================================== --- user/ngie/more-tests/lib/libc/iconv/citrus_none.c (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/citrus_none.c (revision 281585) @@ -1,237 +1,237 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_none.c,v 1.18 2008/06/14 16:01:07 tnozaki Exp $ */ /*- * Copyright (c) 2002 Citrus Project, * Copyright (c) 2010 Gabor Kovesdan , * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_none.h" #include "citrus_stdenc.h" _CITRUS_STDENC_DECLS(NONE); _CITRUS_STDENC_DEF_OPS(NONE); struct _citrus_stdenc_traits _citrus_NONE_stdenc_traits = { 0, /* et_state_size */ 1, /* mb_cur_max */ }; static int _citrus_NONE_stdenc_init(struct _citrus_stdenc * __restrict ce, const void *var __unused, size_t lenvar __unused, struct _citrus_stdenc_traits * __restrict et) { et->et_state_size = 0; et->et_mb_cur_max = 1; ce->ce_closure = NULL; return (0); } static void _citrus_NONE_stdenc_uninit(struct _citrus_stdenc *ce __unused) { } static int _citrus_NONE_stdenc_init_state(struct _citrus_stdenc * __restrict ce __unused, void * __restrict ps __unused) { return (0); } static int _citrus_NONE_stdenc_mbtocs(struct _citrus_stdenc * __restrict ce __unused, - _csid_t *csid, _index_t *idx, const char **s, size_t n, + _csid_t *csid, _index_t *idx, char **s, size_t n, void *ps __unused, size_t *nresult, struct iconv_hooks *hooks) { if (n < 1) { *nresult = (size_t)-2; return (0); } *csid = 0; *idx = (_index_t)(unsigned char)*(*s)++; *nresult = *idx == 0 ? 0 : 1; if ((hooks != NULL) && (hooks->uc_hook != NULL)) hooks->uc_hook((unsigned int)*idx, hooks->data); return (0); } static int _citrus_NONE_stdenc_cstomb(struct _citrus_stdenc * __restrict ce __unused, char *s, size_t n, _csid_t csid, _index_t idx, void *ps __unused, size_t *nresult, struct iconv_hooks *hooks __unused) { if (csid == _CITRUS_CSID_INVALID) { *nresult = 0; return (0); } if (csid != 0) return (EILSEQ); if ((idx & 0x000000FF) == idx) { if (n < 1) { *nresult = (size_t)-1; return (E2BIG); } *s = (char)idx; *nresult = 1; } else if ((idx & 0x0000FFFF) == idx) { if (n < 2) { *nresult = (size_t)-1; return (E2BIG); } s[0] = (char)idx; /* XXX: might be endian dependent */ s[1] = (char)(idx >> 8); *nresult = 2; } else if ((idx & 0x00FFFFFF) == idx) { if (n < 3) { *nresult = (size_t)-1; return (E2BIG); } s[0] = (char)idx; /* XXX: might be endian dependent */ s[1] = (char)(idx >> 8); s[2] = (char)(idx >> 16); *nresult = 3; } else { if (n < 3) { *nresult = (size_t)-1; return (E2BIG); } s[0] = (char)idx; /* XXX: might be endian dependent */ s[1] = (char)(idx >> 8); s[2] = (char)(idx >> 16); s[3] = (char)(idx >> 24); *nresult = 4; } return (0); } static int _citrus_NONE_stdenc_mbtowc(struct _citrus_stdenc * __restrict ce __unused, - _wc_t * __restrict pwc, const char ** __restrict s, size_t n, + _wc_t * __restrict pwc, char ** __restrict s, size_t n, void * __restrict pspriv __unused, size_t * __restrict nresult, struct iconv_hooks *hooks) { if (s == NULL) { *nresult = 0; return (0); } if (n == 0) { *nresult = (size_t)-2; return (0); } if (pwc != NULL) *pwc = (_wc_t)(unsigned char) **s; *nresult = *s == '\0' ? 0 : 1; if ((hooks != NULL) && (hooks->wc_hook != NULL)) hooks->wc_hook(*pwc, hooks->data); return (0); } static int _citrus_NONE_stdenc_wctomb(struct _citrus_stdenc * __restrict ce __unused, char * __restrict s, size_t n, _wc_t wc, void * __restrict pspriv __unused, size_t * __restrict nresult, struct iconv_hooks *hooks __unused) { if ((wc & ~0xFFU) != 0) { *nresult = (size_t)-1; return (EILSEQ); } if (n == 0) { *nresult = (size_t)-1; return (E2BIG); } *nresult = 1; if (s != NULL && n > 0) *s = (char)wc; return (0); } static int _citrus_NONE_stdenc_put_state_reset(struct _citrus_stdenc * __restrict ce __unused, char * __restrict s __unused, size_t n __unused, void * __restrict pspriv __unused, size_t * __restrict nresult) { *nresult = 0; return (0); } static int _citrus_NONE_stdenc_get_state_desc(struct _stdenc * __restrict ce __unused, void * __restrict ps __unused, int id, struct _stdenc_state_desc * __restrict d) { int ret = 0; switch (id) { case _STDENC_SDID_GENERIC: d->u.generic.state = _STDENC_SDGEN_INITIAL; break; default: ret = EOPNOTSUPP; } return (ret); } Index: user/ngie/more-tests/lib/libc/iconv/citrus_stdenc.h =================================================================== --- user/ngie/more-tests/lib/libc/iconv/citrus_stdenc.h (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/citrus_stdenc.h (revision 281585) @@ -1,124 +1,124 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_stdenc.h,v 1.4 2005/10/29 18:02:04 tshiozak Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #ifndef _CITRUS_STDENC_H_ #define _CITRUS_STDENC_H_ struct _citrus_stdenc; struct _citrus_stdenc_ops; struct _citrus_stdenc_traits; #define _CITRUS_STDENC_SDID_GENERIC 0 struct _citrus_stdenc_state_desc { union { struct { int state; #define _CITRUS_STDENC_SDGEN_UNKNOWN 0 #define _CITRUS_STDENC_SDGEN_INITIAL 1 #define _CITRUS_STDENC_SDGEN_STABLE 2 #define _CITRUS_STDENC_SDGEN_INCOMPLETE_CHAR 3 #define _CITRUS_STDENC_SDGEN_INCOMPLETE_SHIFT 4 } generic; } u; }; #include "citrus_stdenc_local.h" __BEGIN_DECLS int _citrus_stdenc_open(struct _citrus_stdenc * __restrict * __restrict, char const * __restrict, const void * __restrict, size_t); void _citrus_stdenc_close(struct _citrus_stdenc *); __END_DECLS static __inline int _citrus_stdenc_init_state(struct _citrus_stdenc * __restrict ce, void * __restrict ps) { return ((*ce->ce_ops->eo_init_state)(ce, ps)); } static __inline int _citrus_stdenc_mbtocs(struct _citrus_stdenc * __restrict ce, _citrus_csid_t * __restrict csid, _citrus_index_t * __restrict idx, - const char ** __restrict s, size_t n, void * __restrict ps, + char ** __restrict s, size_t n, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks) { return ((*ce->ce_ops->eo_mbtocs)(ce, csid, idx, s, n, ps, nresult, hooks)); } static __inline int _citrus_stdenc_cstomb(struct _citrus_stdenc * __restrict ce, char * __restrict s, size_t n, _citrus_csid_t csid, _citrus_index_t idx, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks) { return ((*ce->ce_ops->eo_cstomb)(ce, s, n, csid, idx, ps, nresult, hooks)); } static __inline int _citrus_stdenc_wctomb(struct _citrus_stdenc * __restrict ce, char * __restrict s, size_t n, _citrus_wc_t wc, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks) { return ((*ce->ce_ops->eo_wctomb)(ce, s, n, wc, ps, nresult, hooks)); } static __inline int _citrus_stdenc_put_state_reset(struct _citrus_stdenc * __restrict ce, char * __restrict s, size_t n, void * __restrict ps, size_t * __restrict nresult) { return ((*ce->ce_ops->eo_put_state_reset)(ce, s, n, ps, nresult)); } static __inline size_t _citrus_stdenc_get_state_size(struct _citrus_stdenc *ce) { return (ce->ce_traits->et_state_size); } static __inline int _citrus_stdenc_get_state_desc(struct _citrus_stdenc * __restrict ce, void * __restrict ps, int id, struct _citrus_stdenc_state_desc * __restrict d) { return ((*ce->ce_ops->eo_get_state_desc)(ce, ps, id, d)); } #endif Index: user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_local.h =================================================================== --- user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_local.h (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_local.h (revision 281585) @@ -1,162 +1,162 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_stdenc_local.h,v 1.4 2008/02/09 14:56:20 junyoung Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #ifndef _CITRUS_STDENC_LOCAL_H_ #define _CITRUS_STDENC_LOCAL_H_ #include #include "citrus_module.h" #define _CITRUS_STDENC_GETOPS_FUNC_BASE(n) \ int n(struct _citrus_stdenc_ops *, size_t) #define _CITRUS_STDENC_GETOPS_FUNC(_e_) \ _CITRUS_STDENC_GETOPS_FUNC_BASE(_citrus_##_e_##_stdenc_getops) typedef _CITRUS_STDENC_GETOPS_FUNC_BASE((*_citrus_stdenc_getops_t)); #define _CITRUS_STDENC_DECLS(_e_) \ static int _citrus_##_e_##_stdenc_init \ (struct _citrus_stdenc * __restrict, \ const void * __restrict, size_t, \ struct _citrus_stdenc_traits * __restrict); \ static void _citrus_##_e_##_stdenc_uninit(struct _citrus_stdenc *);\ static int _citrus_##_e_##_stdenc_init_state \ (struct _citrus_stdenc * __restrict, \ void * __restrict); \ static int _citrus_##_e_##_stdenc_mbtocs \ (struct _citrus_stdenc * __restrict, \ _citrus_csid_t * __restrict, \ _citrus_index_t * __restrict, \ - const char ** __restrict, size_t, \ + char ** __restrict, size_t, \ void * __restrict, size_t * __restrict, \ struct iconv_hooks *); \ static int _citrus_##_e_##_stdenc_cstomb \ (struct _citrus_stdenc * __restrict, \ char * __restrict, size_t, _citrus_csid_t, \ _citrus_index_t, void * __restrict, \ size_t * __restrict, struct iconv_hooks *); \ static int _citrus_##_e_##_stdenc_mbtowc \ (struct _citrus_stdenc * __restrict, \ _citrus_wc_t * __restrict, \ - const char ** __restrict, size_t, \ + char ** __restrict, size_t, \ void * __restrict, size_t * __restrict, \ struct iconv_hooks *); \ static int _citrus_##_e_##_stdenc_wctomb \ (struct _citrus_stdenc * __restrict, \ char * __restrict, size_t, _citrus_wc_t, \ void * __restrict, size_t * __restrict, \ struct iconv_hooks *); \ static int _citrus_##_e_##_stdenc_put_state_reset \ (struct _citrus_stdenc * __restrict, \ char * __restrict, size_t, void * __restrict, \ size_t * __restrict); \ static int _citrus_##_e_##_stdenc_get_state_desc \ (struct _citrus_stdenc * __restrict, \ void * __restrict, int, \ struct _citrus_stdenc_state_desc * __restrict) #define _CITRUS_STDENC_DEF_OPS(_e_) \ extern struct _citrus_stdenc_ops _citrus_##_e_##_stdenc_ops; \ struct _citrus_stdenc_ops _citrus_##_e_##_stdenc_ops = { \ /* eo_init */ &_citrus_##_e_##_stdenc_init, \ /* eo_uninit */ &_citrus_##_e_##_stdenc_uninit, \ /* eo_init_state */ &_citrus_##_e_##_stdenc_init_state, \ /* eo_mbtocs */ &_citrus_##_e_##_stdenc_mbtocs, \ /* eo_cstomb */ &_citrus_##_e_##_stdenc_cstomb, \ /* eo_mbtowc */ &_citrus_##_e_##_stdenc_mbtowc, \ /* eo_wctomb */ &_citrus_##_e_##_stdenc_wctomb, \ /* eo_put_state_reset */&_citrus_##_e_##_stdenc_put_state_reset,\ /* eo_get_state_desc */ &_citrus_##_e_##_stdenc_get_state_desc \ } typedef int (*_citrus_stdenc_init_t) (struct _citrus_stdenc * __reatrict, const void * __restrict , size_t, struct _citrus_stdenc_traits * __restrict); typedef void (*_citrus_stdenc_uninit_t)(struct _citrus_stdenc * __restrict); typedef int (*_citrus_stdenc_init_state_t) (struct _citrus_stdenc * __restrict, void * __restrict); typedef int (*_citrus_stdenc_mbtocs_t) (struct _citrus_stdenc * __restrict, _citrus_csid_t * __restrict, _citrus_index_t * __restrict, - const char ** __restrict, size_t, + char ** __restrict, size_t, void * __restrict, size_t * __restrict, struct iconv_hooks *); typedef int (*_citrus_stdenc_cstomb_t) (struct _citrus_stdenc *__restrict, char * __restrict, size_t, _citrus_csid_t, _citrus_index_t, void * __restrict, size_t * __restrict, struct iconv_hooks *); typedef int (*_citrus_stdenc_mbtowc_t) (struct _citrus_stdenc * __restrict, _citrus_wc_t * __restrict, - const char ** __restrict, size_t, + char ** __restrict, size_t, void * __restrict, size_t * __restrict, struct iconv_hooks *); typedef int (*_citrus_stdenc_wctomb_t) (struct _citrus_stdenc *__restrict, char * __restrict, size_t, _citrus_wc_t, void * __restrict, size_t * __restrict, struct iconv_hooks *); typedef int (*_citrus_stdenc_put_state_reset_t) (struct _citrus_stdenc *__restrict, char * __restrict, size_t, void * __restrict, size_t * __restrict); typedef int (*_citrus_stdenc_get_state_desc_t) (struct _citrus_stdenc * __restrict, void * __restrict, int, struct _citrus_stdenc_state_desc * __restrict); struct _citrus_stdenc_ops { _citrus_stdenc_init_t eo_init; _citrus_stdenc_uninit_t eo_uninit; _citrus_stdenc_init_state_t eo_init_state; _citrus_stdenc_mbtocs_t eo_mbtocs; _citrus_stdenc_cstomb_t eo_cstomb; _citrus_stdenc_mbtowc_t eo_mbtowc; _citrus_stdenc_wctomb_t eo_wctomb; _citrus_stdenc_put_state_reset_t eo_put_state_reset; /* version 0x00000002 */ _citrus_stdenc_get_state_desc_t eo_get_state_desc; }; struct _citrus_stdenc_traits { /* version 0x00000001 */ size_t et_state_size; size_t et_mb_cur_max; }; struct _citrus_stdenc { /* version 0x00000001 */ struct _citrus_stdenc_ops *ce_ops; void *ce_closure; _citrus_module_t ce_module; struct _citrus_stdenc_traits *ce_traits; }; #define _CITRUS_DEFAULT_STDENC_NAME "NONE" #endif Index: user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_template.h =================================================================== --- user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_template.h (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_template.h (revision 281585) @@ -1,211 +1,211 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_stdenc_template.h,v 1.4 2008/02/09 14:56:20 junyoung Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include /* * CAUTION: THIS IS NOT STANDALONE FILE * * function templates of iconv standard encoding handler for each encodings. * */ /* * macros */ #undef _TO_EI #undef _CE_TO_EI #undef _TO_STATE #define _TO_EI(_cl_) ((_ENCODING_INFO*)(_cl_)) #define _CE_TO_EI(_ce_) (_TO_EI((_ce_)->ce_closure)) #define _TO_STATE(_ps_) ((_ENCODING_STATE*)(_ps_)) /* ---------------------------------------------------------------------- * templates for public functions */ int _FUNCNAME(stdenc_getops)(struct _citrus_stdenc_ops *ops, size_t lenops __unused) { memcpy(ops, &_FUNCNAME(stdenc_ops), sizeof(_FUNCNAME(stdenc_ops))); return (0); } static int _FUNCNAME(stdenc_init)(struct _citrus_stdenc * __restrict ce, const void * __restrict var, size_t lenvar, struct _citrus_stdenc_traits * __restrict et) { _ENCODING_INFO *ei; int ret; ei = NULL; if (sizeof(_ENCODING_INFO) > 0) { ei = calloc(1, sizeof(_ENCODING_INFO)); if (ei == NULL) return (errno); } ret = _FUNCNAME(encoding_module_init)(ei, var, lenvar); if (ret) { free((void *)ei); return (ret); } ce->ce_closure = ei; et->et_state_size = sizeof(_ENCODING_STATE); et->et_mb_cur_max = _ENCODING_MB_CUR_MAX(_CE_TO_EI(ce)); return (0); } static void _FUNCNAME(stdenc_uninit)(struct _citrus_stdenc * __restrict ce) { if (ce) { _FUNCNAME(encoding_module_uninit)(_CE_TO_EI(ce)); free(ce->ce_closure); } } static int _FUNCNAME(stdenc_init_state)(struct _citrus_stdenc * __restrict ce, void * __restrict ps) { _FUNCNAME(init_state)(_CE_TO_EI(ce), _TO_STATE(ps)); return (0); } static int _FUNCNAME(stdenc_mbtocs)(struct _citrus_stdenc * __restrict ce, _citrus_csid_t * __restrict csid, _citrus_index_t * __restrict idx, - const char ** __restrict s, size_t n, void * __restrict ps, + char ** __restrict s, size_t n, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks) { wchar_t wc; int ret; ret = _FUNCNAME(mbrtowc_priv)(_CE_TO_EI(ce), &wc, s, n, _TO_STATE(ps), nresult); if ((ret == 0) && *nresult != (size_t)-2) ret = _FUNCNAME(stdenc_wctocs)(_CE_TO_EI(ce), csid, idx, wc); if ((ret == 0) && (hooks != NULL) && (hooks->uc_hook != NULL)) hooks->uc_hook((unsigned int)*idx, hooks->data); return (ret); } static int _FUNCNAME(stdenc_cstomb)(struct _citrus_stdenc * __restrict ce, char * __restrict s, size_t n, _citrus_csid_t csid, _citrus_index_t idx, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks __unused) { wchar_t wc; int ret; wc = ret = 0; if (csid != _CITRUS_CSID_INVALID) ret = _FUNCNAME(stdenc_cstowc)(_CE_TO_EI(ce), &wc, csid, idx); if (ret == 0) ret = _FUNCNAME(wcrtomb_priv)(_CE_TO_EI(ce), s, n, wc, _TO_STATE(ps), nresult); return (ret); } static int _FUNCNAME(stdenc_mbtowc)(struct _citrus_stdenc * __restrict ce, - _citrus_wc_t * __restrict wc, const char ** __restrict s, size_t n, + _citrus_wc_t * __restrict wc, char ** __restrict s, size_t n, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks) { int ret; ret = _FUNCNAME(mbrtowc_priv)(_CE_TO_EI(ce), wc, s, n, _TO_STATE(ps), nresult); if ((ret == 0) && (hooks != NULL) && (hooks->wc_hook != NULL)) hooks->wc_hook(*wc, hooks->data); return (ret); } static int _FUNCNAME(stdenc_wctomb)(struct _citrus_stdenc * __restrict ce, char * __restrict s, size_t n, _citrus_wc_t wc, void * __restrict ps, size_t * __restrict nresult, struct iconv_hooks *hooks __unused) { int ret; ret = _FUNCNAME(wcrtomb_priv)(_CE_TO_EI(ce), s, n, wc, _TO_STATE(ps), nresult); return (ret); } static int _FUNCNAME(stdenc_put_state_reset)(struct _citrus_stdenc * __restrict ce __unused, char * __restrict s __unused, size_t n __unused, void * __restrict ps __unused, size_t * __restrict nresult) { #if _ENCODING_IS_STATE_DEPENDENT return ((_FUNCNAME(put_state_reset)(_CE_TO_EI(ce), s, n, _TO_STATE(ps), nresult))); #else *nresult = 0; return (0); #endif } static int _FUNCNAME(stdenc_get_state_desc)(struct _citrus_stdenc * __restrict ce, void * __restrict ps, int id, struct _citrus_stdenc_state_desc * __restrict d) { int ret; switch (id) { case _STDENC_SDID_GENERIC: ret = _FUNCNAME(stdenc_get_state_desc_generic)( _CE_TO_EI(ce), _TO_STATE(ps), &d->u.generic.state); break; default: ret = EOPNOTSUPP; } return (ret); } Index: user/ngie/more-tests/lib/libc/iconv/iconv-internal.h =================================================================== --- user/ngie/more-tests/lib/libc/iconv/iconv-internal.h (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/iconv-internal.h (revision 281585) @@ -1,45 +1,45 @@ /*- * Copyright (c) 2013 Peter Wemm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * Interal prototypes for our back-end functions. */ -size_t __bsd___iconv(iconv_t, const char **, size_t *, char **, +size_t __bsd___iconv(iconv_t, char **, size_t *, char **, size_t *, __uint32_t, size_t *); void __bsd___iconv_free_list(char **, size_t); int __bsd___iconv_get_list(char ***, size_t *, __iconv_bool); -size_t __bsd_iconv(iconv_t, const char ** __restrict, +size_t __bsd_iconv(iconv_t, char ** __restrict, size_t * __restrict, char ** __restrict, size_t * __restrict); const char *__bsd_iconv_canonicalize(const char *); int __bsd_iconv_close(iconv_t); iconv_t __bsd_iconv_open(const char *, const char *); int __bsd_iconv_open_into(const char *, const char *, iconv_allocation_t *); void __bsd_iconv_set_relocation_prefix(const char *, const char *); int __bsd_iconvctl(iconv_t, int, void *); void __bsd_iconvlist(int (*) (unsigned int, const char * const *, void *), void *); Index: user/ngie/more-tests/lib/libc/iconv/iconv.3 =================================================================== --- user/ngie/more-tests/lib/libc/iconv/iconv.3 (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/iconv.3 (revision 281585) @@ -1,309 +1,309 @@ .\" $FreeBSD$ .\" $NetBSD: iconv.3,v 1.12 2004/08/02 13:38:21 tshiozak Exp $ .\" .\" Copyright (c) 2003 Citrus Project, .\" Copyright (c) 2009, 2010 Gabor Kovesdan , .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .Dd August 4, 2014 .Dt ICONV 3 .Os .Sh NAME .Nm iconv_open , .Nm iconv_open_into , .Nm iconv_close , .Nm iconv .Nd codeset conversion functions .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In iconv.h .Ft iconv_t .Fn iconv_open "const char *dstname" "const char *srcname" .Ft int .Fn iconv_open_into "const char *dstname" "const char *srcname" "iconv_allocation_t *ptr" .Ft int .Fn iconv_close "iconv_t cd" .Ft size_t .Fn iconv "iconv_t cd" "char ** restrict src" "size_t * restrict srcleft" "char ** restrict dst" "size_t * restrict dstleft" .Ft size_t -.Fn __iconv "iconv_t cd" "const char ** restrict src" "size_t * restrict srcleft" "char ** restrict dst" "size_t * restrict dstleft" "uint32_t flags" "size_t * invalids" +.Fn __iconv "iconv_t cd" "char ** restrict src" "size_t * restrict srcleft" "char ** restrict dst" "size_t * restrict dstleft" "uint32_t flags" "size_t * invalids" .Sh DESCRIPTION The .Fn iconv_open function opens a converter from the codeset .Fa srcname to the codeset .Fa dstname and returns its descriptor. The arguments .Fa srcname and .Fa dstname accept "" and "char", which refer to the current locale encoding. .Pp The .Fn iconv_open_into creates a conversion descriptor on a preallocated space. The .Ft iconv_allocation_t is used as a spaceholder type when allocating such space. The .Fa dstname and .Fa srcname arguments are the same as in the case of .Fn iconv_open . The .Fa ptr argument is a pointer of .Ft iconv_allocation_t to the preallocated space. .Pp The .Fn iconv_close function closes the specified converter .Fa cd . .Pp The .Fn iconv function converts the string in the buffer .Fa *src of length .Fa *srcleft bytes and stores the converted string in the buffer .Fa *dst of size .Fa *dstleft bytes. After calling .Fn iconv , the values pointed to by .Fa src , .Fa srcleft , .Fa dst , and .Fa dstleft are updated as follows: .Bl -tag -width 01234567 .It *src Pointer to the byte just after the last character fetched. .It *srcleft Number of remaining bytes in the source buffer. .It *dst Pointer to the byte just after the last character stored. .It *dstleft Number of remainder bytes in the destination buffer. .El .Pp If the string pointed to by .Fa *src contains a byte sequence which is not a valid character in the source codeset, the conversion stops just after the last successful conversion. If the output buffer is too small to store the converted character, the conversion also stops in the same way. In these cases, the values pointed to by .Fa src , .Fa srcleft , .Fa dst , and .Fa dstleft are updated to the state just after the last successful conversion. .Pp If the string pointed to by .Fa *src contains a character which is valid under the source codeset but can not be converted to the destination codeset, the character is replaced by an .Dq invalid character which depends on the destination codeset, e.g., .Sq \&? , and the conversion is continued. .Fn iconv returns the number of such .Dq invalid conversions . .Pp There are two special cases of .Fn iconv : .Bl -tag -width 0123 .It "src == NULL || *src == NULL" If the source and/or destination codesets are stateful, .Fn iconv places these into their initial state. .Pp If both .Fa dst and .Fa *dst are .No non- Ns Dv NULL , .Fn iconv stores the shift sequence for the destination switching to the initial state in the buffer pointed to by .Fa *dst . The buffer size is specified by the value pointed to by .Fa dstleft as above. .Fn iconv will fail if the buffer is too small to store the shift sequence. .Pp On the other hand, .Fa dst or .Fa *dst may be .Dv NULL . In this case, the shift sequence for the destination switching to the initial state is discarded. .El .Pp The .Fn __iconv function works just like .Fn iconv but if .Fn iconv fails, the invalid character count is lost there. This is a not bug rather a limitation of .St -p1003.1-2008 , so .Fn __iconv is provided as an alternative but non-standard interface. It also has a flags argument, where currently the following flags can be passed: .Bl -tag -width 0123 .It __ICONV_F_HIDE_INVALID Skip invalid characters, instead of returning with an error. .El .Sh RETURN VALUES Upon successful completion of .Fn iconv_open , it returns a conversion descriptor. Otherwise, .Fn iconv_open returns (iconv_t)\-1 and sets errno to indicate the error. .Pp Upon successful completion of .Fn iconv_open_into , it returns 0. Otherwise, .Fn iconv_open_into returns \-1, and sets errno to indicate the error. .Pp Upon successful completion of .Fn iconv_close , it returns 0. Otherwise, .Fn iconv_close returns \-1 and sets errno to indicate the error. .Pp Upon successful completion of .Fn iconv , it returns the number of .Dq invalid conversions. Otherwise, .Fn iconv returns (size_t)\-1 and sets errno to indicate the error. .Sh ERRORS The .Fn iconv_open function may cause an error in the following cases: .Bl -tag -width Er .It Bq Er ENOMEM Memory is exhausted. .It Bq Er EINVAL There is no converter specified by .Fa srcname and .Fa dstname . .El The .Fn iconv_open_into function may cause an error in the following cases: .Bl -tag -width Er .It Bq Er EINVAL There is no converter specified by .Fa srcname and .Fa dstname . .El .Pp The .Fn iconv_close function may cause an error in the following case: .Bl -tag -width Er .It Bq Er EBADF The conversion descriptor specified by .Fa cd is invalid. .El .Pp The .Fn iconv function may cause an error in the following cases: .Bl -tag -width Er .It Bq Er EBADF The conversion descriptor specified by .Fa cd is invalid. .It Bq Er EILSEQ The string pointed to by .Fa *src contains a byte sequence which does not describe a valid character of the source codeset. .It Bq Er E2BIG The output buffer pointed to by .Fa *dst is too small to store the result string. .It Bq Er EINVAL The string pointed to by .Fa *src terminates with an incomplete character or shift sequence. .El .Sh SEE ALSO .Xr iconv 1 , .Xr mkcsmapper 1 , .Xr mkesdb 1 , .Xr __iconv_get_list 3 , .Xr iconv_canonicalize 3 , .Xr iconvctl 3 , .Xr iconvlist 3 .Sh STANDARDS The .Fn iconv_open , .Fn iconv_close , and .Fn iconv functions conform to .St -p1003.1-2008 . .Pp The .Fn iconv_open_into function is a GNU-specific extension and it is not part of any standard, thus its use may break portability. The .Fn __iconv function is an own extension and it is not part of any standard, thus its use may break portability. Index: user/ngie/more-tests/lib/libc/iconv/iconv.c =================================================================== --- user/ngie/more-tests/lib/libc/iconv/iconv.c (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/iconv.c (revision 281585) @@ -1,39 +1,39 @@ /*- * Copyright (c) 2013 Peter Wemm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include "iconv-internal.h" size_t -iconv(iconv_t a, const char ** __restrict b, +iconv(iconv_t a, char ** __restrict b, size_t * __restrict c, char ** __restrict d, size_t * __restrict e) { return __bsd_iconv(a, b, c, d, e); } Index: user/ngie/more-tests/lib/libc/iconv/iconv_compat.c =================================================================== --- user/ngie/more-tests/lib/libc/iconv/iconv_compat.c (revision 281584) +++ user/ngie/more-tests/lib/libc/iconv/iconv_compat.c (revision 281585) @@ -1,121 +1,121 @@ /*- * Copyright (c) 2013 Peter Wemm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * These are ABI implementations for when the raw iconv_* symbol * space was exposed via libc.so.7 in its early life. This is * a transition aide, these wrappers will not normally ever be * executed except via __sym_compat() references. */ #include #include #include "iconv-internal.h" size_t -__iconv_compat(iconv_t a, const char ** b, size_t * c, char ** d, +__iconv_compat(iconv_t a, char ** b, size_t * c, char ** d, size_t * e, __uint32_t f, size_t *g) { return __bsd___iconv(a, b, c, d, e, f, g); } void __iconv_free_list_compat(char ** a, size_t b) { __bsd___iconv_free_list(a, b); } int __iconv_get_list_compat(char ***a, size_t *b, __iconv_bool c) { return __bsd___iconv_get_list(a, b, c); } size_t -iconv_compat(iconv_t a, const char ** __restrict b, +iconv_compat(iconv_t a, char ** __restrict b, size_t * __restrict c, char ** __restrict d, size_t * __restrict e) { return __bsd_iconv(a, b, c, d, e); } const char * iconv_canonicalize_compat(const char *a) { return __bsd_iconv_canonicalize(a); } int iconv_close_compat(iconv_t a) { return __bsd_iconv_close(a); } iconv_t iconv_open_compat(const char *a, const char *b) { return __bsd_iconv_open(a, b); } int iconv_open_into_compat(const char *a, const char *b, iconv_allocation_t *c) { return __bsd_iconv_open_into(a, b, c); } void iconv_set_relocation_prefix_compat(const char *a, const char *b) { return __bsd_iconv_set_relocation_prefix(a, b); } int iconvctl_compat(iconv_t a, int b, void *c) { return __bsd_iconvctl(a, b, c); } void iconvlist_compat(int (*a) (unsigned int, const char * const *, void *), void *b) { return __bsd_iconvlist(a, b); } int _iconv_version_compat = 0x0108; /* Magic - not used */ __sym_compat(__iconv, __iconv_compat, FBSD_1.2); __sym_compat(__iconv_free_list, __iconv_free_list_compat, FBSD_1.2); __sym_compat(__iconv_get_list, __iconv_get_list_compat, FBSD_1.2); __sym_compat(_iconv_version, _iconv_version_compat, FBSD_1.3); __sym_compat(iconv, iconv_compat, FBSD_1.3); __sym_compat(iconv_canonicalize, iconv_canonicalize_compat, FBSD_1.2); __sym_compat(iconv_close, iconv_close_compat, FBSD_1.3); __sym_compat(iconv_open, iconv_open_compat, FBSD_1.3); __sym_compat(iconv_open_into, iconv_open_into_compat, FBSD_1.3); __sym_compat(iconv_set_relocation_prefix, iconv_set_relocation_prefix_compat, FBSD_1.3); __sym_compat(iconvctl, iconvctl_compat, FBSD_1.3); __sym_compat(iconvlist, iconvlist_compat, FBSD_1.3); Index: user/ngie/more-tests/lib/libc/locale/cXXrtomb_iconv.h =================================================================== --- user/ngie/more-tests/lib/libc/locale/cXXrtomb_iconv.h (revision 281584) +++ user/ngie/more-tests/lib/libc/locale/cXXrtomb_iconv.h (revision 281585) @@ -1,116 +1,115 @@ /*- * Copyright (c) 2013 Ed Schouten * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include "../iconv/citrus_hash.h" #include "../iconv/citrus_module.h" #include "../iconv/citrus_iconv.h" #include "xlocale_private.h" typedef struct { bool initialized; struct _citrus_iconv iconv; union { charXX_t widechar[SRCBUF_LEN]; char bytes[sizeof(charXX_t) * SRCBUF_LEN]; } srcbuf; size_t srcbuf_len; } _ConversionState; _Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t), "Size of _ConversionState must not exceed mbstate_t's size."); size_t cXXrtomb_l(char * __restrict s, charXX_t c, mbstate_t * __restrict ps, locale_t locale) { _ConversionState *cs; struct _citrus_iconv *handle; - const char *src; - char *dst; + char *src, *dst; size_t srcleft, dstleft, invlen; int err; FIX_LOCALE(locale); if (ps == NULL) ps = &locale->cXXrtomb; cs = (_ConversionState *)ps; handle = &cs->iconv; /* Reinitialize mbstate_t. */ if (s == NULL || !cs->initialized) { if (_citrus_iconv_open(&handle, UTF_XX_INTERNAL, nl_langinfo_l(CODESET, locale)) != 0) { cs->initialized = false; errno = EINVAL; return (-1); } handle->cv_shared->ci_discard_ilseq = true; handle->cv_shared->ci_hooks = NULL; cs->srcbuf_len = 0; cs->initialized = true; if (s == NULL) return (1); } assert(cs->srcbuf_len < sizeof(cs->srcbuf.widechar) / sizeof(charXX_t)); cs->srcbuf.widechar[cs->srcbuf_len++] = c; /* Perform conversion. */ src = cs->srcbuf.bytes; srcleft = cs->srcbuf_len * sizeof(charXX_t); dst = s; dstleft = MB_CUR_MAX_L(locale); err = _citrus_iconv_convert(handle, &src, &srcleft, &dst, &dstleft, 0, &invlen); /* Character is part of a surrogate pair. We need more input. */ if (err == EINVAL) return (0); cs->srcbuf_len = 0; /* Illegal sequence. */ if (dst == s) { errno = EILSEQ; return ((size_t)-1); } return (dst - s); } size_t cXXrtomb(char * __restrict s, charXX_t c, mbstate_t * __restrict ps) { return (cXXrtomb_l(s, c, ps, __get_locale())); } Index: user/ngie/more-tests/lib/libc/locale/mbrtocXX_iconv.h =================================================================== --- user/ngie/more-tests/lib/libc/locale/mbrtocXX_iconv.h (revision 281584) +++ user/ngie/more-tests/lib/libc/locale/mbrtocXX_iconv.h (revision 281585) @@ -1,159 +1,158 @@ /*- * Copyright (c) 2013 Ed Schouten * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include "../iconv/citrus_hash.h" #include "../iconv/citrus_module.h" #include "../iconv/citrus_iconv.h" #include "xlocale_private.h" typedef struct { bool initialized; struct _citrus_iconv iconv; char srcbuf[MB_LEN_MAX]; size_t srcbuf_len; union { charXX_t widechar[DSTBUF_LEN]; char bytes[sizeof(charXX_t) * DSTBUF_LEN]; } dstbuf; size_t dstbuf_len; } _ConversionState; _Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t), "Size of _ConversionState must not exceed mbstate_t's size."); size_t mbrtocXX_l(charXX_t * __restrict pc, const char * __restrict s, size_t n, mbstate_t * __restrict ps, locale_t locale) { _ConversionState *cs; struct _citrus_iconv *handle; size_t i, retval; charXX_t retchar; FIX_LOCALE(locale); if (ps == NULL) ps = &locale->mbrtocXX; cs = (_ConversionState *)ps; handle = &cs->iconv; /* Reinitialize mbstate_t. */ if (s == NULL || !cs->initialized) { if (_citrus_iconv_open(&handle, nl_langinfo_l(CODESET, locale), UTF_XX_INTERNAL) != 0) { cs->initialized = false; errno = EINVAL; return (-1); } handle->cv_shared->ci_discard_ilseq = true; handle->cv_shared->ci_hooks = NULL; cs->srcbuf_len = cs->dstbuf_len = 0; cs->initialized = true; if (s == NULL) return (0); } /* See if we still have characters left from the previous invocation. */ if (cs->dstbuf_len > 0) { retval = (size_t)-3; goto return_char; } /* Fill up the read buffer as far as possible. */ if (n > sizeof(cs->srcbuf) - cs->srcbuf_len) n = sizeof(cs->srcbuf) - cs->srcbuf_len; memcpy(cs->srcbuf + cs->srcbuf_len, s, n); /* Convert as few characters to the dst buffer as possible. */ for (i = 0; ; i++) { - const char *src; - char *dst; + char *src, *dst; size_t srcleft, dstleft, invlen; int err; src = cs->srcbuf; srcleft = cs->srcbuf_len + n; dst = cs->dstbuf.bytes; dstleft = i * sizeof(charXX_t); assert(srcleft <= sizeof(cs->srcbuf) && dstleft <= sizeof(cs->dstbuf.bytes)); err = _citrus_iconv_convert(handle, &src, &srcleft, &dst, &dstleft, 0, &invlen); cs->dstbuf_len = (dst - cs->dstbuf.bytes) / sizeof(charXX_t); /* Got new character(s). Return the first. */ if (cs->dstbuf_len > 0) { assert(src - cs->srcbuf > cs->srcbuf_len); retval = src - cs->srcbuf - cs->srcbuf_len; cs->srcbuf_len = 0; goto return_char; } /* Increase dst buffer size, to obtain the surrogate pair. */ if (err == E2BIG) continue; /* Illegal sequence. */ if (invlen > 0) { cs->srcbuf_len = 0; errno = EILSEQ; return ((size_t)-1); } /* Save unprocessed remainder for the next invocation. */ memmove(cs->srcbuf, src, srcleft); cs->srcbuf_len = srcleft; return ((size_t)-2); } return_char: retchar = cs->dstbuf.widechar[0]; memmove(&cs->dstbuf.widechar[0], &cs->dstbuf.widechar[1], --cs->dstbuf_len * sizeof(charXX_t)); if (pc != NULL) *pc = retchar; if (retchar == 0) return (0); return (retval); } size_t mbrtocXX(charXX_t * __restrict pc, const char * __restrict s, size_t n, mbstate_t * __restrict ps) { return (mbrtocXX_l(pc, s, n, ps, __get_locale())); } Index: user/ngie/more-tests/lib/libc =================================================================== --- user/ngie/more-tests/lib/libc (revision 281584) +++ user/ngie/more-tests/lib/libc (revision 281585) Property changes on: user/ngie/more-tests/lib/libc ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/lib/libc:r281477-281584 Index: user/ngie/more-tests/lib/libiconv_modules/BIG5/citrus_big5.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/BIG5/citrus_big5.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/BIG5/citrus_big5.c (revision 281585) @@ -1,458 +1,458 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_big5.c,v 1.13 2011/05/23 14:53:46 joerg Exp $ */ /*- * Copyright (c)2002, 2006 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /*- * Copyright (c) 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Paul Borman at Krystal Technologies. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_prop.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_big5.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { int chlen; char ch[2]; } _BIG5State; typedef struct _BIG5Exclude { TAILQ_ENTRY(_BIG5Exclude) entry; wint_t start; wint_t end; } _BIG5Exclude; typedef TAILQ_HEAD(_BIG5ExcludeList, _BIG5Exclude) _BIG5ExcludeList; typedef struct { _BIG5ExcludeList excludes; int cell[0x100]; } _BIG5EncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_BIG5_##m #define _ENCODING_INFO _BIG5EncodingInfo #define _ENCODING_STATE _BIG5State #define _ENCODING_MB_CUR_MAX(_ei_) 2 #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_BIG5_init_state(_BIG5EncodingInfo * __restrict ei __unused, _BIG5State * __restrict s) { memset(s, 0, sizeof(*s)); } #if 0 static __inline void /*ARGSUSED*/ _citrus_BIG5_pack_state(_BIG5EncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _BIG5State * __restrict s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_BIG5_unpack_state(_BIG5EncodingInfo * __restrict ei __unused, _BIG5State * __restrict s, const void * __restrict pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static __inline int _citrus_BIG5_check(_BIG5EncodingInfo *ei, unsigned int c) { return ((ei->cell[c & 0xFF] & 0x1) ? 2 : 1); } static __inline int _citrus_BIG5_check2(_BIG5EncodingInfo *ei, unsigned int c) { return ((ei->cell[c & 0xFF] & 0x2) ? 1 : 0); } static __inline int _citrus_BIG5_check_excludes(_BIG5EncodingInfo *ei, wint_t c) { _BIG5Exclude *exclude; TAILQ_FOREACH(exclude, &ei->excludes, entry) { if (c >= exclude->start && c <= exclude->end) return (EILSEQ); } return (0); } static int _citrus_BIG5_fill_rowcol(void * __restrict ctx, const char * __restrict s, uint64_t start, uint64_t end) { _BIG5EncodingInfo *ei; uint64_t n; int i; if (start > 0xFF || end > 0xFF) return (EINVAL); ei = (_BIG5EncodingInfo *)ctx; i = strcmp("row", s) ? 1 : 0; i = 1 << i; for (n = start; n <= end; ++n) ei->cell[n & 0xFF] |= i; return (0); } static int /*ARGSUSED*/ _citrus_BIG5_fill_excludes(void * __restrict ctx, const char * __restrict s __unused, uint64_t start, uint64_t end) { _BIG5EncodingInfo *ei; _BIG5Exclude *exclude; if (start > 0xFFFF || end > 0xFFFF) return (EINVAL); ei = (_BIG5EncodingInfo *)ctx; exclude = TAILQ_LAST(&ei->excludes, _BIG5ExcludeList); if (exclude != NULL && (wint_t)start <= exclude->end) return (EINVAL); exclude = (void *)malloc(sizeof(*exclude)); if (exclude == NULL) return (ENOMEM); exclude->start = (wint_t)start; exclude->end = (wint_t)end; TAILQ_INSERT_TAIL(&ei->excludes, exclude, entry); return (0); } static const _citrus_prop_hint_t root_hints[] = { _CITRUS_PROP_HINT_NUM("row", &_citrus_BIG5_fill_rowcol), _CITRUS_PROP_HINT_NUM("col", &_citrus_BIG5_fill_rowcol), _CITRUS_PROP_HINT_NUM("excludes", &_citrus_BIG5_fill_excludes), _CITRUS_PROP_HINT_END }; static void /*ARGSUSED*/ _citrus_BIG5_encoding_module_uninit(_BIG5EncodingInfo *ei) { _BIG5Exclude *exclude; while ((exclude = TAILQ_FIRST(&ei->excludes)) != NULL) { TAILQ_REMOVE(&ei->excludes, exclude, entry); free(exclude); } } static int /*ARGSUSED*/ _citrus_BIG5_encoding_module_init(_BIG5EncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { const char *s; int err; memset((void *)ei, 0, sizeof(*ei)); TAILQ_INIT(&ei->excludes); if (lenvar > 0 && var != NULL) { s = _bcs_skip_ws_len((const char *)var, &lenvar); if (lenvar > 0 && *s != '\0') { err = _citrus_prop_parse_variable( root_hints, (void *)ei, s, lenvar); if (err == 0) return (0); _citrus_BIG5_encoding_module_uninit(ei); memset((void *)ei, 0, sizeof(*ei)); TAILQ_INIT(&ei->excludes); } } /* fallback Big5-1984, for backward compatibility. */ _citrus_BIG5_fill_rowcol(ei, "row", 0xA1, 0xFE); _citrus_BIG5_fill_rowcol(ei, "col", 0x40, 0x7E); _citrus_BIG5_fill_rowcol(ei, "col", 0xA1, 0xFE); return (0); } static int /*ARGSUSED*/ _citrus_BIG5_mbrtowc_priv(_BIG5EncodingInfo * __restrict ei, wchar_t * __restrict pwc, - const char ** __restrict s, size_t n, + char ** __restrict s, size_t n, _BIG5State * __restrict psenc, size_t * __restrict nresult) { wchar_t wchar; - const char *s0; + char *s0; int c, chlenbak; s0 = *s; if (s0 == NULL) { _citrus_BIG5_init_state(ei, psenc); *nresult = 0; return (0); } chlenbak = psenc->chlen; /* make sure we have the first byte in the buffer */ switch (psenc->chlen) { case 0: if (n < 1) goto restart; psenc->ch[0] = *s0++; psenc->chlen = 1; n--; break; case 1: break; default: /* illegal state */ goto ilseq; } c = _citrus_BIG5_check(ei, psenc->ch[0] & 0xff); if (c == 0) goto ilseq; while (psenc->chlen < c) { if (n < 1) { goto restart; } psenc->ch[psenc->chlen] = *s0++; psenc->chlen++; n--; } switch (c) { case 1: wchar = psenc->ch[0] & 0xff; break; case 2: if (!_citrus_BIG5_check2(ei, psenc->ch[1] & 0xff)) goto ilseq; wchar = ((psenc->ch[0] & 0xff) << 8) | (psenc->ch[1] & 0xff); break; default: /* illegal state */ goto ilseq; } if (_citrus_BIG5_check_excludes(ei, (wint_t)wchar) != 0) goto ilseq; *s = s0; psenc->chlen = 0; if (pwc) *pwc = wchar; *nresult = wchar ? c - chlenbak : 0; return (0); ilseq: psenc->chlen = 0; *nresult = (size_t)-1; return (EILSEQ); restart: *s = s0; *nresult = (size_t)-2; return (0); } static int /*ARGSUSED*/ _citrus_BIG5_wcrtomb_priv(_BIG5EncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wc, _BIG5State * __restrict psenc __unused, size_t * __restrict nresult) { size_t l; int ret; /* check invalid sequence */ if (wc & ~0xffff || _citrus_BIG5_check_excludes(ei, (wint_t)wc) != 0) { ret = EILSEQ; goto err; } if (wc & 0x8000) { if (_citrus_BIG5_check(ei, (wc >> 8) & 0xff) != 2 || !_citrus_BIG5_check2(ei, wc & 0xff)) { ret = EILSEQ; goto err; } l = 2; } else { if (wc & ~0xff || !_citrus_BIG5_check(ei, wc & 0xff)) { ret = EILSEQ; goto err; } l = 1; } if (n < l) { /* bound check failure */ ret = E2BIG; goto err; } if (l == 2) { s[0] = (wc >> 8) & 0xff; s[1] = wc & 0xff; } else s[0] = wc & 0xff; *nresult = l; return (0); err: *nresult = (size_t)-1; return (ret); } static __inline int /*ARGSUSED*/ _citrus_BIG5_stdenc_wctocs(_BIG5EncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = (wc < 0x100) ? 0 : 1; *idx = (_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_BIG5_stdenc_cstowc(_BIG5EncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { switch (csid) { case 0: case 1: *wc = (wchar_t)idx; break; default: return (EILSEQ); } return (0); } static __inline int /*ARGSUSED*/ _citrus_BIG5_stdenc_get_state_desc_generic(_BIG5EncodingInfo * __restrict ei __unused, _BIG5State * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(BIG5); _CITRUS_STDENC_DEF_OPS(BIG5); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/DECHanyu/citrus_dechanyu.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/DECHanyu/citrus_dechanyu.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/DECHanyu/citrus_dechanyu.c (revision 281585) @@ -1,394 +1,394 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_dechanyu.c,v 1.4 2011/11/19 18:20:13 tnozaki Exp $ */ /*- * Copyright (c)2007 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_dechanyu.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { size_t chlen; char ch[4]; } _DECHanyuState; typedef struct { int dummy; } _DECHanyuEncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.__CONCAT(s_,_func_) #define _FUNCNAME(m) __CONCAT(_citrus_DECHanyu_,m) #define _ENCODING_INFO _DECHanyuEncodingInfo #define _ENCODING_STATE _DECHanyuState #define _ENCODING_MB_CUR_MAX(_ei_) 4 #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_DECHanyu_init_state(_DECHanyuEncodingInfo * __restrict ei __unused, _DECHanyuState * __restrict psenc) { psenc->chlen = 0; } #if 0 static __inline void /*ARGSUSED*/ _citrus_DECHanyu_pack_state(_DECHanyuEncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _DECHanyuState * __restrict psenc) { memcpy(pspriv, (const void *)psenc, sizeof(*psenc)); } static __inline void /*ARGSUSED*/ _citrus_DECHanyu_unpack_state(_DECHanyuEncodingInfo * __restrict ei __unused, _DECHanyuState * __restrict psenc, const void * __restrict pspriv) { memcpy((void *)psenc, pspriv, sizeof(*psenc)); } #endif static void /*ARGSUSED*/ _citrus_DECHanyu_encoding_module_uninit(_DECHanyuEncodingInfo *ei __unused) { /* ei may be null */ } static int /*ARGSUSED*/ _citrus_DECHanyu_encoding_module_init(_DECHanyuEncodingInfo * __restrict ei __unused, const void * __restrict var __unused, size_t lenvar __unused) { /* ei may be null */ return (0); } static __inline bool is_singlebyte(int c) { return (c <= 0x7F); } static __inline bool is_leadbyte(int c) { return (c >= 0xA1 && c <= 0xFE); } static __inline bool is_trailbyte(int c) { c &= ~0x80; return (c >= 0x21 && c <= 0x7E); } static __inline bool is_hanyu1(int c) { return (c == 0xC2); } static __inline bool is_hanyu2(int c) { return (c == 0xCB); } #define HANYUBIT 0xC2CB0000 static __inline bool is_94charset(int c) { return (c >= 0x21 && c <= 0x7E); } static int /*ARGSUSED*/ _citrus_DECHanyu_mbrtowc_priv(_DECHanyuEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _DECHanyuState * __restrict psenc, size_t * __restrict nresult) { - const char *s0; + char *s0; wchar_t wc; int ch; if (*s == NULL) { _citrus_DECHanyu_init_state(ei, psenc); *nresult = _ENCODING_IS_STATE_DEPENDENT; return (0); } s0 = *s; wc = (wchar_t)0; switch (psenc->chlen) { case 0: if (n-- < 1) goto restart; ch = *s0++ & 0xFF; if (is_singlebyte(ch)) { if (pwc != NULL) *pwc = (wchar_t)ch; *nresult = (size_t)((ch == 0) ? 0 : 1); *s = s0; return (0); } if (!is_leadbyte(ch)) goto ilseq; psenc->ch[psenc->chlen++] = ch; break; case 1: ch = psenc->ch[0] & 0xFF; if (!is_leadbyte(ch)) return (EINVAL); break; case 2: case 3: ch = psenc->ch[0] & 0xFF; if (is_hanyu1(ch)) { ch = psenc->ch[1] & 0xFF; if (is_hanyu2(ch)) { wc |= (wchar_t)HANYUBIT; break; } } /*FALLTHROUGH*/ default: return (EINVAL); } switch (psenc->chlen) { case 1: if (is_hanyu1(ch)) { if (n-- < 1) goto restart; ch = *s0++ & 0xFF; if (!is_hanyu2(ch)) goto ilseq; psenc->ch[psenc->chlen++] = ch; wc |= (wchar_t)HANYUBIT; if (n-- < 1) goto restart; ch = *s0++ & 0xFF; if (!is_leadbyte(ch)) goto ilseq; psenc->ch[psenc->chlen++] = ch; } break; case 2: if (n-- < 1) goto restart; ch = *s0++ & 0xFF; if (!is_leadbyte(ch)) goto ilseq; psenc->ch[psenc->chlen++] = ch; break; case 3: ch = psenc->ch[2] & 0xFF; if (!is_leadbyte(ch)) return (EINVAL); } if (n-- < 1) goto restart; wc |= (wchar_t)(ch << 8); ch = *s0++ & 0xFF; if (!is_trailbyte(ch)) goto ilseq; wc |= (wchar_t)ch; if (pwc != NULL) *pwc = wc; *nresult = (size_t)(s0 - *s); *s = s0; psenc->chlen = 0; return (0); restart: *nresult = (size_t)-2; *s = s0; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); } static int /*ARGSUSED*/ _citrus_DECHanyu_wcrtomb_priv(_DECHanyuEncodingInfo * __restrict ei __unused, char * __restrict s, size_t n, wchar_t wc, _DECHanyuState * __restrict psenc, size_t * __restrict nresult) { int ch; if (psenc->chlen != 0) return (EINVAL); /* XXX: assume wchar_t as int */ if ((uint32_t)wc <= 0x7F) { ch = wc & 0xFF; } else { if ((uint32_t)wc > 0xFFFF) { if ((wc & ~0xFFFF) != (wchar_t)HANYUBIT) goto ilseq; psenc->ch[psenc->chlen++] = (wc >> 24) & 0xFF; psenc->ch[psenc->chlen++] = (wc >> 16) & 0xFF; wc &= 0xFFFF; } ch = (wc >> 8) & 0xFF; if (!is_leadbyte(ch)) goto ilseq; psenc->ch[psenc->chlen++] = ch; ch = wc & 0xFF; if (!is_trailbyte(ch)) goto ilseq; } psenc->ch[psenc->chlen++] = ch; if (n < psenc->chlen) { *nresult = (size_t)-1; return (E2BIG); } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; psenc->chlen = 0; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); } static __inline int /*ARGSUSED*/ _citrus_DECHanyu_stdenc_wctocs(_DECHanyuEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { wchar_t mask; int plane; plane = 0; mask = 0x7F; /* XXX: assume wchar_t as int */ if ((uint32_t)wc > 0x7F) { if ((uint32_t)wc > 0xFFFF) { if ((wc & ~0xFFFF) != (wchar_t)HANYUBIT) return (EILSEQ); plane += 2; } if (!is_leadbyte((wc >> 8) & 0xFF) || !is_trailbyte(wc & 0xFF)) return (EILSEQ); plane += (wc & 0x80) ? 1 : 2; mask |= 0x7F00; } *csid = plane; *idx = (_index_t)(wc & mask); return (0); } static __inline int /*ARGSUSED*/ _citrus_DECHanyu_stdenc_cstowc(_DECHanyuEncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { if (csid == 0) { if (idx > 0x7F) return (EILSEQ); } else if (csid <= 4) { if (!is_94charset(idx >> 8)) return (EILSEQ); if (!is_94charset(idx & 0xFF)) return (EILSEQ); if (csid % 2) idx |= 0x80; idx |= 0x8000; if (csid > 2) idx |= HANYUBIT; } else return (EILSEQ); *wc = (wchar_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_DECHanyu_stdenc_get_state_desc_generic( _DECHanyuEncodingInfo * __restrict ei __unused, _DECHanyuState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(DECHanyu); _CITRUS_STDENC_DEF_OPS(DECHanyu); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/EUC/citrus_euc.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/EUC/citrus_euc.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/EUC/citrus_euc.c (revision 281585) @@ -1,387 +1,387 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_euc.c,v 1.14 2009/01/11 02:46:24 christos Exp $ */ /*- * Copyright (c)2002 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /*- * Copyright (c) 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Paul Borman at Krystal Technologies. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_bcs.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_euc.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { int chlen; char ch[3]; } _EUCState; typedef struct { wchar_t bits[4]; wchar_t mask; unsigned count[4]; unsigned mb_cur_max; } _EUCEncodingInfo; #define _SS2 0x008e #define _SS3 0x008f #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_EUC_##m #define _ENCODING_INFO _EUCEncodingInfo #define _ENCODING_STATE _EUCState #define _ENCODING_MB_CUR_MAX(_ei_) (_ei_)->mb_cur_max #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline int _citrus_EUC_cs(unsigned int c) { c &= 0xff; return ((c & 0x80) ? c == _SS3 ? 3 : c == _SS2 ? 2 : 1 : 0); } static __inline int _citrus_EUC_parse_variable(_EUCEncodingInfo *ei, const void *var, size_t lenvar __unused) { char *e; const char *v; int x; /* parse variable string */ if (!var) return (EFTYPE); v = (const char *)var; while (*v == ' ' || *v == '\t') ++v; ei->mb_cur_max = 1; for (x = 0; x < 4; ++x) { ei->count[x] = (int)_bcs_strtol(v, (char **)&e, 0); if (v == e || !(v = e) || ei->count[x] < 1 || ei->count[x] > 4) { return (EFTYPE); } if (ei->mb_cur_max < ei->count[x]) ei->mb_cur_max = ei->count[x]; while (*v == ' ' || *v == '\t') ++v; ei->bits[x] = (int)_bcs_strtol(v, (char **)&e, 0); if (v == e || !(v = e)) { return (EFTYPE); } while (*v == ' ' || *v == '\t') ++v; } ei->mask = (int)_bcs_strtol(v, (char **)&e, 0); if (v == e || !(v = e)) { return (EFTYPE); } return (0); } static __inline void /*ARGSUSED*/ _citrus_EUC_init_state(_EUCEncodingInfo *ei __unused, _EUCState *s) { memset(s, 0, sizeof(*s)); } #if 0 static __inline void /*ARGSUSED*/ _citrus_EUC_pack_state(_EUCEncodingInfo *ei __unused, void *pspriv, const _EUCState *s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_EUC_unpack_state(_EUCEncodingInfo *ei __unused, _EUCState *s, const void *pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static int -_citrus_EUC_mbrtowc_priv(_EUCEncodingInfo *ei, wchar_t *pwc, const char **s, +_citrus_EUC_mbrtowc_priv(_EUCEncodingInfo *ei, wchar_t *pwc, char **s, size_t n, _EUCState *psenc, size_t *nresult) { wchar_t wchar; int c, chlenbak, cs, len; - const char *s0, *s1 = NULL; + char *s0, *s1 = NULL; s0 = *s; if (s0 == NULL) { _citrus_EUC_init_state(ei, psenc); *nresult = 0; /* state independent */ return (0); } chlenbak = psenc->chlen; /* make sure we have the first byte in the buffer */ switch (psenc->chlen) { case 0: if (n < 1) goto restart; psenc->ch[0] = *s0++; psenc->chlen = 1; n--; break; case 1: case 2: break; default: /* illgeal state */ goto encoding_error; } c = ei->count[cs = _citrus_EUC_cs(psenc->ch[0] & 0xff)]; if (c == 0) goto encoding_error; while (psenc->chlen < c) { if (n < 1) goto restart; psenc->ch[psenc->chlen] = *s0++; psenc->chlen++; n--; } *s = s0; switch (cs) { case 3: case 2: /* skip SS2/SS3 */ len = c - 1; s1 = &psenc->ch[1]; break; case 1: case 0: len = c; s1 = &psenc->ch[0]; break; default: goto encoding_error; } wchar = 0; while (len-- > 0) wchar = (wchar << 8) | (*s1++ & 0xff); wchar = (wchar & ~ei->mask) | ei->bits[cs]; psenc->chlen = 0; if (pwc) *pwc = wchar; *nresult = wchar ? (size_t)(c - chlenbak) : 0; return (0); encoding_error: psenc->chlen = 0; *nresult = (size_t)-1; return (EILSEQ); restart: *nresult = (size_t)-2; *s = s0; return (0); } static int _citrus_EUC_wcrtomb_priv(_EUCEncodingInfo *ei, char *s, size_t n, wchar_t wc, _EUCState *psenc __unused, size_t *nresult) { wchar_t m, nm; unsigned int cs; int ret; short i; m = wc & ei->mask; nm = wc & ~m; for (cs = 0; cs < sizeof(ei->count) / sizeof(ei->count[0]); cs++) if (m == ei->bits[cs]) break; /* fallback case - not sure if it is necessary */ if (cs == sizeof(ei->count) / sizeof(ei->count[0])) cs = 1; i = ei->count[cs]; if (n < (unsigned)i) { ret = E2BIG; goto err; } m = (cs) ? 0x80 : 0x00; switch (cs) { case 2: *s++ = _SS2; i--; break; case 3: *s++ = _SS3; i--; break; } while (i-- > 0) *s++ = ((nm >> (i << 3)) & 0xff) | m; *nresult = (size_t)ei->count[cs]; return (0); err: *nresult = (size_t)-1; return (ret); } static __inline int /*ARGSUSED*/ _citrus_EUC_stdenc_wctocs(_EUCEncodingInfo * __restrict ei, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { wchar_t m, nm; m = wc & ei->mask; nm = wc & ~m; *csid = (_citrus_csid_t)m; *idx = (_citrus_index_t)nm; return (0); } static __inline int /*ARGSUSED*/ _citrus_EUC_stdenc_cstowc(_EUCEncodingInfo * __restrict ei, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { if ((csid & ~ei->mask) != 0 || (idx & ei->mask) != 0) return (EINVAL); *wc = (wchar_t)csid | (wchar_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_EUC_stdenc_get_state_desc_generic(_EUCEncodingInfo * __restrict ei __unused, _EUCState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } static int /*ARGSUSED*/ _citrus_EUC_encoding_module_init(_EUCEncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { return (_citrus_EUC_parse_variable(ei, var, lenvar)); } static void /*ARGSUSED*/ _citrus_EUC_encoding_module_uninit(_EUCEncodingInfo * __restrict ei __unused) { } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(EUC); _CITRUS_STDENC_DEF_OPS(EUC); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/EUCTW/citrus_euctw.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/EUCTW/citrus_euctw.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/EUCTW/citrus_euctw.c (revision 281585) @@ -1,379 +1,379 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_euctw.c,v 1.11 2008/06/14 16:01:07 tnozaki Exp $ */ /*- * Copyright (c)2002 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /*- * Copyright (c)1999 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $Citrus: xpg4dl/FreeBSD/lib/libc/locale/euctw.c,v 1.13 2001/06/21 01:51:44 yamt Exp $ */ #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_euctw.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { int chlen; char ch[4]; } _EUCTWState; typedef struct { int dummy; } _EUCTWEncodingInfo; #define _SS2 0x008e #define _SS3 0x008f #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_EUCTW_##m #define _ENCODING_INFO _EUCTWEncodingInfo #define _ENCODING_STATE _EUCTWState #define _ENCODING_MB_CUR_MAX(_ei_) 4 #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline int _citrus_EUCTW_cs(unsigned int c) { c &= 0xff; return ((c & 0x80) ? (c == _SS2 ? 2 : 1) : 0); } static __inline int _citrus_EUCTW_count(int cs) { switch (cs) { case 0: /*FALLTHROUGH*/ case 1: /*FALLTHROUGH*/ case 2: return (1 << cs); case 3: abort(); /*NOTREACHED*/ } return (0); } static __inline void /*ARGSUSED*/ _citrus_EUCTW_init_state(_EUCTWEncodingInfo * __restrict ei __unused, _EUCTWState * __restrict s) { memset(s, 0, sizeof(*s)); } #if 0 static __inline void /*ARGSUSED*/ _citrus_EUCTW_pack_state(_EUCTWEncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _EUCTWState * __restrict s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_EUCTW_unpack_state(_EUCTWEncodingInfo * __restrict ei __unused, _EUCTWState * __restrict s, const void * __restrict pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static int /*ARGSUSED*/ _citrus_EUCTW_encoding_module_init(_EUCTWEncodingInfo * __restrict ei, const void * __restrict var __unused, size_t lenvar __unused) { memset((void *)ei, 0, sizeof(*ei)); return (0); } static void /*ARGSUSED*/ _citrus_EUCTW_encoding_module_uninit(_EUCTWEncodingInfo *ei __unused) { } static int _citrus_EUCTW_mbrtowc_priv(_EUCTWEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _EUCTWState * __restrict psenc, size_t * __restrict nresult) { - const char *s0; + char *s0; wchar_t wchar; int c, chlenbak, cs; s0 = *s; if (s0 == NULL) { _citrus_EUCTW_init_state(ei, psenc); *nresult = 0; /* state independent */ return (0); } chlenbak = psenc->chlen; /* make sure we have the first byte in the buffer */ switch (psenc->chlen) { case 0: if (n < 1) goto restart; psenc->ch[0] = *s0++; psenc->chlen = 1; n--; break; case 1: case 2: break; default: /* illgeal state */ goto ilseq; } c = _citrus_EUCTW_count(cs = _citrus_EUCTW_cs(psenc->ch[0] & 0xff)); if (c == 0) goto ilseq; while (psenc->chlen < c) { if (n < 1) goto ilseq; psenc->ch[psenc->chlen] = *s0++; psenc->chlen++; n--; } wchar = 0; switch (cs) { case 0: if (psenc->ch[0] & 0x80) goto ilseq; wchar = psenc->ch[0] & 0xff; break; case 1: if (!(psenc->ch[0] & 0x80) || !(psenc->ch[1] & 0x80)) goto ilseq; wchar = ((psenc->ch[0] & 0xff) << 8) | (psenc->ch[1] & 0xff); wchar |= 'G' << 24; break; case 2: if ((unsigned char)psenc->ch[1] < 0xa1 || 0xa7 < (unsigned char)psenc->ch[1]) goto ilseq; if (!(psenc->ch[2] & 0x80) || !(psenc->ch[3] & 0x80)) goto ilseq; wchar = ((psenc->ch[2] & 0xff) << 8) | (psenc->ch[3] & 0xff); wchar |= ('G' + psenc->ch[1] - 0xa1) << 24; break; default: goto ilseq; } *s = s0; psenc->chlen = 0; if (pwc) *pwc = wchar; *nresult = wchar ? c - chlenbak : 0; return (0); ilseq: psenc->chlen = 0; *nresult = (size_t)-1; return (EILSEQ); restart: *s = s0; *nresult = (size_t)-1; return (0); } static int _citrus_EUCTW_wcrtomb_priv(_EUCTWEncodingInfo * __restrict ei __unused, char * __restrict s, size_t n, wchar_t wc, _EUCTWState * __restrict psenc __unused, size_t * __restrict nresult) { wchar_t cs, v; int clen, i, ret; size_t len; cs = wc & 0x7f000080; clen = 1; if (wc & 0x00007f00) clen = 2; if ((wc & 0x007f0000) && !(wc & 0x00800000)) clen = 3; if (clen == 1 && cs == 0x00000000) { /* ASCII */ len = 1; if (n < len) { ret = E2BIG; goto err; } v = wc & 0x0000007f; } else if (clen == 2 && cs == ('G' << 24)) { /* CNS-11643-1 */ len = 2; if (n < len) { ret = E2BIG; goto err; } v = wc & 0x00007f7f; v |= 0x00008080; } else if (clen == 2 && 'H' <= (cs >> 24) && (cs >> 24) <= 'M') { /* CNS-11643-[2-7] */ len = 4; if (n < len) { ret = E2BIG; goto err; } *s++ = _SS2; *s++ = (cs >> 24) - 'H' + 0xa2; v = wc & 0x00007f7f; v |= 0x00008080; } else { ret = EILSEQ; goto err; } i = clen; while (i-- > 0) *s++ = (v >> (i << 3)) & 0xff; *nresult = len; return (0); err: *nresult = (size_t)-1; return (ret); } static __inline int /*ARGSUSED*/ _citrus_EUCTW_stdenc_wctocs(_EUCTWEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = (_csid_t)(wc >> 24) & 0xFF; *idx = (_index_t)(wc & 0x7F7F); return (0); } static __inline int /*ARGSUSED*/ _citrus_EUCTW_stdenc_cstowc(_EUCTWEncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { if (csid == 0) { if ((idx & ~0x7F) != 0) return (EINVAL); *wc = (wchar_t)idx; } else { if (csid < 'G' || csid > 'M' || (idx & ~0x7F7F) != 0) return (EINVAL); *wc = (wchar_t)idx | ((wchar_t)csid << 24); } return (0); } static __inline int /*ARGSUSED*/ _citrus_EUCTW_stdenc_get_state_desc_generic(_EUCTWEncodingInfo * __restrict ei __unused, _EUCTWState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(EUCTW); _CITRUS_STDENC_DEF_OPS(EUCTW); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/GBK2K/citrus_gbk2k.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/GBK2K/citrus_gbk2k.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/GBK2K/citrus_gbk2k.c (revision 281585) @@ -1,419 +1,419 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_gbk2k.c,v 1.7 2008/06/14 16:01:07 tnozaki Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_gbk2k.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct _GBK2KState { int chlen; char ch[4]; } _GBK2KState; typedef struct { int mb_cur_max; } _GBK2KEncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_GBK2K_##m #define _ENCODING_INFO _GBK2KEncodingInfo #define _ENCODING_STATE _GBK2KState #define _ENCODING_MB_CUR_MAX(_ei_) (_ei_)->mb_cur_max #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_GBK2K_init_state(_GBK2KEncodingInfo * __restrict ei __unused, _GBK2KState * __restrict s) { memset(s, 0, sizeof(*s)); } #if 0 static __inline void /*ARGSUSED*/ _citrus_GBK2K_pack_state(_GBK2KEncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _GBK2KState * __restrict s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_GBK2K_unpack_state(_GBK2KEncodingInfo * __restrict ei __unused, _GBK2KState * __restrict s, const void * __restrict pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static __inline bool _mb_singlebyte(int c) { return ((c & 0xff) <= 0x7f); } static __inline bool _mb_leadbyte(int c) { c &= 0xff; return (0x81 <= c && c <= 0xfe); } static __inline bool _mb_trailbyte(int c) { c &= 0xff; return ((0x40 <= c && c <= 0x7e) || (0x80 <= c && c <= 0xfe)); } static __inline bool _mb_surrogate(int c) { c &= 0xff; return (0x30 <= c && c <= 0x39); } static __inline int _mb_count(wchar_t v) { uint32_t c; c = (uint32_t)v; /* XXX */ if (!(c & 0xffffff00)) return (1); if (!(c & 0xffff0000)) return (2); return (4); } #define _PSENC (psenc->ch[psenc->chlen - 1]) #define _PUSH_PSENC(c) (psenc->ch[psenc->chlen++] = (c)) static int _citrus_GBK2K_mbrtowc_priv(_GBK2KEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _GBK2KState * __restrict psenc, size_t * __restrict nresult) { - const char *s0, *s1; + char *s0, *s1; wchar_t wc; int chlenbak, len; s0 = *s; if (s0 == NULL) { /* _citrus_GBK2K_init_state(ei, psenc); */ psenc->chlen = 0; *nresult = 0; return (0); } chlenbak = psenc->chlen; switch (psenc->chlen) { case 3: if (!_mb_leadbyte (_PSENC)) goto invalid; /* FALLTHROUGH */ case 2: if (!_mb_surrogate(_PSENC) || _mb_trailbyte(_PSENC)) goto invalid; /* FALLTHROUGH */ case 1: if (!_mb_leadbyte (_PSENC)) goto invalid; /* FALLTHOROUGH */ case 0: break; default: goto invalid; } for (;;) { if (n-- < 1) goto restart; _PUSH_PSENC(*s0++); switch (psenc->chlen) { case 1: if (_mb_singlebyte(_PSENC)) goto convert; if (_mb_leadbyte (_PSENC)) continue; goto ilseq; case 2: if (_mb_trailbyte (_PSENC)) goto convert; if (ei->mb_cur_max == 4 && _mb_surrogate (_PSENC)) continue; goto ilseq; case 3: if (_mb_leadbyte (_PSENC)) continue; goto ilseq; case 4: if (_mb_surrogate (_PSENC)) goto convert; goto ilseq; } } convert: len = psenc->chlen; s1 = &psenc->ch[0]; wc = 0; while (len-- > 0) wc = (wc << 8) | (*s1++ & 0xff); if (pwc != NULL) *pwc = wc; *s = s0; *nresult = (wc == 0) ? 0 : psenc->chlen - chlenbak; /* _citrus_GBK2K_init_state(ei, psenc); */ psenc->chlen = 0; return (0); restart: *s = s0; *nresult = (size_t)-2; return (0); invalid: return (EINVAL); ilseq: *nresult = (size_t)-1; return (EILSEQ); } static int _citrus_GBK2K_wcrtomb_priv(_GBK2KEncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wc, _GBK2KState * __restrict psenc, size_t * __restrict nresult) { size_t len; int ret; if (psenc->chlen != 0) { ret = EINVAL; goto err; } len = _mb_count(wc); if (n < len) { ret = E2BIG; goto err; } switch (len) { case 1: if (!_mb_singlebyte(_PUSH_PSENC(wc ))) { ret = EILSEQ; goto err; } break; case 2: if (!_mb_leadbyte (_PUSH_PSENC(wc >> 8)) || !_mb_trailbyte (_PUSH_PSENC(wc))) { ret = EILSEQ; goto err; } break; case 4: if (ei->mb_cur_max != 4 || !_mb_leadbyte (_PUSH_PSENC(wc >> 24)) || !_mb_surrogate (_PUSH_PSENC(wc >> 16)) || !_mb_leadbyte (_PUSH_PSENC(wc >> 8)) || !_mb_surrogate (_PUSH_PSENC(wc))) { ret = EILSEQ; goto err; } break; } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; /* _citrus_GBK2K_init_state(ei, psenc); */ psenc->chlen = 0; return (0); err: *nresult = (size_t)-1; return (ret); } static __inline int /*ARGSUSED*/ _citrus_GBK2K_stdenc_wctocs(_GBK2KEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { uint8_t ch, cl; if ((uint32_t)wc < 0x80) { /* ISO646 */ *csid = 0; *idx = (_index_t)wc; } else if ((uint32_t)wc >= 0x10000) { /* GBKUCS : XXX */ *csid = 3; *idx = (_index_t)wc; } else { ch = (uint8_t)(wc >> 8); cl = (uint8_t)wc; if (ch >= 0xA1 && cl >= 0xA1) { /* EUC G1 */ *csid = 1; *idx = (_index_t)wc & 0x7F7FU; } else { /* extended area (0x8140-) */ *csid = 2; *idx = (_index_t)wc; } } return (0); } static __inline int /*ARGSUSED*/ _citrus_GBK2K_stdenc_cstowc(_GBK2KEncodingInfo * __restrict ei, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { switch (csid) { case 0: /* ISO646 */ *wc = (wchar_t)idx; break; case 1: /* EUC G1 */ *wc = (wchar_t)idx | 0x8080U; break; case 2: /* extended area */ *wc = (wchar_t)idx; break; case 3: /* GBKUCS : XXX */ if (ei->mb_cur_max != 4) return (EINVAL); *wc = (wchar_t)idx; break; default: return (EILSEQ); } return (0); } static __inline int /*ARGSUSED*/ _citrus_GBK2K_stdenc_get_state_desc_generic(_GBK2KEncodingInfo * __restrict ei __unused, _GBK2KState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } static int /*ARGSUSED*/ _citrus_GBK2K_encoding_module_init(_GBK2KEncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { const char *p; p = var; memset((void *)ei, 0, sizeof(*ei)); ei->mb_cur_max = 4; while (lenvar > 0) { switch (_bcs_tolower(*p)) { case '2': MATCH("2byte", ei->mb_cur_max = 2); break; } p++; lenvar--; } return (0); } static void /*ARGSUSED*/ _citrus_GBK2K_encoding_module_uninit(_GBK2KEncodingInfo *ei __unused) { } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(GBK2K); _CITRUS_STDENC_DEF_OPS(GBK2K); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/HZ/citrus_hz.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/HZ/citrus_hz.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/HZ/citrus_hz.c (revision 281585) @@ -1,648 +1,648 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_hz.c,v 1.2 2008/06/14 16:01:07 tnozaki Exp $ */ /*- * Copyright (c)2004, 2006 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_hz.h" #include "citrus_prop.h" /* * wchar_t mapping: * * CTRL/ASCII 00000000 00000000 00000000 gxxxxxxx * GB2312 00000000 00000000 0xxxxxxx gxxxxxxx * 94/96*n (~M) 0mmmmmmm 0xxxxxxx 0xxxxxxx gxxxxxxx */ #define ESCAPE_CHAR '~' typedef enum { CTRL = 0, ASCII = 1, GB2312 = 2, CS94 = 3, CS96 = 4 } charset_t; typedef struct { int start; int end; int width; } range_t; static const range_t ranges[] = { #define RANGE(start, end) { start, end, (end - start) + 1 } /* CTRL */ RANGE(0x00, 0x1F), /* ASCII */ RANGE(0x20, 0x7F), /* GB2312 */ RANGE(0x21, 0x7E), /* CS94 */ RANGE(0x21, 0x7E), /* CS96 */ RANGE(0x20, 0x7F), #undef RANGE }; typedef struct escape_t escape_t; typedef struct { charset_t charset; escape_t *escape; ssize_t length; #define ROWCOL_MAX 3 } graphic_t; typedef TAILQ_HEAD(escape_list, escape_t) escape_list; struct escape_t { TAILQ_ENTRY(escape_t) entry; escape_list *set; graphic_t *left; graphic_t *right; int ch; }; #define GL(escape) ((escape)->left) #define GR(escape) ((escape)->right) #define SET(escape) ((escape)->set) #define ESC(escape) ((escape)->ch) #define INIT(escape) (TAILQ_FIRST(SET(escape))) static __inline escape_t * find_escape(escape_list *set, int ch) { escape_t *escape; TAILQ_FOREACH(escape, set, entry) { if (ESC(escape) == ch) break; } return (escape); } typedef struct { escape_list e0; escape_list e1; graphic_t *ascii; graphic_t *gb2312; } _HZEncodingInfo; #define E0SET(ei) (&(ei)->e0) #define E1SET(ei) (&(ei)->e1) #define INIT0(ei) (TAILQ_FIRST(E0SET(ei))) #define INIT1(ei) (TAILQ_FIRST(E1SET(ei))) typedef struct { escape_t *inuse; int chlen; char ch[ROWCOL_MAX]; } _HZState; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_HZ_##m #define _ENCODING_INFO _HZEncodingInfo #define _ENCODING_STATE _HZState #define _ENCODING_MB_CUR_MAX(_ei_) MB_LEN_MAX #define _ENCODING_IS_STATE_DEPENDENT 1 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) ((_ps_)->inuse == NULL) static __inline void _citrus_HZ_init_state(_HZEncodingInfo * __restrict ei, _HZState * __restrict psenc) { psenc->chlen = 0; psenc->inuse = INIT0(ei); } #if 0 static __inline void /*ARGSUSED*/ _citrus_HZ_pack_state(_HZEncodingInfo * __restrict ei __unused, void *__restrict pspriv, const _HZState * __restrict psenc) { memcpy(pspriv, (const void *)psenc, sizeof(*psenc)); } static __inline void /*ARGSUSED*/ _citrus_HZ_unpack_state(_HZEncodingInfo * __restrict ei __unused, _HZState * __restrict psenc, const void * __restrict pspriv) { memcpy((void *)psenc, pspriv, sizeof(*psenc)); } #endif static int _citrus_HZ_mbrtowc_priv(_HZEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _HZState * __restrict psenc, size_t * __restrict nresult) { escape_t *candidate, *init; graphic_t *graphic; const range_t *range; - const char *s0; + char *s0; wchar_t wc; int bit, ch, head, len, tail; if (*s == NULL) { _citrus_HZ_init_state(ei, psenc); *nresult = 1; return (0); } s0 = *s; if (psenc->chlen < 0 || psenc->inuse == NULL) return (EINVAL); wc = (wchar_t)0; bit = head = tail = 0; graphic = NULL; for (len = 0; len <= MB_LEN_MAX;) { if (psenc->chlen == tail) { if (n-- < 1) { *s = s0; *nresult = (size_t)-2; return (0); } psenc->ch[psenc->chlen++] = *s0++; ++len; } ch = (unsigned char)psenc->ch[tail++]; if (tail == 1) { if ((ch & ~0x80) <= 0x1F) { if (psenc->inuse != INIT0(ei)) break; wc = (wchar_t)ch; goto done; } if (ch & 0x80) { graphic = GR(psenc->inuse); bit = 0x80; ch &= ~0x80; } else { graphic = GL(psenc->inuse); if (ch == ESCAPE_CHAR) continue; bit = 0x0; } if (graphic == NULL) break; } else if (tail == 2 && psenc->ch[0] == ESCAPE_CHAR) { if (tail < psenc->chlen) return (EINVAL); if (ch == ESCAPE_CHAR) { ++head; } else if (ch == '\n') { if (psenc->inuse != INIT0(ei)) break; tail = psenc->chlen = 0; continue; } else { candidate = NULL; init = INIT0(ei); if (psenc->inuse == init) { init = INIT1(ei); } else if (INIT(psenc->inuse) == init) { if (ESC(init) != ch) break; candidate = init; } if (candidate == NULL) { candidate = find_escape( SET(psenc->inuse), ch); if (candidate == NULL) { if (init == NULL || ESC(init) != ch) break; candidate = init; } } psenc->inuse = candidate; tail = psenc->chlen = 0; continue; } } else if (ch & 0x80) { if (graphic != GR(psenc->inuse)) break; ch &= ~0x80; } else { if (graphic != GL(psenc->inuse)) break; } range = &ranges[(size_t)graphic->charset]; if (range->start > ch || range->end < ch) break; wc <<= 8; wc |= ch; if (graphic->length == (tail - head)) { if (graphic->charset > GB2312) bit |= ESC(psenc->inuse) << 24; wc |= bit; goto done; } } *nresult = (size_t)-1; return (EILSEQ); done: if (tail < psenc->chlen) return (EINVAL); *s = s0; if (pwc != NULL) *pwc = wc; psenc->chlen = 0; *nresult = (wc == 0) ? 0 : len; return (0); } static int _citrus_HZ_wcrtomb_priv(_HZEncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wc, _HZState * __restrict psenc, size_t * __restrict nresult) { escape_t *candidate, *init; graphic_t *graphic; const range_t *range; size_t len; int bit, ch; if (psenc->chlen != 0 || psenc->inuse == NULL) return (EINVAL); if (wc & 0x80) { bit = 0x80; wc &= ~0x80; } else { bit = 0x0; } if ((uint32_t)wc <= 0x1F) { candidate = INIT0(ei); graphic = (bit == 0) ? candidate->left : candidate->right; if (graphic == NULL) goto ilseq; range = &ranges[(size_t)CTRL]; len = 1; } else if ((uint32_t)wc <= 0x7F) { graphic = ei->ascii; if (graphic == NULL) goto ilseq; candidate = graphic->escape; range = &ranges[(size_t)graphic->charset]; len = graphic->length; } else if ((uint32_t)wc <= 0x7F7F) { graphic = ei->gb2312; if (graphic == NULL) goto ilseq; candidate = graphic->escape; range = &ranges[(size_t)graphic->charset]; len = graphic->length; } else { ch = (wc >> 24) & 0xFF; candidate = find_escape(E0SET(ei), ch); if (candidate == NULL) { candidate = find_escape(E1SET(ei), ch); if (candidate == NULL) goto ilseq; } wc &= ~0xFF000000; graphic = (bit == 0) ? candidate->left : candidate->right; if (graphic == NULL) goto ilseq; range = &ranges[(size_t)graphic->charset]; len = graphic->length; } if (psenc->inuse != candidate) { init = INIT0(ei); if (SET(psenc->inuse) == SET(candidate)) { if (INIT(psenc->inuse) != init || psenc->inuse == init || candidate == init) init = NULL; } else if (candidate == (init = INIT(candidate))) { init = NULL; } if (init != NULL) { if (n < 2) return (E2BIG); n -= 2; psenc->ch[psenc->chlen++] = ESCAPE_CHAR; psenc->ch[psenc->chlen++] = ESC(init); } if (n < 2) return (E2BIG); n -= 2; psenc->ch[psenc->chlen++] = ESCAPE_CHAR; psenc->ch[psenc->chlen++] = ESC(candidate); psenc->inuse = candidate; } if (n < len) return (E2BIG); while (len-- > 0) { ch = (wc >> (len * 8)) & 0xFF; if (range->start > ch || range->end < ch) goto ilseq; psenc->ch[psenc->chlen++] = ch | bit; } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; psenc->chlen = 0; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); } static __inline int _citrus_HZ_put_state_reset(_HZEncodingInfo * __restrict ei, char * __restrict s, size_t n, _HZState * __restrict psenc, size_t * __restrict nresult) { escape_t *candidate; if (psenc->chlen != 0 || psenc->inuse == NULL) return (EINVAL); candidate = INIT0(ei); if (psenc->inuse != candidate) { if (n < 2) return (E2BIG); n -= 2; psenc->ch[psenc->chlen++] = ESCAPE_CHAR; psenc->ch[psenc->chlen++] = ESC(candidate); } if (n < 1) return (E2BIG); if (psenc->chlen > 0) memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; _citrus_HZ_init_state(ei, psenc); return (0); } static __inline int _citrus_HZ_stdenc_get_state_desc_generic(_HZEncodingInfo * __restrict ei, _HZState * __restrict psenc, int * __restrict rstate) { if (psenc->chlen < 0 || psenc->inuse == NULL) return (EINVAL); *rstate = (psenc->chlen == 0) ? ((psenc->inuse == INIT0(ei)) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_STABLE) : ((psenc->ch[0] == ESCAPE_CHAR) ? _STDENC_SDGEN_INCOMPLETE_SHIFT : _STDENC_SDGEN_INCOMPLETE_CHAR); return (0); } static __inline int /*ARGSUSED*/ _citrus_HZ_stdenc_wctocs(_HZEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { int bit; if (wc & 0x80) { bit = 0x80; wc &= ~0x80; } else bit = 0x0; if ((uint32_t)wc <= 0x7F) { *csid = (_csid_t)bit; *idx = (_index_t)wc; } else if ((uint32_t)wc <= 0x7F7F) { *csid = (_csid_t)(bit | 0x8000); *idx = (_index_t)wc; } else { *csid = (_index_t)(wc & ~0x00FFFF7F); *idx = (_csid_t)(wc & 0x00FFFF7F); } return (0); } static __inline int /*ARGSUSED*/ _citrus_HZ_stdenc_cstowc(_HZEncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { *wc = (wchar_t)idx; switch (csid) { case 0x80: case 0x8080: *wc |= (wchar_t)0x80; /*FALLTHROUGH*/ case 0x0: case 0x8000: break; default: *wc |= (wchar_t)csid; } return (0); } static void _citrus_HZ_encoding_module_uninit(_HZEncodingInfo *ei) { escape_t *escape; while ((escape = TAILQ_FIRST(E0SET(ei))) != NULL) { TAILQ_REMOVE(E0SET(ei), escape, entry); free(GL(escape)); free(GR(escape)); free(escape); } while ((escape = TAILQ_FIRST(E1SET(ei))) != NULL) { TAILQ_REMOVE(E1SET(ei), escape, entry); free(GL(escape)); free(GR(escape)); free(escape); } } static int _citrus_HZ_parse_char(void *context, const char *name __unused, const char *s) { escape_t *escape; void **p; p = (void **)context; escape = (escape_t *)p[0]; if (escape->ch != '\0') return (EINVAL); escape->ch = *s++; if (escape->ch == ESCAPE_CHAR || *s != '\0') return (EINVAL); return (0); } static int _citrus_HZ_parse_graphic(void *context, const char *name, const char *s) { _HZEncodingInfo *ei; escape_t *escape; graphic_t *graphic; void **p; p = (void **)context; escape = (escape_t *)p[0]; ei = (_HZEncodingInfo *)p[1]; graphic = calloc(1, sizeof(*graphic)); if (graphic == NULL) return (ENOMEM); if (strcmp("GL", name) == 0) { if (GL(escape) != NULL) goto release; GL(escape) = graphic; } else if (strcmp("GR", name) == 0) { if (GR(escape) != NULL) goto release; GR(escape) = graphic; } else { release: free(graphic); return (EINVAL); } graphic->escape = escape; if (_bcs_strncasecmp("ASCII", s, 5) == 0) { if (s[5] != '\0') return (EINVAL); graphic->charset = ASCII; graphic->length = 1; ei->ascii = graphic; return (0); } else if (_bcs_strncasecmp("GB2312", s, 6) == 0) { if (s[6] != '\0') return (EINVAL); graphic->charset = GB2312; graphic->length = 2; ei->gb2312 = graphic; return (0); } else if (strncmp("94*", s, 3) == 0) graphic->charset = CS94; else if (strncmp("96*", s, 3) == 0) graphic->charset = CS96; else return (EINVAL); s += 3; switch(*s) { case '1': case '2': case '3': graphic->length = (size_t)(*s - '0'); if (*++s == '\0') break; /*FALLTHROUGH*/ default: return (EINVAL); } return (0); } static const _citrus_prop_hint_t escape_hints[] = { _CITRUS_PROP_HINT_STR("CH", &_citrus_HZ_parse_char), _CITRUS_PROP_HINT_STR("GL", &_citrus_HZ_parse_graphic), _CITRUS_PROP_HINT_STR("GR", &_citrus_HZ_parse_graphic), _CITRUS_PROP_HINT_END }; static int _citrus_HZ_parse_escape(void *context, const char *name, const char *s) { _HZEncodingInfo *ei; escape_t *escape; void *p[2]; ei = (_HZEncodingInfo *)context; escape = calloc(1, sizeof(*escape)); if (escape == NULL) return (EINVAL); if (strcmp("0", name) == 0) { escape->set = E0SET(ei); TAILQ_INSERT_TAIL(E0SET(ei), escape, entry); } else if (strcmp("1", name) == 0) { escape->set = E1SET(ei); TAILQ_INSERT_TAIL(E1SET(ei), escape, entry); } else { free(escape); return (EINVAL); } p[0] = (void *)escape; p[1] = (void *)ei; return (_citrus_prop_parse_variable( escape_hints, (void *)&p[0], s, strlen(s))); } static const _citrus_prop_hint_t root_hints[] = { _CITRUS_PROP_HINT_STR("0", &_citrus_HZ_parse_escape), _CITRUS_PROP_HINT_STR("1", &_citrus_HZ_parse_escape), _CITRUS_PROP_HINT_END }; static int _citrus_HZ_encoding_module_init(_HZEncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { int errnum; memset(ei, 0, sizeof(*ei)); TAILQ_INIT(E0SET(ei)); TAILQ_INIT(E1SET(ei)); errnum = _citrus_prop_parse_variable( root_hints, (void *)ei, var, lenvar); if (errnum != 0) _citrus_HZ_encoding_module_uninit(ei); return (errnum); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(HZ); _CITRUS_STDENC_DEF_OPS(HZ); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/ISO2022/citrus_iso2022.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/ISO2022/citrus_iso2022.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/ISO2022/citrus_iso2022.c (revision 281585) @@ -1,1291 +1,1291 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_iso2022.c,v 1.20 2010/12/07 22:01:45 joerg Exp $ */ /*- * Copyright (c)1999, 2002 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $Citrus: xpg4dl/FreeBSD/lib/libc/locale/iso2022.c,v 1.23 2001/06/21 01:51:44 yamt Exp $ */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_iso2022.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ /* * wchar_t mappings: * ASCII (ESC ( B) 00000000 00000000 00000000 0xxxxxxx * iso-8859-1 (ESC , A) 00000000 00000000 00000000 1xxxxxxx * 94 charset (ESC ( F) 0fffffff 00000000 00000000 0xxxxxxx * 94 charset (ESC ( M F) 0fffffff 1mmmmmmm 00000000 0xxxxxxx * 96 charset (ESC , F) 0fffffff 00000000 00000000 1xxxxxxx * 96 charset (ESC , M F) 0fffffff 1mmmmmmm 00000000 1xxxxxxx * 94x94 charset (ESC $ ( F) 0fffffff 00000000 0xxxxxxx 0xxxxxxx * 96x96 charset (ESC $ , F) 0fffffff 00000000 0xxxxxxx 1xxxxxxx * 94x94 charset (ESC & V ESC $ ( F) * 0fffffff 1vvvvvvv 0xxxxxxx 0xxxxxxx * 94x94x94 charset (ESC $ ( F) 0fffffff 0xxxxxxx 0xxxxxxx 0xxxxxxx * 96x96x96 charset (ESC $ , F) 0fffffff 0xxxxxxx 0xxxxxxx 1xxxxxxx * reserved for UCS4 co-existence (UCS4 is 31bit encoding thanks to mohta bit) * 1xxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx */ #define CS94 (0U) #define CS96 (1U) #define CS94MULTI (2U) #define CS96MULTI (3U) typedef struct { unsigned char type; unsigned char final; unsigned char interm; unsigned char vers; } _ISO2022Charset; static const _ISO2022Charset ascii = { CS94, 'B', '\0', '\0' }; static const _ISO2022Charset iso88591 = { CS96, 'A', '\0', '\0' }; typedef struct { _ISO2022Charset g[4]; /* need 3 bits to hold -1, 0, ..., 3 */ int gl:3, gr:3, singlegl:3, singlegr:3; char ch[7]; /* longest escape sequence (ESC & V ESC $ ( F) */ size_t chlen; int flags; #define _ISO2022STATE_FLAG_INITIALIZED 1 } _ISO2022State; typedef struct { _ISO2022Charset *recommend[4]; size_t recommendsize[4]; _ISO2022Charset initg[4]; int maxcharset; int flags; #define F_8BIT 0x0001 #define F_NOOLD 0x0002 #define F_SI 0x0010 /*0F*/ #define F_SO 0x0020 /*0E*/ #define F_LS0 0x0010 /*0F*/ #define F_LS1 0x0020 /*0E*/ #define F_LS2 0x0040 /*ESC n*/ #define F_LS3 0x0080 /*ESC o*/ #define F_LS1R 0x0100 /*ESC ~*/ #define F_LS2R 0x0200 /*ESC }*/ #define F_LS3R 0x0400 /*ESC |*/ #define F_SS2 0x0800 /*ESC N*/ #define F_SS3 0x1000 /*ESC O*/ #define F_SS2R 0x2000 /*8E*/ #define F_SS3R 0x4000 /*8F*/ } _ISO2022EncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_ISO2022_##m #define _ENCODING_INFO _ISO2022EncodingInfo #define _ENCODING_STATE _ISO2022State #define _ENCODING_MB_CUR_MAX(_ei_) MB_LEN_MAX #define _ENCODING_IS_STATE_DEPENDENT 1 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) \ (!((_ps_)->flags & _ISO2022STATE_FLAG_INITIALIZED)) #define _ISO2022INVALID (wchar_t)-1 static __inline bool isc0(__uint8_t x) { return ((x & 0x1f) == x); } static __inline bool isc1(__uint8_t x) { return (0x80 <= x && x <= 0x9f); } static __inline bool iscntl(__uint8_t x) { return (isc0(x) || isc1(x) || x == 0x7f); } static __inline bool is94(__uint8_t x) { return (0x21 <= x && x <= 0x7e); } static __inline bool is96(__uint8_t x) { return (0x20 <= x && x <= 0x7f); } static __inline bool isecma(__uint8_t x) { return (0x30 <= x && x <= 0x7f); } static __inline bool isinterm(__uint8_t x) { return (0x20 <= x && x <= 0x2f); } static __inline bool isthree(__uint8_t x) { return (0x60 <= x && x <= 0x6f); } static __inline int getcs(const char * __restrict p, _ISO2022Charset * __restrict cs) { if (!strncmp(p, "94$", 3) && p[3] && !p[4]) { cs->final = (unsigned char)(p[3] & 0xff); cs->interm = '\0'; cs->vers = '\0'; cs->type = CS94MULTI; } else if (!strncmp(p, "96$", 3) && p[3] && !p[4]) { cs->final = (unsigned char)(p[3] & 0xff); cs->interm = '\0'; cs->vers = '\0'; cs->type = CS96MULTI; } else if (!strncmp(p, "94", 2) && p[2] && !p[3]) { cs->final = (unsigned char)(p[2] & 0xff); cs->interm = '\0'; cs->vers = '\0'; cs->type = CS94; } else if (!strncmp(p, "96", 2) && p[2] && !p[3]) { cs->final = (unsigned char )(p[2] & 0xff); cs->interm = '\0'; cs->vers = '\0'; cs->type = CS96; } else return (1); return (0); } #define _NOTMATCH 0 #define _MATCH 1 #define _PARSEFAIL 2 static __inline int get_recommend(_ISO2022EncodingInfo * __restrict ei, const char * __restrict token) { _ISO2022Charset cs, *p; int i; if (!strchr("0123", token[0]) || token[1] != '=') return (_NOTMATCH); if (getcs(&token[2], &cs) == 0) ; else if (!strcmp(&token[2], "94")) { cs.final = (unsigned char)(token[4]); cs.interm = '\0'; cs.vers = '\0'; cs.type = CS94; } else if (!strcmp(&token[2], "96")) { cs.final = (unsigned char)(token[4]); cs.interm = '\0'; cs.vers = '\0'; cs.type = CS96; } else if (!strcmp(&token[2], "94$")) { cs.final = (unsigned char)(token[5]); cs.interm = '\0'; cs.vers = '\0'; cs.type = CS94MULTI; } else if (!strcmp(&token[2], "96$")) { cs.final = (unsigned char)(token[5]); cs.interm = '\0'; cs.vers = '\0'; cs.type = CS96MULTI; } else return (_PARSEFAIL); i = token[0] - '0'; if (!ei->recommend[i]) ei->recommend[i] = malloc(sizeof(_ISO2022Charset)); else { p = realloc(ei->recommend[i], sizeof(_ISO2022Charset) * (ei->recommendsize[i] + 1)); if (!p) return (_PARSEFAIL); ei->recommend[i] = p; } if (!ei->recommend[i]) return (_PARSEFAIL); ei->recommendsize[i]++; (ei->recommend[i] + (ei->recommendsize[i] - 1))->final = cs.final; (ei->recommend[i] + (ei->recommendsize[i] - 1))->interm = cs.interm; (ei->recommend[i] + (ei->recommendsize[i] - 1))->vers = cs.vers; (ei->recommend[i] + (ei->recommendsize[i] - 1))->type = cs.type; return (_MATCH); } static __inline int get_initg(_ISO2022EncodingInfo * __restrict ei, const char * __restrict token) { _ISO2022Charset cs; if (strncmp("INIT", &token[0], 4) || !strchr("0123", token[4]) || token[5] != '=') return (_NOTMATCH); if (getcs(&token[6], &cs) != 0) return (_PARSEFAIL); ei->initg[token[4] - '0'].type = cs.type; ei->initg[token[4] - '0'].final = cs.final; ei->initg[token[4] - '0'].interm = cs.interm; ei->initg[token[4] - '0'].vers = cs.vers; return (_MATCH); } static __inline int get_max(_ISO2022EncodingInfo * __restrict ei, const char * __restrict token) { if (!strcmp(token, "MAX1")) ei->maxcharset = 1; else if (!strcmp(token, "MAX2")) ei->maxcharset = 2; else if (!strcmp(token, "MAX3")) ei->maxcharset = 3; else return (_NOTMATCH); return (_MATCH); } static __inline int get_flags(_ISO2022EncodingInfo * __restrict ei, const char * __restrict token) { static struct { const char *tag; int flag; } const tags[] = { { "DUMMY", 0 }, { "8BIT", F_8BIT }, { "NOOLD", F_NOOLD }, { "SI", F_SI }, { "SO", F_SO }, { "LS0", F_LS0 }, { "LS1", F_LS1 }, { "LS2", F_LS2 }, { "LS3", F_LS3 }, { "LS1R", F_LS1R }, { "LS2R", F_LS2R }, { "LS3R", F_LS3R }, { "SS2", F_SS2 }, { "SS3", F_SS3 }, { "SS2R", F_SS2R }, { "SS3R", F_SS3R }, { NULL, 0 } }; int i; for (i = 0; tags[i].tag; i++) if (!strcmp(token, tags[i].tag)) { ei->flags |= tags[i].flag; return (_MATCH); } return (_NOTMATCH); } static __inline int _citrus_ISO2022_parse_variable(_ISO2022EncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar __unused) { char const *e, *v; char buf[20]; size_t len; int i, ret; /* * parse VARIABLE section. */ if (!var) return (EFTYPE); v = (const char *) var; /* initialize structure */ ei->maxcharset = 0; for (i = 0; i < 4; i++) { ei->recommend[i] = NULL; ei->recommendsize[i] = 0; } ei->flags = 0; while (*v) { while (*v == ' ' || *v == '\t') ++v; /* find the token */ e = v; while (*e && *e != ' ' && *e != '\t') ++e; len = e - v; if (len == 0) break; if (len >= sizeof(buf)) goto parsefail; snprintf(buf, sizeof(buf), "%.*s", (int)len, v); if ((ret = get_recommend(ei, buf)) != _NOTMATCH) ; else if ((ret = get_initg(ei, buf)) != _NOTMATCH) ; else if ((ret = get_max(ei, buf)) != _NOTMATCH) ; else if ((ret = get_flags(ei, buf)) != _NOTMATCH) ; else ret = _PARSEFAIL; if (ret == _PARSEFAIL) goto parsefail; v = e; } return (0); parsefail: free(ei->recommend[0]); free(ei->recommend[1]); free(ei->recommend[2]); free(ei->recommend[3]); return (EFTYPE); } static __inline void /*ARGSUSED*/ _citrus_ISO2022_init_state(_ISO2022EncodingInfo * __restrict ei, _ISO2022State * __restrict s) { int i; memset(s, 0, sizeof(*s)); s->gl = 0; s->gr = (ei->flags & F_8BIT) ? 1 : -1; for (i = 0; i < 4; i++) if (ei->initg[i].final) { s->g[i].type = ei->initg[i].type; s->g[i].final = ei->initg[i].final; s->g[i].interm = ei->initg[i].interm; } s->singlegl = s->singlegr = -1; s->flags |= _ISO2022STATE_FLAG_INITIALIZED; } #if 0 static __inline void /*ARGSUSED*/ _citrus_ISO2022_pack_state(_ISO2022EncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _ISO2022State * __restrict s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_ISO2022_unpack_state(_ISO2022EncodingInfo * __restrict ei __unused, _ISO2022State * __restrict s, const void * __restrict pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static int /*ARGSUSED*/ _citrus_ISO2022_encoding_module_init(_ISO2022EncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { return (_citrus_ISO2022_parse_variable(ei, var, lenvar)); } static void /*ARGSUSED*/ _citrus_ISO2022_encoding_module_uninit(_ISO2022EncodingInfo *ei __unused) { } #define ESC '\033' #define ECMA -1 #define INTERM -2 #define OECMA -3 static const struct seqtable { int type; int csoff; int finaloff; int intermoff; int versoff; int len; int chars[10]; } seqtable[] = { /* G0 94MULTI special */ { CS94MULTI, -1, 2, -1, -1, 3, { ESC, '$', OECMA }, }, /* G0 94MULTI special with version identification */ { CS94MULTI, -1, 5, -1, 2, 6, { ESC, '&', ECMA, ESC, '$', OECMA }, }, /* G? 94 */ { CS94, 1, 2, -1, -1, 3, { ESC, CS94, ECMA, }, }, /* G? 94 with 2nd intermediate char */ { CS94, 1, 3, 2, -1, 4, { ESC, CS94, INTERM, ECMA, }, }, /* G? 96 */ { CS96, 1, 2, -1, -1, 3, { ESC, CS96, ECMA, }, }, /* G? 96 with 2nd intermediate char */ { CS96, 1, 3, 2, -1, 4, { ESC, CS96, INTERM, ECMA, }, }, /* G? 94MULTI */ { CS94MULTI, 2, 3, -1, -1, 4, { ESC, '$', CS94, ECMA, }, }, /* G? 96MULTI */ { CS96MULTI, 2, 3, -1, -1, 4, { ESC, '$', CS96, ECMA, }, }, /* G? 94MULTI with version specification */ { CS94MULTI, 5, 6, -1, 2, 7, { ESC, '&', ECMA, ESC, '$', CS94, ECMA, }, }, /* LS2/3 */ { -1, -1, -1, -1, -1, 2, { ESC, 'n', }, }, { -1, -1, -1, -1, -1, 2, { ESC, 'o', }, }, /* LS1/2/3R */ { -1, -1, -1, -1, -1, 2, { ESC, '~', }, }, { -1, -1, -1, -1, -1, 2, { ESC, /*{*/ '}', }, }, { -1, -1, -1, -1, -1, 2, { ESC, '|', }, }, /* SS2/3 */ { -1, -1, -1, -1, -1, 2, { ESC, 'N', }, }, { -1, -1, -1, -1, -1, 2, { ESC, 'O', }, }, /* end of records */ // { 0, } { 0, 0, 0, 0, 0, 0, { ESC, 0, }, } }; static int seqmatch(const char * __restrict s, size_t n, const struct seqtable * __restrict sp) { const int *p; p = sp->chars; while ((size_t)(p - sp->chars) < n && p - sp->chars < sp->len) { switch (*p) { case ECMA: if (!isecma(*s)) goto terminate; break; case OECMA: if (*s && strchr("@AB", *s)) break; else goto terminate; case INTERM: if (!isinterm(*s)) goto terminate; break; case CS94: if (*s && strchr("()*+", *s)) break; else goto terminate; case CS96: if (*s && strchr(",-./", *s)) break; else goto terminate; default: if (*s != *p) goto terminate; break; } p++; s++; } terminate: return (p - sp->chars); } static wchar_t _ISO2022_sgetwchar(_ISO2022EncodingInfo * __restrict ei __unused, - const char * __restrict string, size_t n, const char ** __restrict result, + char * __restrict string, size_t n, char ** __restrict result, _ISO2022State * __restrict psenc) { const struct seqtable *sp; wchar_t wchar = 0; int i, cur, nmatch; while (1) { /* SI/SO */ if (1 <= n && string[0] == '\017') { psenc->gl = 0; string++; n--; continue; } if (1 <= n && string[0] == '\016') { psenc->gl = 1; string++; n--; continue; } /* SS2/3R */ if (1 <= n && string[0] && strchr("\217\216", string[0])) { psenc->singlegl = psenc->singlegr = (string[0] - '\216') + 2; string++; n--; continue; } /* eat the letter if this is not ESC */ if (1 <= n && string[0] != '\033') break; /* look for a perfect match from escape sequences */ for (sp = &seqtable[0]; sp->len; sp++) { nmatch = seqmatch(string, n, sp); if (sp->len == nmatch && n >= (size_t)(sp->len)) break; } if (!sp->len) goto notseq; if (sp->type != -1) { if (sp->csoff == -1) i = 0; else { switch (sp->type) { case CS94: case CS94MULTI: i = string[sp->csoff] - '('; break; case CS96: case CS96MULTI: i = string[sp->csoff] - ','; break; default: return (_ISO2022INVALID); } } psenc->g[i].type = sp->type; psenc->g[i].final = '\0'; psenc->g[i].interm = '\0'; psenc->g[i].vers = '\0'; /* sp->finaloff must not be -1 */ if (sp->finaloff != -1) psenc->g[i].final = string[sp->finaloff]; if (sp->intermoff != -1) psenc->g[i].interm = string[sp->intermoff]; if (sp->versoff != -1) psenc->g[i].vers = string[sp->versoff]; string += sp->len; n -= sp->len; continue; } /* LS2/3 */ if (2 <= n && string[0] == '\033' && string[1] && strchr("no", string[1])) { psenc->gl = string[1] - 'n' + 2; string += 2; n -= 2; continue; } /* LS1/2/3R */ /* XXX: { for vi showmatch */ if (2 <= n && string[0] == '\033' && string[1] && strchr("~}|", string[1])) { psenc->gr = 3 - (string[1] - '|'); string += 2; n -= 2; continue; } /* SS2/3 */ if (2 <= n && string[0] == '\033' && string[1] && strchr("NO", string[1])) { psenc->singlegl = (string[1] - 'N') + 2; string += 2; n -= 2; continue; } notseq: /* * if we've got an unknown escape sequence, eat the ESC at the * head. otherwise, wait till full escape sequence comes. */ for (sp = &seqtable[0]; sp->len; sp++) { nmatch = seqmatch(string, n, sp); if (!nmatch) continue; /* * if we are in the middle of escape sequence, * we still need to wait for more characters to come */ if (n < (size_t)(sp->len)) { if ((size_t)(nmatch) == n) { if (result) *result = string; return (_ISO2022INVALID); } } else { if (nmatch == sp->len) { /* this case should not happen */ goto eat; } } } break; } eat: /* no letter to eat */ if (n < 1) { if (result) *result = string; return (_ISO2022INVALID); } /* normal chars. always eat C0/C1 as is. */ if (iscntl(*string & 0xff)) cur = -1; else if (*string & 0x80) cur = (psenc->singlegr == -1) ? psenc->gr : psenc->singlegr; else cur = (psenc->singlegl == -1) ? psenc->gl : psenc->singlegl; if (cur == -1) { asis: wchar = *string++ & 0xff; if (result) *result = string; /* reset single shift state */ psenc->singlegr = psenc->singlegl = -1; return (wchar); } /* length error check */ switch (psenc->g[cur].type) { case CS94MULTI: case CS96MULTI: if (!isthree(psenc->g[cur].final)) { if (2 <= n && (string[0] & 0x80) == (string[1] & 0x80)) break; } else { if (3 <= n && (string[0] & 0x80) == (string[1] & 0x80) && (string[0] & 0x80) == (string[2] & 0x80)) break; } /* we still need to wait for more characters to come */ if (result) *result = string; return (_ISO2022INVALID); case CS94: case CS96: if (1 <= n) break; /* we still need to wait for more characters to come */ if (result) *result = string; return (_ISO2022INVALID); } /* range check */ switch (psenc->g[cur].type) { case CS94: if (!(is94(string[0] & 0x7f))) goto asis; case CS96: if (!(is96(string[0] & 0x7f))) goto asis; break; case CS94MULTI: if (!(is94(string[0] & 0x7f) && is94(string[1] & 0x7f))) goto asis; break; case CS96MULTI: if (!(is96(string[0] & 0x7f) && is96(string[1] & 0x7f))) goto asis; break; } /* extract the character. */ switch (psenc->g[cur].type) { case CS94: /* special case for ASCII. */ if (psenc->g[cur].final == 'B' && !psenc->g[cur].interm) { wchar = *string++; wchar &= 0x7f; break; } wchar = psenc->g[cur].final; wchar = (wchar << 8); wchar |= (psenc->g[cur].interm ? (0x80 | psenc->g[cur].interm) : 0); wchar = (wchar << 8); wchar = (wchar << 8) | (*string++ & 0x7f); break; case CS96: /* special case for ISO-8859-1. */ if (psenc->g[cur].final == 'A' && !psenc->g[cur].interm) { wchar = *string++; wchar &= 0x7f; wchar |= 0x80; break; } wchar = psenc->g[cur].final; wchar = (wchar << 8); wchar |= (psenc->g[cur].interm ? (0x80 | psenc->g[cur].interm) : 0); wchar = (wchar << 8); wchar = (wchar << 8) | (*string++ & 0x7f); wchar |= 0x80; break; case CS94MULTI: case CS96MULTI: wchar = psenc->g[cur].final; wchar = (wchar << 8); if (isthree(psenc->g[cur].final)) wchar |= (*string++ & 0x7f); wchar = (wchar << 8) | (*string++ & 0x7f); wchar = (wchar << 8) | (*string++ & 0x7f); if (psenc->g[cur].type == CS96MULTI) wchar |= 0x80; break; } if (result) *result = string; /* reset single shift state */ psenc->singlegr = psenc->singlegl = -1; return (wchar); } static int _citrus_ISO2022_mbrtowc_priv(_ISO2022EncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _ISO2022State * __restrict psenc, size_t * __restrict nresult) { - const char *p, *result, *s0; + char *p, *result, *s0; wchar_t wchar; int c, chlenbak; if (*s == NULL) { _citrus_ISO2022_init_state(ei, psenc); *nresult = _ENCODING_IS_STATE_DEPENDENT; return (0); } s0 = *s; c = 0; chlenbak = psenc->chlen; /* * if we have something in buffer, use that. * otherwise, skip here */ if (psenc->chlen > sizeof(psenc->ch)) { /* illgeal state */ _citrus_ISO2022_init_state(ei, psenc); goto encoding_error; } if (psenc->chlen == 0) goto emptybuf; /* buffer is not empty */ p = psenc->ch; while (psenc->chlen < sizeof(psenc->ch)) { if (n > 0) { psenc->ch[psenc->chlen++] = *s0++; n--; } wchar = _ISO2022_sgetwchar(ei, p, psenc->chlen - (p-psenc->ch), &result, psenc); c += result - p; if (wchar != _ISO2022INVALID) { if (psenc->chlen > (size_t)c) memmove(psenc->ch, result, psenc->chlen - c); if (psenc->chlen < (size_t)c) psenc->chlen = 0; else psenc->chlen -= c; goto output; } if (n == 0) { if ((size_t)(result - p) == psenc->chlen) /* complete shift sequence. */ psenc->chlen = 0; goto restart; } p = result; } /* escape sequence too long? */ goto encoding_error; emptybuf: wchar = _ISO2022_sgetwchar(ei, s0, n, &result, psenc); if (wchar != _ISO2022INVALID) { c += result - s0; psenc->chlen = 0; s0 = result; goto output; } if (result > s0) { c += (result - s0); n -= (result - s0); s0 = result; if (n > 0) goto emptybuf; /* complete shift sequence. */ goto restart; } n += c; if (n < sizeof(psenc->ch)) { memcpy(psenc->ch, s0 - c, n); psenc->chlen = n; s0 = result; goto restart; } /* escape sequence too long? */ encoding_error: psenc->chlen = 0; *nresult = (size_t)-1; return (EILSEQ); output: *s = s0; if (pwc) *pwc = wchar; *nresult = wchar ? c - chlenbak : 0; return (0); restart: *s = s0; *nresult = (size_t)-2; return (0); } static int recommendation(_ISO2022EncodingInfo * __restrict ei, _ISO2022Charset * __restrict cs) { _ISO2022Charset *recommend; size_t j; int i; /* first, try a exact match. */ for (i = 0; i < 4; i++) { recommend = ei->recommend[i]; for (j = 0; j < ei->recommendsize[i]; j++) { if (cs->type != recommend[j].type) continue; if (cs->final != recommend[j].final) continue; if (cs->interm != recommend[j].interm) continue; return (i); } } /* then, try a wildcard match over final char. */ for (i = 0; i < 4; i++) { recommend = ei->recommend[i]; for (j = 0; j < ei->recommendsize[i]; j++) { if (cs->type != recommend[j].type) continue; if (cs->final && (cs->final != recommend[j].final)) continue; if (cs->interm && (cs->interm != recommend[j].interm)) continue; return (i); } } /* there's no recommendation. make a guess. */ if (ei->maxcharset == 0) { return (0); } else { switch (cs->type) { case CS94: case CS94MULTI: return (0); case CS96: case CS96MULTI: return (1); } } return (0); } static int _ISO2022_sputwchar(_ISO2022EncodingInfo * __restrict ei, wchar_t wc, char * __restrict string, size_t n, char ** __restrict result, _ISO2022State * __restrict psenc, size_t * __restrict nresult) { _ISO2022Charset cs; char *p; char tmp[MB_LEN_MAX]; size_t len; int bit8, i = 0, target; unsigned char mask; if (isc0(wc & 0xff)) { /* go back to INIT0 or ASCII on control chars */ cs = ei->initg[0].final ? ei->initg[0] : ascii; } else if (isc1(wc & 0xff)) { /* go back to INIT1 or ISO-8859-1 on control chars */ cs = ei->initg[1].final ? ei->initg[1] : iso88591; } else if (!(wc & ~0xff)) { if (wc & 0x80) { /* special treatment for ISO-8859-1 */ cs = iso88591; } else { /* special treatment for ASCII */ cs = ascii; } } else { cs.final = (wc >> 24) & 0x7f; if ((wc >> 16) & 0x80) cs.interm = (wc >> 16) & 0x7f; else cs.interm = '\0'; if (wc & 0x80) cs.type = (wc & 0x00007f00) ? CS96MULTI : CS96; else cs.type = (wc & 0x00007f00) ? CS94MULTI : CS94; } target = recommendation(ei, &cs); p = tmp; bit8 = ei->flags & F_8BIT; /* designate the charset onto the target plane(G0/1/2/3). */ if (psenc->g[target].type == cs.type && psenc->g[target].final == cs.final && psenc->g[target].interm == cs.interm) goto planeok; *p++ = '\033'; if (cs.type == CS94MULTI || cs.type == CS96MULTI) *p++ = '$'; if (target == 0 && cs.type == CS94MULTI && strchr("@AB", cs.final) && !cs.interm && !(ei->flags & F_NOOLD)) ; else if (cs.type == CS94 || cs.type == CS94MULTI) *p++ = "()*+"[target]; else *p++ = ",-./"[target]; if (cs.interm) *p++ = cs.interm; *p++ = cs.final; psenc->g[target].type = cs.type; psenc->g[target].final = cs.final; psenc->g[target].interm = cs.interm; planeok: /* invoke the plane onto GL or GR. */ if (psenc->gl == target) goto sideok; if (bit8 && psenc->gr == target) goto sideok; if (target == 0 && (ei->flags & F_LS0)) { *p++ = '\017'; psenc->gl = 0; } else if (target == 1 && (ei->flags & F_LS1)) { *p++ = '\016'; psenc->gl = 1; } else if (target == 2 && (ei->flags & F_LS2)) { *p++ = '\033'; *p++ = 'n'; psenc->gl = 2; } else if (target == 3 && (ei->flags & F_LS3)) { *p++ = '\033'; *p++ = 'o'; psenc->gl = 3; } else if (bit8 && target == 1 && (ei->flags & F_LS1R)) { *p++ = '\033'; *p++ = '~'; psenc->gr = 1; } else if (bit8 && target == 2 && (ei->flags & F_LS2R)) { *p++ = '\033'; /*{*/ *p++ = '}'; psenc->gr = 2; } else if (bit8 && target == 3 && (ei->flags & F_LS3R)) { *p++ = '\033'; *p++ = '|'; psenc->gr = 3; } else if (target == 2 && (ei->flags & F_SS2)) { *p++ = '\033'; *p++ = 'N'; psenc->singlegl = 2; } else if (target == 3 && (ei->flags & F_SS3)) { *p++ = '\033'; *p++ = 'O'; psenc->singlegl = 3; } else if (bit8 && target == 2 && (ei->flags & F_SS2R)) { *p++ = '\216'; *p++ = 'N'; psenc->singlegl = psenc->singlegr = 2; } else if (bit8 && target == 3 && (ei->flags & F_SS3R)) { *p++ = '\217'; *p++ = 'O'; psenc->singlegl = psenc->singlegr = 3; } else goto ilseq; sideok: if (psenc->singlegl == target) mask = 0x00; else if (psenc->singlegr == target) mask = 0x80; else if (psenc->gl == target) mask = 0x00; else if ((ei->flags & F_8BIT) && psenc->gr == target) mask = 0x80; else goto ilseq; switch (cs.type) { case CS94: case CS96: i = 1; break; case CS94MULTI: case CS96MULTI: i = !iscntl(wc & 0xff) ? (isthree(cs.final) ? 3 : 2) : 1; break; } while (i-- > 0) *p++ = ((wc >> (i << 3)) & 0x7f) | mask; /* reset single shift state */ psenc->singlegl = psenc->singlegr = -1; len = (size_t)(p - tmp); if (n < len) { if (result) *result = (char *)0; *nresult = (size_t)-1; return (E2BIG); } if (result) *result = string + len; memcpy(string, tmp, len); *nresult = len; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); } static int _citrus_ISO2022_put_state_reset(_ISO2022EncodingInfo * __restrict ei, char * __restrict s, size_t n, _ISO2022State * __restrict psenc, size_t * __restrict nresult) { char *result; char buf[MB_LEN_MAX]; size_t len; int ret; /* XXX state will be modified after this operation... */ ret = _ISO2022_sputwchar(ei, L'\0', buf, sizeof(buf), &result, psenc, &len); if (ret) { *nresult = len; return (ret); } if (sizeof(buf) < len || n < len-1) { /* XXX should recover state? */ *nresult = (size_t)-1; return (E2BIG); } memcpy(s, buf, len - 1); *nresult = len - 1; return (0); } static int _citrus_ISO2022_wcrtomb_priv(_ISO2022EncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wc, _ISO2022State * __restrict psenc, size_t * __restrict nresult) { char *result; char buf[MB_LEN_MAX]; size_t len; int ret; /* XXX state will be modified after this operation... */ ret = _ISO2022_sputwchar(ei, wc, buf, sizeof(buf), &result, psenc, &len); if (ret) { *nresult = len; return (ret); } if (sizeof(buf) < len || n < len) { /* XXX should recover state? */ *nresult = (size_t)-1; return (E2BIG); } memcpy(s, buf, len); *nresult = len; return (0); } static __inline int /*ARGSUSED*/ _citrus_ISO2022_stdenc_wctocs(_ISO2022EncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { wchar_t m, nm; m = wc & 0x7FFF8080; nm = wc & 0x007F7F7F; if (m & 0x00800000) nm &= 0x00007F7F; else m &= 0x7F008080; if (nm & 0x007F0000) { /* ^3 mark */ m |= 0x007F0000; } else if (nm & 0x00007F00) { /* ^2 mark */ m |= 0x00007F00; } *csid = (_csid_t)m; *idx = (_index_t)nm; return (0); } static __inline int /*ARGSUSED*/ _citrus_ISO2022_stdenc_cstowc(_ISO2022EncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { *wc = (wchar_t)(csid & 0x7F808080) | (wchar_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_ISO2022_stdenc_get_state_desc_generic(_ISO2022EncodingInfo * __restrict ei __unused, _ISO2022State * __restrict psenc, int * __restrict rstate) { if (psenc->chlen == 0) { /* XXX: it should distinguish initial and stable. */ *rstate = _STDENC_SDGEN_STABLE; } else *rstate = (psenc->ch[0] == '\033') ? _STDENC_SDGEN_INCOMPLETE_SHIFT : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(ISO2022); _CITRUS_STDENC_DEF_OPS(ISO2022); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/JOHAB/citrus_johab.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/JOHAB/citrus_johab.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/JOHAB/citrus_johab.c (revision 281585) @@ -1,335 +1,335 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_johab.c,v 1.4 2008/06/14 16:01:07 tnozaki Exp $ */ /*- * Copyright (c)2006 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_johab.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { int chlen; char ch[2]; } _JOHABState; typedef struct { int dummy; } _JOHABEncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_JOHAB_##m #define _ENCODING_INFO _JOHABEncodingInfo #define _ENCODING_STATE _JOHABState #define _ENCODING_MB_CUR_MAX(_ei_) 2 #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_JOHAB_init_state(_JOHABEncodingInfo * __restrict ei __unused, _JOHABState * __restrict psenc) { psenc->chlen = 0; } #if 0 static __inline void /*ARGSUSED*/ _citrus_JOHAB_pack_state(_JOHABEncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _JOHABState * __restrict psenc) { memcpy(pspriv, (const void *)psenc, sizeof(*psenc)); } static __inline void /*ARGSUSED*/ _citrus_JOHAB_unpack_state(_JOHABEncodingInfo * __restrict ei __unused, _JOHABState * __restrict psenc, const void * __restrict pspriv) { memcpy((void *)psenc, pspriv, sizeof(*psenc)); } #endif static void /*ARGSUSED*/ _citrus_JOHAB_encoding_module_uninit(_JOHABEncodingInfo *ei __unused) { /* ei may be null */ } static int /*ARGSUSED*/ _citrus_JOHAB_encoding_module_init(_JOHABEncodingInfo * __restrict ei __unused, const void * __restrict var __unused, size_t lenvar __unused) { /* ei may be null */ return (0); } static __inline bool ishangul(int l, int t) { return ((l >= 0x84 && l <= 0xD3) && ((t >= 0x41 && t <= 0x7E) || (t >= 0x81 && t <= 0xFE))); } static __inline bool isuda(int l, int t) { return ((l == 0xD8) && ((t >= 0x31 && t <= 0x7E) || (t >= 0x91 && t <= 0xFE))); } static __inline bool ishanja(int l, int t) { return (((l >= 0xD9 && l <= 0xDE) || (l >= 0xE0 && l <= 0xF9)) && ((t >= 0x31 && t <= 0x7E) || (t >= 0x91 && t <= 0xFE))); } static int /*ARGSUSED*/ _citrus_JOHAB_mbrtowc_priv(_JOHABEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _JOHABState * __restrict psenc, size_t * __restrict nresult) { - const char *s0; + char *s0; int l, t; if (*s == NULL) { _citrus_JOHAB_init_state(ei, psenc); *nresult = _ENCODING_IS_STATE_DEPENDENT; return (0); } s0 = *s; switch (psenc->chlen) { case 0: if (n-- < 1) goto restart; l = *s0++ & 0xFF; if (l <= 0x7F) { if (pwc != NULL) *pwc = (wchar_t)l; *nresult = (l == 0) ? 0 : 1; *s = s0; return (0); } psenc->ch[psenc->chlen++] = l; break; case 1: l = psenc->ch[0] & 0xFF; break; default: return (EINVAL); } if (n-- < 1) { restart: *nresult = (size_t)-2; *s = s0; return (0); } t = *s0++ & 0xFF; if (!ishangul(l, t) && !isuda(l, t) && !ishanja(l, t)) { *nresult = (size_t)-1; return (EILSEQ); } if (pwc != NULL) *pwc = (wchar_t)(l << 8 | t); *nresult = s0 - *s; *s = s0; psenc->chlen = 0; return (0); } static int /*ARGSUSED*/ _citrus_JOHAB_wcrtomb_priv(_JOHABEncodingInfo * __restrict ei __unused, char * __restrict s, size_t n, wchar_t wc, _JOHABState * __restrict psenc, size_t * __restrict nresult) { int l, t; if (psenc->chlen != 0) return (EINVAL); /* XXX assume wchar_t as int */ if ((uint32_t)wc <= 0x7F) { if (n < 1) goto e2big; *s = wc & 0xFF; *nresult = 1; } else if ((uint32_t)wc <= 0xFFFF) { if (n < 2) { e2big: *nresult = (size_t)-1; return (E2BIG); } l = (wc >> 8) & 0xFF; t = wc & 0xFF; if (!ishangul(l, t) && !isuda(l, t) && !ishanja(l, t)) goto ilseq; *s++ = l; *s = t; *nresult = 2; } else { ilseq: *nresult = (size_t)-1; return (EILSEQ); } return (0); } static __inline int /*ARGSUSED*/ _citrus_JOHAB_stdenc_wctocs(_JOHABEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { int m, l, linear, t; /* XXX assume wchar_t as int */ if ((uint32_t)wc <= 0x7F) { *idx = (_index_t)wc; *csid = 0; } else if ((uint32_t)wc <= 0xFFFF) { l = (wc >> 8) & 0xFF; t = wc & 0xFF; if (ishangul(l, t) || isuda(l, t)) { *idx = (_index_t)wc; *csid = 1; } else { if (l >= 0xD9 && l <= 0xDE) { linear = l - 0xD9; m = 0x21; } else if (l >= 0xE0 && l <= 0xF9) { linear = l - 0xE0; m = 0x4A; } else return (EILSEQ); linear *= 188; if (t >= 0x31 && t <= 0x7E) linear += t - 0x31; else if (t >= 0x91 && t <= 0xFE) linear += t - 0x43; else return (EILSEQ); l = (linear / 94) + m; t = (linear % 94) + 0x21; *idx = (_index_t)((l << 8) | t); *csid = 2; } } else return (EILSEQ); return (0); } static __inline int /*ARGSUSED*/ _citrus_JOHAB_stdenc_cstowc(_JOHABEncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { int m, n, l, linear, t; switch (csid) { case 0: case 1: *wc = (wchar_t)idx; break; case 2: if (idx >= 0x2121 && idx <= 0x2C71) { m = 0xD9; n = 0x21; } else if (idx >= 0x4A21 && idx <= 0x7D7E) { m = 0xE0; n = 0x4A; } else return (EILSEQ); l = ((idx >> 8) & 0xFF) - n; t = (idx & 0xFF) - 0x21; linear = (l * 94) + t; l = (linear / 188) + m; t = linear % 188; t += (t <= 0x4D) ? 0x31 : 0x43; break; default: return (EILSEQ); } return (0); } static __inline int /*ARGSUSED*/ _citrus_JOHAB_stdenc_get_state_desc_generic(_JOHABEncodingInfo * __restrict ei __unused, _JOHABState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(JOHAB); _CITRUS_STDENC_DEF_OPS(JOHAB); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/MSKanji/citrus_mskanji.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/MSKanji/citrus_mskanji.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/MSKanji/citrus_mskanji.c (revision 281585) @@ -1,475 +1,475 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_mskanji.c,v 1.13 2008/06/14 16:01:08 tnozaki Exp $ */ /*- * Copyright (c)2002 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * ja_JP.SJIS locale table for BSD4.4/rune * version 1.0 * (C) Sin'ichiro MIYATANI / Phase One, Inc * May 12, 1995 * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Phase One, Inc. * 4. The name of Phase One, Inc. may be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_mskanji.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct _MSKanjiState { int chlen; char ch[2]; } _MSKanjiState; typedef struct { int mode; #define MODE_JIS2004 1 } _MSKanjiEncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_MSKanji_##m #define _ENCODING_INFO _MSKanjiEncodingInfo #define _ENCODING_STATE _MSKanjiState #define _ENCODING_MB_CUR_MAX(_ei_) 2 #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static bool _mskanji1(int c) { return ((c >= 0x81 && c <= 0x9f) || (c >= 0xe0 && c <= 0xfc)); } static bool _mskanji2(int c) { return ((c >= 0x40 && c <= 0x7e) || (c >= 0x80 && c <= 0xfc)); } static __inline void /*ARGSUSED*/ _citrus_MSKanji_init_state(_MSKanjiEncodingInfo * __restrict ei __unused, _MSKanjiState * __restrict s) { s->chlen = 0; } #if 0 static __inline void /*ARGSUSED*/ _citrus_MSKanji_pack_state(_MSKanjiEncodingInfo * __restrict ei __unused, void * __restrict pspriv, const _MSKanjiState * __restrict s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_MSKanji_unpack_state(_MSKanjiEncodingInfo * __restrict ei __unused, _MSKanjiState * __restrict s, const void * __restrict pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static int /*ARGSUSED*/ _citrus_MSKanji_mbrtowc_priv(_MSKanjiEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _MSKanjiState * __restrict psenc, size_t * __restrict nresult) { - const char *s0; + char *s0; wchar_t wchar; int chlenbak, len; s0 = *s; if (s0 == NULL) { _citrus_MSKanji_init_state(ei, psenc); *nresult = 0; /* state independent */ return (0); } chlenbak = psenc->chlen; /* make sure we have the first byte in the buffer */ switch (psenc->chlen) { case 0: if (n < 1) goto restart; psenc->ch[0] = *s0++; psenc->chlen = 1; n--; break; case 1: break; default: /* illegal state */ goto encoding_error; } len = _mskanji1(psenc->ch[0] & 0xff) ? 2 : 1; while (psenc->chlen < len) { if (n < 1) goto restart; psenc->ch[psenc->chlen] = *s0++; psenc->chlen++; n--; } *s = s0; switch (len) { case 1: wchar = psenc->ch[0] & 0xff; break; case 2: if (!_mskanji2(psenc->ch[1] & 0xff)) goto encoding_error; wchar = ((psenc->ch[0] & 0xff) << 8) | (psenc->ch[1] & 0xff); break; default: /* illegal state */ goto encoding_error; } psenc->chlen = 0; if (pwc) *pwc = wchar; *nresult = wchar ? len - chlenbak : 0; return (0); encoding_error: psenc->chlen = 0; *nresult = (size_t)-1; return (EILSEQ); restart: *nresult = (size_t)-2; *s = s0; return (0); } static int _citrus_MSKanji_wcrtomb_priv(_MSKanjiEncodingInfo * __restrict ei __unused, char * __restrict s, size_t n, wchar_t wc, _MSKanjiState * __restrict psenc __unused, size_t * __restrict nresult) { int ret; /* check invalid sequence */ if (wc & ~0xffff) { ret = EILSEQ; goto err; } if (wc & 0xff00) { if (n < 2) { ret = E2BIG; goto err; } s[0] = (wc >> 8) & 0xff; s[1] = wc & 0xff; if (!_mskanji1(s[0] & 0xff) || !_mskanji2(s[1] & 0xff)) { ret = EILSEQ; goto err; } *nresult = 2; return (0); } else { if (n < 1) { ret = E2BIG; goto err; } s[0] = wc & 0xff; if (_mskanji1(s[0] & 0xff)) { ret = EILSEQ; goto err; } *nresult = 1; return (0); } err: *nresult = (size_t)-1; return (ret); } static __inline int /*ARGSUSED*/ _citrus_MSKanji_stdenc_wctocs(_MSKanjiEncodingInfo * __restrict ei, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { _index_t col, row; int offset; if ((_wc_t)wc < 0x80) { /* ISO-646 */ *csid = 0; *idx = (_index_t)wc; } else if ((_wc_t)wc < 0x100) { /* KANA */ *csid = 1; *idx = (_index_t)wc & 0x7F; } else { /* Kanji (containing Gaiji zone) */ /* * 94^2 zone (contains a part of Gaiji (0xED40 - 0xEEFC)): * 0x8140 - 0x817E -> 0x2121 - 0x215F * 0x8180 - 0x819E -> 0x2160 - 0x217E * 0x819F - 0x81FC -> 0x2221 - 0x227E * * 0x8240 - 0x827E -> 0x2321 - 0x235F * ... * 0x9F9F - 0x9FFc -> 0x5E21 - 0x5E7E * * 0xE040 - 0xE07E -> 0x5F21 - 0x5F5F * ... * 0xEF9F - 0xEFFC -> 0x7E21 - 0x7E7E * * extended Gaiji zone: * 0xF040 - 0xFCFC * * JIS X0213-plane2: * 0xF040 - 0xF09E -> 0x2121 - 0x217E * 0xF140 - 0xF19E -> 0x2321 - 0x237E * ... * 0xF240 - 0xF29E -> 0x2521 - 0x257E * * 0xF09F - 0xF0FC -> 0x2821 - 0x287E * 0xF29F - 0xF2FC -> 0x2C21 - 0x2C7E * ... * 0xF44F - 0xF49E -> 0x2F21 - 0x2F7E * * 0xF49F - 0xF4FC -> 0x6E21 - 0x6E7E * ... * 0xFC9F - 0xFCFC -> 0x7E21 - 0x7E7E */ row = ((_wc_t)wc >> 8) & 0xFF; col = (_wc_t)wc & 0xFF; if (!_mskanji1(row) || !_mskanji2(col)) return (EILSEQ); if ((ei->mode & MODE_JIS2004) == 0 || row < 0xF0) { *csid = 2; offset = 0x81; } else { *csid = 3; if ((_wc_t)wc <= 0xF49E) { offset = (_wc_t)wc >= 0xF29F || ((_wc_t)wc >= 0xF09F && (_wc_t)wc <= 0xF0FC) ? 0xED : 0xF0; } else offset = 0xCE; } row -= offset; if (row >= 0x5F) row -= 0x40; row = row * 2 + 0x21; col -= 0x1F; if (col >= 0x61) col -= 1; if (col > 0x7E) { row += 1; col -= 0x5E; } *idx = ((_index_t)row << 8) | col; } return (0); } static __inline int /*ARGSUSED*/ _citrus_MSKanji_stdenc_cstowc(_MSKanjiEncodingInfo * __restrict ei, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { uint32_t col, row; int offset; switch (csid) { case 0: /* ISO-646 */ if (idx >= 0x80) return (EILSEQ); *wc = (wchar_t)idx; break; case 1: /* kana */ if (idx >= 0x80) return (EILSEQ); *wc = (wchar_t)idx + 0x80; break; case 3: if ((ei->mode & MODE_JIS2004) == 0) return (EILSEQ); /*FALLTHROUGH*/ case 2: /* kanji */ row = (idx >> 8); if (row < 0x21) return (EILSEQ); if (csid == 3) { if (row <= 0x2F) offset = (row == 0x22 || row >= 0x26) ? 0xED : 0xF0; else if (row >= 0x4D && row <= 0x7E) offset = 0xCE; else return (EILSEQ); } else { if (row > 0x97) return (EILSEQ); offset = (row < 0x5F) ? 0x81 : 0xC1; } col = idx & 0xFF; if (col < 0x21 || col > 0x7E) return (EILSEQ); row -= 0x21; col -= 0x21; if ((row & 1) == 0) { col += 0x40; if (col >= 0x7F) col += 1; } else col += 0x9F; row = row / 2 + offset; *wc = ((wchar_t)row << 8) | col; break; default: return (EILSEQ); } return (0); } static __inline int /*ARGSUSED*/ _citrus_MSKanji_stdenc_get_state_desc_generic(_MSKanjiEncodingInfo * __restrict ei __unused, _MSKanjiState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } static int /*ARGSUSED*/ _citrus_MSKanji_encoding_module_init(_MSKanjiEncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { const char *p; p = var; memset((void *)ei, 0, sizeof(*ei)); while (lenvar > 0) { switch (_bcs_toupper(*p)) { case 'J': MATCH(JIS2004, ei->mode |= MODE_JIS2004); break; } ++p; --lenvar; } return (0); } static void _citrus_MSKanji_encoding_module_uninit(_MSKanjiEncodingInfo *ei __unused) { } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(MSKanji); _CITRUS_STDENC_DEF_OPS(MSKanji); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/UES/citrus_ues.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/UES/citrus_ues.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/UES/citrus_ues.c (revision 281585) @@ -1,414 +1,414 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_ues.c,v 1.3 2012/02/12 13:51:29 wiz Exp $ */ /*- * Copyright (c)2006 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_ues.h" typedef struct { size_t mb_cur_max; int mode; #define MODE_C99 1 } _UESEncodingInfo; typedef struct { int chlen; char ch[12]; } _UESState; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_UES_##m #define _ENCODING_INFO _UESEncodingInfo #define _ENCODING_STATE _UESState #define _ENCODING_MB_CUR_MAX(_ei_) (_ei_)->mb_cur_max #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_UES_init_state(_UESEncodingInfo * __restrict ei __unused, _UESState * __restrict psenc) { psenc->chlen = 0; } #if 0 static __inline void /*ARGSUSED*/ _citrus_UES_pack_state(_UESEncodingInfo * __restrict ei __unused, void *__restrict pspriv, const _UESState * __restrict psenc) { memcpy(pspriv, (const void *)psenc, sizeof(*psenc)); } static __inline void /*ARGSUSED*/ _citrus_UES_unpack_state(_UESEncodingInfo * __restrict ei __unused, _UESState * __restrict psenc, const void * __restrict pspriv) { memcpy((void *)psenc, pspriv, sizeof(*psenc)); } #endif static __inline int to_int(int ch) { if (ch >= '0' && ch <= '9') return (ch - '0'); else if (ch >= 'A' && ch <= 'F') return ((ch - 'A') + 10); else if (ch >= 'a' && ch <= 'f') return ((ch - 'a') + 10); return (-1); } #define ESCAPE '\\' #define UCS2_ESC 'u' #define UCS4_ESC 'U' #define UCS2_BIT 16 #define UCS4_BIT 32 #define BMP_MAX UINT32_C(0xFFFF) #define UCS2_MAX UINT32_C(0x10FFFF) #define UCS4_MAX UINT32_C(0x7FFFFFFF) static const char *xdig = "0123456789abcdef"; static __inline int to_str(char *s, wchar_t wc, int bit) { char *p; p = s; *p++ = ESCAPE; switch (bit) { case UCS2_BIT: *p++ = UCS2_ESC; break; case UCS4_BIT: *p++ = UCS4_ESC; break; default: abort(); } do { *p++ = xdig[(wc >> (bit -= 4)) & 0xF]; } while (bit > 0); return (p - s); } static __inline bool is_hi_surrogate(wchar_t wc) { return (wc >= 0xD800 && wc <= 0xDBFF); } static __inline bool is_lo_surrogate(wchar_t wc) { return (wc >= 0xDC00 && wc <= 0xDFFF); } static __inline wchar_t surrogate_to_ucs(wchar_t hi, wchar_t lo) { hi -= 0xD800; lo -= 0xDC00; return ((hi << 10 | lo) + 0x10000); } static __inline void ucs_to_surrogate(wchar_t wc, wchar_t * __restrict hi, wchar_t * __restrict lo) { wc -= 0x10000; *hi = (wc >> 10) + 0xD800; *lo = (wc & 0x3FF) + 0xDC00; } static __inline bool is_basic(wchar_t wc) { return ((uint32_t)wc <= 0x9F && wc != 0x24 && wc != 0x40 && wc != 0x60); } static int _citrus_UES_mbrtowc_priv(_UESEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _UESState * __restrict psenc, size_t * __restrict nresult) { - const char *s0; + char *s0; int ch, head, num, tail; wchar_t hi, wc; if (*s == NULL) { _citrus_UES_init_state(ei, psenc); *nresult = 0; return (0); } s0 = *s; hi = (wchar_t)0; tail = 0; surrogate: wc = (wchar_t)0; head = tail; if (psenc->chlen == head) { if (n-- < 1) goto restart; psenc->ch[psenc->chlen++] = *s0++; } ch = (unsigned char)psenc->ch[head++]; if (ch == ESCAPE) { if (psenc->chlen == head) { if (n-- < 1) goto restart; psenc->ch[psenc->chlen++] = *s0++; } switch (psenc->ch[head]) { case UCS2_ESC: tail += 6; break; case UCS4_ESC: if (ei->mode & MODE_C99) { tail = 10; break; } /*FALLTHROUGH*/ default: tail = 0; } ++head; } for (; head < tail; ++head) { if (psenc->chlen == head) { if (n-- < 1) { restart: *s = s0; *nresult = (size_t)-2; return (0); } psenc->ch[psenc->chlen++] = *s0++; } num = to_int((int)(unsigned char)psenc->ch[head]); if (num < 0) { tail = 0; break; } wc = (wc << 4) | num; } head = 0; switch (tail) { case 0: break; case 6: if (hi != (wchar_t)0) break; if ((ei->mode & MODE_C99) == 0) { if (is_hi_surrogate(wc) != 0) { hi = wc; goto surrogate; } if ((uint32_t)wc <= 0x7F /* XXX */ || is_lo_surrogate(wc) != 0) break; goto done; } /*FALLTHROUGH*/ case 10: if (is_basic(wc) == 0 && (uint32_t)wc <= UCS4_MAX && is_hi_surrogate(wc) == 0 && is_lo_surrogate(wc) == 0) goto done; *nresult = (size_t)-1; return (EILSEQ); case 12: if (is_lo_surrogate(wc) == 0) break; wc = surrogate_to_ucs(hi, wc); goto done; } ch = (unsigned char)psenc->ch[0]; head = psenc->chlen; if (--head > 0) memmove(&psenc->ch[0], &psenc->ch[1], head); wc = (wchar_t)ch; done: psenc->chlen = head; if (pwc != NULL) *pwc = wc; *nresult = (size_t)((wc == 0) ? 0 : (s0 - *s)); *s = s0; return (0); } static int _citrus_UES_wcrtomb_priv(_UESEncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wc, _UESState * __restrict psenc, size_t * __restrict nresult) { wchar_t hi, lo; if (psenc->chlen != 0) return (EINVAL); if ((ei->mode & MODE_C99) ? is_basic(wc) : (uint32_t)wc <= 0x7F) { if (n-- < 1) goto e2big; psenc->ch[psenc->chlen++] = (char)wc; } else if ((uint32_t)wc <= BMP_MAX) { if (n < 6) goto e2big; psenc->chlen = to_str(&psenc->ch[0], wc, UCS2_BIT); } else if ((ei->mode & MODE_C99) == 0 && (uint32_t)wc <= UCS2_MAX) { if (n < 12) goto e2big; ucs_to_surrogate(wc, &hi, &lo); psenc->chlen += to_str(&psenc->ch[0], hi, UCS2_BIT); psenc->chlen += to_str(&psenc->ch[6], lo, UCS2_BIT); } else if ((ei->mode & MODE_C99) && (uint32_t)wc <= UCS4_MAX) { if (n < 10) goto e2big; psenc->chlen = to_str(&psenc->ch[0], wc, UCS4_BIT); } else { *nresult = (size_t)-1; return (EILSEQ); } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; psenc->chlen = 0; return (0); e2big: *nresult = (size_t)-1; return (E2BIG); } /*ARGSUSED*/ static int _citrus_UES_stdenc_wctocs(_UESEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = 0; *idx = (_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_UES_stdenc_cstowc(_UESEncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { if (csid != 0) return (EILSEQ); *wc = (wchar_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_UES_stdenc_get_state_desc_generic(_UESEncodingInfo * __restrict ei __unused, _UESState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } static void /*ARGSUSED*/ _citrus_UES_encoding_module_uninit(_UESEncodingInfo *ei __unused) { /* ei seems to be unused */ } static int /*ARGSUSED*/ _citrus_UES_encoding_module_init(_UESEncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { const char *p; p = var; memset((void *)ei, 0, sizeof(*ei)); while (lenvar > 0) { switch (_bcs_toupper(*p)) { case 'C': MATCH(C99, ei->mode |= MODE_C99); break; } ++p; --lenvar; } ei->mb_cur_max = (ei->mode & MODE_C99) ? 10 : 12; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(UES); _CITRUS_STDENC_DEF_OPS(UES); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/UTF1632/citrus_utf1632.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/UTF1632/citrus_utf1632.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/UTF1632/citrus_utf1632.c (revision 281585) @@ -1,453 +1,453 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_utf1632.c,v 1.9 2008/06/14 16:01:08 tnozaki Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_bcs.h" #include "citrus_utf1632.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { int chlen; int current_endian; uint8_t ch[4]; } _UTF1632State; #define _ENDIAN_UNKNOWN 0 #define _ENDIAN_BIG 1 #define _ENDIAN_LITTLE 2 #if BYTE_ORDER == BIG_ENDIAN #define _ENDIAN_INTERNAL _ENDIAN_BIG #define _ENDIAN_SWAPPED _ENDIAN_LITTLE #else #define _ENDIAN_INTERNAL _ENDIAN_LITTLE #define _ENDIAN_SWAPPED _ENDIAN_BIG #endif #define _MODE_UTF32 0x00000001U #define _MODE_FORCE_ENDIAN 0x00000002U typedef struct { int preffered_endian; unsigned int cur_max; uint32_t mode; } _UTF1632EncodingInfo; #define _FUNCNAME(m) _citrus_UTF1632_##m #define _ENCODING_INFO _UTF1632EncodingInfo #define _ENCODING_STATE _UTF1632State #define _ENCODING_MB_CUR_MAX(_ei_) ((_ei_)->cur_max) #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_UTF1632_init_state(_UTF1632EncodingInfo *ei __unused, _UTF1632State *s) { memset(s, 0, sizeof(*s)); } static int _citrus_UTF1632_mbrtowc_priv(_UTF1632EncodingInfo *ei, wchar_t *pwc, - const char **s, size_t n, _UTF1632State *psenc, size_t *nresult) + char **s, size_t n, _UTF1632State *psenc, size_t *nresult) { - const char *s0; + char *s0; size_t result; wchar_t wc = L'\0'; int chlenbak, endian, needlen; s0 = *s; if (s0 == NULL) { _citrus_UTF1632_init_state(ei, psenc); *nresult = 0; /* state independent */ return (0); } result = 0; chlenbak = psenc->chlen; refetch: needlen = ((ei->mode & _MODE_UTF32) != 0 || chlenbak >= 2) ? 4 : 2; while (chlenbak < needlen) { if (n == 0) goto restart; psenc->ch[chlenbak++] = *s0++; n--; result++; } /* judge endian marker */ if ((ei->mode & _MODE_UTF32) == 0) { /* UTF16 */ if (psenc->ch[0] == 0xFE && psenc->ch[1] == 0xFF) { psenc->current_endian = _ENDIAN_BIG; chlenbak = 0; goto refetch; } else if (psenc->ch[0] == 0xFF && psenc->ch[1] == 0xFE) { psenc->current_endian = _ENDIAN_LITTLE; chlenbak = 0; goto refetch; } } else { /* UTF32 */ if (psenc->ch[0] == 0x00 && psenc->ch[1] == 0x00 && psenc->ch[2] == 0xFE && psenc->ch[3] == 0xFF) { psenc->current_endian = _ENDIAN_BIG; chlenbak = 0; goto refetch; } else if (psenc->ch[0] == 0xFF && psenc->ch[1] == 0xFE && psenc->ch[2] == 0x00 && psenc->ch[3] == 0x00) { psenc->current_endian = _ENDIAN_LITTLE; chlenbak = 0; goto refetch; } } endian = ((ei->mode & _MODE_FORCE_ENDIAN) != 0 || psenc->current_endian == _ENDIAN_UNKNOWN) ? ei->preffered_endian : psenc->current_endian; /* get wc */ if ((ei->mode & _MODE_UTF32) == 0) { /* UTF16 */ if (needlen == 2) { switch (endian) { case _ENDIAN_LITTLE: wc = (psenc->ch[0] | ((wchar_t)psenc->ch[1] << 8)); break; case _ENDIAN_BIG: wc = (psenc->ch[1] | ((wchar_t)psenc->ch[0] << 8)); break; default: goto ilseq; } if (wc >= 0xD800 && wc <= 0xDBFF) { /* surrogate high */ needlen = 4; goto refetch; } } else { /* surrogate low */ wc -= 0xD800; /* wc : surrogate high (see above) */ wc <<= 10; switch (endian) { case _ENDIAN_LITTLE: if (psenc->ch[3] < 0xDC || psenc->ch[3] > 0xDF) goto ilseq; wc |= psenc->ch[2]; wc |= (wchar_t)(psenc->ch[3] & 3) << 8; break; case _ENDIAN_BIG: if (psenc->ch[2]<0xDC || psenc->ch[2]>0xDF) goto ilseq; wc |= psenc->ch[3]; wc |= (wchar_t)(psenc->ch[2] & 3) << 8; break; default: goto ilseq; } wc += 0x10000; } } else { /* UTF32 */ switch (endian) { case _ENDIAN_LITTLE: wc = (psenc->ch[0] | ((wchar_t)psenc->ch[1] << 8) | ((wchar_t)psenc->ch[2] << 16) | ((wchar_t)psenc->ch[3] << 24)); break; case _ENDIAN_BIG: wc = (psenc->ch[3] | ((wchar_t)psenc->ch[2] << 8) | ((wchar_t)psenc->ch[1] << 16) | ((wchar_t)psenc->ch[0] << 24)); break; default: goto ilseq; } if (wc >= 0xD800 && wc <= 0xDFFF) goto ilseq; } *pwc = wc; psenc->chlen = 0; *nresult = result; *s = s0; return (0); ilseq: *nresult = (size_t)-1; psenc->chlen = 0; return (EILSEQ); restart: *nresult = (size_t)-2; psenc->chlen = chlenbak; *s = s0; return (0); } static int _citrus_UTF1632_wcrtomb_priv(_UTF1632EncodingInfo *ei, char *s, size_t n, wchar_t wc, _UTF1632State *psenc, size_t *nresult) { wchar_t wc2; static const char _bom[4] = { 0x00, 0x00, 0xFE, 0xFF, }; const char *bom = &_bom[0]; size_t cnt; cnt = (size_t)0; if (psenc->current_endian == _ENDIAN_UNKNOWN) { if ((ei->mode & _MODE_FORCE_ENDIAN) == 0) { if (ei->mode & _MODE_UTF32) cnt = 4; else { cnt = 2; bom += 2; } if (n < cnt) goto e2big; memcpy(s, bom, cnt); s += cnt, n -= cnt; } psenc->current_endian = ei->preffered_endian; } wc2 = 0; if ((ei->mode & _MODE_UTF32)==0) { /* UTF16 */ if (wc > 0xFFFF) { /* surrogate */ if (wc > 0x10FFFF) goto ilseq; if (n < 4) goto e2big; cnt += 4; wc -= 0x10000; wc2 = (wc & 0x3FF) | 0xDC00; wc = (wc>>10) | 0xD800; } else { if (n < 2) goto e2big; cnt += 2; } surrogate: switch (psenc->current_endian) { case _ENDIAN_BIG: s[1] = wc; s[0] = (wc >>= 8); break; case _ENDIAN_LITTLE: s[0] = wc; s[1] = (wc >>= 8); break; } if (wc2 != 0) { wc = wc2; wc2 = 0; s += 2; goto surrogate; } } else { /* UTF32 */ if (wc >= 0xD800 && wc <= 0xDFFF) goto ilseq; if (n < 4) goto e2big; cnt += 4; switch (psenc->current_endian) { case _ENDIAN_BIG: s[3] = wc; s[2] = (wc >>= 8); s[1] = (wc >>= 8); s[0] = (wc >>= 8); break; case _ENDIAN_LITTLE: s[0] = wc; s[1] = (wc >>= 8); s[2] = (wc >>= 8); s[3] = (wc >>= 8); break; } } *nresult = cnt; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); e2big: *nresult = (size_t)-1; return (E2BIG); } static void parse_variable(_UTF1632EncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { const char *p; p = var; while (lenvar > 0) { switch (*p) { case 'B': case 'b': MATCH(big, ei->preffered_endian = _ENDIAN_BIG); break; case 'L': case 'l': MATCH(little, ei->preffered_endian = _ENDIAN_LITTLE); break; case 'i': case 'I': MATCH(internal, ei->preffered_endian = _ENDIAN_INTERNAL); break; case 's': case 'S': MATCH(swapped, ei->preffered_endian = _ENDIAN_SWAPPED); break; case 'F': case 'f': MATCH(force, ei->mode |= _MODE_FORCE_ENDIAN); break; case 'U': case 'u': MATCH(utf32, ei->mode |= _MODE_UTF32); break; } p++; lenvar--; } } static int /*ARGSUSED*/ _citrus_UTF1632_encoding_module_init(_UTF1632EncodingInfo * __restrict ei, const void * __restrict var, size_t lenvar) { memset((void *)ei, 0, sizeof(*ei)); parse_variable(ei, var, lenvar); ei->cur_max = ((ei->mode&_MODE_UTF32) == 0) ? 6 : 8; /* 6: endian + surrogate */ /* 8: endian + normal */ if (ei->preffered_endian == _ENDIAN_UNKNOWN) { ei->preffered_endian = _ENDIAN_BIG; } return (0); } static void /*ARGSUSED*/ _citrus_UTF1632_encoding_module_uninit(_UTF1632EncodingInfo *ei __unused) { } static __inline int /*ARGSUSED*/ _citrus_UTF1632_stdenc_wctocs(_UTF1632EncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, _wc_t wc) { *csid = 0; *idx = (_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF1632_stdenc_cstowc(_UTF1632EncodingInfo * __restrict ei __unused, _wc_t * __restrict wc, _csid_t csid, _index_t idx) { if (csid != 0) return (EILSEQ); *wc = (_wc_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF1632_stdenc_get_state_desc_generic(_UTF1632EncodingInfo * __restrict ei __unused, _UTF1632State * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(UTF1632); _CITRUS_STDENC_DEF_OPS(UTF1632); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/UTF7/citrus_utf7.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/UTF7/citrus_utf7.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/UTF7/citrus_utf7.c (revision 281585) @@ -1,502 +1,502 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_utf7.c,v 1.5 2006/08/23 12:57:24 tnozaki Exp $ */ /*- * Copyright (c)2004, 2005 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_utf7.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ #define EI_MASK UINT16_C(0xff) #define EI_DIRECT UINT16_C(0x100) #define EI_OPTION UINT16_C(0x200) #define EI_SPACE UINT16_C(0x400) typedef struct { uint16_t cell[0x80]; } _UTF7EncodingInfo; typedef struct { unsigned int mode: 1, /* whether base64 mode */ bits: 4, /* need to hold 0 - 15 */ cache: 22, /* 22 = BASE64_BIT + UTF16_BIT */ surrogate: 1; /* whether surrogate pair or not */ int chlen; char ch[4]; /* BASE64_IN, 3 * 6 = 18, most closed to UTF16_BIT */ } _UTF7State; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_UTF7_##m #define _ENCODING_INFO _UTF7EncodingInfo #define _ENCODING_STATE _UTF7State #define _ENCODING_MB_CUR_MAX(_ei_) 4 #define _ENCODING_IS_STATE_DEPENDENT 1 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_UTF7_init_state(_UTF7EncodingInfo * __restrict ei __unused, _UTF7State * __restrict s) { memset((void *)s, 0, sizeof(*s)); } #if 0 static __inline void /*ARGSUSED*/ _citrus_UTF7_pack_state(_UTF7EncodingInfo * __restrict ei __unused, void *__restrict pspriv, const _UTF7State * __restrict s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_UTF7_unpack_state(_UTF7EncodingInfo * __restrict ei __unused, _UTF7State * __restrict s, const void * __restrict pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static const char base64[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789+/"; static const char direct[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789'(),-./:?"; static const char option[] = "!\"#$%&*;<=>@[]^_`{|}"; static const char spaces[] = " \t\r\n"; #define BASE64_BIT 6 #define UTF16_BIT 16 #define BASE64_MAX 0x3f #define UTF16_MAX UINT16_C(0xffff) #define UTF32_MAX UINT32_C(0x10ffff) #define BASE64_IN '+' #define BASE64_OUT '-' #define SHIFT7BIT(c) ((c) >> 7) #define ISSPECIAL(c) ((c) == '\0' || (c) == BASE64_IN) #define FINDLEN(ei, c) \ (SHIFT7BIT((c)) ? -1 : (((ei)->cell[(c)] & EI_MASK) - 1)) #define ISDIRECT(ei, c) (!SHIFT7BIT((c)) && (ISSPECIAL((c)) || \ ei->cell[(c)] & (EI_DIRECT | EI_OPTION | EI_SPACE))) #define ISSAFE(ei, c) (!SHIFT7BIT((c)) && (ISSPECIAL((c)) || \ (c < 0x80 && ei->cell[(c)] & (EI_DIRECT | EI_SPACE)))) /* surrogate pair */ #define SRG_BASE UINT32_C(0x10000) #define HISRG_MIN UINT16_C(0xd800) #define HISRG_MAX UINT16_C(0xdbff) #define LOSRG_MIN UINT16_C(0xdc00) #define LOSRG_MAX UINT16_C(0xdfff) static int _citrus_UTF7_mbtoutf16(_UTF7EncodingInfo * __restrict ei, - uint16_t * __restrict u16, const char ** __restrict s, size_t n, + uint16_t * __restrict u16, char ** __restrict s, size_t n, _UTF7State * __restrict psenc, size_t * __restrict nresult) { _UTF7State sv; - const char *s0; + char *s0; int done, i, len; s0 = *s; sv = *psenc; for (i = 0, done = 0; done == 0; i++) { if (i == psenc->chlen) { if (n-- < 1) { *nresult = (size_t)-2; *s = s0; sv.chlen = psenc->chlen; memcpy(sv.ch, psenc->ch, sizeof(sv.ch)); *psenc = sv; return (0); } psenc->ch[psenc->chlen++] = *s0++; } if (SHIFT7BIT((int)psenc->ch[i])) goto ilseq; if (!psenc->mode) { if (psenc->bits > 0 || psenc->cache > 0) return (EINVAL); if (psenc->ch[i] == BASE64_IN) psenc->mode = 1; else { if (!ISDIRECT(ei, (int)psenc->ch[i])) goto ilseq; *u16 = (uint16_t)psenc->ch[i]; done = 1; continue; } } else { if (psenc->ch[i] == BASE64_OUT && psenc->cache == 0) { psenc->mode = 0; *u16 = (uint16_t)BASE64_IN; done = 1; continue; } len = FINDLEN(ei, (int)psenc->ch[i]); if (len < 0) { if (psenc->bits >= BASE64_BIT) return (EINVAL); psenc->mode = 0; psenc->bits = psenc->cache = 0; if (psenc->ch[i] != BASE64_OUT) { if (!ISDIRECT(ei, (int)psenc->ch[i])) goto ilseq; *u16 = (uint16_t)psenc->ch[i]; done = 1; } else { psenc->chlen--; i--; } } else { psenc->cache = (psenc->cache << BASE64_BIT) | len; switch (psenc->bits) { case 0: case 2: case 4: case 6: case 8: psenc->bits += BASE64_BIT; break; case 10: case 12: case 14: psenc->bits -= (UTF16_BIT - BASE64_BIT); *u16 = (psenc->cache >> psenc->bits) & UTF16_MAX; done = 1; break; default: return (EINVAL); } } } } if (psenc->chlen > i) return (EINVAL); psenc->chlen = 0; *nresult = (size_t)((*u16 == 0) ? 0 : s0 - *s); *s = s0; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); } static int _citrus_UTF7_mbrtowc_priv(_UTF7EncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _UTF7State * __restrict psenc, size_t * __restrict nresult) { uint32_t u32; uint16_t hi, lo; size_t nr, siz; int err; if (*s == NULL) { _citrus_UTF7_init_state(ei, psenc); *nresult = (size_t)_ENCODING_IS_STATE_DEPENDENT; return (0); } if (psenc->surrogate) { hi = (psenc->cache >> psenc->bits) & UTF16_MAX; if (hi < HISRG_MIN || hi > HISRG_MAX) return (EINVAL); siz = 0; } else { err = _citrus_UTF7_mbtoutf16(ei, &hi, s, n, psenc, &nr); if (nr == (size_t)-1 || nr == (size_t)-2) { *nresult = nr; return (err); } if (err != 0) return (err); n -= nr; siz = nr; if (hi < HISRG_MIN || hi > HISRG_MAX) { u32 = (uint32_t)hi; goto done; } psenc->surrogate = 1; } err = _citrus_UTF7_mbtoutf16(ei, &lo, s, n, psenc, &nr); if (nr == (size_t)-1 || nr == (size_t)-2) { *nresult = nr; return (err); } if (err != 0) return (err); hi -= HISRG_MIN; lo -= LOSRG_MIN; u32 = (hi << 10 | lo) + SRG_BASE; siz += nr; done: if (pwc != NULL) *pwc = (wchar_t)u32; if (u32 == (uint32_t)0) { *nresult = (size_t)0; _citrus_UTF7_init_state(ei, psenc); } else { *nresult = siz; psenc->surrogate = 0; } return (err); } static int _citrus_UTF7_utf16tomb(_UTF7EncodingInfo * __restrict ei, char * __restrict s, size_t n __unused, uint16_t u16, _UTF7State * __restrict psenc, size_t * __restrict nresult) { int bits, i; if (psenc->chlen != 0 || psenc->bits > BASE64_BIT) return (EINVAL); if (ISSAFE(ei, u16)) { if (psenc->mode) { if (psenc->bits > 0) { bits = BASE64_BIT - psenc->bits; i = (psenc->cache << bits) & BASE64_MAX; psenc->ch[psenc->chlen++] = base64[i]; psenc->bits = psenc->cache = 0; } if (u16 == BASE64_OUT || FINDLEN(ei, u16) >= 0) psenc->ch[psenc->chlen++] = BASE64_OUT; psenc->mode = 0; } if (psenc->bits != 0) return (EINVAL); psenc->ch[psenc->chlen++] = (char)u16; if (u16 == BASE64_IN) psenc->ch[psenc->chlen++] = BASE64_OUT; } else { if (!psenc->mode) { if (psenc->bits > 0) return (EINVAL); psenc->ch[psenc->chlen++] = BASE64_IN; psenc->mode = 1; } psenc->cache = (psenc->cache << UTF16_BIT) | u16; bits = UTF16_BIT + psenc->bits; psenc->bits = bits % BASE64_BIT; while ((bits -= BASE64_BIT) >= 0) { i = (psenc->cache >> bits) & BASE64_MAX; psenc->ch[psenc->chlen++] = base64[i]; } } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; psenc->chlen = 0; return (0); } static int _citrus_UTF7_wcrtomb_priv(_UTF7EncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wchar, _UTF7State * __restrict psenc, size_t * __restrict nresult) { uint32_t u32; uint16_t u16[2]; int err, i, len; size_t nr, siz; u32 = (uint32_t)wchar; if (u32 <= UTF16_MAX) { u16[0] = (uint16_t)u32; len = 1; } else if (u32 <= UTF32_MAX) { u32 -= SRG_BASE; u16[0] = (u32 >> 10) + HISRG_MIN; u16[1] = ((uint16_t)(u32 & UINT32_C(0x3ff))) + LOSRG_MIN; len = 2; } else { *nresult = (size_t)-1; return (EILSEQ); } siz = 0; for (i = 0; i < len; ++i) { err = _citrus_UTF7_utf16tomb(ei, s, n, u16[i], psenc, &nr); if (err != 0) return (err); /* XXX: state has been modified */ s += nr; n -= nr; siz += nr; } *nresult = siz; return (0); } static int /* ARGSUSED */ _citrus_UTF7_put_state_reset(_UTF7EncodingInfo * __restrict ei __unused, char * __restrict s, size_t n, _UTF7State * __restrict psenc, size_t * __restrict nresult) { int bits, pos; if (psenc->chlen != 0 || psenc->bits > BASE64_BIT || psenc->surrogate) return (EINVAL); if (psenc->mode) { if (psenc->bits > 0) { if (n-- < 1) return (E2BIG); bits = BASE64_BIT - psenc->bits; pos = (psenc->cache << bits) & BASE64_MAX; psenc->ch[psenc->chlen++] = base64[pos]; psenc->ch[psenc->chlen++] = BASE64_OUT; psenc->bits = psenc->cache = 0; } psenc->mode = 0; } if (psenc->bits != 0) return (EINVAL); if (n-- < 1) return (E2BIG); *nresult = (size_t)psenc->chlen; if (psenc->chlen > 0) { memcpy(s, psenc->ch, psenc->chlen); psenc->chlen = 0; } return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF7_stdenc_wctocs(_UTF7EncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = 0; *idx = (_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF7_stdenc_cstowc(_UTF7EncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { if (csid != 0) return (EILSEQ); *wc = (wchar_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF7_stdenc_get_state_desc_generic(_UTF7EncodingInfo * __restrict ei __unused, _UTF7State * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } static void /*ARGSUSED*/ _citrus_UTF7_encoding_module_uninit(_UTF7EncodingInfo *ei __unused) { /* ei seems to be unused */ } static int /*ARGSUSED*/ _citrus_UTF7_encoding_module_init(_UTF7EncodingInfo * __restrict ei, const void * __restrict var __unused, size_t lenvar __unused) { const char *s; memset(ei, 0, sizeof(*ei)); #define FILL(str, flag) \ do { \ for (s = str; *s != '\0'; s++) \ ei->cell[*s & 0x7f] |= flag; \ } while (/*CONSTCOND*/0) FILL(base64, (s - base64) + 1); FILL(direct, EI_DIRECT); FILL(option, EI_OPTION); FILL(spaces, EI_SPACE); return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(UTF7); _CITRUS_STDENC_DEF_OPS(UTF7); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/UTF8/citrus_utf8.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/UTF8/citrus_utf8.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/UTF8/citrus_utf8.c (revision 281585) @@ -1,351 +1,351 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_utf8.c,v 1.17 2008/06/14 16:01:08 tnozaki Exp $ */ /*- * Copyright (c)2002 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /*- * Copyright (c) 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Paul Borman at Krystal Technologies. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_utf8.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ static uint8_t _UTF8_count_array[256] = { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 00 - 0F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 10 - 1F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 20 - 2F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 30 - 3F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 40 - 4F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 50 - 5F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 60 - 6F */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 70 - 7F */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 80 - 8F */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 90 - 9F */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* A0 - AF */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* B0 - BF */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, /* C0 - CF */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, /* D0 - DF */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, /* E0 - EF */ 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0 /* F0 - FF */ }; static uint8_t const *_UTF8_count = _UTF8_count_array; static const uint32_t _UTF8_range[] = { 0, /*dummy*/ 0x00000000, 0x00000080, 0x00000800, 0x00010000, 0x00200000, 0x04000000, 0x80000000, }; typedef struct { int chlen; char ch[6]; } _UTF8State; typedef void *_UTF8EncodingInfo; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_ei_, _func_) (_ei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_UTF8_##m #define _ENCODING_INFO _UTF8EncodingInfo #define _ENCODING_STATE _UTF8State #define _ENCODING_MB_CUR_MAX(_ei_) 6 #define _ENCODING_IS_STATE_DEPENDENT 0 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static size_t _UTF8_findlen(wchar_t v) { size_t i; uint32_t c; c = (uint32_t)v; /*XXX*/ for (i = 1; i < sizeof(_UTF8_range) / sizeof(_UTF8_range[0]) - 1; i++) if (c >= _UTF8_range[i] && c < _UTF8_range[i + 1]) return (i); return (-1); /*out of range*/ } static __inline bool _UTF8_surrogate(wchar_t wc) { return (wc >= 0xd800 && wc <= 0xdfff); } static __inline void /*ARGSUSED*/ _citrus_UTF8_init_state(_UTF8EncodingInfo *ei __unused, _UTF8State *s) { s->chlen = 0; } #if 0 static __inline void /*ARGSUSED*/ _citrus_UTF8_pack_state(_UTF8EncodingInfo *ei __unused, void *pspriv, const _UTF8State *s) { memcpy(pspriv, (const void *)s, sizeof(*s)); } static __inline void /*ARGSUSED*/ _citrus_UTF8_unpack_state(_UTF8EncodingInfo *ei __unused, _UTF8State *s, const void *pspriv) { memcpy((void *)s, pspriv, sizeof(*s)); } #endif static int -_citrus_UTF8_mbrtowc_priv(_UTF8EncodingInfo *ei, wchar_t *pwc, const char **s, +_citrus_UTF8_mbrtowc_priv(_UTF8EncodingInfo *ei, wchar_t *pwc, char **s, size_t n, _UTF8State *psenc, size_t *nresult) { - const char *s0; + char *s0; wchar_t wchar; int i; uint8_t c; s0 = *s; if (s0 == NULL) { _citrus_UTF8_init_state(ei, psenc); *nresult = 0; /* state independent */ return (0); } /* make sure we have the first byte in the buffer */ if (psenc->chlen == 0) { if (n-- < 1) goto restart; psenc->ch[psenc->chlen++] = *s0++; } c = _UTF8_count[psenc->ch[0] & 0xff]; if (c < 1 || c < psenc->chlen) goto ilseq; if (c == 1) wchar = psenc->ch[0] & 0xff; else { while (psenc->chlen < c) { if (n-- < 1) goto restart; psenc->ch[psenc->chlen++] = *s0++; } wchar = psenc->ch[0] & (0x7f >> c); for (i = 1; i < c; i++) { if ((psenc->ch[i] & 0xc0) != 0x80) goto ilseq; wchar <<= 6; wchar |= (psenc->ch[i] & 0x3f); } if (_UTF8_surrogate(wchar) || _UTF8_findlen(wchar) != c) goto ilseq; } if (pwc != NULL) *pwc = wchar; *nresult = (wchar == 0) ? 0 : s0 - *s; *s = s0; psenc->chlen = 0; return (0); ilseq: *nresult = (size_t)-1; return (EILSEQ); restart: *s = s0; *nresult = (size_t)-2; return (0); } static int _citrus_UTF8_wcrtomb_priv(_UTF8EncodingInfo *ei __unused, char *s, size_t n, wchar_t wc, _UTF8State *psenc __unused, size_t *nresult) { wchar_t c; size_t cnt; int i, ret; if (_UTF8_surrogate(wc)) { ret = EILSEQ; goto err; } cnt = _UTF8_findlen(wc); if (cnt <= 0 || cnt > 6) { /* invalid UCS4 value */ ret = EILSEQ; goto err; } if (n < cnt) { /* bound check failure */ ret = E2BIG; goto err; } c = wc; if (s) { for (i = cnt - 1; i > 0; i--) { s[i] = 0x80 | (c & 0x3f); c >>= 6; } s[0] = c; if (cnt == 1) s[0] &= 0x7f; else { s[0] &= (0x7f >> cnt); s[0] |= ((0xff00 >> cnt) & 0xff); } } *nresult = (size_t)cnt; return (0); err: *nresult = (size_t)-1; return (ret); } static __inline int /*ARGSUSED*/ _citrus_UTF8_stdenc_wctocs(_UTF8EncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = 0; *idx = (_citrus_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF8_stdenc_cstowc(_UTF8EncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { if (csid != 0) return (EILSEQ); *wc = (wchar_t)idx; return (0); } static __inline int /*ARGSUSED*/ _citrus_UTF8_stdenc_get_state_desc_generic(_UTF8EncodingInfo * __restrict ei __unused, _UTF8State * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } static int /*ARGSUSED*/ _citrus_UTF8_encoding_module_init(_UTF8EncodingInfo * __restrict ei __unused, const void * __restrict var __unused, size_t lenvar __unused) { return (0); } static void /*ARGSUSED*/ _citrus_UTF8_encoding_module_uninit(_UTF8EncodingInfo *ei __unused) { } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(UTF8); _CITRUS_STDENC_DEF_OPS(UTF8); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/VIQR/citrus_viqr.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/VIQR/citrus_viqr.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/VIQR/citrus_viqr.c (revision 281585) @@ -1,498 +1,498 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_viqr.c,v 1.5 2011/11/19 18:20:13 tnozaki Exp $ */ /*- * Copyright (c)2006 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_bcs.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_viqr.h" #define ESCAPE '\\' /* * this table generated from RFC 1456. */ static const char *mnemonic_rfc1456[0x100] = { NULL , NULL , "A(?", NULL , NULL , "A(~", "A^~", NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , "Y?" , NULL , NULL , NULL , NULL , "Y~" , NULL , NULL , NULL , NULL , "Y." , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL , "A." , "A('", "A(`", "A(.", "A^'", "A^`", "A^?", "A^.", "E~" , "E." , "E^'", "E^`", "E^?", "E^~", "E^.", "O^'", "O^`", "O^?", "O^~", "O^.", "O+.", "O+'", "O+`", "O+?", "I." , "O?" , "O." , "I?" , "U?" , "U~" , "U." , "Y`" , "O~" , "a('", "a(`", "a(.", "a^'", "a^`", "a^?", "a^.", "e~" , "e." , "e^'", "e^`", "e^?", "e^~", "e^.", "o^'", "o^`", "o^?", "o^~", "O+~", "O+" , "o^.", "o+`", "o+?", "i." , "U+.", "U+'", "U+`", "U+?", "o+" , "o+'", "U+" , "A`" , "A'" , "A^" , "A~" , "A?" , "A(" , "a(?", "a(~", "E`" , "E'" , "E^" , "E?" , "I`" , "I'" , "I~" , "y`" , "DD" , "u+'", "O`" , "O'" , "O^" , "a." , "y?" , "u+`", "u+?", "U`" , "U'" , "y~" , "y." , "Y'" , "o+~", "u+" , "a`" , "a'" , "a^" , "a~" , "a?" , "a(" , "u+~", "a^~", "e`" , "e'" , "e^" , "e?" , "i`" , "i'" , "i~" , "i?" , "dd" , "u+.", "o`" , "o'" , "o^" , "o~" , "o?" , "o." , "u." , "u`" , "u'" , "u~" , "u?" , "y'" , "o+.", "U+~", }; typedef struct { const char *name; wchar_t value; } mnemonic_def_t; static const mnemonic_def_t mnemonic_ext[] = { /* add extra mnemonic here (should be sorted by wchar_t order). */ }; static const size_t mnemonic_ext_size = sizeof(mnemonic_ext) / sizeof(mnemonic_def_t); static const char * mnemonic_ext_find(wchar_t wc, const mnemonic_def_t *head, size_t n) { const mnemonic_def_t *mid; for (; n > 0; n >>= 1) { mid = head + (n >> 1); if (mid->value == wc) return (mid->name); else if (mid->value < wc) { head = mid + 1; --n; } } return (NULL); } struct mnemonic_t; typedef TAILQ_HEAD(mnemonic_list_t, mnemonic_t) mnemonic_list_t; typedef struct mnemonic_t { TAILQ_ENTRY(mnemonic_t) entry; struct mnemonic_t *parent; mnemonic_list_t child; wchar_t value; int ascii; } mnemonic_t; static mnemonic_t * mnemonic_list_find(mnemonic_list_t *ml, int ch) { mnemonic_t *m; TAILQ_FOREACH(m, ml, entry) { if (m->ascii == ch) return (m); } return (NULL); } static mnemonic_t * mnemonic_create(mnemonic_t *parent, int ascii, wchar_t value) { mnemonic_t *m; m = malloc(sizeof(*m)); if (m != NULL) { m->parent = parent; m->ascii = ascii; m->value = value; TAILQ_INIT(&m->child); } return (m); } static int mnemonic_append_child(mnemonic_t *m, const char *s, wchar_t value, wchar_t invalid) { mnemonic_t *m0; int ch; ch = (unsigned char)*s++; if (ch == '\0') return (EINVAL); m0 = mnemonic_list_find(&m->child, ch); if (m0 == NULL) { m0 = mnemonic_create(m, ch, (wchar_t)ch); if (m0 == NULL) return (ENOMEM); TAILQ_INSERT_TAIL(&m->child, m0, entry); } m = m0; for (m0 = NULL; (ch = (unsigned char)*s) != '\0'; ++s) { m0 = mnemonic_list_find(&m->child, ch); if (m0 == NULL) { m0 = mnemonic_create(m, ch, invalid); if (m0 == NULL) return (ENOMEM); TAILQ_INSERT_TAIL(&m->child, m0, entry); } m = m0; } if (m0 == NULL) return (EINVAL); m0->value = value; return (0); } static void mnemonic_destroy(mnemonic_t *m) { mnemonic_t *m0; TAILQ_FOREACH(m0, &m->child, entry) mnemonic_destroy(m0); free(m); } typedef struct { mnemonic_t *mroot; wchar_t invalid; size_t mb_cur_max; } _VIQREncodingInfo; typedef struct { int chlen; char ch[MB_LEN_MAX]; } _VIQRState; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_VIQR_##m #define _ENCODING_INFO _VIQREncodingInfo #define _ENCODING_STATE _VIQRState #define _ENCODING_MB_CUR_MAX(_ei_) (_ei_)->mb_cur_max #define _ENCODING_IS_STATE_DEPENDENT 1 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) 0 static __inline void /*ARGSUSED*/ _citrus_VIQR_init_state(_VIQREncodingInfo * __restrict ei __unused, _VIQRState * __restrict psenc) { psenc->chlen = 0; } #if 0 static __inline void /*ARGSUSED*/ _citrus_VIQR_pack_state(_VIQREncodingInfo * __restrict ei __unused, void *__restrict pspriv, const _VIQRState * __restrict psenc) { memcpy(pspriv, (const void *)psenc, sizeof(*psenc)); } static __inline void /*ARGSUSED*/ _citrus_VIQR_unpack_state(_VIQREncodingInfo * __restrict ei __unused, _VIQRState * __restrict psenc, const void * __restrict pspriv) { memcpy((void *)psenc, pspriv, sizeof(*psenc)); } #endif static int _citrus_VIQR_mbrtowc_priv(_VIQREncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char ** __restrict s, size_t n, + wchar_t * __restrict pwc, char ** __restrict s, size_t n, _VIQRState * __restrict psenc, size_t * __restrict nresult) { mnemonic_t *m, *m0; - const char *s0; + char *s0; wchar_t wc; ssize_t i; int ch, escape; if (*s == NULL) { _citrus_VIQR_init_state(ei, psenc); *nresult = (size_t)_ENCODING_IS_STATE_DEPENDENT; return (0); } s0 = *s; i = 0; m = ei->mroot; for (escape = 0;;) { if (psenc->chlen == i) { if (n-- < 1) { *s = s0; *nresult = (size_t)-2; return (0); } psenc->ch[psenc->chlen++] = *s0++; } ch = (unsigned char)psenc->ch[i++]; if (ch == ESCAPE) { if (m != ei->mroot) break; escape = 1; continue; } if (escape != 0) break; m0 = mnemonic_list_find(&m->child, ch); if (m0 == NULL) break; m = m0; } while (m != ei->mroot) { --i; if (m->value != ei->invalid) break; m = m->parent; } if (ch == ESCAPE && m != ei->mroot) ++i; psenc->chlen -= i; memmove(&psenc->ch[0], &psenc->ch[i], psenc->chlen); wc = (m == ei->mroot) ? (wchar_t)ch : m->value; if (pwc != NULL) *pwc = wc; *nresult = (size_t)(wc == 0 ? 0 : s0 - *s); *s = s0; return (0); } static int _citrus_VIQR_wcrtomb_priv(_VIQREncodingInfo * __restrict ei, char * __restrict s, size_t n, wchar_t wc, _VIQRState * __restrict psenc, size_t * __restrict nresult) { mnemonic_t *m; const char *p; int ch = 0; switch (psenc->chlen) { case 0: case 1: break; default: return (EINVAL); } m = NULL; if ((uint32_t)wc <= 0xFF) { p = mnemonic_rfc1456[wc & 0xFF]; if (p != NULL) goto mnemonic_found; if (n-- < 1) goto e2big; ch = (unsigned int)wc; m = ei->mroot; if (psenc->chlen > 0) { m = mnemonic_list_find(&m->child, psenc->ch[0]); if (m == NULL) return (EINVAL); psenc->ch[0] = ESCAPE; } if (mnemonic_list_find(&m->child, ch) == NULL) { psenc->chlen = 0; m = NULL; } psenc->ch[psenc->chlen++] = ch; } else { p = mnemonic_ext_find(wc, &mnemonic_ext[0], mnemonic_ext_size); if (p == NULL) { *nresult = (size_t)-1; return (EILSEQ); } else { mnemonic_found: psenc->chlen = 0; while (*p != '\0') { if (n-- < 1) goto e2big; psenc->ch[psenc->chlen++] = *p++; } } } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; if (m == ei->mroot) { psenc->ch[0] = ch; psenc->chlen = 1; } else psenc->chlen = 0; return (0); e2big: *nresult = (size_t)-1; return (E2BIG); } static int /* ARGSUSED */ _citrus_VIQR_put_state_reset(_VIQREncodingInfo * __restrict ei __unused, char * __restrict s __unused, size_t n __unused, _VIQRState * __restrict psenc, size_t * __restrict nresult) { switch (psenc->chlen) { case 0: case 1: break; default: return (EINVAL); } *nresult = 0; psenc->chlen = 0; return (0); } static __inline int /*ARGSUSED*/ _citrus_VIQR_stdenc_wctocs(_VIQREncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = 0; *idx = (_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_VIQR_stdenc_cstowc(_VIQREncodingInfo * __restrict ei __unused, wchar_t * __restrict pwc, _csid_t csid, _index_t idx) { if (csid != 0) return (EILSEQ); *pwc = (wchar_t)idx; return (0); } static void _citrus_VIQR_encoding_module_uninit(_VIQREncodingInfo *ei) { mnemonic_destroy(ei->mroot); } static int /*ARGSUSED*/ _citrus_VIQR_encoding_module_init(_VIQREncodingInfo * __restrict ei, const void * __restrict var __unused, size_t lenvar __unused) { const char *s; size_t i, n; int errnum; ei->mb_cur_max = 1; ei->invalid = (wchar_t)-1; ei->mroot = mnemonic_create(NULL, '\0', ei->invalid); if (ei->mroot == NULL) return (ENOMEM); for (i = 0; i < sizeof(mnemonic_rfc1456) / sizeof(const char *); ++i) { s = mnemonic_rfc1456[i]; if (s == NULL) continue; n = strlen(s); if (ei->mb_cur_max < n) ei->mb_cur_max = n; errnum = mnemonic_append_child(ei->mroot, s, (wchar_t)i, ei->invalid); if (errnum != 0) { _citrus_VIQR_encoding_module_uninit(ei); return (errnum); } } /* a + 1 < b + 1 here to silence gcc warning about unsigned < 0. */ for (i = 0; i + 1 < mnemonic_ext_size + 1; ++i) { const mnemonic_def_t *p; p = &mnemonic_ext[i]; n = strlen(p->name); if (ei->mb_cur_max < n) ei->mb_cur_max = n; errnum = mnemonic_append_child(ei->mroot, p->name, p->value, ei->invalid); if (errnum != 0) { _citrus_VIQR_encoding_module_uninit(ei); return (errnum); } } return (0); } static __inline int /*ARGSUSED*/ _citrus_VIQR_stdenc_get_state_desc_generic(_VIQREncodingInfo * __restrict ei __unused, _VIQRState * __restrict psenc, int * __restrict rstate) { *rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL : _STDENC_SDGEN_INCOMPLETE_CHAR; return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(VIQR); _CITRUS_STDENC_DEF_OPS(VIQR); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/ZW/citrus_zw.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/ZW/citrus_zw.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/ZW/citrus_zw.c (revision 281585) @@ -1,457 +1,457 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_zw.c,v 1.4 2008/06/14 16:01:08 tnozaki Exp $ */ /*- * Copyright (c)2004, 2006 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include #include #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_stdenc.h" #include "citrus_zw.h" /* ---------------------------------------------------------------------- * private stuffs used by templates */ typedef struct { int dummy; } _ZWEncodingInfo; typedef enum { NONE, AMBIGIOUS, ASCII, GB2312 } _ZWCharset; typedef struct { _ZWCharset charset; int chlen; char ch[4]; } _ZWState; #define _CEI_TO_EI(_cei_) (&(_cei_)->ei) #define _CEI_TO_STATE(_cei_, _func_) (_cei_)->states.s_##_func_ #define _FUNCNAME(m) _citrus_ZW_##m #define _ENCODING_INFO _ZWEncodingInfo #define _ENCODING_STATE _ZWState #define _ENCODING_MB_CUR_MAX(_ei_) MB_LEN_MAX #define _ENCODING_IS_STATE_DEPENDENT 1 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_) ((_ps_)->charset != NONE) static __inline void /*ARGSUSED*/ _citrus_ZW_init_state(_ZWEncodingInfo * __restrict ei __unused, _ZWState * __restrict psenc) { psenc->chlen = 0; psenc->charset = NONE; } #if 0 static __inline void /*ARGSUSED*/ _citrus_ZW_pack_state(_ZWEncodingInfo * __restrict ei __unused, void *__restrict pspriv, const _ZWState * __restrict psenc) { memcpy(pspriv, (const void *)psenc, sizeof(*psenc)); } static __inline void /*ARGSUSED*/ _citrus_ZW_unpack_state(_ZWEncodingInfo * __restrict ei __unused, _ZWState * __restrict psenc, const void * __restrict pspriv) { memcpy((void *)psenc, pspriv, sizeof(*psenc)); } #endif static int _citrus_ZW_mbrtowc_priv(_ZWEncodingInfo * __restrict ei, - wchar_t * __restrict pwc, const char **__restrict s, size_t n, + wchar_t * __restrict pwc, char **__restrict s, size_t n, _ZWState * __restrict psenc, size_t * __restrict nresult) { - const char *s0; + char *s0; wchar_t wc; int ch, len; if (*s == NULL) { _citrus_ZW_init_state(ei, psenc); *nresult = (size_t)_ENCODING_IS_STATE_DEPENDENT; return (0); } s0 = *s; len = 0; #define STORE \ do { \ if (n-- < 1) { \ *nresult = (size_t)-2; \ *s = s0; \ return (0); \ } \ ch = (unsigned char)*s0++; \ if (len++ > MB_LEN_MAX || ch > 0x7F)\ goto ilseq; \ psenc->ch[psenc->chlen++] = ch; \ } while (/*CONSTCOND*/0) loop: switch (psenc->charset) { case ASCII: switch (psenc->chlen) { case 0: STORE; switch (psenc->ch[0]) { case '\0': case '\n': psenc->charset = NONE; } /*FALLTHROUGH*/ case 1: break; default: return (EINVAL); } ch = (unsigned char)psenc->ch[0]; if (ch > 0x7F) goto ilseq; wc = (wchar_t)ch; psenc->chlen = 0; break; case NONE: if (psenc->chlen != 0) return (EINVAL); STORE; ch = (unsigned char)psenc->ch[0]; if (ch != 'z') { if (ch != '\n' && ch != '\0') psenc->charset = ASCII; wc = (wchar_t)ch; psenc->chlen = 0; break; } psenc->charset = AMBIGIOUS; psenc->chlen = 0; /* FALLTHROUGH */ case AMBIGIOUS: if (psenc->chlen != 0) return (EINVAL); STORE; if (psenc->ch[0] != 'W') { psenc->charset = ASCII; wc = L'z'; break; } psenc->charset = GB2312; psenc->chlen = 0; /* FALLTHROUGH */ case GB2312: switch (psenc->chlen) { case 0: STORE; ch = (unsigned char)psenc->ch[0]; if (ch == '\0') { psenc->charset = NONE; wc = (wchar_t)ch; psenc->chlen = 0; break; } else if (ch == '\n') { psenc->charset = NONE; psenc->chlen = 0; goto loop; } /*FALLTHROUGH*/ case 1: STORE; if (psenc->ch[0] == ' ') { ch = (unsigned char)psenc->ch[1]; wc = (wchar_t)ch; psenc->chlen = 0; break; } else if (psenc->ch[0] == '#') { ch = (unsigned char)psenc->ch[1]; if (ch == '\n') { psenc->charset = NONE; wc = (wchar_t)ch; psenc->chlen = 0; break; } else if (ch == ' ') { wc = (wchar_t)ch; psenc->chlen = 0; break; } } ch = (unsigned char)psenc->ch[0]; if (ch < 0x21 || ch > 0x7E) goto ilseq; wc = (wchar_t)(ch << 8); ch = (unsigned char)psenc->ch[1]; if (ch < 0x21 || ch > 0x7E) { ilseq: *nresult = (size_t)-1; return (EILSEQ); } wc |= (wchar_t)ch; psenc->chlen = 0; break; default: return (EINVAL); } break; default: return (EINVAL); } if (pwc != NULL) *pwc = wc; *nresult = (size_t)(wc == 0 ? 0 : len); *s = s0; return (0); } static int /*ARGSUSED*/ _citrus_ZW_wcrtomb_priv(_ZWEncodingInfo * __restrict ei __unused, char *__restrict s, size_t n, wchar_t wc, _ZWState * __restrict psenc, size_t * __restrict nresult) { int ch; if (psenc->chlen != 0) return (EINVAL); if ((uint32_t)wc <= 0x7F) { ch = (unsigned char)wc; switch (psenc->charset) { case NONE: if (ch == '\0' || ch == '\n') psenc->ch[psenc->chlen++] = ch; else { if (n < 4) return (E2BIG); n -= 4; psenc->ch[psenc->chlen++] = 'z'; psenc->ch[psenc->chlen++] = 'W'; psenc->ch[psenc->chlen++] = ' '; psenc->ch[psenc->chlen++] = ch; psenc->charset = GB2312; } break; case GB2312: if (n < 2) return (E2BIG); n -= 2; if (ch == '\0') { psenc->ch[psenc->chlen++] = '\n'; psenc->ch[psenc->chlen++] = '\0'; psenc->charset = NONE; } else if (ch == '\n') { psenc->ch[psenc->chlen++] = '#'; psenc->ch[psenc->chlen++] = '\n'; psenc->charset = NONE; } else { psenc->ch[psenc->chlen++] = ' '; psenc->ch[psenc->chlen++] = ch; } break; default: return (EINVAL); } } else if ((uint32_t)wc <= 0x7E7E) { switch (psenc->charset) { case NONE: if (n < 2) return (E2BIG); n -= 2; psenc->ch[psenc->chlen++] = 'z'; psenc->ch[psenc->chlen++] = 'W'; psenc->charset = GB2312; /* FALLTHROUGH*/ case GB2312: if (n < 2) return (E2BIG); n -= 2; ch = (wc >> 8) & 0xFF; if (ch < 0x21 || ch > 0x7E) goto ilseq; psenc->ch[psenc->chlen++] = ch; ch = wc & 0xFF; if (ch < 0x21 || ch > 0x7E) goto ilseq; psenc->ch[psenc->chlen++] = ch; break; default: return (EINVAL); } } else { ilseq: *nresult = (size_t)-1; return (EILSEQ); } memcpy(s, psenc->ch, psenc->chlen); *nresult = psenc->chlen; psenc->chlen = 0; return (0); } static int /*ARGSUSED*/ _citrus_ZW_put_state_reset(_ZWEncodingInfo * __restrict ei __unused, char * __restrict s, size_t n, _ZWState * __restrict psenc, size_t * __restrict nresult) { if (psenc->chlen != 0) return (EINVAL); switch (psenc->charset) { case GB2312: if (n-- < 1) return (E2BIG); psenc->ch[psenc->chlen++] = '\n'; psenc->charset = NONE; /*FALLTHROUGH*/ case NONE: *nresult = psenc->chlen; if (psenc->chlen > 0) { memcpy(s, psenc->ch, psenc->chlen); psenc->chlen = 0; } break; default: return (EINVAL); } return (0); } static __inline int /*ARGSUSED*/ _citrus_ZW_stdenc_get_state_desc_generic(_ZWEncodingInfo * __restrict ei __unused, _ZWState * __restrict psenc, int * __restrict rstate) { switch (psenc->charset) { case NONE: if (psenc->chlen != 0) return (EINVAL); *rstate = _STDENC_SDGEN_INITIAL; break; case AMBIGIOUS: if (psenc->chlen != 0) return (EINVAL); *rstate = _STDENC_SDGEN_INCOMPLETE_SHIFT; break; case ASCII: case GB2312: switch (psenc->chlen) { case 0: *rstate = _STDENC_SDGEN_STABLE; break; case 1: *rstate = (psenc->ch[0] == '#') ? _STDENC_SDGEN_INCOMPLETE_SHIFT : _STDENC_SDGEN_INCOMPLETE_CHAR; break; default: return (EINVAL); } break; default: return (EINVAL); } return (0); } static __inline int /*ARGSUSED*/ _citrus_ZW_stdenc_wctocs(_ZWEncodingInfo * __restrict ei __unused, _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc) { *csid = (_csid_t)(wc <= (wchar_t)0x7FU) ? 0 : 1; *idx = (_index_t)wc; return (0); } static __inline int /*ARGSUSED*/ _citrus_ZW_stdenc_cstowc(_ZWEncodingInfo * __restrict ei __unused, wchar_t * __restrict wc, _csid_t csid, _index_t idx) { switch (csid) { case 0: case 1: break; default: return (EINVAL); } *wc = (wchar_t)idx; return (0); } static void /*ARGSUSED*/ _citrus_ZW_encoding_module_uninit(_ZWEncodingInfo *ei __unused) { } static int /*ARGSUSED*/ _citrus_ZW_encoding_module_init(_ZWEncodingInfo * __restrict ei __unused, const void *__restrict var __unused, size_t lenvar __unused) { return (0); } /* ---------------------------------------------------------------------- * public interface for stdenc */ _CITRUS_STDENC_DECLS(ZW); _CITRUS_STDENC_DEF_OPS(ZW); #include "citrus_stdenc_template.h" Index: user/ngie/more-tests/lib/libiconv_modules/iconv_none/citrus_iconv_none.c =================================================================== --- user/ngie/more-tests/lib/libiconv_modules/iconv_none/citrus_iconv_none.c (revision 281584) +++ user/ngie/more-tests/lib/libiconv_modules/iconv_none/citrus_iconv_none.c (revision 281585) @@ -1,127 +1,127 @@ /* $FreeBSD$ */ /* $NetBSD: citrus_iconv_none.c,v 1.3 2011/05/23 14:45:44 joerg Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include "citrus_types.h" #include "citrus_module.h" #include "citrus_hash.h" #include "citrus_iconv.h" #include "citrus_iconv_none.h" /* ---------------------------------------------------------------------- */ _CITRUS_ICONV_DECLS(iconv_none); _CITRUS_ICONV_DEF_OPS(iconv_none); /* ---------------------------------------------------------------------- */ int _citrus_iconv_none_iconv_getops(struct _citrus_iconv_ops *ops) { memcpy(ops, &_citrus_iconv_none_iconv_ops, sizeof(_citrus_iconv_none_iconv_ops)); return (0); } static int /*ARGSUSED*/ _citrus_iconv_none_iconv_init_shared( struct _citrus_iconv_shared * __restrict ci, const char * __restrict in __unused, const char * __restrict out __unused) { ci->ci_closure = NULL; return (0); } static void /*ARGSUSED*/ _citrus_iconv_none_iconv_uninit_shared(struct _citrus_iconv_shared *ci __unused) { } static int /*ARGSUSED*/ _citrus_iconv_none_iconv_init_context(struct _citrus_iconv *cv) { cv->cv_closure = NULL; return (0); } static void /*ARGSUSED*/ _citrus_iconv_none_iconv_uninit_context(struct _citrus_iconv *cv __unused) { } static int /*ARGSUSED*/ _citrus_iconv_none_iconv_convert(struct _citrus_iconv * __restrict ci __unused, - const char * __restrict * __restrict in, size_t * __restrict inbytes, + char * __restrict * __restrict in, size_t * __restrict inbytes, char * __restrict * __restrict out, size_t * __restrict outbytes, uint32_t flags __unused, size_t * __restrict invalids) { size_t len; int e2big; if ((in == NULL) || (out == NULL) || (inbytes == NULL)) return (0); if ((*in == NULL) || (*out == NULL) || (*inbytes == 0) || (*outbytes == 0)) return (0); len = *inbytes; e2big = 0; if (*outbytes #include #include #include #include #include #include #include #include #include #include "citrus_namespace.h" #include "citrus_types.h" #include "citrus_module.h" #include "citrus_region.h" #include "citrus_mmap.h" #include "citrus_hash.h" #include "citrus_iconv.h" #include "citrus_stdenc.h" #include "citrus_mapper.h" #include "citrus_csmapper.h" #include "citrus_memstream.h" #include "citrus_iconv_std.h" #include "citrus_esdb.h" /* ---------------------------------------------------------------------- */ _CITRUS_ICONV_DECLS(iconv_std); _CITRUS_ICONV_DEF_OPS(iconv_std); /* ---------------------------------------------------------------------- */ int _citrus_iconv_std_iconv_getops(struct _citrus_iconv_ops *ops) { memcpy(ops, &_citrus_iconv_std_iconv_ops, sizeof(_citrus_iconv_std_iconv_ops)); return (0); } /* ---------------------------------------------------------------------- */ /* * convenience routines for stdenc. */ static __inline void save_encoding_state(struct _citrus_iconv_std_encoding *se) { if (se->se_ps) memcpy(se->se_pssaved, se->se_ps, _stdenc_get_state_size(se->se_handle)); } static __inline void restore_encoding_state(struct _citrus_iconv_std_encoding *se) { if (se->se_ps) memcpy(se->se_ps, se->se_pssaved, _stdenc_get_state_size(se->se_handle)); } static __inline void init_encoding_state(struct _citrus_iconv_std_encoding *se) { if (se->se_ps) _stdenc_init_state(se->se_handle, se->se_ps); } static __inline int mbtocsx(struct _citrus_iconv_std_encoding *se, - _csid_t *csid, _index_t *idx, const char **s, size_t n, size_t *nresult, + _csid_t *csid, _index_t *idx, char **s, size_t n, size_t *nresult, struct iconv_hooks *hooks) { return (_stdenc_mbtocs(se->se_handle, csid, idx, s, n, se->se_ps, nresult, hooks)); } static __inline int cstombx(struct _citrus_iconv_std_encoding *se, char *s, size_t n, _csid_t csid, _index_t idx, size_t *nresult, struct iconv_hooks *hooks) { return (_stdenc_cstomb(se->se_handle, s, n, csid, idx, se->se_ps, nresult, hooks)); } static __inline int wctombx(struct _citrus_iconv_std_encoding *se, char *s, size_t n, _wc_t wc, size_t *nresult, struct iconv_hooks *hooks) { return (_stdenc_wctomb(se->se_handle, s, n, wc, se->se_ps, nresult, hooks)); } static __inline int put_state_resetx(struct _citrus_iconv_std_encoding *se, char *s, size_t n, size_t *nresult) { return (_stdenc_put_state_reset(se->se_handle, s, n, se->se_ps, nresult)); } static __inline int get_state_desc_gen(struct _citrus_iconv_std_encoding *se, int *rstate) { struct _stdenc_state_desc ssd; int ret; ret = _stdenc_get_state_desc(se->se_handle, se->se_ps, _STDENC_SDID_GENERIC, &ssd); if (!ret) *rstate = ssd.u.generic.state; return (ret); } /* * init encoding context */ static int init_encoding(struct _citrus_iconv_std_encoding *se, struct _stdenc *cs, void *ps1, void *ps2) { int ret = -1; se->se_handle = cs; se->se_ps = ps1; se->se_pssaved = ps2; if (se->se_ps) ret = _stdenc_init_state(cs, se->se_ps); if (!ret && se->se_pssaved) ret = _stdenc_init_state(cs, se->se_pssaved); return (ret); } static int open_csmapper(struct _csmapper **rcm, const char *src, const char *dst, unsigned long *rnorm) { struct _csmapper *cm; int ret; ret = _csmapper_open(&cm, src, dst, 0, rnorm); if (ret) return (ret); if (_csmapper_get_src_max(cm) != 1 || _csmapper_get_dst_max(cm) != 1 || _csmapper_get_state_size(cm) != 0) { _csmapper_close(cm); return (EINVAL); } *rcm = cm; return (0); } static void close_dsts(struct _citrus_iconv_std_dst_list *dl) { struct _citrus_iconv_std_dst *sd; while ((sd = TAILQ_FIRST(dl)) != NULL) { TAILQ_REMOVE(dl, sd, sd_entry); _csmapper_close(sd->sd_mapper); free(sd); } } static int open_dsts(struct _citrus_iconv_std_dst_list *dl, const struct _esdb_charset *ec, const struct _esdb *dbdst) { struct _citrus_iconv_std_dst *sd, *sdtmp; unsigned long norm; int i, ret; sd = malloc(sizeof(*sd)); if (sd == NULL) return (errno); for (i = 0; i < dbdst->db_num_charsets; i++) { ret = open_csmapper(&sd->sd_mapper, ec->ec_csname, dbdst->db_charsets[i].ec_csname, &norm); if (ret == 0) { sd->sd_csid = dbdst->db_charsets[i].ec_csid; sd->sd_norm = norm; /* insert this mapper by sorted order. */ TAILQ_FOREACH(sdtmp, dl, sd_entry) { if (sdtmp->sd_norm > norm) { TAILQ_INSERT_BEFORE(sdtmp, sd, sd_entry); sd = NULL; break; } } if (sd) TAILQ_INSERT_TAIL(dl, sd, sd_entry); sd = malloc(sizeof(*sd)); if (sd == NULL) { ret = errno; close_dsts(dl); return (ret); } } else if (ret != ENOENT) { close_dsts(dl); free(sd); return (ret); } } free(sd); return (0); } static void close_srcs(struct _citrus_iconv_std_src_list *sl) { struct _citrus_iconv_std_src *ss; while ((ss = TAILQ_FIRST(sl)) != NULL) { TAILQ_REMOVE(sl, ss, ss_entry); close_dsts(&ss->ss_dsts); free(ss); } } static int open_srcs(struct _citrus_iconv_std_src_list *sl, const struct _esdb *dbsrc, const struct _esdb *dbdst) { struct _citrus_iconv_std_src *ss; int count = 0, i, ret; ss = malloc(sizeof(*ss)); if (ss == NULL) return (errno); TAILQ_INIT(&ss->ss_dsts); for (i = 0; i < dbsrc->db_num_charsets; i++) { ret = open_dsts(&ss->ss_dsts, &dbsrc->db_charsets[i], dbdst); if (ret) goto err; if (!TAILQ_EMPTY(&ss->ss_dsts)) { ss->ss_csid = dbsrc->db_charsets[i].ec_csid; TAILQ_INSERT_TAIL(sl, ss, ss_entry); ss = malloc(sizeof(*ss)); if (ss == NULL) { ret = errno; goto err; } count++; TAILQ_INIT(&ss->ss_dsts); } } free(ss); return (count ? 0 : ENOENT); err: free(ss); close_srcs(sl); return (ret); } /* do convert a character */ #define E_NO_CORRESPONDING_CHAR ENOENT /* XXX */ static int /*ARGSUSED*/ do_conv(const struct _citrus_iconv_std_shared *is, _csid_t *csid, _index_t *idx) { struct _citrus_iconv_std_dst *sd; struct _citrus_iconv_std_src *ss; _index_t tmpidx; int ret; TAILQ_FOREACH(ss, &is->is_srcs, ss_entry) { if (ss->ss_csid == *csid) { TAILQ_FOREACH(sd, &ss->ss_dsts, sd_entry) { ret = _csmapper_convert(sd->sd_mapper, &tmpidx, *idx, NULL); switch (ret) { case _MAPPER_CONVERT_SUCCESS: *csid = sd->sd_csid; *idx = tmpidx; return (0); case _MAPPER_CONVERT_NONIDENTICAL: break; case _MAPPER_CONVERT_SRC_MORE: /*FALLTHROUGH*/ case _MAPPER_CONVERT_DST_MORE: /*FALLTHROUGH*/ case _MAPPER_CONVERT_ILSEQ: return (EILSEQ); case _MAPPER_CONVERT_FATAL: return (EINVAL); } } break; } } return (E_NO_CORRESPONDING_CHAR); } /* ---------------------------------------------------------------------- */ static int /*ARGSUSED*/ _citrus_iconv_std_iconv_init_shared(struct _citrus_iconv_shared *ci, const char * __restrict src, const char * __restrict dst) { struct _citrus_esdb esdbdst, esdbsrc; struct _citrus_iconv_std_shared *is; int ret; is = malloc(sizeof(*is)); if (is == NULL) { ret = errno; goto err0; } ret = _citrus_esdb_open(&esdbsrc, src); if (ret) goto err1; ret = _citrus_esdb_open(&esdbdst, dst); if (ret) goto err2; ret = _stdenc_open(&is->is_src_encoding, esdbsrc.db_encname, esdbsrc.db_variable, esdbsrc.db_len_variable); if (ret) goto err3; ret = _stdenc_open(&is->is_dst_encoding, esdbdst.db_encname, esdbdst.db_variable, esdbdst.db_len_variable); if (ret) goto err4; is->is_use_invalid = esdbdst.db_use_invalid; is->is_invalid = esdbdst.db_invalid; TAILQ_INIT(&is->is_srcs); ret = open_srcs(&is->is_srcs, &esdbsrc, &esdbdst); if (ret) goto err5; _esdb_close(&esdbsrc); _esdb_close(&esdbdst); ci->ci_closure = is; return (0); err5: _stdenc_close(is->is_dst_encoding); err4: _stdenc_close(is->is_src_encoding); err3: _esdb_close(&esdbdst); err2: _esdb_close(&esdbsrc); err1: free(is); err0: return (ret); } static void _citrus_iconv_std_iconv_uninit_shared(struct _citrus_iconv_shared *ci) { struct _citrus_iconv_std_shared *is = ci->ci_closure; if (is == NULL) return; _stdenc_close(is->is_src_encoding); _stdenc_close(is->is_dst_encoding); close_srcs(&is->is_srcs); free(is); } static int _citrus_iconv_std_iconv_init_context(struct _citrus_iconv *cv) { const struct _citrus_iconv_std_shared *is = cv->cv_shared->ci_closure; struct _citrus_iconv_std_context *sc; char *ptr; size_t sz, szpsdst, szpssrc; szpssrc = _stdenc_get_state_size(is->is_src_encoding); szpsdst = _stdenc_get_state_size(is->is_dst_encoding); sz = (szpssrc + szpsdst)*2 + sizeof(struct _citrus_iconv_std_context); sc = malloc(sz); if (sc == NULL) return (errno); ptr = (char *)&sc[1]; if (szpssrc > 0) init_encoding(&sc->sc_src_encoding, is->is_src_encoding, ptr, ptr+szpssrc); else init_encoding(&sc->sc_src_encoding, is->is_src_encoding, NULL, NULL); ptr += szpssrc*2; if (szpsdst > 0) init_encoding(&sc->sc_dst_encoding, is->is_dst_encoding, ptr, ptr+szpsdst); else init_encoding(&sc->sc_dst_encoding, is->is_dst_encoding, NULL, NULL); cv->cv_closure = (void *)sc; return (0); } static void _citrus_iconv_std_iconv_uninit_context(struct _citrus_iconv *cv) { free(cv->cv_closure); } static int _citrus_iconv_std_iconv_convert(struct _citrus_iconv * __restrict cv, - const char * __restrict * __restrict in, size_t * __restrict inbytes, + char * __restrict * __restrict in, size_t * __restrict inbytes, char * __restrict * __restrict out, size_t * __restrict outbytes, uint32_t flags, size_t * __restrict invalids) { const struct _citrus_iconv_std_shared *is = cv->cv_shared->ci_closure; struct _citrus_iconv_std_context *sc = cv->cv_closure; _csid_t csid; _index_t idx; - const char *tmpin; + char *tmpin; size_t inval, szrin, szrout; int ret, state = 0; inval = 0; if (in == NULL || *in == NULL) { /* special cases */ if (out != NULL && *out != NULL) { /* init output state and store the shift sequence */ save_encoding_state(&sc->sc_src_encoding); save_encoding_state(&sc->sc_dst_encoding); szrout = 0; ret = put_state_resetx(&sc->sc_dst_encoding, *out, *outbytes, &szrout); if (ret) goto err; if (szrout == (size_t)-2) { /* too small to store the character */ ret = EINVAL; goto err; } *out += szrout; *outbytes -= szrout; } else /* otherwise, discard the shift sequence */ init_encoding_state(&sc->sc_dst_encoding); init_encoding_state(&sc->sc_src_encoding); *invalids = 0; return (0); } /* normal case */ for (;;) { if (*inbytes == 0) { ret = get_state_desc_gen(&sc->sc_src_encoding, &state); if (state == _STDENC_SDGEN_INITIAL || state == _STDENC_SDGEN_STABLE) break; } /* save the encoding states for the error recovery */ save_encoding_state(&sc->sc_src_encoding); save_encoding_state(&sc->sc_dst_encoding); /* mb -> csid/index */ tmpin = *in; szrin = szrout = 0; ret = mbtocsx(&sc->sc_src_encoding, &csid, &idx, &tmpin, *inbytes, &szrin, cv->cv_shared->ci_hooks); if (ret) goto err; if (szrin == (size_t)-2) { /* incompleted character */ ret = get_state_desc_gen(&sc->sc_src_encoding, &state); if (ret) { ret = EINVAL; goto err; } switch (state) { case _STDENC_SDGEN_INITIAL: case _STDENC_SDGEN_STABLE: /* fetch shift sequences only. */ goto next; } ret = EINVAL; goto err; } /* convert the character */ ret = do_conv(is, &csid, &idx); if (ret) { if (ret == E_NO_CORRESPONDING_CHAR) { /* * GNU iconv returns EILSEQ when no * corresponding character in the output. * Some software depends on this behavior * though this is against POSIX specification. */ if (cv->cv_shared->ci_ilseq_invalid != 0) { ret = EILSEQ; goto err; } inval++; szrout = 0; if ((((flags & _CITRUS_ICONV_F_HIDE_INVALID) == 0) && !cv->cv_shared->ci_discard_ilseq) && is->is_use_invalid) { ret = wctombx(&sc->sc_dst_encoding, *out, *outbytes, is->is_invalid, &szrout, cv->cv_shared->ci_hooks); if (ret) goto err; } goto next; } else goto err; } /* csid/index -> mb */ ret = cstombx(&sc->sc_dst_encoding, *out, *outbytes, csid, idx, &szrout, cv->cv_shared->ci_hooks); if (ret) goto err; next: *inbytes -= tmpin-*in; /* szrin is insufficient on \0. */ *in = tmpin; *outbytes -= szrout; *out += szrout; } *invalids = inval; return (0); err: restore_encoding_state(&sc->sc_src_encoding); restore_encoding_state(&sc->sc_dst_encoding); *invalids = inval; return (ret); } Index: user/ngie/more-tests/lib/libkiconv/xlat16_iconv.c =================================================================== --- user/ngie/more-tests/lib/libkiconv/xlat16_iconv.c (revision 281584) +++ user/ngie/more-tests/lib/libkiconv/xlat16_iconv.c (revision 281585) @@ -1,465 +1,464 @@ /*- * Copyright (c) 2003, 2005 Ryuichiro Imura * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * kiconv(3) requires shared linked, and reduce module size * when statically linked. */ #ifdef PIC #include #include #include #include #include #include #include #include #include #include #include #include #include "quirks.h" struct xlat16_table { uint32_t * idx[0x200]; void * data; size_t size; }; static struct xlat16_table kiconv_xlat16_open(const char *, const char *, int); static int chklocale(int, const char *); #ifdef ICONV_DLOPEN typedef void *iconv_t; static int my_iconv_init(void); static iconv_t (*my_iconv_open)(const char *, const char *); -static size_t (*my_iconv)(iconv_t, const char **, size_t *, char **, size_t *); +static size_t (*my_iconv)(iconv_t, char **, size_t *, char **, size_t *); static int (*my_iconv_close)(iconv_t); #else #include #define my_iconv_init() 0 #define my_iconv_open iconv_open #define my_iconv iconv #define my_iconv_close iconv_close #endif -static size_t my_iconv_char(iconv_t, const u_char **, size_t *, u_char **, size_t *); +static size_t my_iconv_char(iconv_t, u_char **, size_t *, u_char **, size_t *); int kiconv_add_xlat16_cspair(const char *tocode, const char *fromcode, int flag) { int error; size_t idxsize; struct xlat16_table xt; void *data; char *p; const char unicode[] = ENCODING_UNICODE; if ((flag & KICONV_WCTYPE) == 0 && strcmp(unicode, tocode) != 0 && strcmp(unicode, fromcode) != 0 && kiconv_lookupconv(unicode) == 0) { error = kiconv_add_xlat16_cspair(unicode, fromcode, flag); if (error) return (-1); error = kiconv_add_xlat16_cspair(tocode, unicode, flag); return (error); } if (kiconv_lookupcs(tocode, fromcode) == 0) return (0); if (flag & KICONV_WCTYPE) xt = kiconv_xlat16_open(fromcode, fromcode, flag); else xt = kiconv_xlat16_open(tocode, fromcode, flag); if (xt.size == 0) return (-1); idxsize = sizeof(xt.idx); if ((idxsize + xt.size) > ICONV_CSMAXDATALEN) { errno = E2BIG; return (-1); } if ((data = malloc(idxsize + xt.size)) != NULL) { p = data; memcpy(p, xt.idx, idxsize); p += idxsize; memcpy(p, xt.data, xt.size); error = kiconv_add_xlat16_table(tocode, fromcode, data, (int)(idxsize + xt.size)); return (error); } return (-1); } int kiconv_add_xlat16_cspairs(const char *foreigncode, const char *localcode) { int error, locale; error = kiconv_add_xlat16_cspair(foreigncode, localcode, KICONV_FROM_LOWER | KICONV_FROM_UPPER); if (error) return (error); error = kiconv_add_xlat16_cspair(localcode, foreigncode, KICONV_LOWER | KICONV_UPPER); if (error) return (error); locale = chklocale(LC_CTYPE, localcode); if (locale == 0) { error = kiconv_add_xlat16_cspair(KICONV_WCTYPE_NAME, localcode, KICONV_WCTYPE); if (error) return (error); } return (0); } static struct xlat16_table kiconv_xlat16_open(const char *tocode, const char *fromcode, int lcase) { u_char src[3], dst[4], *srcp, *dstp, ud, ld; int us, ls, ret; uint16_t c; uint32_t table[0x80]; size_t inbytesleft, outbytesleft, pre_q_size, post_q_size; struct xlat16_table xt; struct quirk_replace_list *pre_q_list, *post_q_list; iconv_t cd; char *p; xt.data = NULL; xt.size = 0; src[2] = '\0'; dst[3] = '\0'; ret = my_iconv_init(); if (ret) return (xt); cd = my_iconv_open(search_quirk(tocode, fromcode, &pre_q_list, &pre_q_size), search_quirk(fromcode, tocode, &post_q_list, &post_q_size)); if (cd == (iconv_t) (-1)) return (xt); if ((xt.data = malloc(0x200 * 0x80 * sizeof(uint32_t))) == NULL) return (xt); p = xt.data; for (ls = 0 ; ls < 0x200 ; ls++) { xt.idx[ls] = NULL; for (us = 0 ; us < 0x80 ; us++) { srcp = src; dstp = dst; inbytesleft = 2; outbytesleft = 3; bzero(dst, outbytesleft); c = ((ls & 0x100 ? us | 0x80 : us) << 8) | (u_char)ls; if (lcase & KICONV_WCTYPE) { if ((c & 0xff) == 0) c >>= 8; if (iswupper(c)) { c = towlower(c); if ((c & 0xff00) == 0) c <<= 8; table[us] = c | XLAT16_HAS_LOWER_CASE; } else if (iswlower(c)) { c = towupper(c); if ((c & 0xff00) == 0) c <<= 8; table[us] = c | XLAT16_HAS_UPPER_CASE; } else table[us] = 0; /* * store not NULL */ if (table[us]) xt.idx[ls] = table; continue; } c = quirk_vendor2unix(c, pre_q_list, pre_q_size); src[0] = (u_char)(c >> 8); src[1] = (u_char)c; - ret = my_iconv_char(cd, (const u_char **)&srcp, - &inbytesleft, &dstp, &outbytesleft); + ret = my_iconv_char(cd, &srcp, &inbytesleft, + &dstp, &outbytesleft); if (ret == -1) { table[us] = 0; continue; } ud = (u_char)dst[0]; ld = (u_char)dst[1]; switch(outbytesleft) { case 0: #ifdef XLAT16_ACCEPT_3BYTE_CHR table[us] = (ud << 8) | ld; table[us] |= (u_char)dst[2] << 16; table[us] |= XLAT16_IS_3BYTE_CHR; #else table[us] = 0; continue; #endif break; case 1: table[us] = quirk_unix2vendor((ud << 8) | ld, post_q_list, post_q_size); if ((table[us] >> 8) == 0) table[us] |= XLAT16_ACCEPT_NULL_OUT; break; case 2: table[us] = ud; if (lcase & KICONV_LOWER && ud != tolower(ud)) { table[us] |= (u_char)tolower(ud) << 16; table[us] |= XLAT16_HAS_LOWER_CASE; } if (lcase & KICONV_UPPER && ud != toupper(ud)) { table[us] |= (u_char)toupper(ud) << 16; table[us] |= XLAT16_HAS_UPPER_CASE; } break; } switch(inbytesleft) { case 0: if ((ls & 0xff) == 0) table[us] |= XLAT16_ACCEPT_NULL_IN; break; case 1: c = ls > 0xff ? us | 0x80 : us; if (lcase & KICONV_FROM_LOWER && c != tolower(c)) { table[us] |= (u_char)tolower(c) << 16; table[us] |= XLAT16_HAS_FROM_LOWER_CASE; } if (lcase & KICONV_FROM_UPPER && c != toupper(c)) { table[us] |= (u_char)toupper(c) << 16; table[us] |= XLAT16_HAS_FROM_UPPER_CASE; } break; } if (table[us] == 0) continue; /* * store not NULL */ xt.idx[ls] = table; } if (xt.idx[ls]) { memcpy(p, table, sizeof(table)); p += sizeof(table); } } my_iconv_close(cd); xt.size = p - (char *)xt.data; xt.data = realloc(xt.data, xt.size); return (xt); } static int chklocale(int category, const char *code) { char *p; int error = -1; p = strchr(setlocale(category, NULL), '.'); if (p++) { error = strcasecmp(code, p); if (error) { /* XXX - can't avoid calling quirk here... */ error = strcasecmp(code, kiconv_quirkcs(p, KICONV_VENDOR_MICSFT)); } } return (error); } #ifdef ICONV_DLOPEN static int my_iconv_init(void) { void *iconv_lib; iconv_lib = dlopen("libiconv.so", RTLD_LAZY | RTLD_GLOBAL); if (iconv_lib == NULL) { warn("Unable to load iconv library: %s\n", dlerror()); errno = ENOENT; return (-1); } my_iconv_open = dlsym(iconv_lib, "iconv_open"); my_iconv = dlsym(iconv_lib, "iconv"); my_iconv_close = dlsym(iconv_lib, "iconv_close"); return (0); } #endif static size_t -my_iconv_char(iconv_t cd, const u_char **ibuf, size_t * ilen, u_char **obuf, +my_iconv_char(iconv_t cd, u_char **ibuf, size_t * ilen, u_char **obuf, size_t * olen) { - const u_char *sp; - u_char *dp, ilocal[3], olocal[3]; + u_char *sp, *dp, ilocal[3], olocal[3]; u_char c1, c2; int ret; size_t ir, or; sp = *ibuf; dp = *obuf; ir = *ilen; bzero(*obuf, *olen); - ret = my_iconv(cd, (const char **)&sp, ilen, (char **)&dp, olen); + ret = my_iconv(cd, (char **)&sp, ilen, (char **)&dp, olen); c1 = (*obuf)[0]; c2 = (*obuf)[1]; if (ret == -1) { if (*ilen == ir - 1 && (*ibuf)[1] == '\0' && (c1 || c2)) return (0); else return (-1); } /* * We must judge if inbuf is a single byte char or double byte char. * Here, to judge, try first byte(*sp) conversion and compare. */ ir = 1; or = 3; bzero(olocal, or); memcpy(ilocal, *ibuf, sizeof(ilocal)); sp = ilocal; dp = olocal; - if ((my_iconv(cd,(const char **)&sp, &ir, (char **)&dp, &or)) != -1) { + if ((my_iconv(cd,(char **)&sp, &ir, (char **)&dp, &or)) != -1) { if (olocal[0] != c1) return (ret); if (olocal[1] == c2 && (*ibuf)[1] == '\0') { /* * inbuf is a single byte char */ *ilen = 1; *olen = or; return (ret); } switch(or) { case 0: case 1: if (olocal[1] == c2) { /* * inbuf is a single byte char, * so return false here. */ return (-1); } else { /* * inbuf is a double byte char */ return (ret); } break; case 2: /* * should compare second byte of inbuf */ break; } } else { /* * inbuf clould not be splitted, so inbuf is * a double byte char. */ return (ret); } /* * try second byte(*(sp+1)) conversion, and compare */ ir = 1; or = 3; bzero(olocal, or); sp = ilocal + 1; dp = olocal; - if ((my_iconv(cd,(const char **)&sp, &ir, (char **)&dp, &or)) != -1) { + if ((my_iconv(cd,(char **)&sp, &ir, (char **)&dp, &or)) != -1) { if (olocal[0] == c2) /* * inbuf is a single byte char */ return (-1); } return (ret); } #else /* statically linked */ #include #include #include int kiconv_add_xlat16_cspair(const char *tocode __unused, const char *fromcode __unused, int flag __unused) { errno = EINVAL; return (-1); } int kiconv_add_xlat16_cspairs(const char *tocode __unused, const char *fromcode __unused) { errno = EINVAL; return (-1); } #endif /* PIC */ Index: user/ngie/more-tests/libexec/rtld-elf/aarch64/reloc.c =================================================================== --- user/ngie/more-tests/libexec/rtld-elf/aarch64/reloc.c (revision 281584) +++ user/ngie/more-tests/libexec/rtld-elf/aarch64/reloc.c (revision 281585) @@ -1,414 +1,414 @@ /*- * Copyright (c) 2014-2015 The FreeBSD Foundation * All rights reserved. * * Portions of this software were developed by Andrew Turner * under sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include "debug.h" #include "rtld.h" #include "rtld_printf.h" /* * It is possible for the compiler to emit relocations for unaligned data. * We handle this situation with these inlines. */ #define RELOC_ALIGNED_P(x) \ (((uintptr_t)(x) & (sizeof(void *) - 1)) == 0) /* * This is not the correct prototype, but we only need it for * a function pointer to a simple asm function. */ void *_rtld_tlsdesc(void *); void *_rtld_tlsdesc_dynamic(void *); void _exit(int); void init_pltgot(Obj_Entry *obj) { if (obj->pltgot != NULL) { obj->pltgot[1] = (Elf_Addr) obj; obj->pltgot[2] = (Elf_Addr) &_rtld_bind_start; } } int do_copy_relocations(Obj_Entry *dstobj) { const Obj_Entry *srcobj, *defobj; const Elf_Rela *relalim; const Elf_Rela *rela; const Elf_Sym *srcsym; const Elf_Sym *dstsym; const void *srcaddr; const char *name; void *dstaddr; SymLook req; size_t size; int res; /* * COPY relocs are invalid outside of the main program */ assert(dstobj->mainprog); relalim = (const Elf_Rela *)((char *)dstobj->rela + dstobj->relasize); for (rela = dstobj->rela; rela < relalim; rela++) { if (ELF_R_TYPE(rela->r_info) != R_AARCH64_COPY) continue; dstaddr = (void *)(dstobj->relocbase + rela->r_offset); dstsym = dstobj->symtab + ELF_R_SYM(rela->r_info); name = dstobj->strtab + dstsym->st_name; size = dstsym->st_size; symlook_init(&req, name); req.ventry = fetch_ventry(dstobj, ELF_R_SYM(rela->r_info)); req.flags = SYMLOOK_EARLY; for (srcobj = dstobj->next; srcobj != NULL; srcobj = srcobj->next) { res = symlook_obj(&req, srcobj); if (res == 0) { srcsym = req.sym_out; defobj = req.defobj_out; break; } } if (srcobj == NULL) { _rtld_error( "Undefined symbol \"%s\" referenced from COPY relocation in %s", name, dstobj->path); return (-1); } srcaddr = (const void *)(defobj->relocbase + srcsym->st_value); memcpy(dstaddr, srcaddr, size); } return (0); } struct tls_data { int64_t index; Obj_Entry *obj; const Elf_Rela *rela; }; static struct tls_data * reloc_tlsdesc_alloc(Obj_Entry *obj, const Elf_Rela *rela) { struct tls_data *tlsdesc; tlsdesc = xmalloc(sizeof(struct tls_data)); tlsdesc->index = -1; tlsdesc->obj = obj; tlsdesc->rela = rela; return (tlsdesc); } /* * Look up the symbol to find its tls index */ static int64_t rtld_tlsdesc_handle_locked(struct tls_data *tlsdesc, int flags, RtldLockState *lockstate) { const Elf_Rela *rela; const Elf_Sym *def; const Obj_Entry *defobj; Obj_Entry *obj; rela = tlsdesc->rela; obj = tlsdesc->obj; def = find_symdef(ELF_R_SYM(rela->r_info), obj, &defobj, flags, NULL, lockstate); if (def == NULL) rtld_die(); - tlsdesc->index = defobj->tlsindex + def->st_value + rela->r_addend; + tlsdesc->index = defobj->tlsoffset + def->st_value + rela->r_addend; return (tlsdesc->index); } int64_t rtld_tlsdesc_handle(struct tls_data *tlsdesc, int flags) { RtldLockState lockstate; /* We have already found the index, return it */ if (tlsdesc->index >= 0) return (tlsdesc->index); wlock_acquire(rtld_bind_lock, &lockstate); /* tlsdesc->index may have been set by another thread */ if (tlsdesc->index == -1) rtld_tlsdesc_handle_locked(tlsdesc, flags, &lockstate); lock_release(rtld_bind_lock, &lockstate); return (tlsdesc->index); } /* * Process the PLT relocations. */ int reloc_plt(Obj_Entry *obj) { const Elf_Rela *relalim; const Elf_Rela *rela; relalim = (const Elf_Rela *)((char *)obj->pltrela + obj->pltrelasize); for (rela = obj->pltrela; rela < relalim; rela++) { Elf_Addr *where; where = (Elf_Addr *)(obj->relocbase + rela->r_offset); switch(ELF_R_TYPE(rela->r_info)) { case R_AARCH64_JUMP_SLOT: *where += (Elf_Addr)obj->relocbase; break; case R_AARCH64_TLSDESC: if (ELF_R_SYM(rela->r_info) == 0) { where[0] = (Elf_Addr)_rtld_tlsdesc; - where[1] = obj->tlsindex + rela->r_addend; + where[1] = obj->tlsoffset + rela->r_addend; } else { where[0] = (Elf_Addr)_rtld_tlsdesc_dynamic; where[1] = (Elf_Addr)reloc_tlsdesc_alloc(obj, rela); } break; default: _rtld_error("Unknown relocation type %u in PLT", (unsigned int)ELF_R_TYPE(rela->r_info)); return (-1); } } return (0); } /* * LD_BIND_NOW was set - force relocation for all jump slots */ int reloc_jmpslots(Obj_Entry *obj, int flags, RtldLockState *lockstate) { const Obj_Entry *defobj; const Elf_Rela *relalim; const Elf_Rela *rela; const Elf_Sym *def; struct tls_data *tlsdesc; relalim = (const Elf_Rela *)((char *)obj->pltrela + obj->pltrelasize); for (rela = obj->pltrela; rela < relalim; rela++) { Elf_Addr *where; where = (Elf_Addr *)(obj->relocbase + rela->r_offset); switch(ELF_R_TYPE(rela->r_info)) { case R_AARCH64_JUMP_SLOT: def = find_symdef(ELF_R_SYM(rela->r_info), obj, &defobj, SYMLOOK_IN_PLT | flags, NULL, lockstate); if (def == NULL) { dbg("reloc_jmpslots: sym not found"); return (-1); } *where = (Elf_Addr)(defobj->relocbase + def->st_value); break; case R_AARCH64_TLSDESC: if (ELF_R_SYM(rela->r_info) != 0) { tlsdesc = (struct tls_data *)where[1]; if (tlsdesc->index == -1) rtld_tlsdesc_handle_locked(tlsdesc, SYMLOOK_IN_PLT | flags, lockstate); } break; default: _rtld_error("Unknown relocation type %x in jmpslot", (unsigned int)ELF_R_TYPE(rela->r_info)); return (-1); } } return (0); } int reloc_iresolve(Obj_Entry *obj, struct Struct_RtldLockState *lockstate) { /* XXX not implemented */ return (0); } int reloc_gnu_ifunc(Obj_Entry *obj, int flags, struct Struct_RtldLockState *lockstate) { /* XXX not implemented */ return (0); } Elf_Addr reloc_jmpslot(Elf_Addr *where, Elf_Addr target, const Obj_Entry *defobj, const Obj_Entry *obj, const Elf_Rel *rel) { assert(ELF_R_TYPE(rel->r_info) == R_AARCH64_JUMP_SLOT); if (*where != target) *where = target; return target; } /* * Process non-PLT relocations */ int reloc_non_plt(Obj_Entry *obj, Obj_Entry *obj_rtld, int flags, RtldLockState *lockstate) { const Obj_Entry *defobj; const Elf_Rela *relalim; const Elf_Rela *rela; const Elf_Sym *def; SymCache *cache; Elf_Addr *where; unsigned long symnum; if ((flags & SYMLOOK_IFUNC) != 0) /* XXX not implemented */ return (0); /* * The dynamic loader may be called from a thread, we have * limited amounts of stack available so we cannot use alloca(). */ if (obj == obj_rtld) cache = NULL; else cache = calloc(obj->dynsymcount, sizeof(SymCache)); /* No need to check for NULL here */ relalim = (const Elf_Rela *)((caddr_t)obj->rela + obj->relasize); for (rela = obj->rela; rela < relalim; rela++) { where = (Elf_Addr *)(obj->relocbase + rela->r_offset); symnum = ELF_R_SYM(rela->r_info); switch (ELF_R_TYPE(rela->r_info)) { case R_AARCH64_ABS64: case R_AARCH64_GLOB_DAT: def = find_symdef(symnum, obj, &defobj, flags, cache, lockstate); if (def == NULL) return (-1); *where = (Elf_Addr)defobj->relocbase + def->st_value; break; case R_AARCH64_COPY: /* * These are deferred until all other relocations have * been done. All we do here is make sure that the * COPY relocation is not in a shared library. They * are allowed only in executable files. */ if (!obj->mainprog) { _rtld_error("%s: Unexpected R_AARCH64_COPY " "relocation in shared library", obj->path); return (-1); } break; case R_AARCH64_TLS_TPREL64: def = find_symdef(symnum, obj, &defobj, flags, cache, lockstate); if (def == NULL) return (-1); /* * We lazily allocate offsets for static TLS as we * see the first relocation that references the * TLS block. This allows us to support (small * amounts of) static TLS in dynamically loaded * modules. If we run out of space, we generate an * error. */ if (!defobj->tls_done) { if (!allocate_tls_offset((Obj_Entry*) defobj)) { _rtld_error( "%s: No space available for static " "Thread Local Storage", obj->path); return (-1); } } *where = def->st_value + rela->r_addend + defobj->tlsoffset - TLS_TCB_SIZE; break; case R_AARCH64_RELATIVE: *where = (Elf_Addr)(obj->relocbase + rela->r_addend); break; default: rtld_printf("%s: Unhandled relocation %lu\n", obj->path, ELF_R_TYPE(rela->r_info)); return (-1); } } return (0); } void allocate_initial_tls(Obj_Entry *objs) { Elf_Addr **tp; /* * Fix the size of the static TLS block by using the maximum * offset allocated so far and adding a bit for dynamic modules to * use. */ tls_static_space = tls_last_offset + tls_last_size + RTLD_STATIC_TLS_EXTRA; tp = (Elf_Addr **) allocate_tls(objs, NULL, TLS_TCB_SIZE, 16); asm volatile("msr tpidr_el0, %0" : : "r"(tp)); } Index: user/ngie/more-tests/libexec/rtld-elf/rtld.c =================================================================== --- user/ngie/more-tests/libexec/rtld-elf/rtld.c (revision 281584) +++ user/ngie/more-tests/libexec/rtld-elf/rtld.c (revision 281585) @@ -1,5070 +1,5083 @@ /*- * Copyright 1996, 1997, 1998, 1999, 2000 John D. Polstra. * Copyright 2003 Alexander Kabaev . * Copyright 2009-2012 Konstantin Belousov . * Copyright 2012 John Marino . * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * Dynamic linker for ELF. * * John Polstra . */ #ifndef __GNUC__ #error "GCC is needed to compile this file" #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "debug.h" #include "rtld.h" #include "libmap.h" #include "rtld_tls.h" #include "rtld_printf.h" #include "notes.h" #ifndef COMPAT_32BIT #define PATH_RTLD "/libexec/ld-elf.so.1" #else #define PATH_RTLD "/libexec/ld-elf32.so.1" #endif /* Types. */ typedef void (*func_ptr_type)(); typedef void * (*path_enum_proc) (const char *path, size_t len, void *arg); /* * Function declarations. */ static const char *basename(const char *); static void digest_dynamic1(Obj_Entry *, int, const Elf_Dyn **, const Elf_Dyn **, const Elf_Dyn **); static void digest_dynamic2(Obj_Entry *, const Elf_Dyn *, const Elf_Dyn *, const Elf_Dyn *); static void digest_dynamic(Obj_Entry *, int); static Obj_Entry *digest_phdr(const Elf_Phdr *, int, caddr_t, const char *); static Obj_Entry *dlcheck(void *); static Obj_Entry *dlopen_object(const char *name, int fd, Obj_Entry *refobj, int lo_flags, int mode, RtldLockState *lockstate); static Obj_Entry *do_load_object(int, const char *, char *, struct stat *, int); static int do_search_info(const Obj_Entry *obj, int, struct dl_serinfo *); static bool donelist_check(DoneList *, const Obj_Entry *); static void errmsg_restore(char *); static char *errmsg_save(void); static void *fill_search_info(const char *, size_t, void *); static char *find_library(const char *, const Obj_Entry *, int *); static const char *gethints(bool); static void init_dag(Obj_Entry *); static void init_pagesizes(Elf_Auxinfo **aux_info); static void init_rtld(caddr_t, Elf_Auxinfo **); static void initlist_add_neededs(Needed_Entry *, Objlist *); static void initlist_add_objects(Obj_Entry *, Obj_Entry **, Objlist *); static void linkmap_add(Obj_Entry *); static void linkmap_delete(Obj_Entry *); static void load_filtees(Obj_Entry *, int flags, RtldLockState *); static void unload_filtees(Obj_Entry *); static int load_needed_objects(Obj_Entry *, int); static int load_preload_objects(void); static Obj_Entry *load_object(const char *, int fd, const Obj_Entry *, int); static void map_stacks_exec(RtldLockState *); static Obj_Entry *obj_from_addr(const void *); static void objlist_call_fini(Objlist *, Obj_Entry *, RtldLockState *); static void objlist_call_init(Objlist *, RtldLockState *); static void objlist_clear(Objlist *); static Objlist_Entry *objlist_find(Objlist *, const Obj_Entry *); static void objlist_init(Objlist *); static void objlist_push_head(Objlist *, Obj_Entry *); static void objlist_push_tail(Objlist *, Obj_Entry *); static void objlist_put_after(Objlist *, Obj_Entry *, Obj_Entry *); static void objlist_remove(Objlist *, Obj_Entry *); static int parse_libdir(const char *); static void *path_enumerate(const char *, path_enum_proc, void *); static int relocate_object_dag(Obj_Entry *root, bool bind_now, Obj_Entry *rtldobj, int flags, RtldLockState *lockstate); static int relocate_object(Obj_Entry *obj, bool bind_now, Obj_Entry *rtldobj, int flags, RtldLockState *lockstate); static int relocate_objects(Obj_Entry *, bool, Obj_Entry *, int, RtldLockState *); static int resolve_objects_ifunc(Obj_Entry *first, bool bind_now, int flags, RtldLockState *lockstate); static int rtld_dirname(const char *, char *); static int rtld_dirname_abs(const char *, char *); static void *rtld_dlopen(const char *name, int fd, int mode); static void rtld_exit(void); static char *search_library_path(const char *, const char *); static char *search_library_pathfds(const char *, const char *, int *); static const void **get_program_var_addr(const char *, RtldLockState *); static void set_program_var(const char *, const void *); static int symlook_default(SymLook *, const Obj_Entry *refobj); static int symlook_global(SymLook *, DoneList *); static void symlook_init_from_req(SymLook *, const SymLook *); static int symlook_list(SymLook *, const Objlist *, DoneList *); static int symlook_needed(SymLook *, const Needed_Entry *, DoneList *); static int symlook_obj1_sysv(SymLook *, const Obj_Entry *); static int symlook_obj1_gnu(SymLook *, const Obj_Entry *); static void trace_loaded_objects(Obj_Entry *); static void unlink_object(Obj_Entry *); static void unload_object(Obj_Entry *); static void unref_dag(Obj_Entry *); static void ref_dag(Obj_Entry *); static char *origin_subst_one(char *, const char *, const char *, bool); static char *origin_subst(char *, const char *); static void preinit_main(void); static int rtld_verify_versions(const Objlist *); static int rtld_verify_object_versions(Obj_Entry *); static void object_add_name(Obj_Entry *, const char *); static int object_match_name(const Obj_Entry *, const char *); static void ld_utrace_log(int, void *, void *, size_t, int, const char *); static void rtld_fill_dl_phdr_info(const Obj_Entry *obj, struct dl_phdr_info *phdr_info); static uint32_t gnu_hash(const char *); static bool matched_symbol(SymLook *, const Obj_Entry *, Sym_Match_Result *, const unsigned long); void r_debug_state(struct r_debug *, struct link_map *) __noinline __exported; void _r_debug_postinit(struct link_map *) __noinline __exported; int __sys_openat(int, const char *, int, ...); /* * Data declarations. */ static char *error_message; /* Message for dlerror(), or NULL */ struct r_debug r_debug __exported; /* for GDB; */ static bool libmap_disable; /* Disable libmap */ static bool ld_loadfltr; /* Immediate filters processing */ static char *libmap_override; /* Maps to use in addition to libmap.conf */ static bool trust; /* False for setuid and setgid programs */ static bool dangerous_ld_env; /* True if environment variables have been used to affect the libraries loaded */ static char *ld_bind_now; /* Environment variable for immediate binding */ static char *ld_debug; /* Environment variable for debugging */ static char *ld_library_path; /* Environment variable for search path */ static char *ld_library_dirs; /* Environment variable for library descriptors */ static char *ld_preload; /* Environment variable for libraries to load first */ static char *ld_elf_hints_path; /* Environment variable for alternative hints path */ static char *ld_tracing; /* Called from ldd to print libs */ static char *ld_utrace; /* Use utrace() to log events. */ static Obj_Entry *obj_list; /* Head of linked list of shared objects */ static Obj_Entry **obj_tail; /* Link field of last object in list */ static Obj_Entry *obj_main; /* The main program shared object */ static Obj_Entry obj_rtld; /* The dynamic linker shared object */ static unsigned int obj_count; /* Number of objects in obj_list */ static unsigned int obj_loads; /* Number of objects in obj_list */ static Objlist list_global = /* Objects dlopened with RTLD_GLOBAL */ STAILQ_HEAD_INITIALIZER(list_global); static Objlist list_main = /* Objects loaded at program startup */ STAILQ_HEAD_INITIALIZER(list_main); static Objlist list_fini = /* Objects needing fini() calls */ STAILQ_HEAD_INITIALIZER(list_fini); Elf_Sym sym_zero; /* For resolving undefined weak refs. */ #define GDB_STATE(s,m) r_debug.r_state = s; r_debug_state(&r_debug,m); extern Elf_Dyn _DYNAMIC; #pragma weak _DYNAMIC #ifndef RTLD_IS_DYNAMIC #define RTLD_IS_DYNAMIC() (&_DYNAMIC != NULL) #endif int dlclose(void *) __exported; char *dlerror(void) __exported; void *dlopen(const char *, int) __exported; void *fdlopen(int, int) __exported; void *dlsym(void *, const char *) __exported; dlfunc_t dlfunc(void *, const char *) __exported; void *dlvsym(void *, const char *, const char *) __exported; int dladdr(const void *, Dl_info *) __exported; void dllockinit(void *, void *(*)(void *), void (*)(void *), void (*)(void *), void (*)(void *), void (*)(void *), void (*)(void *)) __exported; int dlinfo(void *, int , void *) __exported; int dl_iterate_phdr(__dl_iterate_hdr_callback, void *) __exported; int _rtld_addr_phdr(const void *, struct dl_phdr_info *) __exported; int _rtld_get_stack_prot(void) __exported; int _rtld_is_dlopened(void *) __exported; void _rtld_error(const char *, ...) __exported; int npagesizes, osreldate; size_t *pagesizes; long __stack_chk_guard[8] = {0, 0, 0, 0, 0, 0, 0, 0}; static int stack_prot = PROT_READ | PROT_WRITE | RTLD_DEFAULT_STACK_EXEC; static int max_stack_flags; /* * Global declarations normally provided by crt1. The dynamic linker is * not built with crt1, so we have to provide them ourselves. */ char *__progname; char **environ; /* * Used to pass argc, argv to init functions. */ int main_argc; char **main_argv; /* * Globals to control TLS allocation. */ size_t tls_last_offset; /* Static TLS offset of last module */ size_t tls_last_size; /* Static TLS size of last module */ size_t tls_static_space; /* Static TLS space allocated */ size_t tls_static_max_align; int tls_dtv_generation = 1; /* Used to detect when dtv size changes */ int tls_max_index = 1; /* Largest module index allocated */ bool ld_library_path_rpath = false; /* * Fill in a DoneList with an allocation large enough to hold all of * the currently-loaded objects. Keep this as a macro since it calls * alloca and we want that to occur within the scope of the caller. */ #define donelist_init(dlp) \ ((dlp)->objs = alloca(obj_count * sizeof (dlp)->objs[0]), \ assert((dlp)->objs != NULL), \ (dlp)->num_alloc = obj_count, \ (dlp)->num_used = 0) #define UTRACE_DLOPEN_START 1 #define UTRACE_DLOPEN_STOP 2 #define UTRACE_DLCLOSE_START 3 #define UTRACE_DLCLOSE_STOP 4 #define UTRACE_LOAD_OBJECT 5 #define UTRACE_UNLOAD_OBJECT 6 #define UTRACE_ADD_RUNDEP 7 #define UTRACE_PRELOAD_FINISHED 8 #define UTRACE_INIT_CALL 9 #define UTRACE_FINI_CALL 10 #define UTRACE_DLSYM_START 11 #define UTRACE_DLSYM_STOP 12 struct utrace_rtld { char sig[4]; /* 'RTLD' */ int event; void *handle; void *mapbase; /* Used for 'parent' and 'init/fini' */ size_t mapsize; int refcnt; /* Used for 'mode' */ char name[MAXPATHLEN]; }; #define LD_UTRACE(e, h, mb, ms, r, n) do { \ if (ld_utrace != NULL) \ ld_utrace_log(e, h, mb, ms, r, n); \ } while (0) static void ld_utrace_log(int event, void *handle, void *mapbase, size_t mapsize, int refcnt, const char *name) { struct utrace_rtld ut; ut.sig[0] = 'R'; ut.sig[1] = 'T'; ut.sig[2] = 'L'; ut.sig[3] = 'D'; ut.event = event; ut.handle = handle; ut.mapbase = mapbase; ut.mapsize = mapsize; ut.refcnt = refcnt; bzero(ut.name, sizeof(ut.name)); if (name) strlcpy(ut.name, name, sizeof(ut.name)); utrace(&ut, sizeof(ut)); } /* * Main entry point for dynamic linking. The first argument is the * stack pointer. The stack is expected to be laid out as described * in the SVR4 ABI specification, Intel 386 Processor Supplement. * Specifically, the stack pointer points to a word containing * ARGC. Following that in the stack is a null-terminated sequence * of pointers to argument strings. Then comes a null-terminated * sequence of pointers to environment strings. Finally, there is a * sequence of "auxiliary vector" entries. * * The second argument points to a place to store the dynamic linker's * exit procedure pointer and the third to a place to store the main * program's object. * * The return value is the main program's entry point. */ func_ptr_type _rtld(Elf_Addr *sp, func_ptr_type *exit_proc, Obj_Entry **objp) { Elf_Auxinfo *aux_info[AT_COUNT]; int i; int argc; char **argv; char **env; Elf_Auxinfo *aux; Elf_Auxinfo *auxp; const char *argv0; Objlist_Entry *entry; Obj_Entry *obj; Obj_Entry **preload_tail; Obj_Entry *last_interposer; Objlist initlist; RtldLockState lockstate; char *library_path_rpath; int mib[2]; size_t len; /* * On entry, the dynamic linker itself has not been relocated yet. * Be very careful not to reference any global data until after * init_rtld has returned. It is OK to reference file-scope statics * and string constants, and to call static and global functions. */ /* Find the auxiliary vector on the stack. */ argc = *sp++; argv = (char **) sp; sp += argc + 1; /* Skip over arguments and NULL terminator */ env = (char **) sp; while (*sp++ != 0) /* Skip over environment, and NULL terminator */ ; aux = (Elf_Auxinfo *) sp; /* Digest the auxiliary vector. */ for (i = 0; i < AT_COUNT; i++) aux_info[i] = NULL; for (auxp = aux; auxp->a_type != AT_NULL; auxp++) { if (auxp->a_type < AT_COUNT) aux_info[auxp->a_type] = auxp; } /* Initialize and relocate ourselves. */ assert(aux_info[AT_BASE] != NULL); init_rtld((caddr_t) aux_info[AT_BASE]->a_un.a_ptr, aux_info); __progname = obj_rtld.path; argv0 = argv[0] != NULL ? argv[0] : "(null)"; environ = env; main_argc = argc; main_argv = argv; if (aux_info[AT_CANARY] != NULL && aux_info[AT_CANARY]->a_un.a_ptr != NULL) { i = aux_info[AT_CANARYLEN]->a_un.a_val; if (i > sizeof(__stack_chk_guard)) i = sizeof(__stack_chk_guard); memcpy(__stack_chk_guard, aux_info[AT_CANARY]->a_un.a_ptr, i); } else { mib[0] = CTL_KERN; mib[1] = KERN_ARND; len = sizeof(__stack_chk_guard); if (sysctl(mib, 2, __stack_chk_guard, &len, NULL, 0) == -1 || len != sizeof(__stack_chk_guard)) { /* If sysctl was unsuccessful, use the "terminator canary". */ ((unsigned char *)(void *)__stack_chk_guard)[0] = 0; ((unsigned char *)(void *)__stack_chk_guard)[1] = 0; ((unsigned char *)(void *)__stack_chk_guard)[2] = '\n'; ((unsigned char *)(void *)__stack_chk_guard)[3] = 255; } } trust = !issetugid(); ld_bind_now = getenv(LD_ "BIND_NOW"); /* * If the process is tainted, then we un-set the dangerous environment * variables. The process will be marked as tainted until setuid(2) * is called. If any child process calls setuid(2) we do not want any * future processes to honor the potentially un-safe variables. */ if (!trust) { if (unsetenv(LD_ "PRELOAD") || unsetenv(LD_ "LIBMAP") || unsetenv(LD_ "LIBRARY_PATH") || unsetenv(LD_ "LIBRARY_PATH_FDS") || unsetenv(LD_ "LIBMAP_DISABLE") || unsetenv(LD_ "DEBUG") || unsetenv(LD_ "ELF_HINTS_PATH") || unsetenv(LD_ "LOADFLTR") || unsetenv(LD_ "LIBRARY_PATH_RPATH")) { _rtld_error("environment corrupt; aborting"); rtld_die(); } } ld_debug = getenv(LD_ "DEBUG"); libmap_disable = getenv(LD_ "LIBMAP_DISABLE") != NULL; libmap_override = getenv(LD_ "LIBMAP"); ld_library_path = getenv(LD_ "LIBRARY_PATH"); ld_library_dirs = getenv(LD_ "LIBRARY_PATH_FDS"); ld_preload = getenv(LD_ "PRELOAD"); ld_elf_hints_path = getenv(LD_ "ELF_HINTS_PATH"); ld_loadfltr = getenv(LD_ "LOADFLTR") != NULL; library_path_rpath = getenv(LD_ "LIBRARY_PATH_RPATH"); if (library_path_rpath != NULL) { if (library_path_rpath[0] == 'y' || library_path_rpath[0] == 'Y' || library_path_rpath[0] == '1') ld_library_path_rpath = true; else ld_library_path_rpath = false; } dangerous_ld_env = libmap_disable || (libmap_override != NULL) || (ld_library_path != NULL) || (ld_preload != NULL) || (ld_elf_hints_path != NULL) || ld_loadfltr; ld_tracing = getenv(LD_ "TRACE_LOADED_OBJECTS"); ld_utrace = getenv(LD_ "UTRACE"); if ((ld_elf_hints_path == NULL) || strlen(ld_elf_hints_path) == 0) ld_elf_hints_path = _PATH_ELF_HINTS; if (ld_debug != NULL && *ld_debug != '\0') debug = 1; dbg("%s is initialized, base address = %p", __progname, (caddr_t) aux_info[AT_BASE]->a_un.a_ptr); dbg("RTLD dynamic = %p", obj_rtld.dynamic); dbg("RTLD pltgot = %p", obj_rtld.pltgot); dbg("initializing thread locks"); lockdflt_init(); /* * Load the main program, or process its program header if it is * already loaded. */ if (aux_info[AT_EXECFD] != NULL) { /* Load the main program. */ int fd = aux_info[AT_EXECFD]->a_un.a_val; dbg("loading main program"); obj_main = map_object(fd, argv0, NULL); close(fd); if (obj_main == NULL) rtld_die(); max_stack_flags = obj->stack_flags; } else { /* Main program already loaded. */ const Elf_Phdr *phdr; int phnum; caddr_t entry; dbg("processing main program's program header"); assert(aux_info[AT_PHDR] != NULL); phdr = (const Elf_Phdr *) aux_info[AT_PHDR]->a_un.a_ptr; assert(aux_info[AT_PHNUM] != NULL); phnum = aux_info[AT_PHNUM]->a_un.a_val; assert(aux_info[AT_PHENT] != NULL); assert(aux_info[AT_PHENT]->a_un.a_val == sizeof(Elf_Phdr)); assert(aux_info[AT_ENTRY] != NULL); entry = (caddr_t) aux_info[AT_ENTRY]->a_un.a_ptr; if ((obj_main = digest_phdr(phdr, phnum, entry, argv0)) == NULL) rtld_die(); } if (aux_info[AT_EXECPATH] != 0) { char *kexecpath; char buf[MAXPATHLEN]; kexecpath = aux_info[AT_EXECPATH]->a_un.a_ptr; dbg("AT_EXECPATH %p %s", kexecpath, kexecpath); if (kexecpath[0] == '/') obj_main->path = kexecpath; else if (getcwd(buf, sizeof(buf)) == NULL || strlcat(buf, "/", sizeof(buf)) >= sizeof(buf) || strlcat(buf, kexecpath, sizeof(buf)) >= sizeof(buf)) obj_main->path = xstrdup(argv0); else obj_main->path = xstrdup(buf); } else { dbg("No AT_EXECPATH"); obj_main->path = xstrdup(argv0); } dbg("obj_main path %s", obj_main->path); obj_main->mainprog = true; if (aux_info[AT_STACKPROT] != NULL && aux_info[AT_STACKPROT]->a_un.a_val != 0) stack_prot = aux_info[AT_STACKPROT]->a_un.a_val; #ifndef COMPAT_32BIT /* * Get the actual dynamic linker pathname from the executable if * possible. (It should always be possible.) That ensures that * gdb will find the right dynamic linker even if a non-standard * one is being used. */ if (obj_main->interp != NULL && strcmp(obj_main->interp, obj_rtld.path) != 0) { free(obj_rtld.path); obj_rtld.path = xstrdup(obj_main->interp); __progname = obj_rtld.path; } #endif digest_dynamic(obj_main, 0); dbg("%s valid_hash_sysv %d valid_hash_gnu %d dynsymcount %d", obj_main->path, obj_main->valid_hash_sysv, obj_main->valid_hash_gnu, obj_main->dynsymcount); linkmap_add(obj_main); linkmap_add(&obj_rtld); /* Link the main program into the list of objects. */ *obj_tail = obj_main; obj_tail = &obj_main->next; obj_count++; obj_loads++; /* Initialize a fake symbol for resolving undefined weak references. */ sym_zero.st_info = ELF_ST_INFO(STB_GLOBAL, STT_NOTYPE); sym_zero.st_shndx = SHN_UNDEF; sym_zero.st_value = -(uintptr_t)obj_main->relocbase; if (!libmap_disable) libmap_disable = (bool)lm_init(libmap_override); dbg("loading LD_PRELOAD libraries"); if (load_preload_objects() == -1) rtld_die(); preload_tail = obj_tail; dbg("loading needed objects"); if (load_needed_objects(obj_main, 0) == -1) rtld_die(); /* Make a list of all objects loaded at startup. */ last_interposer = obj_main; for (obj = obj_list; obj != NULL; obj = obj->next) { if (obj->z_interpose && obj != obj_main) { objlist_put_after(&list_main, last_interposer, obj); last_interposer = obj; } else { objlist_push_tail(&list_main, obj); } obj->refcount++; } dbg("checking for required versions"); if (rtld_verify_versions(&list_main) == -1 && !ld_tracing) rtld_die(); if (ld_tracing) { /* We're done */ trace_loaded_objects(obj_main); exit(0); } if (getenv(LD_ "DUMP_REL_PRE") != NULL) { dump_relocations(obj_main); exit (0); } /* * Processing tls relocations requires having the tls offsets * initialized. Prepare offsets before starting initial * relocation processing. */ dbg("initializing initial thread local storage offsets"); STAILQ_FOREACH(entry, &list_main, link) { /* * Allocate all the initial objects out of the static TLS * block even if they didn't ask for it. */ allocate_tls_offset(entry->obj); } if (relocate_objects(obj_main, ld_bind_now != NULL && *ld_bind_now != '\0', &obj_rtld, SYMLOOK_EARLY, NULL) == -1) rtld_die(); dbg("doing copy relocations"); if (do_copy_relocations(obj_main) == -1) rtld_die(); if (getenv(LD_ "DUMP_REL_POST") != NULL) { dump_relocations(obj_main); exit (0); } /* * Setup TLS for main thread. This must be done after the * relocations are processed, since tls initialization section * might be the subject for relocations. */ dbg("initializing initial thread local storage"); allocate_initial_tls(obj_list); dbg("initializing key program variables"); set_program_var("__progname", argv[0] != NULL ? basename(argv[0]) : ""); set_program_var("environ", env); set_program_var("__elf_aux_vector", aux); /* Make a list of init functions to call. */ objlist_init(&initlist); initlist_add_objects(obj_list, preload_tail, &initlist); r_debug_state(NULL, &obj_main->linkmap); /* say hello to gdb! */ map_stacks_exec(NULL); dbg("resolving ifuncs"); if (resolve_objects_ifunc(obj_main, ld_bind_now != NULL && *ld_bind_now != '\0', SYMLOOK_EARLY, NULL) == -1) rtld_die(); if (!obj_main->crt_no_init) { /* * Make sure we don't call the main program's init and fini * functions for binaries linked with old crt1 which calls * _init itself. */ obj_main->init = obj_main->fini = (Elf_Addr)NULL; obj_main->preinit_array = obj_main->init_array = obj_main->fini_array = (Elf_Addr)NULL; } wlock_acquire(rtld_bind_lock, &lockstate); if (obj_main->crt_no_init) preinit_main(); objlist_call_init(&initlist, &lockstate); _r_debug_postinit(&obj_main->linkmap); objlist_clear(&initlist); dbg("loading filtees"); for (obj = obj_list->next; obj != NULL; obj = obj->next) { if (ld_loadfltr || obj->z_loadfltr) load_filtees(obj, 0, &lockstate); } lock_release(rtld_bind_lock, &lockstate); dbg("transferring control to program entry point = %p", obj_main->entry); /* Return the exit procedure and the program entry point. */ *exit_proc = rtld_exit; *objp = obj_main; return (func_ptr_type) obj_main->entry; } void * rtld_resolve_ifunc(const Obj_Entry *obj, const Elf_Sym *def) { void *ptr; Elf_Addr target; ptr = (void *)make_function_pointer(def, obj); target = ((Elf_Addr (*)(void))ptr)(); return ((void *)target); } Elf_Addr _rtld_bind(Obj_Entry *obj, Elf_Size reloff) { const Elf_Rel *rel; const Elf_Sym *def; const Obj_Entry *defobj; Elf_Addr *where; Elf_Addr target; RtldLockState lockstate; rlock_acquire(rtld_bind_lock, &lockstate); if (sigsetjmp(lockstate.env, 0) != 0) lock_upgrade(rtld_bind_lock, &lockstate); if (obj->pltrel) rel = (const Elf_Rel *) ((caddr_t) obj->pltrel + reloff); else rel = (const Elf_Rel *) ((caddr_t) obj->pltrela + reloff); where = (Elf_Addr *) (obj->relocbase + rel->r_offset); def = find_symdef(ELF_R_SYM(rel->r_info), obj, &defobj, true, NULL, &lockstate); if (def == NULL) rtld_die(); if (ELF_ST_TYPE(def->st_info) == STT_GNU_IFUNC) target = (Elf_Addr)rtld_resolve_ifunc(defobj, def); else target = (Elf_Addr)(defobj->relocbase + def->st_value); dbg("\"%s\" in \"%s\" ==> %p in \"%s\"", defobj->strtab + def->st_name, basename(obj->path), (void *)target, basename(defobj->path)); /* * Write the new contents for the jmpslot. Note that depending on * architecture, the value which we need to return back to the * lazy binding trampoline may or may not be the target * address. The value returned from reloc_jmpslot() is the value * that the trampoline needs. */ target = reloc_jmpslot(where, target, defobj, obj, rel); lock_release(rtld_bind_lock, &lockstate); return target; } /* * Error reporting function. Use it like printf. If formats the message * into a buffer, and sets things up so that the next call to dlerror() * will return the message. */ void _rtld_error(const char *fmt, ...) { static char buf[512]; va_list ap; va_start(ap, fmt); rtld_vsnprintf(buf, sizeof buf, fmt, ap); error_message = buf; va_end(ap); } /* * Return a dynamically-allocated copy of the current error message, if any. */ static char * errmsg_save(void) { return error_message == NULL ? NULL : xstrdup(error_message); } /* * Restore the current error message from a copy which was previously saved * by errmsg_save(). The copy is freed. */ static void errmsg_restore(char *saved_msg) { if (saved_msg == NULL) error_message = NULL; else { _rtld_error("%s", saved_msg); free(saved_msg); } } static const char * basename(const char *name) { const char *p = strrchr(name, '/'); return p != NULL ? p + 1 : name; } static struct utsname uts; static char * origin_subst_one(char *real, const char *kw, const char *subst, bool may_free) { char *p, *p1, *res, *resp; int subst_len, kw_len, subst_count, old_len, new_len; kw_len = strlen(kw); /* * First, count the number of the keyword occurences, to * preallocate the final string. */ for (p = real, subst_count = 0;; p = p1 + kw_len, subst_count++) { p1 = strstr(p, kw); if (p1 == NULL) break; } /* * If the keyword is not found, just return. */ if (subst_count == 0) return (may_free ? real : xstrdup(real)); /* * There is indeed something to substitute. Calculate the * length of the resulting string, and allocate it. */ subst_len = strlen(subst); old_len = strlen(real); new_len = old_len + (subst_len - kw_len) * subst_count; res = xmalloc(new_len + 1); /* * Now, execute the substitution loop. */ for (p = real, resp = res, *resp = '\0';;) { p1 = strstr(p, kw); if (p1 != NULL) { /* Copy the prefix before keyword. */ memcpy(resp, p, p1 - p); resp += p1 - p; /* Keyword replacement. */ memcpy(resp, subst, subst_len); resp += subst_len; *resp = '\0'; p = p1 + kw_len; } else break; } /* Copy to the end of string and finish. */ strcat(resp, p); if (may_free) free(real); return (res); } static char * origin_subst(char *real, const char *origin_path) { char *res1, *res2, *res3, *res4; if (uts.sysname[0] == '\0') { if (uname(&uts) != 0) { _rtld_error("utsname failed: %d", errno); return (NULL); } } res1 = origin_subst_one(real, "$ORIGIN", origin_path, false); res2 = origin_subst_one(res1, "$OSNAME", uts.sysname, true); res3 = origin_subst_one(res2, "$OSREL", uts.release, true); res4 = origin_subst_one(res3, "$PLATFORM", uts.machine, true); return (res4); } void rtld_die(void) { const char *msg = dlerror(); if (msg == NULL) msg = "Fatal error"; rtld_fdputstr(STDERR_FILENO, msg); rtld_fdputchar(STDERR_FILENO, '\n'); _exit(1); } /* * Process a shared object's DYNAMIC section, and save the important * information in its Obj_Entry structure. */ static void digest_dynamic1(Obj_Entry *obj, int early, const Elf_Dyn **dyn_rpath, const Elf_Dyn **dyn_soname, const Elf_Dyn **dyn_runpath) { const Elf_Dyn *dynp; Needed_Entry **needed_tail = &obj->needed; Needed_Entry **needed_filtees_tail = &obj->needed_filtees; Needed_Entry **needed_aux_filtees_tail = &obj->needed_aux_filtees; const Elf_Hashelt *hashtab; const Elf32_Word *hashval; Elf32_Word bkt, nmaskwords; int bloom_size32; int plttype = DT_REL; *dyn_rpath = NULL; *dyn_soname = NULL; *dyn_runpath = NULL; obj->bind_now = false; for (dynp = obj->dynamic; dynp->d_tag != DT_NULL; dynp++) { switch (dynp->d_tag) { case DT_REL: obj->rel = (const Elf_Rel *) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_RELSZ: obj->relsize = dynp->d_un.d_val; break; case DT_RELENT: assert(dynp->d_un.d_val == sizeof(Elf_Rel)); break; case DT_JMPREL: obj->pltrel = (const Elf_Rel *) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_PLTRELSZ: obj->pltrelsize = dynp->d_un.d_val; break; case DT_RELA: obj->rela = (const Elf_Rela *) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_RELASZ: obj->relasize = dynp->d_un.d_val; break; case DT_RELAENT: assert(dynp->d_un.d_val == sizeof(Elf_Rela)); break; case DT_PLTREL: plttype = dynp->d_un.d_val; assert(dynp->d_un.d_val == DT_REL || plttype == DT_RELA); break; case DT_SYMTAB: obj->symtab = (const Elf_Sym *) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_SYMENT: assert(dynp->d_un.d_val == sizeof(Elf_Sym)); break; case DT_STRTAB: obj->strtab = (const char *) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_STRSZ: obj->strsize = dynp->d_un.d_val; break; case DT_VERNEED: obj->verneed = (const Elf_Verneed *) (obj->relocbase + dynp->d_un.d_val); break; case DT_VERNEEDNUM: obj->verneednum = dynp->d_un.d_val; break; case DT_VERDEF: obj->verdef = (const Elf_Verdef *) (obj->relocbase + dynp->d_un.d_val); break; case DT_VERDEFNUM: obj->verdefnum = dynp->d_un.d_val; break; case DT_VERSYM: obj->versyms = (const Elf_Versym *)(obj->relocbase + dynp->d_un.d_val); break; case DT_HASH: { hashtab = (const Elf_Hashelt *)(obj->relocbase + dynp->d_un.d_ptr); obj->nbuckets = hashtab[0]; obj->nchains = hashtab[1]; obj->buckets = hashtab + 2; obj->chains = obj->buckets + obj->nbuckets; obj->valid_hash_sysv = obj->nbuckets > 0 && obj->nchains > 0 && obj->buckets != NULL; } break; case DT_GNU_HASH: { hashtab = (const Elf_Hashelt *)(obj->relocbase + dynp->d_un.d_ptr); obj->nbuckets_gnu = hashtab[0]; obj->symndx_gnu = hashtab[1]; nmaskwords = hashtab[2]; bloom_size32 = (__ELF_WORD_SIZE / 32) * nmaskwords; obj->maskwords_bm_gnu = nmaskwords - 1; obj->shift2_gnu = hashtab[3]; obj->bloom_gnu = (Elf_Addr *) (hashtab + 4); obj->buckets_gnu = hashtab + 4 + bloom_size32; obj->chain_zero_gnu = obj->buckets_gnu + obj->nbuckets_gnu - obj->symndx_gnu; /* Number of bitmask words is required to be power of 2 */ obj->valid_hash_gnu = powerof2(nmaskwords) && obj->nbuckets_gnu > 0 && obj->buckets_gnu != NULL; } break; case DT_NEEDED: if (!obj->rtld) { Needed_Entry *nep = NEW(Needed_Entry); nep->name = dynp->d_un.d_val; nep->obj = NULL; nep->next = NULL; *needed_tail = nep; needed_tail = &nep->next; } break; case DT_FILTER: if (!obj->rtld) { Needed_Entry *nep = NEW(Needed_Entry); nep->name = dynp->d_un.d_val; nep->obj = NULL; nep->next = NULL; *needed_filtees_tail = nep; needed_filtees_tail = &nep->next; } break; case DT_AUXILIARY: if (!obj->rtld) { Needed_Entry *nep = NEW(Needed_Entry); nep->name = dynp->d_un.d_val; nep->obj = NULL; nep->next = NULL; *needed_aux_filtees_tail = nep; needed_aux_filtees_tail = &nep->next; } break; case DT_PLTGOT: obj->pltgot = (Elf_Addr *) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_TEXTREL: obj->textrel = true; break; case DT_SYMBOLIC: obj->symbolic = true; break; case DT_RPATH: /* * We have to wait until later to process this, because we * might not have gotten the address of the string table yet. */ *dyn_rpath = dynp; break; case DT_SONAME: *dyn_soname = dynp; break; case DT_RUNPATH: *dyn_runpath = dynp; break; case DT_INIT: obj->init = (Elf_Addr) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_PREINIT_ARRAY: obj->preinit_array = (Elf_Addr)(obj->relocbase + dynp->d_un.d_ptr); break; case DT_PREINIT_ARRAYSZ: obj->preinit_array_num = dynp->d_un.d_val / sizeof(Elf_Addr); break; case DT_INIT_ARRAY: obj->init_array = (Elf_Addr)(obj->relocbase + dynp->d_un.d_ptr); break; case DT_INIT_ARRAYSZ: obj->init_array_num = dynp->d_un.d_val / sizeof(Elf_Addr); break; case DT_FINI: obj->fini = (Elf_Addr) (obj->relocbase + dynp->d_un.d_ptr); break; case DT_FINI_ARRAY: obj->fini_array = (Elf_Addr)(obj->relocbase + dynp->d_un.d_ptr); break; case DT_FINI_ARRAYSZ: obj->fini_array_num = dynp->d_un.d_val / sizeof(Elf_Addr); break; /* * Don't process DT_DEBUG on MIPS as the dynamic section * is mapped read-only. DT_MIPS_RLD_MAP is used instead. */ #ifndef __mips__ case DT_DEBUG: /* XXX - not implemented yet */ if (!early) dbg("Filling in DT_DEBUG entry"); ((Elf_Dyn*)dynp)->d_un.d_ptr = (Elf_Addr) &r_debug; break; #endif case DT_FLAGS: if ((dynp->d_un.d_val & DF_ORIGIN) && trust) obj->z_origin = true; if (dynp->d_un.d_val & DF_SYMBOLIC) obj->symbolic = true; if (dynp->d_un.d_val & DF_TEXTREL) obj->textrel = true; if (dynp->d_un.d_val & DF_BIND_NOW) obj->bind_now = true; /*if (dynp->d_un.d_val & DF_STATIC_TLS) ;*/ break; #ifdef __mips__ case DT_MIPS_LOCAL_GOTNO: obj->local_gotno = dynp->d_un.d_val; break; case DT_MIPS_SYMTABNO: obj->symtabno = dynp->d_un.d_val; break; case DT_MIPS_GOTSYM: obj->gotsym = dynp->d_un.d_val; break; case DT_MIPS_RLD_MAP: *((Elf_Addr *)(dynp->d_un.d_ptr)) = (Elf_Addr) &r_debug; break; #endif case DT_FLAGS_1: if (dynp->d_un.d_val & DF_1_NOOPEN) obj->z_noopen = true; if ((dynp->d_un.d_val & DF_1_ORIGIN) && trust) obj->z_origin = true; - /*if (dynp->d_un.d_val & DF_1_GLOBAL) - XXX ;*/ + if (dynp->d_un.d_val & DF_1_GLOBAL) + obj->z_global = true; if (dynp->d_un.d_val & DF_1_BIND_NOW) obj->bind_now = true; if (dynp->d_un.d_val & DF_1_NODELETE) obj->z_nodelete = true; if (dynp->d_un.d_val & DF_1_LOADFLTR) obj->z_loadfltr = true; if (dynp->d_un.d_val & DF_1_INTERPOSE) obj->z_interpose = true; if (dynp->d_un.d_val & DF_1_NODEFLIB) obj->z_nodeflib = true; break; default: if (!early) { dbg("Ignoring d_tag %ld = %#lx", (long)dynp->d_tag, (long)dynp->d_tag); } break; } } obj->traced = false; if (plttype == DT_RELA) { obj->pltrela = (const Elf_Rela *) obj->pltrel; obj->pltrel = NULL; obj->pltrelasize = obj->pltrelsize; obj->pltrelsize = 0; } /* Determine size of dynsym table (equal to nchains of sysv hash) */ if (obj->valid_hash_sysv) obj->dynsymcount = obj->nchains; else if (obj->valid_hash_gnu) { obj->dynsymcount = 0; for (bkt = 0; bkt < obj->nbuckets_gnu; bkt++) { if (obj->buckets_gnu[bkt] == 0) continue; hashval = &obj->chain_zero_gnu[obj->buckets_gnu[bkt]]; do obj->dynsymcount++; while ((*hashval++ & 1u) == 0); } obj->dynsymcount += obj->symndx_gnu; } } static void digest_dynamic2(Obj_Entry *obj, const Elf_Dyn *dyn_rpath, const Elf_Dyn *dyn_soname, const Elf_Dyn *dyn_runpath) { if (obj->z_origin && obj->origin_path == NULL) { obj->origin_path = xmalloc(PATH_MAX); if (rtld_dirname_abs(obj->path, obj->origin_path) == -1) rtld_die(); } if (dyn_runpath != NULL) { obj->runpath = (char *)obj->strtab + dyn_runpath->d_un.d_val; if (obj->z_origin) obj->runpath = origin_subst(obj->runpath, obj->origin_path); } else if (dyn_rpath != NULL) { obj->rpath = (char *)obj->strtab + dyn_rpath->d_un.d_val; if (obj->z_origin) obj->rpath = origin_subst(obj->rpath, obj->origin_path); } if (dyn_soname != NULL) object_add_name(obj, obj->strtab + dyn_soname->d_un.d_val); } static void digest_dynamic(Obj_Entry *obj, int early) { const Elf_Dyn *dyn_rpath; const Elf_Dyn *dyn_soname; const Elf_Dyn *dyn_runpath; digest_dynamic1(obj, early, &dyn_rpath, &dyn_soname, &dyn_runpath); digest_dynamic2(obj, dyn_rpath, dyn_soname, dyn_runpath); } /* * Process a shared object's program header. This is used only for the * main program, when the kernel has already loaded the main program * into memory before calling the dynamic linker. It creates and * returns an Obj_Entry structure. */ static Obj_Entry * digest_phdr(const Elf_Phdr *phdr, int phnum, caddr_t entry, const char *path) { Obj_Entry *obj; const Elf_Phdr *phlimit = phdr + phnum; const Elf_Phdr *ph; Elf_Addr note_start, note_end; int nsegs = 0; obj = obj_new(); for (ph = phdr; ph < phlimit; ph++) { if (ph->p_type != PT_PHDR) continue; obj->phdr = phdr; obj->phsize = ph->p_memsz; obj->relocbase = (caddr_t)phdr - ph->p_vaddr; break; } obj->stack_flags = PF_X | PF_R | PF_W; for (ph = phdr; ph < phlimit; ph++) { switch (ph->p_type) { case PT_INTERP: obj->interp = (const char *)(ph->p_vaddr + obj->relocbase); break; case PT_LOAD: if (nsegs == 0) { /* First load segment */ obj->vaddrbase = trunc_page(ph->p_vaddr); obj->mapbase = obj->vaddrbase + obj->relocbase; obj->textsize = round_page(ph->p_vaddr + ph->p_memsz) - obj->vaddrbase; } else { /* Last load segment */ obj->mapsize = round_page(ph->p_vaddr + ph->p_memsz) - obj->vaddrbase; } nsegs++; break; case PT_DYNAMIC: obj->dynamic = (const Elf_Dyn *)(ph->p_vaddr + obj->relocbase); break; case PT_TLS: obj->tlsindex = 1; obj->tlssize = ph->p_memsz; obj->tlsalign = ph->p_align; obj->tlsinitsize = ph->p_filesz; obj->tlsinit = (void*)(ph->p_vaddr + obj->relocbase); break; case PT_GNU_STACK: obj->stack_flags = ph->p_flags; break; case PT_GNU_RELRO: obj->relro_page = obj->relocbase + trunc_page(ph->p_vaddr); obj->relro_size = round_page(ph->p_memsz); break; case PT_NOTE: note_start = (Elf_Addr)obj->relocbase + ph->p_vaddr; note_end = note_start + ph->p_filesz; digest_notes(obj, note_start, note_end); break; } } if (nsegs < 1) { _rtld_error("%s: too few PT_LOAD segments", path); return NULL; } obj->entry = entry; return obj; } void digest_notes(Obj_Entry *obj, Elf_Addr note_start, Elf_Addr note_end) { const Elf_Note *note; const char *note_name; uintptr_t p; for (note = (const Elf_Note *)note_start; (Elf_Addr)note < note_end; note = (const Elf_Note *)((const char *)(note + 1) + roundup2(note->n_namesz, sizeof(Elf32_Addr)) + roundup2(note->n_descsz, sizeof(Elf32_Addr)))) { if (note->n_namesz != sizeof(NOTE_FREEBSD_VENDOR) || note->n_descsz != sizeof(int32_t)) continue; if (note->n_type != ABI_NOTETYPE && note->n_type != CRT_NOINIT_NOTETYPE) continue; note_name = (const char *)(note + 1); if (strncmp(NOTE_FREEBSD_VENDOR, note_name, sizeof(NOTE_FREEBSD_VENDOR)) != 0) continue; switch (note->n_type) { case ABI_NOTETYPE: /* FreeBSD osrel note */ p = (uintptr_t)(note + 1); p += roundup2(note->n_namesz, sizeof(Elf32_Addr)); obj->osrel = *(const int32_t *)(p); dbg("note osrel %d", obj->osrel); break; case CRT_NOINIT_NOTETYPE: /* FreeBSD 'crt does not call init' note */ obj->crt_no_init = true; dbg("note crt_no_init"); break; } } } static Obj_Entry * dlcheck(void *handle) { Obj_Entry *obj; for (obj = obj_list; obj != NULL; obj = obj->next) if (obj == (Obj_Entry *) handle) break; if (obj == NULL || obj->refcount == 0 || obj->dl_refcount == 0) { _rtld_error("Invalid shared object handle %p", handle); return NULL; } return obj; } /* * If the given object is already in the donelist, return true. Otherwise * add the object to the list and return false. */ static bool donelist_check(DoneList *dlp, const Obj_Entry *obj) { unsigned int i; for (i = 0; i < dlp->num_used; i++) if (dlp->objs[i] == obj) return true; /* * Our donelist allocation should always be sufficient. But if * our threads locking isn't working properly, more shared objects * could have been loaded since we allocated the list. That should * never happen, but we'll handle it properly just in case it does. */ if (dlp->num_used < dlp->num_alloc) dlp->objs[dlp->num_used++] = obj; return false; } /* * Hash function for symbol table lookup. Don't even think about changing * this. It is specified by the System V ABI. */ unsigned long elf_hash(const char *name) { const unsigned char *p = (const unsigned char *) name; unsigned long h = 0; unsigned long g; while (*p != '\0') { h = (h << 4) + *p++; if ((g = h & 0xf0000000) != 0) h ^= g >> 24; h &= ~g; } return h; } /* * The GNU hash function is the Daniel J. Bernstein hash clipped to 32 bits * unsigned in case it's implemented with a wider type. */ static uint32_t gnu_hash(const char *s) { uint32_t h; unsigned char c; h = 5381; for (c = *s; c != '\0'; c = *++s) h = h * 33 + c; return (h & 0xffffffff); } /* * Find the library with the given name, and return its full pathname. * The returned string is dynamically allocated. Generates an error * message and returns NULL if the library cannot be found. * * If the second argument is non-NULL, then it refers to an already- * loaded shared object, whose library search path will be searched. * * If a library is successfully located via LD_LIBRARY_PATH_FDS, its * descriptor (which is close-on-exec) will be passed out via the third * argument. * * The search order is: * DT_RPATH in the referencing file _unless_ DT_RUNPATH is present (1) * DT_RPATH of the main object if DSO without defined DT_RUNPATH (1) * LD_LIBRARY_PATH * DT_RUNPATH in the referencing file * ldconfig hints (if -z nodefaultlib, filter out default library directories * from list) * /lib:/usr/lib _unless_ the referencing file is linked with -z nodefaultlib * * (1) Handled in digest_dynamic2 - rpath left NULL if runpath defined. */ static char * find_library(const char *xname, const Obj_Entry *refobj, int *fdp) { char *pathname; char *name; bool nodeflib, objgiven; objgiven = refobj != NULL; if (strchr(xname, '/') != NULL) { /* Hard coded pathname */ if (xname[0] != '/' && !trust) { _rtld_error("Absolute pathname required for shared object \"%s\"", xname); return NULL; } if (objgiven && refobj->z_origin) { return (origin_subst(__DECONST(char *, xname), refobj->origin_path)); } else { return (xstrdup(xname)); } } if (libmap_disable || !objgiven || (name = lm_find(refobj->path, xname)) == NULL) name = (char *)xname; dbg(" Searching for \"%s\"", name); /* * If refobj->rpath != NULL, then refobj->runpath is NULL. Fall * back to pre-conforming behaviour if user requested so with * LD_LIBRARY_PATH_RPATH environment variable and ignore -z * nodeflib. */ if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) { if ((pathname = search_library_path(name, ld_library_path)) != NULL || (refobj != NULL && (pathname = search_library_path(name, refobj->rpath)) != NULL) || (pathname = search_library_pathfds(name, ld_library_dirs, fdp)) != NULL || (pathname = search_library_path(name, gethints(false))) != NULL || (pathname = search_library_path(name, STANDARD_LIBRARY_PATH)) != NULL) return (pathname); } else { nodeflib = objgiven ? refobj->z_nodeflib : false; if ((objgiven && (pathname = search_library_path(name, refobj->rpath)) != NULL) || (objgiven && refobj->runpath == NULL && refobj != obj_main && (pathname = search_library_path(name, obj_main->rpath)) != NULL) || (pathname = search_library_path(name, ld_library_path)) != NULL || (objgiven && (pathname = search_library_path(name, refobj->runpath)) != NULL) || (pathname = search_library_pathfds(name, ld_library_dirs, fdp)) != NULL || (pathname = search_library_path(name, gethints(nodeflib))) != NULL || (objgiven && !nodeflib && (pathname = search_library_path(name, STANDARD_LIBRARY_PATH)) != NULL)) return (pathname); } if (objgiven && refobj->path != NULL) { _rtld_error("Shared object \"%s\" not found, required by \"%s\"", name, basename(refobj->path)); } else { _rtld_error("Shared object \"%s\" not found", name); } return NULL; } /* * Given a symbol number in a referencing object, find the corresponding * definition of the symbol. Returns a pointer to the symbol, or NULL if * no definition was found. Returns a pointer to the Obj_Entry of the * defining object via the reference parameter DEFOBJ_OUT. */ const Elf_Sym * find_symdef(unsigned long symnum, const Obj_Entry *refobj, const Obj_Entry **defobj_out, int flags, SymCache *cache, RtldLockState *lockstate) { const Elf_Sym *ref; const Elf_Sym *def; const Obj_Entry *defobj; SymLook req; const char *name; int res; /* * If we have already found this symbol, get the information from * the cache. */ if (symnum >= refobj->dynsymcount) return NULL; /* Bad object */ if (cache != NULL && cache[symnum].sym != NULL) { *defobj_out = cache[symnum].obj; return cache[symnum].sym; } ref = refobj->symtab + symnum; name = refobj->strtab + ref->st_name; def = NULL; defobj = NULL; /* * We don't have to do a full scale lookup if the symbol is local. * We know it will bind to the instance in this load module; to * which we already have a pointer (ie ref). By not doing a lookup, * we not only improve performance, but it also avoids unresolvable * symbols when local symbols are not in the hash table. This has * been seen with the ia64 toolchain. */ if (ELF_ST_BIND(ref->st_info) != STB_LOCAL) { if (ELF_ST_TYPE(ref->st_info) == STT_SECTION) { _rtld_error("%s: Bogus symbol table entry %lu", refobj->path, symnum); } symlook_init(&req, name); req.flags = flags; req.ventry = fetch_ventry(refobj, symnum); req.lockstate = lockstate; res = symlook_default(&req, refobj); if (res == 0) { def = req.sym_out; defobj = req.defobj_out; } } else { def = ref; defobj = refobj; } /* * If we found no definition and the reference is weak, treat the * symbol as having the value zero. */ if (def == NULL && ELF_ST_BIND(ref->st_info) == STB_WEAK) { def = &sym_zero; defobj = obj_main; } if (def != NULL) { *defobj_out = defobj; /* Record the information in the cache to avoid subsequent lookups. */ if (cache != NULL) { cache[symnum].sym = def; cache[symnum].obj = defobj; } } else { if (refobj != &obj_rtld) _rtld_error("%s: Undefined symbol \"%s\"", refobj->path, name); } return def; } /* * Return the search path from the ldconfig hints file, reading it if * necessary. If nostdlib is true, then the default search paths are * not added to result. * * Returns NULL if there are problems with the hints file, * or if the search path there is empty. */ static const char * gethints(bool nostdlib) { static char *hints, *filtered_path; struct elfhints_hdr hdr; struct fill_search_info_args sargs, hargs; struct dl_serinfo smeta, hmeta, *SLPinfo, *hintinfo; struct dl_serpath *SLPpath, *hintpath; char *p; unsigned int SLPndx, hintndx, fndx, fcount; int fd; size_t flen; bool skip; /* First call, read the hints file */ if (hints == NULL) { /* Keep from trying again in case the hints file is bad. */ hints = ""; if ((fd = open(ld_elf_hints_path, O_RDONLY | O_CLOEXEC)) == -1) return (NULL); if (read(fd, &hdr, sizeof hdr) != sizeof hdr || hdr.magic != ELFHINTS_MAGIC || hdr.version != 1) { close(fd); return (NULL); } p = xmalloc(hdr.dirlistlen + 1); if (lseek(fd, hdr.strtab + hdr.dirlist, SEEK_SET) == -1 || read(fd, p, hdr.dirlistlen + 1) != (ssize_t)hdr.dirlistlen + 1) { free(p); close(fd); return (NULL); } hints = p; close(fd); } /* * If caller agreed to receive list which includes the default * paths, we are done. Otherwise, if we still did not * calculated filtered result, do it now. */ if (!nostdlib) return (hints[0] != '\0' ? hints : NULL); if (filtered_path != NULL) goto filt_ret; /* * Obtain the list of all configured search paths, and the * list of the default paths. * * First estimate the size of the results. */ smeta.dls_size = __offsetof(struct dl_serinfo, dls_serpath); smeta.dls_cnt = 0; hmeta.dls_size = __offsetof(struct dl_serinfo, dls_serpath); hmeta.dls_cnt = 0; sargs.request = RTLD_DI_SERINFOSIZE; sargs.serinfo = &smeta; hargs.request = RTLD_DI_SERINFOSIZE; hargs.serinfo = &hmeta; path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &sargs); path_enumerate(p, fill_search_info, &hargs); SLPinfo = xmalloc(smeta.dls_size); hintinfo = xmalloc(hmeta.dls_size); /* * Next fetch both sets of paths. */ sargs.request = RTLD_DI_SERINFO; sargs.serinfo = SLPinfo; sargs.serpath = &SLPinfo->dls_serpath[0]; sargs.strspace = (char *)&SLPinfo->dls_serpath[smeta.dls_cnt]; hargs.request = RTLD_DI_SERINFO; hargs.serinfo = hintinfo; hargs.serpath = &hintinfo->dls_serpath[0]; hargs.strspace = (char *)&hintinfo->dls_serpath[hmeta.dls_cnt]; path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &sargs); path_enumerate(p, fill_search_info, &hargs); /* * Now calculate the difference between two sets, by excluding * standard paths from the full set. */ fndx = 0; fcount = 0; filtered_path = xmalloc(hdr.dirlistlen + 1); hintpath = &hintinfo->dls_serpath[0]; for (hintndx = 0; hintndx < hmeta.dls_cnt; hintndx++, hintpath++) { skip = false; SLPpath = &SLPinfo->dls_serpath[0]; /* * Check each standard path against current. */ for (SLPndx = 0; SLPndx < smeta.dls_cnt; SLPndx++, SLPpath++) { /* matched, skip the path */ if (!strcmp(hintpath->dls_name, SLPpath->dls_name)) { skip = true; break; } } if (skip) continue; /* * Not matched against any standard path, add the path * to result. Separate consequtive paths with ':'. */ if (fcount > 0) { filtered_path[fndx] = ':'; fndx++; } fcount++; flen = strlen(hintpath->dls_name); strncpy((filtered_path + fndx), hintpath->dls_name, flen); fndx += flen; } filtered_path[fndx] = '\0'; free(SLPinfo); free(hintinfo); filt_ret: return (filtered_path[0] != '\0' ? filtered_path : NULL); } static void init_dag(Obj_Entry *root) { const Needed_Entry *needed; const Objlist_Entry *elm; DoneList donelist; if (root->dag_inited) return; donelist_init(&donelist); /* Root object belongs to own DAG. */ objlist_push_tail(&root->dldags, root); objlist_push_tail(&root->dagmembers, root); donelist_check(&donelist, root); /* * Add dependencies of root object to DAG in breadth order * by exploiting the fact that each new object get added * to the tail of the dagmembers list. */ STAILQ_FOREACH(elm, &root->dagmembers, link) { for (needed = elm->obj->needed; needed != NULL; needed = needed->next) { if (needed->obj == NULL || donelist_check(&donelist, needed->obj)) continue; objlist_push_tail(&needed->obj->dldags, root); objlist_push_tail(&root->dagmembers, needed->obj); } } root->dag_inited = true; } static void -process_nodelete(Obj_Entry *root) +process_z(Obj_Entry *root) { const Objlist_Entry *elm; + Obj_Entry *obj; /* - * Walk over object DAG and process every dependent object that - * is marked as DF_1_NODELETE. They need to grow their own DAG, - * which then should have its reference upped separately. + * Walk over object DAG and process every dependent object + * that is marked as DF_1_NODELETE or DF_1_GLOBAL. They need + * to grow their own DAG. + * + * For DF_1_GLOBAL, DAG is required for symbol lookups in + * symlook_global() to work. + * + * For DF_1_NODELETE, the DAG should have its reference upped. */ STAILQ_FOREACH(elm, &root->dagmembers, link) { - if (elm->obj != NULL && elm->obj->z_nodelete && - !elm->obj->ref_nodel) { - dbg("obj %s nodelete", elm->obj->path); - init_dag(elm->obj); - ref_dag(elm->obj); - elm->obj->ref_nodel = true; + obj = elm->obj; + if (obj == NULL) + continue; + if (obj->z_nodelete && !obj->ref_nodel) { + dbg("obj %s -z nodelete", obj->path); + init_dag(obj); + ref_dag(obj); + obj->ref_nodel = true; } + if (obj->z_global && objlist_find(&list_global, obj) == NULL) { + dbg("obj %s -z global", obj->path); + objlist_push_tail(&list_global, obj); + init_dag(obj); + } } } /* * Initialize the dynamic linker. The argument is the address at which * the dynamic linker has been mapped into memory. The primary task of * this function is to relocate the dynamic linker. */ static void init_rtld(caddr_t mapbase, Elf_Auxinfo **aux_info) { Obj_Entry objtmp; /* Temporary rtld object */ const Elf_Dyn *dyn_rpath; const Elf_Dyn *dyn_soname; const Elf_Dyn *dyn_runpath; #ifdef RTLD_INIT_PAGESIZES_EARLY /* The page size is required by the dynamic memory allocator. */ init_pagesizes(aux_info); #endif /* * Conjure up an Obj_Entry structure for the dynamic linker. * * The "path" member can't be initialized yet because string constants * cannot yet be accessed. Below we will set it correctly. */ memset(&objtmp, 0, sizeof(objtmp)); objtmp.path = NULL; objtmp.rtld = true; objtmp.mapbase = mapbase; #ifdef PIC objtmp.relocbase = mapbase; #endif if (RTLD_IS_DYNAMIC()) { objtmp.dynamic = rtld_dynamic(&objtmp); digest_dynamic1(&objtmp, 1, &dyn_rpath, &dyn_soname, &dyn_runpath); assert(objtmp.needed == NULL); #if !defined(__mips__) /* MIPS has a bogus DT_TEXTREL. */ assert(!objtmp.textrel); #endif /* * Temporarily put the dynamic linker entry into the object list, so * that symbols can be found. */ relocate_objects(&objtmp, true, &objtmp, 0, NULL); } /* Initialize the object list. */ obj_tail = &obj_list; /* Now that non-local variables can be accesses, copy out obj_rtld. */ memcpy(&obj_rtld, &objtmp, sizeof(obj_rtld)); #ifndef RTLD_INIT_PAGESIZES_EARLY /* The page size is required by the dynamic memory allocator. */ init_pagesizes(aux_info); #endif if (aux_info[AT_OSRELDATE] != NULL) osreldate = aux_info[AT_OSRELDATE]->a_un.a_val; digest_dynamic2(&obj_rtld, dyn_rpath, dyn_soname, dyn_runpath); /* Replace the path with a dynamically allocated copy. */ obj_rtld.path = xstrdup(PATH_RTLD); r_debug.r_brk = r_debug_state; r_debug.r_state = RT_CONSISTENT; } /* * Retrieve the array of supported page sizes. The kernel provides the page * sizes in increasing order. */ static void init_pagesizes(Elf_Auxinfo **aux_info) { static size_t psa[MAXPAGESIZES]; int mib[2]; size_t len, size; if (aux_info[AT_PAGESIZES] != NULL && aux_info[AT_PAGESIZESLEN] != NULL) { size = aux_info[AT_PAGESIZESLEN]->a_un.a_val; pagesizes = aux_info[AT_PAGESIZES]->a_un.a_ptr; } else { len = 2; if (sysctlnametomib("hw.pagesizes", mib, &len) == 0) size = sizeof(psa); else { /* As a fallback, retrieve the base page size. */ size = sizeof(psa[0]); if (aux_info[AT_PAGESZ] != NULL) { psa[0] = aux_info[AT_PAGESZ]->a_un.a_val; goto psa_filled; } else { mib[0] = CTL_HW; mib[1] = HW_PAGESIZE; len = 2; } } if (sysctl(mib, len, psa, &size, NULL, 0) == -1) { _rtld_error("sysctl for hw.pagesize(s) failed"); rtld_die(); } psa_filled: pagesizes = psa; } npagesizes = size / sizeof(pagesizes[0]); /* Discard any invalid entries at the end of the array. */ while (npagesizes > 0 && pagesizes[npagesizes - 1] == 0) npagesizes--; } /* * Add the init functions from a needed object list (and its recursive * needed objects) to "list". This is not used directly; it is a helper * function for initlist_add_objects(). The write lock must be held * when this function is called. */ static void initlist_add_neededs(Needed_Entry *needed, Objlist *list) { /* Recursively process the successor needed objects. */ if (needed->next != NULL) initlist_add_neededs(needed->next, list); /* Process the current needed object. */ if (needed->obj != NULL) initlist_add_objects(needed->obj, &needed->obj->next, list); } /* * Scan all of the DAGs rooted in the range of objects from "obj" to * "tail" and add their init functions to "list". This recurses over * the DAGs and ensure the proper init ordering such that each object's * needed libraries are initialized before the object itself. At the * same time, this function adds the objects to the global finalization * list "list_fini" in the opposite order. The write lock must be * held when this function is called. */ static void initlist_add_objects(Obj_Entry *obj, Obj_Entry **tail, Objlist *list) { if (obj->init_scanned || obj->init_done) return; obj->init_scanned = true; /* Recursively process the successor objects. */ if (&obj->next != tail) initlist_add_objects(obj->next, tail, list); /* Recursively process the needed objects. */ if (obj->needed != NULL) initlist_add_neededs(obj->needed, list); if (obj->needed_filtees != NULL) initlist_add_neededs(obj->needed_filtees, list); if (obj->needed_aux_filtees != NULL) initlist_add_neededs(obj->needed_aux_filtees, list); /* Add the object to the init list. */ if (obj->preinit_array != (Elf_Addr)NULL || obj->init != (Elf_Addr)NULL || obj->init_array != (Elf_Addr)NULL) objlist_push_tail(list, obj); /* Add the object to the global fini list in the reverse order. */ if ((obj->fini != (Elf_Addr)NULL || obj->fini_array != (Elf_Addr)NULL) && !obj->on_fini_list) { objlist_push_head(&list_fini, obj); obj->on_fini_list = true; } } #ifndef FPTR_TARGET #define FPTR_TARGET(f) ((Elf_Addr) (f)) #endif static void free_needed_filtees(Needed_Entry *n) { Needed_Entry *needed, *needed1; for (needed = n; needed != NULL; needed = needed->next) { if (needed->obj != NULL) { dlclose(needed->obj); needed->obj = NULL; } } for (needed = n; needed != NULL; needed = needed1) { needed1 = needed->next; free(needed); } } static void unload_filtees(Obj_Entry *obj) { free_needed_filtees(obj->needed_filtees); obj->needed_filtees = NULL; free_needed_filtees(obj->needed_aux_filtees); obj->needed_aux_filtees = NULL; obj->filtees_loaded = false; } static void load_filtee1(Obj_Entry *obj, Needed_Entry *needed, int flags, RtldLockState *lockstate) { for (; needed != NULL; needed = needed->next) { needed->obj = dlopen_object(obj->strtab + needed->name, -1, obj, flags, ((ld_loadfltr || obj->z_loadfltr) ? RTLD_NOW : RTLD_LAZY) | RTLD_LOCAL, lockstate); } } static void load_filtees(Obj_Entry *obj, int flags, RtldLockState *lockstate) { lock_restart_for_upgrade(lockstate); if (!obj->filtees_loaded) { load_filtee1(obj, obj->needed_filtees, flags, lockstate); load_filtee1(obj, obj->needed_aux_filtees, flags, lockstate); obj->filtees_loaded = true; } } static int process_needed(Obj_Entry *obj, Needed_Entry *needed, int flags) { Obj_Entry *obj1; for (; needed != NULL; needed = needed->next) { obj1 = needed->obj = load_object(obj->strtab + needed->name, -1, obj, flags & ~RTLD_LO_NOLOAD); if (obj1 == NULL && !ld_tracing && (flags & RTLD_LO_FILTEES) == 0) return (-1); } return (0); } /* * Given a shared object, traverse its list of needed objects, and load * each of them. Returns 0 on success. Generates an error message and * returns -1 on failure. */ static int load_needed_objects(Obj_Entry *first, int flags) { Obj_Entry *obj; for (obj = first; obj != NULL; obj = obj->next) { if (process_needed(obj, obj->needed, flags) == -1) return (-1); } return (0); } static int load_preload_objects(void) { char *p = ld_preload; Obj_Entry *obj; static const char delim[] = " \t:;"; if (p == NULL) return 0; p += strspn(p, delim); while (*p != '\0') { size_t len = strcspn(p, delim); char savech; savech = p[len]; p[len] = '\0'; obj = load_object(p, -1, NULL, 0); if (obj == NULL) return -1; /* XXX - cleanup */ obj->z_interpose = true; p[len] = savech; p += len; p += strspn(p, delim); } LD_UTRACE(UTRACE_PRELOAD_FINISHED, NULL, NULL, 0, 0, NULL); return 0; } static const char * printable_path(const char *path) { return (path == NULL ? "" : path); } /* * Load a shared object into memory, if it is not already loaded. The * object may be specified by name or by user-supplied file descriptor * fd_u. In the later case, the fd_u descriptor is not closed, but its * duplicate is. * * Returns a pointer to the Obj_Entry for the object. Returns NULL * on failure. */ static Obj_Entry * load_object(const char *name, int fd_u, const Obj_Entry *refobj, int flags) { Obj_Entry *obj; int fd; struct stat sb; char *path; fd = -1; if (name != NULL) { for (obj = obj_list->next; obj != NULL; obj = obj->next) { if (object_match_name(obj, name)) return (obj); } path = find_library(name, refobj, &fd); if (path == NULL) return (NULL); } else path = NULL; if (fd >= 0) { /* * search_library_pathfds() opens a fresh file descriptor for the * library, so there is no need to dup(). */ } else if (fd_u == -1) { /* * If we didn't find a match by pathname, or the name is not * supplied, open the file and check again by device and inode. * This avoids false mismatches caused by multiple links or ".." * in pathnames. * * To avoid a race, we open the file and use fstat() rather than * using stat(). */ if ((fd = open(path, O_RDONLY | O_CLOEXEC)) == -1) { _rtld_error("Cannot open \"%s\"", path); free(path); return (NULL); } } else { fd = fcntl(fd_u, F_DUPFD_CLOEXEC, 0); if (fd == -1) { _rtld_error("Cannot dup fd"); free(path); return (NULL); } } if (fstat(fd, &sb) == -1) { _rtld_error("Cannot fstat \"%s\"", printable_path(path)); close(fd); free(path); return NULL; } for (obj = obj_list->next; obj != NULL; obj = obj->next) if (obj->ino == sb.st_ino && obj->dev == sb.st_dev) break; if (obj != NULL && name != NULL) { object_add_name(obj, name); free(path); close(fd); return obj; } if (flags & RTLD_LO_NOLOAD) { free(path); close(fd); return (NULL); } /* First use of this object, so we must map it in */ obj = do_load_object(fd, name, path, &sb, flags); if (obj == NULL) free(path); close(fd); return obj; } static Obj_Entry * do_load_object(int fd, const char *name, char *path, struct stat *sbp, int flags) { Obj_Entry *obj; struct statfs fs; /* * but first, make sure that environment variables haven't been * used to circumvent the noexec flag on a filesystem. */ if (dangerous_ld_env) { if (fstatfs(fd, &fs) != 0) { _rtld_error("Cannot fstatfs \"%s\"", printable_path(path)); return NULL; } if (fs.f_flags & MNT_NOEXEC) { _rtld_error("Cannot execute objects on %s\n", fs.f_mntonname); return NULL; } } dbg("loading \"%s\"", printable_path(path)); obj = map_object(fd, printable_path(path), sbp); if (obj == NULL) return NULL; /* * If DT_SONAME is present in the object, digest_dynamic2 already * added it to the object names. */ if (name != NULL) object_add_name(obj, name); obj->path = path; digest_dynamic(obj, 0); dbg("%s valid_hash_sysv %d valid_hash_gnu %d dynsymcount %d", obj->path, obj->valid_hash_sysv, obj->valid_hash_gnu, obj->dynsymcount); if (obj->z_noopen && (flags & (RTLD_LO_DLOPEN | RTLD_LO_TRACE)) == RTLD_LO_DLOPEN) { dbg("refusing to load non-loadable \"%s\"", obj->path); _rtld_error("Cannot dlopen non-loadable %s", obj->path); munmap(obj->mapbase, obj->mapsize); obj_free(obj); return (NULL); } obj->dlopened = (flags & RTLD_LO_DLOPEN) != 0; *obj_tail = obj; obj_tail = &obj->next; obj_count++; obj_loads++; linkmap_add(obj); /* for GDB & dlinfo() */ max_stack_flags |= obj->stack_flags; dbg(" %p .. %p: %s", obj->mapbase, obj->mapbase + obj->mapsize - 1, obj->path); if (obj->textrel) dbg(" WARNING: %s has impure text", obj->path); LD_UTRACE(UTRACE_LOAD_OBJECT, obj, obj->mapbase, obj->mapsize, 0, obj->path); return obj; } static Obj_Entry * obj_from_addr(const void *addr) { Obj_Entry *obj; for (obj = obj_list; obj != NULL; obj = obj->next) { if (addr < (void *) obj->mapbase) continue; if (addr < (void *) (obj->mapbase + obj->mapsize)) return obj; } return NULL; } static void preinit_main(void) { Elf_Addr *preinit_addr; int index; preinit_addr = (Elf_Addr *)obj_main->preinit_array; if (preinit_addr == NULL) return; for (index = 0; index < obj_main->preinit_array_num; index++) { if (preinit_addr[index] != 0 && preinit_addr[index] != 1) { dbg("calling preinit function for %s at %p", obj_main->path, (void *)preinit_addr[index]); LD_UTRACE(UTRACE_INIT_CALL, obj_main, (void *)preinit_addr[index], 0, 0, obj_main->path); call_init_pointer(obj_main, preinit_addr[index]); } } } /* * Call the finalization functions for each of the objects in "list" * belonging to the DAG of "root" and referenced once. If NULL "root" * is specified, every finalization function will be called regardless * of the reference count and the list elements won't be freed. All of * the objects are expected to have non-NULL fini functions. */ static void objlist_call_fini(Objlist *list, Obj_Entry *root, RtldLockState *lockstate) { Objlist_Entry *elm; char *saved_msg; Elf_Addr *fini_addr; int index; assert(root == NULL || root->refcount == 1); /* * Preserve the current error message since a fini function might * call into the dynamic linker and overwrite it. */ saved_msg = errmsg_save(); do { STAILQ_FOREACH(elm, list, link) { if (root != NULL && (elm->obj->refcount != 1 || objlist_find(&root->dagmembers, elm->obj) == NULL)) continue; /* Remove object from fini list to prevent recursive invocation. */ STAILQ_REMOVE(list, elm, Struct_Objlist_Entry, link); /* * XXX: If a dlopen() call references an object while the * fini function is in progress, we might end up trying to * unload the referenced object in dlclose() or the object * won't be unloaded although its fini function has been * called. */ lock_release(rtld_bind_lock, lockstate); /* * It is legal to have both DT_FINI and DT_FINI_ARRAY defined. * When this happens, DT_FINI_ARRAY is processed first. */ fini_addr = (Elf_Addr *)elm->obj->fini_array; if (fini_addr != NULL && elm->obj->fini_array_num > 0) { for (index = elm->obj->fini_array_num - 1; index >= 0; index--) { if (fini_addr[index] != 0 && fini_addr[index] != 1) { dbg("calling fini function for %s at %p", elm->obj->path, (void *)fini_addr[index]); LD_UTRACE(UTRACE_FINI_CALL, elm->obj, (void *)fini_addr[index], 0, 0, elm->obj->path); call_initfini_pointer(elm->obj, fini_addr[index]); } } } if (elm->obj->fini != (Elf_Addr)NULL) { dbg("calling fini function for %s at %p", elm->obj->path, (void *)elm->obj->fini); LD_UTRACE(UTRACE_FINI_CALL, elm->obj, (void *)elm->obj->fini, 0, 0, elm->obj->path); call_initfini_pointer(elm->obj, elm->obj->fini); } wlock_acquire(rtld_bind_lock, lockstate); /* No need to free anything if process is going down. */ if (root != NULL) free(elm); /* * We must restart the list traversal after every fini call * because a dlclose() call from the fini function or from * another thread might have modified the reference counts. */ break; } } while (elm != NULL); errmsg_restore(saved_msg); } /* * Call the initialization functions for each of the objects in * "list". All of the objects are expected to have non-NULL init * functions. */ static void objlist_call_init(Objlist *list, RtldLockState *lockstate) { Objlist_Entry *elm; Obj_Entry *obj; char *saved_msg; Elf_Addr *init_addr; int index; /* * Clean init_scanned flag so that objects can be rechecked and * possibly initialized earlier if any of vectors called below * cause the change by using dlopen. */ for (obj = obj_list; obj != NULL; obj = obj->next) obj->init_scanned = false; /* * Preserve the current error message since an init function might * call into the dynamic linker and overwrite it. */ saved_msg = errmsg_save(); STAILQ_FOREACH(elm, list, link) { if (elm->obj->init_done) /* Initialized early. */ continue; /* * Race: other thread might try to use this object before current * one completes the initilization. Not much can be done here * without better locking. */ elm->obj->init_done = true; lock_release(rtld_bind_lock, lockstate); /* * It is legal to have both DT_INIT and DT_INIT_ARRAY defined. * When this happens, DT_INIT is processed first. */ if (elm->obj->init != (Elf_Addr)NULL) { dbg("calling init function for %s at %p", elm->obj->path, (void *)elm->obj->init); LD_UTRACE(UTRACE_INIT_CALL, elm->obj, (void *)elm->obj->init, 0, 0, elm->obj->path); call_initfini_pointer(elm->obj, elm->obj->init); } init_addr = (Elf_Addr *)elm->obj->init_array; if (init_addr != NULL) { for (index = 0; index < elm->obj->init_array_num; index++) { if (init_addr[index] != 0 && init_addr[index] != 1) { dbg("calling init function for %s at %p", elm->obj->path, (void *)init_addr[index]); LD_UTRACE(UTRACE_INIT_CALL, elm->obj, (void *)init_addr[index], 0, 0, elm->obj->path); call_init_pointer(elm->obj, init_addr[index]); } } } wlock_acquire(rtld_bind_lock, lockstate); } errmsg_restore(saved_msg); } static void objlist_clear(Objlist *list) { Objlist_Entry *elm; while (!STAILQ_EMPTY(list)) { elm = STAILQ_FIRST(list); STAILQ_REMOVE_HEAD(list, link); free(elm); } } static Objlist_Entry * objlist_find(Objlist *list, const Obj_Entry *obj) { Objlist_Entry *elm; STAILQ_FOREACH(elm, list, link) if (elm->obj == obj) return elm; return NULL; } static void objlist_init(Objlist *list) { STAILQ_INIT(list); } static void objlist_push_head(Objlist *list, Obj_Entry *obj) { Objlist_Entry *elm; elm = NEW(Objlist_Entry); elm->obj = obj; STAILQ_INSERT_HEAD(list, elm, link); } static void objlist_push_tail(Objlist *list, Obj_Entry *obj) { Objlist_Entry *elm; elm = NEW(Objlist_Entry); elm->obj = obj; STAILQ_INSERT_TAIL(list, elm, link); } static void objlist_put_after(Objlist *list, Obj_Entry *listobj, Obj_Entry *obj) { Objlist_Entry *elm, *listelm; STAILQ_FOREACH(listelm, list, link) { if (listelm->obj == listobj) break; } elm = NEW(Objlist_Entry); elm->obj = obj; if (listelm != NULL) STAILQ_INSERT_AFTER(list, listelm, elm, link); else STAILQ_INSERT_TAIL(list, elm, link); } static void objlist_remove(Objlist *list, Obj_Entry *obj) { Objlist_Entry *elm; if ((elm = objlist_find(list, obj)) != NULL) { STAILQ_REMOVE(list, elm, Struct_Objlist_Entry, link); free(elm); } } /* * Relocate dag rooted in the specified object. * Returns 0 on success, or -1 on failure. */ static int relocate_object_dag(Obj_Entry *root, bool bind_now, Obj_Entry *rtldobj, int flags, RtldLockState *lockstate) { Objlist_Entry *elm; int error; error = 0; STAILQ_FOREACH(elm, &root->dagmembers, link) { error = relocate_object(elm->obj, bind_now, rtldobj, flags, lockstate); if (error == -1) break; } return (error); } /* * Relocate single object. * Returns 0 on success, or -1 on failure. */ static int relocate_object(Obj_Entry *obj, bool bind_now, Obj_Entry *rtldobj, int flags, RtldLockState *lockstate) { if (obj->relocated) return (0); obj->relocated = true; if (obj != rtldobj) dbg("relocating \"%s\"", obj->path); if (obj->symtab == NULL || obj->strtab == NULL || !(obj->valid_hash_sysv || obj->valid_hash_gnu)) { _rtld_error("%s: Shared object has no run-time symbol table", obj->path); return (-1); } if (obj->textrel) { /* There are relocations to the write-protected text segment. */ if (mprotect(obj->mapbase, obj->textsize, PROT_READ|PROT_WRITE|PROT_EXEC) == -1) { _rtld_error("%s: Cannot write-enable text segment: %s", obj->path, rtld_strerror(errno)); return (-1); } } /* Process the non-PLT non-IFUNC relocations. */ if (reloc_non_plt(obj, rtldobj, flags, lockstate)) return (-1); if (obj->textrel) { /* Re-protected the text segment. */ if (mprotect(obj->mapbase, obj->textsize, PROT_READ|PROT_EXEC) == -1) { _rtld_error("%s: Cannot write-protect text segment: %s", obj->path, rtld_strerror(errno)); return (-1); } } /* Set the special PLT or GOT entries. */ init_pltgot(obj); /* Process the PLT relocations. */ if (reloc_plt(obj) == -1) return (-1); /* Relocate the jump slots if we are doing immediate binding. */ if (obj->bind_now || bind_now) if (reloc_jmpslots(obj, flags, lockstate) == -1) return (-1); /* * Process the non-PLT IFUNC relocations. The relocations are * processed in two phases, because IFUNC resolvers may * reference other symbols, which must be readily processed * before resolvers are called. */ if (obj->non_plt_gnu_ifunc && reloc_non_plt(obj, rtldobj, flags | SYMLOOK_IFUNC, lockstate)) return (-1); if (obj->relro_size > 0) { if (mprotect(obj->relro_page, obj->relro_size, PROT_READ) == -1) { _rtld_error("%s: Cannot enforce relro protection: %s", obj->path, rtld_strerror(errno)); return (-1); } } /* * Set up the magic number and version in the Obj_Entry. These * were checked in the crt1.o from the original ElfKit, so we * set them for backward compatibility. */ obj->magic = RTLD_MAGIC; obj->version = RTLD_VERSION; return (0); } /* * Relocate newly-loaded shared objects. The argument is a pointer to * the Obj_Entry for the first such object. All objects from the first * to the end of the list of objects are relocated. Returns 0 on success, * or -1 on failure. */ static int relocate_objects(Obj_Entry *first, bool bind_now, Obj_Entry *rtldobj, int flags, RtldLockState *lockstate) { Obj_Entry *obj; int error; for (error = 0, obj = first; obj != NULL; obj = obj->next) { error = relocate_object(obj, bind_now, rtldobj, flags, lockstate); if (error == -1) break; } return (error); } /* * The handling of R_MACHINE_IRELATIVE relocations and jumpslots * referencing STT_GNU_IFUNC symbols is postponed till the other * relocations are done. The indirect functions specified as * ifunc are allowed to call other symbols, so we need to have * objects relocated before asking for resolution from indirects. * * The R_MACHINE_IRELATIVE slots are resolved in greedy fashion, * instead of the usual lazy handling of PLT slots. It is * consistent with how GNU does it. */ static int resolve_object_ifunc(Obj_Entry *obj, bool bind_now, int flags, RtldLockState *lockstate) { if (obj->irelative && reloc_iresolve(obj, lockstate) == -1) return (-1); if ((obj->bind_now || bind_now) && obj->gnu_ifunc && reloc_gnu_ifunc(obj, flags, lockstate) == -1) return (-1); return (0); } static int resolve_objects_ifunc(Obj_Entry *first, bool bind_now, int flags, RtldLockState *lockstate) { Obj_Entry *obj; for (obj = first; obj != NULL; obj = obj->next) { if (resolve_object_ifunc(obj, bind_now, flags, lockstate) == -1) return (-1); } return (0); } static int initlist_objects_ifunc(Objlist *list, bool bind_now, int flags, RtldLockState *lockstate) { Objlist_Entry *elm; STAILQ_FOREACH(elm, list, link) { if (resolve_object_ifunc(elm->obj, bind_now, flags, lockstate) == -1) return (-1); } return (0); } /* * Cleanup procedure. It will be called (by the atexit mechanism) just * before the process exits. */ static void rtld_exit(void) { RtldLockState lockstate; wlock_acquire(rtld_bind_lock, &lockstate); dbg("rtld_exit()"); objlist_call_fini(&list_fini, NULL, &lockstate); /* No need to remove the items from the list, since we are exiting. */ if (!libmap_disable) lm_fini(); lock_release(rtld_bind_lock, &lockstate); } /* * Iterate over a search path, translate each element, and invoke the * callback on the result. */ static void * path_enumerate(const char *path, path_enum_proc callback, void *arg) { const char *trans; if (path == NULL) return (NULL); path += strspn(path, ":;"); while (*path != '\0') { size_t len; char *res; len = strcspn(path, ":;"); trans = lm_findn(NULL, path, len); if (trans) res = callback(trans, strlen(trans), arg); else res = callback(path, len, arg); if (res != NULL) return (res); path += len; path += strspn(path, ":;"); } return (NULL); } struct try_library_args { const char *name; size_t namelen; char *buffer; size_t buflen; }; static void * try_library_path(const char *dir, size_t dirlen, void *param) { struct try_library_args *arg; arg = param; if (*dir == '/' || trust) { char *pathname; if (dirlen + 1 + arg->namelen + 1 > arg->buflen) return (NULL); pathname = arg->buffer; strncpy(pathname, dir, dirlen); pathname[dirlen] = '/'; strcpy(pathname + dirlen + 1, arg->name); dbg(" Trying \"%s\"", pathname); if (access(pathname, F_OK) == 0) { /* We found it */ pathname = xmalloc(dirlen + 1 + arg->namelen + 1); strcpy(pathname, arg->buffer); return (pathname); } } return (NULL); } static char * search_library_path(const char *name, const char *path) { char *p; struct try_library_args arg; if (path == NULL) return NULL; arg.name = name; arg.namelen = strlen(name); arg.buffer = xmalloc(PATH_MAX); arg.buflen = PATH_MAX; p = path_enumerate(path, try_library_path, &arg); free(arg.buffer); return (p); } /* * Finds the library with the given name using the directory descriptors * listed in the LD_LIBRARY_PATH_FDS environment variable. * * Returns a freshly-opened close-on-exec file descriptor for the library, * or -1 if the library cannot be found. */ static char * search_library_pathfds(const char *name, const char *path, int *fdp) { char *envcopy, *fdstr, *found, *last_token; size_t len; int dirfd, fd; dbg("%s('%s', '%s', fdp)", __func__, name, path); /* Don't load from user-specified libdirs into setuid binaries. */ if (!trust) return (NULL); /* We can't do anything if LD_LIBRARY_PATH_FDS isn't set. */ if (path == NULL) return (NULL); /* LD_LIBRARY_PATH_FDS only works with relative paths. */ if (name[0] == '/') { dbg("Absolute path (%s) passed to %s", name, __func__); return (NULL); } /* * Use strtok_r() to walk the FD:FD:FD list. This requires a local * copy of the path, as strtok_r rewrites separator tokens * with '\0'. */ found = NULL; envcopy = xstrdup(path); for (fdstr = strtok_r(envcopy, ":", &last_token); fdstr != NULL; fdstr = strtok_r(NULL, ":", &last_token)) { dirfd = parse_libdir(fdstr); if (dirfd < 0) break; fd = __sys_openat(dirfd, name, O_RDONLY | O_CLOEXEC); if (fd >= 0) { *fdp = fd; len = strlen(fdstr) + strlen(name) + 3; found = xmalloc(len); if (rtld_snprintf(found, len, "#%d/%s", dirfd, name) < 0) { _rtld_error("error generating '%d/%s'", dirfd, name); rtld_die(); } dbg("open('%s') => %d", found, fd); break; } } free(envcopy); return (found); } int dlclose(void *handle) { Obj_Entry *root; RtldLockState lockstate; wlock_acquire(rtld_bind_lock, &lockstate); root = dlcheck(handle); if (root == NULL) { lock_release(rtld_bind_lock, &lockstate); return -1; } LD_UTRACE(UTRACE_DLCLOSE_START, handle, NULL, 0, root->dl_refcount, root->path); /* Unreference the object and its dependencies. */ root->dl_refcount--; if (root->refcount == 1) { /* * The object will be no longer referenced, so we must unload it. * First, call the fini functions. */ objlist_call_fini(&list_fini, root, &lockstate); unref_dag(root); /* Finish cleaning up the newly-unreferenced objects. */ GDB_STATE(RT_DELETE,&root->linkmap); unload_object(root); GDB_STATE(RT_CONSISTENT,NULL); } else unref_dag(root); LD_UTRACE(UTRACE_DLCLOSE_STOP, handle, NULL, 0, 0, NULL); lock_release(rtld_bind_lock, &lockstate); return 0; } char * dlerror(void) { char *msg = error_message; error_message = NULL; return msg; } /* * This function is deprecated and has no effect. */ void dllockinit(void *context, void *(*lock_create)(void *context), void (*rlock_acquire)(void *lock), void (*wlock_acquire)(void *lock), void (*lock_release)(void *lock), void (*lock_destroy)(void *lock), void (*context_destroy)(void *context)) { static void *cur_context; static void (*cur_context_destroy)(void *); /* Just destroy the context from the previous call, if necessary. */ if (cur_context_destroy != NULL) cur_context_destroy(cur_context); cur_context = context; cur_context_destroy = context_destroy; } void * dlopen(const char *name, int mode) { return (rtld_dlopen(name, -1, mode)); } void * fdlopen(int fd, int mode) { return (rtld_dlopen(NULL, fd, mode)); } static void * rtld_dlopen(const char *name, int fd, int mode) { RtldLockState lockstate; int lo_flags; LD_UTRACE(UTRACE_DLOPEN_START, NULL, NULL, 0, mode, name); ld_tracing = (mode & RTLD_TRACE) == 0 ? NULL : "1"; if (ld_tracing != NULL) { rlock_acquire(rtld_bind_lock, &lockstate); if (sigsetjmp(lockstate.env, 0) != 0) lock_upgrade(rtld_bind_lock, &lockstate); environ = (char **)*get_program_var_addr("environ", &lockstate); lock_release(rtld_bind_lock, &lockstate); } lo_flags = RTLD_LO_DLOPEN; if (mode & RTLD_NODELETE) lo_flags |= RTLD_LO_NODELETE; if (mode & RTLD_NOLOAD) lo_flags |= RTLD_LO_NOLOAD; if (ld_tracing != NULL) lo_flags |= RTLD_LO_TRACE; return (dlopen_object(name, fd, obj_main, lo_flags, mode & (RTLD_MODEMASK | RTLD_GLOBAL), NULL)); } static void dlopen_cleanup(Obj_Entry *obj) { obj->dl_refcount--; unref_dag(obj); if (obj->refcount == 0) unload_object(obj); } static Obj_Entry * dlopen_object(const char *name, int fd, Obj_Entry *refobj, int lo_flags, int mode, RtldLockState *lockstate) { Obj_Entry **old_obj_tail; Obj_Entry *obj; Objlist initlist; RtldLockState mlockstate; int result; objlist_init(&initlist); if (lockstate == NULL && !(lo_flags & RTLD_LO_EARLY)) { wlock_acquire(rtld_bind_lock, &mlockstate); lockstate = &mlockstate; } GDB_STATE(RT_ADD,NULL); old_obj_tail = obj_tail; obj = NULL; if (name == NULL && fd == -1) { obj = obj_main; obj->refcount++; } else { obj = load_object(name, fd, refobj, lo_flags); } if (obj) { obj->dl_refcount++; if (mode & RTLD_GLOBAL && objlist_find(&list_global, obj) == NULL) objlist_push_tail(&list_global, obj); if (*old_obj_tail != NULL) { /* We loaded something new. */ assert(*old_obj_tail == obj); result = load_needed_objects(obj, lo_flags & (RTLD_LO_DLOPEN | RTLD_LO_EARLY)); init_dag(obj); ref_dag(obj); if (result != -1) result = rtld_verify_versions(&obj->dagmembers); if (result != -1 && ld_tracing) goto trace; if (result == -1 || relocate_object_dag(obj, (mode & RTLD_MODEMASK) == RTLD_NOW, &obj_rtld, (lo_flags & RTLD_LO_EARLY) ? SYMLOOK_EARLY : 0, lockstate) == -1) { dlopen_cleanup(obj); obj = NULL; } else if (lo_flags & RTLD_LO_EARLY) { /* * Do not call the init functions for early loaded * filtees. The image is still not initialized enough * for them to work. * * Our object is found by the global object list and * will be ordered among all init calls done right * before transferring control to main. */ } else { /* Make list of init functions to call. */ initlist_add_objects(obj, &obj->next, &initlist); } /* - * Process all no_delete objects here, given them own - * DAGs to prevent their dependencies from being unloaded. - * This has to be done after we have loaded all of the - * dependencies, so that we do not miss any. + * Process all no_delete or global objects here, given + * them own DAGs to prevent their dependencies from being + * unloaded. This has to be done after we have loaded all + * of the dependencies, so that we do not miss any. */ if (obj != NULL) - process_nodelete(obj); + process_z(obj); } else { /* * Bump the reference counts for objects on this DAG. If * this is the first dlopen() call for the object that was * already loaded as a dependency, initialize the dag * starting at it. */ init_dag(obj); ref_dag(obj); if ((lo_flags & RTLD_LO_TRACE) != 0) goto trace; } if (obj != NULL && ((lo_flags & RTLD_LO_NODELETE) != 0 || obj->z_nodelete) && !obj->ref_nodel) { dbg("obj %s nodelete", obj->path); ref_dag(obj); obj->z_nodelete = obj->ref_nodel = true; } } LD_UTRACE(UTRACE_DLOPEN_STOP, obj, NULL, 0, obj ? obj->dl_refcount : 0, name); GDB_STATE(RT_CONSISTENT,obj ? &obj->linkmap : NULL); if (!(lo_flags & RTLD_LO_EARLY)) { map_stacks_exec(lockstate); } if (initlist_objects_ifunc(&initlist, (mode & RTLD_MODEMASK) == RTLD_NOW, (lo_flags & RTLD_LO_EARLY) ? SYMLOOK_EARLY : 0, lockstate) == -1) { objlist_clear(&initlist); dlopen_cleanup(obj); if (lockstate == &mlockstate) lock_release(rtld_bind_lock, lockstate); return (NULL); } if (!(lo_flags & RTLD_LO_EARLY)) { /* Call the init functions. */ objlist_call_init(&initlist, lockstate); } objlist_clear(&initlist); if (lockstate == &mlockstate) lock_release(rtld_bind_lock, lockstate); return obj; trace: trace_loaded_objects(obj); if (lockstate == &mlockstate) lock_release(rtld_bind_lock, lockstate); exit(0); } static void * do_dlsym(void *handle, const char *name, void *retaddr, const Ver_Entry *ve, int flags) { DoneList donelist; const Obj_Entry *obj, *defobj; const Elf_Sym *def; SymLook req; RtldLockState lockstate; tls_index ti; void *sym; int res; def = NULL; defobj = NULL; symlook_init(&req, name); req.ventry = ve; req.flags = flags | SYMLOOK_IN_PLT; req.lockstate = &lockstate; LD_UTRACE(UTRACE_DLSYM_START, handle, NULL, 0, 0, name); rlock_acquire(rtld_bind_lock, &lockstate); if (sigsetjmp(lockstate.env, 0) != 0) lock_upgrade(rtld_bind_lock, &lockstate); if (handle == NULL || handle == RTLD_NEXT || handle == RTLD_DEFAULT || handle == RTLD_SELF) { if ((obj = obj_from_addr(retaddr)) == NULL) { _rtld_error("Cannot determine caller's shared object"); lock_release(rtld_bind_lock, &lockstate); LD_UTRACE(UTRACE_DLSYM_STOP, handle, NULL, 0, 0, name); return NULL; } if (handle == NULL) { /* Just the caller's shared object. */ res = symlook_obj(&req, obj); if (res == 0) { def = req.sym_out; defobj = req.defobj_out; } } else if (handle == RTLD_NEXT || /* Objects after caller's */ handle == RTLD_SELF) { /* ... caller included */ if (handle == RTLD_NEXT) obj = obj->next; for (; obj != NULL; obj = obj->next) { res = symlook_obj(&req, obj); if (res == 0) { if (def == NULL || ELF_ST_BIND(req.sym_out->st_info) != STB_WEAK) { def = req.sym_out; defobj = req.defobj_out; if (ELF_ST_BIND(def->st_info) != STB_WEAK) break; } } } /* * Search the dynamic linker itself, and possibly resolve the * symbol from there. This is how the application links to * dynamic linker services such as dlopen. */ if (def == NULL || ELF_ST_BIND(def->st_info) == STB_WEAK) { res = symlook_obj(&req, &obj_rtld); if (res == 0) { def = req.sym_out; defobj = req.defobj_out; } } } else { assert(handle == RTLD_DEFAULT); res = symlook_default(&req, obj); if (res == 0) { defobj = req.defobj_out; def = req.sym_out; } } } else { if ((obj = dlcheck(handle)) == NULL) { lock_release(rtld_bind_lock, &lockstate); LD_UTRACE(UTRACE_DLSYM_STOP, handle, NULL, 0, 0, name); return NULL; } donelist_init(&donelist); if (obj->mainprog) { /* Handle obtained by dlopen(NULL, ...) implies global scope. */ res = symlook_global(&req, &donelist); if (res == 0) { def = req.sym_out; defobj = req.defobj_out; } /* * Search the dynamic linker itself, and possibly resolve the * symbol from there. This is how the application links to * dynamic linker services such as dlopen. */ if (def == NULL || ELF_ST_BIND(def->st_info) == STB_WEAK) { res = symlook_obj(&req, &obj_rtld); if (res == 0) { def = req.sym_out; defobj = req.defobj_out; } } } else { /* Search the whole DAG rooted at the given object. */ res = symlook_list(&req, &obj->dagmembers, &donelist); if (res == 0) { def = req.sym_out; defobj = req.defobj_out; } } } if (def != NULL) { lock_release(rtld_bind_lock, &lockstate); /* * The value required by the caller is derived from the value * of the symbol. this is simply the relocated value of the * symbol. */ if (ELF_ST_TYPE(def->st_info) == STT_FUNC) sym = make_function_pointer(def, defobj); else if (ELF_ST_TYPE(def->st_info) == STT_GNU_IFUNC) sym = rtld_resolve_ifunc(defobj, def); else if (ELF_ST_TYPE(def->st_info) == STT_TLS) { ti.ti_module = defobj->tlsindex; ti.ti_offset = def->st_value; sym = __tls_get_addr(&ti); } else sym = defobj->relocbase + def->st_value; LD_UTRACE(UTRACE_DLSYM_STOP, handle, sym, 0, 0, name); return (sym); } _rtld_error("Undefined symbol \"%s\"", name); lock_release(rtld_bind_lock, &lockstate); LD_UTRACE(UTRACE_DLSYM_STOP, handle, NULL, 0, 0, name); return NULL; } void * dlsym(void *handle, const char *name) { return do_dlsym(handle, name, __builtin_return_address(0), NULL, SYMLOOK_DLSYM); } dlfunc_t dlfunc(void *handle, const char *name) { union { void *d; dlfunc_t f; } rv; rv.d = do_dlsym(handle, name, __builtin_return_address(0), NULL, SYMLOOK_DLSYM); return (rv.f); } void * dlvsym(void *handle, const char *name, const char *version) { Ver_Entry ventry; ventry.name = version; ventry.file = NULL; ventry.hash = elf_hash(version); ventry.flags= 0; return do_dlsym(handle, name, __builtin_return_address(0), &ventry, SYMLOOK_DLSYM); } int _rtld_addr_phdr(const void *addr, struct dl_phdr_info *phdr_info) { const Obj_Entry *obj; RtldLockState lockstate; rlock_acquire(rtld_bind_lock, &lockstate); obj = obj_from_addr(addr); if (obj == NULL) { _rtld_error("No shared object contains address"); lock_release(rtld_bind_lock, &lockstate); return (0); } rtld_fill_dl_phdr_info(obj, phdr_info); lock_release(rtld_bind_lock, &lockstate); return (1); } int dladdr(const void *addr, Dl_info *info) { const Obj_Entry *obj; const Elf_Sym *def; void *symbol_addr; unsigned long symoffset; RtldLockState lockstate; rlock_acquire(rtld_bind_lock, &lockstate); obj = obj_from_addr(addr); if (obj == NULL) { _rtld_error("No shared object contains address"); lock_release(rtld_bind_lock, &lockstate); return 0; } info->dli_fname = obj->path; info->dli_fbase = obj->mapbase; info->dli_saddr = (void *)0; info->dli_sname = NULL; /* * Walk the symbol list looking for the symbol whose address is * closest to the address sent in. */ for (symoffset = 0; symoffset < obj->dynsymcount; symoffset++) { def = obj->symtab + symoffset; /* * For skip the symbol if st_shndx is either SHN_UNDEF or * SHN_COMMON. */ if (def->st_shndx == SHN_UNDEF || def->st_shndx == SHN_COMMON) continue; /* * If the symbol is greater than the specified address, or if it * is further away from addr than the current nearest symbol, * then reject it. */ symbol_addr = obj->relocbase + def->st_value; if (symbol_addr > addr || symbol_addr < info->dli_saddr) continue; /* Update our idea of the nearest symbol. */ info->dli_sname = obj->strtab + def->st_name; info->dli_saddr = symbol_addr; /* Exact match? */ if (info->dli_saddr == addr) break; } lock_release(rtld_bind_lock, &lockstate); return 1; } int dlinfo(void *handle, int request, void *p) { const Obj_Entry *obj; RtldLockState lockstate; int error; rlock_acquire(rtld_bind_lock, &lockstate); if (handle == NULL || handle == RTLD_SELF) { void *retaddr; retaddr = __builtin_return_address(0); /* __GNUC__ only */ if ((obj = obj_from_addr(retaddr)) == NULL) _rtld_error("Cannot determine caller's shared object"); } else obj = dlcheck(handle); if (obj == NULL) { lock_release(rtld_bind_lock, &lockstate); return (-1); } error = 0; switch (request) { case RTLD_DI_LINKMAP: *((struct link_map const **)p) = &obj->linkmap; break; case RTLD_DI_ORIGIN: error = rtld_dirname(obj->path, p); break; case RTLD_DI_SERINFOSIZE: case RTLD_DI_SERINFO: error = do_search_info(obj, request, (struct dl_serinfo *)p); break; default: _rtld_error("Invalid request %d passed to dlinfo()", request); error = -1; } lock_release(rtld_bind_lock, &lockstate); return (error); } static void rtld_fill_dl_phdr_info(const Obj_Entry *obj, struct dl_phdr_info *phdr_info) { phdr_info->dlpi_addr = (Elf_Addr)obj->relocbase; phdr_info->dlpi_name = obj->path; phdr_info->dlpi_phdr = obj->phdr; phdr_info->dlpi_phnum = obj->phsize / sizeof(obj->phdr[0]); phdr_info->dlpi_tls_modid = obj->tlsindex; phdr_info->dlpi_tls_data = obj->tlsinit; phdr_info->dlpi_adds = obj_loads; phdr_info->dlpi_subs = obj_loads - obj_count; } int dl_iterate_phdr(__dl_iterate_hdr_callback callback, void *param) { struct dl_phdr_info phdr_info; const Obj_Entry *obj; RtldLockState bind_lockstate, phdr_lockstate; int error; wlock_acquire(rtld_phdr_lock, &phdr_lockstate); rlock_acquire(rtld_bind_lock, &bind_lockstate); error = 0; for (obj = obj_list; obj != NULL; obj = obj->next) { rtld_fill_dl_phdr_info(obj, &phdr_info); if ((error = callback(&phdr_info, sizeof phdr_info, param)) != 0) break; } if (error == 0) { rtld_fill_dl_phdr_info(&obj_rtld, &phdr_info); error = callback(&phdr_info, sizeof(phdr_info), param); } lock_release(rtld_bind_lock, &bind_lockstate); lock_release(rtld_phdr_lock, &phdr_lockstate); return (error); } static void * fill_search_info(const char *dir, size_t dirlen, void *param) { struct fill_search_info_args *arg; arg = param; if (arg->request == RTLD_DI_SERINFOSIZE) { arg->serinfo->dls_cnt ++; arg->serinfo->dls_size += sizeof(struct dl_serpath) + dirlen + 1; } else { struct dl_serpath *s_entry; s_entry = arg->serpath; s_entry->dls_name = arg->strspace; s_entry->dls_flags = arg->flags; strncpy(arg->strspace, dir, dirlen); arg->strspace[dirlen] = '\0'; arg->strspace += dirlen + 1; arg->serpath++; } return (NULL); } static int do_search_info(const Obj_Entry *obj, int request, struct dl_serinfo *info) { struct dl_serinfo _info; struct fill_search_info_args args; args.request = RTLD_DI_SERINFOSIZE; args.serinfo = &_info; _info.dls_size = __offsetof(struct dl_serinfo, dls_serpath); _info.dls_cnt = 0; path_enumerate(obj->rpath, fill_search_info, &args); path_enumerate(ld_library_path, fill_search_info, &args); path_enumerate(obj->runpath, fill_search_info, &args); path_enumerate(gethints(obj->z_nodeflib), fill_search_info, &args); if (!obj->z_nodeflib) path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &args); if (request == RTLD_DI_SERINFOSIZE) { info->dls_size = _info.dls_size; info->dls_cnt = _info.dls_cnt; return (0); } if (info->dls_cnt != _info.dls_cnt || info->dls_size != _info.dls_size) { _rtld_error("Uninitialized Dl_serinfo struct passed to dlinfo()"); return (-1); } args.request = RTLD_DI_SERINFO; args.serinfo = info; args.serpath = &info->dls_serpath[0]; args.strspace = (char *)&info->dls_serpath[_info.dls_cnt]; args.flags = LA_SER_RUNPATH; if (path_enumerate(obj->rpath, fill_search_info, &args) != NULL) return (-1); args.flags = LA_SER_LIBPATH; if (path_enumerate(ld_library_path, fill_search_info, &args) != NULL) return (-1); args.flags = LA_SER_RUNPATH; if (path_enumerate(obj->runpath, fill_search_info, &args) != NULL) return (-1); args.flags = LA_SER_CONFIG; if (path_enumerate(gethints(obj->z_nodeflib), fill_search_info, &args) != NULL) return (-1); args.flags = LA_SER_DEFAULT; if (!obj->z_nodeflib && path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &args) != NULL) return (-1); return (0); } static int rtld_dirname(const char *path, char *bname) { const char *endp; /* Empty or NULL string gets treated as "." */ if (path == NULL || *path == '\0') { bname[0] = '.'; bname[1] = '\0'; return (0); } /* Strip trailing slashes */ endp = path + strlen(path) - 1; while (endp > path && *endp == '/') endp--; /* Find the start of the dir */ while (endp > path && *endp != '/') endp--; /* Either the dir is "/" or there are no slashes */ if (endp == path) { bname[0] = *endp == '/' ? '/' : '.'; bname[1] = '\0'; return (0); } else { do { endp--; } while (endp > path && *endp == '/'); } if (endp - path + 2 > PATH_MAX) { _rtld_error("Filename is too long: %s", path); return(-1); } strncpy(bname, path, endp - path + 1); bname[endp - path + 1] = '\0'; return (0); } static int rtld_dirname_abs(const char *path, char *base) { char *last; if (realpath(path, base) == NULL) return (-1); dbg("%s -> %s", path, base); last = strrchr(base, '/'); if (last == NULL) return (-1); if (last != base) *last = '\0'; return (0); } static void linkmap_add(Obj_Entry *obj) { struct link_map *l = &obj->linkmap; struct link_map *prev; obj->linkmap.l_name = obj->path; obj->linkmap.l_addr = obj->mapbase; obj->linkmap.l_ld = obj->dynamic; #ifdef __mips__ /* GDB needs load offset on MIPS to use the symbols */ obj->linkmap.l_offs = obj->relocbase; #endif if (r_debug.r_map == NULL) { r_debug.r_map = l; return; } /* * Scan to the end of the list, but not past the entry for the * dynamic linker, which we want to keep at the very end. */ for (prev = r_debug.r_map; prev->l_next != NULL && prev->l_next != &obj_rtld.linkmap; prev = prev->l_next) ; /* Link in the new entry. */ l->l_prev = prev; l->l_next = prev->l_next; if (l->l_next != NULL) l->l_next->l_prev = l; prev->l_next = l; } static void linkmap_delete(Obj_Entry *obj) { struct link_map *l = &obj->linkmap; if (l->l_prev == NULL) { if ((r_debug.r_map = l->l_next) != NULL) l->l_next->l_prev = NULL; return; } if ((l->l_prev->l_next = l->l_next) != NULL) l->l_next->l_prev = l->l_prev; } /* * Function for the debugger to set a breakpoint on to gain control. * * The two parameters allow the debugger to easily find and determine * what the runtime loader is doing and to whom it is doing it. * * When the loadhook trap is hit (r_debug_state, set at program * initialization), the arguments can be found on the stack: * * +8 struct link_map *m * +4 struct r_debug *rd * +0 RetAddr */ void r_debug_state(struct r_debug* rd, struct link_map *m) { /* * The following is a hack to force the compiler to emit calls to * this function, even when optimizing. If the function is empty, * the compiler is not obliged to emit any code for calls to it, * even when marked __noinline. However, gdb depends on those * calls being made. */ __compiler_membar(); } /* * A function called after init routines have completed. This can be used to * break before a program's entry routine is called, and can be used when * main is not available in the symbol table. */ void _r_debug_postinit(struct link_map *m) { /* See r_debug_state(). */ __compiler_membar(); } /* * Get address of the pointer variable in the main program. * Prefer non-weak symbol over the weak one. */ static const void ** get_program_var_addr(const char *name, RtldLockState *lockstate) { SymLook req; DoneList donelist; symlook_init(&req, name); req.lockstate = lockstate; donelist_init(&donelist); if (symlook_global(&req, &donelist) != 0) return (NULL); if (ELF_ST_TYPE(req.sym_out->st_info) == STT_FUNC) return ((const void **)make_function_pointer(req.sym_out, req.defobj_out)); else if (ELF_ST_TYPE(req.sym_out->st_info) == STT_GNU_IFUNC) return ((const void **)rtld_resolve_ifunc(req.defobj_out, req.sym_out)); else return ((const void **)(req.defobj_out->relocbase + req.sym_out->st_value)); } /* * Set a pointer variable in the main program to the given value. This * is used to set key variables such as "environ" before any of the * init functions are called. */ static void set_program_var(const char *name, const void *value) { const void **addr; if ((addr = get_program_var_addr(name, NULL)) != NULL) { dbg("\"%s\": *%p <-- %p", name, addr, value); *addr = value; } } /* * Search the global objects, including dependencies and main object, * for the given symbol. */ static int symlook_global(SymLook *req, DoneList *donelist) { SymLook req1; const Objlist_Entry *elm; int res; symlook_init_from_req(&req1, req); /* Search all objects loaded at program start up. */ if (req->defobj_out == NULL || ELF_ST_BIND(req->sym_out->st_info) == STB_WEAK) { res = symlook_list(&req1, &list_main, donelist); if (res == 0 && (req->defobj_out == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK)) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; assert(req->defobj_out != NULL); } } /* Search all DAGs whose roots are RTLD_GLOBAL objects. */ STAILQ_FOREACH(elm, &list_global, link) { if (req->defobj_out != NULL && ELF_ST_BIND(req->sym_out->st_info) != STB_WEAK) break; res = symlook_list(&req1, &elm->obj->dagmembers, donelist); if (res == 0 && (req->defobj_out == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK)) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; assert(req->defobj_out != NULL); } } return (req->sym_out != NULL ? 0 : ESRCH); } /* * Given a symbol name in a referencing object, find the corresponding * definition of the symbol. Returns a pointer to the symbol, or NULL if * no definition was found. Returns a pointer to the Obj_Entry of the * defining object via the reference parameter DEFOBJ_OUT. */ static int symlook_default(SymLook *req, const Obj_Entry *refobj) { DoneList donelist; const Objlist_Entry *elm; SymLook req1; int res; donelist_init(&donelist); symlook_init_from_req(&req1, req); /* Look first in the referencing object if linked symbolically. */ if (refobj->symbolic && !donelist_check(&donelist, refobj)) { res = symlook_obj(&req1, refobj); if (res == 0) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; assert(req->defobj_out != NULL); } } symlook_global(req, &donelist); /* Search all dlopened DAGs containing the referencing object. */ STAILQ_FOREACH(elm, &refobj->dldags, link) { if (req->sym_out != NULL && ELF_ST_BIND(req->sym_out->st_info) != STB_WEAK) break; res = symlook_list(&req1, &elm->obj->dagmembers, &donelist); if (res == 0 && (req->sym_out == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK)) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; assert(req->defobj_out != NULL); } } /* * Search the dynamic linker itself, and possibly resolve the * symbol from there. This is how the application links to * dynamic linker services such as dlopen. */ if (req->sym_out == NULL || ELF_ST_BIND(req->sym_out->st_info) == STB_WEAK) { res = symlook_obj(&req1, &obj_rtld); if (res == 0) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; assert(req->defobj_out != NULL); } } return (req->sym_out != NULL ? 0 : ESRCH); } static int symlook_list(SymLook *req, const Objlist *objlist, DoneList *dlp) { const Elf_Sym *def; const Obj_Entry *defobj; const Objlist_Entry *elm; SymLook req1; int res; def = NULL; defobj = NULL; STAILQ_FOREACH(elm, objlist, link) { if (donelist_check(dlp, elm->obj)) continue; symlook_init_from_req(&req1, req); if ((res = symlook_obj(&req1, elm->obj)) == 0) { if (def == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK) { def = req1.sym_out; defobj = req1.defobj_out; if (ELF_ST_BIND(def->st_info) != STB_WEAK) break; } } } if (def != NULL) { req->sym_out = def; req->defobj_out = defobj; return (0); } return (ESRCH); } /* * Search the chain of DAGS cointed to by the given Needed_Entry * for a symbol of the given name. Each DAG is scanned completely * before advancing to the next one. Returns a pointer to the symbol, * or NULL if no definition was found. */ static int symlook_needed(SymLook *req, const Needed_Entry *needed, DoneList *dlp) { const Elf_Sym *def; const Needed_Entry *n; const Obj_Entry *defobj; SymLook req1; int res; def = NULL; defobj = NULL; symlook_init_from_req(&req1, req); for (n = needed; n != NULL; n = n->next) { if (n->obj == NULL || (res = symlook_list(&req1, &n->obj->dagmembers, dlp)) != 0) continue; if (def == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK) { def = req1.sym_out; defobj = req1.defobj_out; if (ELF_ST_BIND(def->st_info) != STB_WEAK) break; } } if (def != NULL) { req->sym_out = def; req->defobj_out = defobj; return (0); } return (ESRCH); } /* * Search the symbol table of a single shared object for a symbol of * the given name and version, if requested. Returns a pointer to the * symbol, or NULL if no definition was found. If the object is * filter, return filtered symbol from filtee. * * The symbol's hash value is passed in for efficiency reasons; that * eliminates many recomputations of the hash value. */ int symlook_obj(SymLook *req, const Obj_Entry *obj) { DoneList donelist; SymLook req1; int flags, res, mres; /* * If there is at least one valid hash at this point, we prefer to * use the faster GNU version if available. */ if (obj->valid_hash_gnu) mres = symlook_obj1_gnu(req, obj); else if (obj->valid_hash_sysv) mres = symlook_obj1_sysv(req, obj); else return (EINVAL); if (mres == 0) { if (obj->needed_filtees != NULL) { flags = (req->flags & SYMLOOK_EARLY) ? RTLD_LO_EARLY : 0; load_filtees(__DECONST(Obj_Entry *, obj), flags, req->lockstate); donelist_init(&donelist); symlook_init_from_req(&req1, req); res = symlook_needed(&req1, obj->needed_filtees, &donelist); if (res == 0) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; } return (res); } if (obj->needed_aux_filtees != NULL) { flags = (req->flags & SYMLOOK_EARLY) ? RTLD_LO_EARLY : 0; load_filtees(__DECONST(Obj_Entry *, obj), flags, req->lockstate); donelist_init(&donelist); symlook_init_from_req(&req1, req); res = symlook_needed(&req1, obj->needed_aux_filtees, &donelist); if (res == 0) { req->sym_out = req1.sym_out; req->defobj_out = req1.defobj_out; return (res); } } } return (mres); } /* Symbol match routine common to both hash functions */ static bool matched_symbol(SymLook *req, const Obj_Entry *obj, Sym_Match_Result *result, const unsigned long symnum) { Elf_Versym verndx; const Elf_Sym *symp; const char *strp; symp = obj->symtab + symnum; strp = obj->strtab + symp->st_name; switch (ELF_ST_TYPE(symp->st_info)) { case STT_FUNC: case STT_NOTYPE: case STT_OBJECT: case STT_COMMON: case STT_GNU_IFUNC: if (symp->st_value == 0) return (false); /* fallthrough */ case STT_TLS: if (symp->st_shndx != SHN_UNDEF) break; #ifndef __mips__ else if (((req->flags & SYMLOOK_IN_PLT) == 0) && (ELF_ST_TYPE(symp->st_info) == STT_FUNC)) break; /* fallthrough */ #endif default: return (false); } if (req->name[0] != strp[0] || strcmp(req->name, strp) != 0) return (false); if (req->ventry == NULL) { if (obj->versyms != NULL) { verndx = VER_NDX(obj->versyms[symnum]); if (verndx > obj->vernum) { _rtld_error( "%s: symbol %s references wrong version %d", obj->path, obj->strtab + symnum, verndx); return (false); } /* * If we are not called from dlsym (i.e. this * is a normal relocation from unversioned * binary), accept the symbol immediately if * it happens to have first version after this * shared object became versioned. Otherwise, * if symbol is versioned and not hidden, * remember it. If it is the only symbol with * this name exported by the shared object, it * will be returned as a match by the calling * function. If symbol is global (verndx < 2) * accept it unconditionally. */ if ((req->flags & SYMLOOK_DLSYM) == 0 && verndx == VER_NDX_GIVEN) { result->sym_out = symp; return (true); } else if (verndx >= VER_NDX_GIVEN) { if ((obj->versyms[symnum] & VER_NDX_HIDDEN) == 0) { if (result->vsymp == NULL) result->vsymp = symp; result->vcount++; } return (false); } } result->sym_out = symp; return (true); } if (obj->versyms == NULL) { if (object_match_name(obj, req->ventry->name)) { _rtld_error("%s: object %s should provide version %s " "for symbol %s", obj_rtld.path, obj->path, req->ventry->name, obj->strtab + symnum); return (false); } } else { verndx = VER_NDX(obj->versyms[symnum]); if (verndx > obj->vernum) { _rtld_error("%s: symbol %s references wrong version %d", obj->path, obj->strtab + symnum, verndx); return (false); } if (obj->vertab[verndx].hash != req->ventry->hash || strcmp(obj->vertab[verndx].name, req->ventry->name)) { /* * Version does not match. Look if this is a * global symbol and if it is not hidden. If * global symbol (verndx < 2) is available, * use it. Do not return symbol if we are * called by dlvsym, because dlvsym looks for * a specific version and default one is not * what dlvsym wants. */ if ((req->flags & SYMLOOK_DLSYM) || (verndx >= VER_NDX_GIVEN) || (obj->versyms[symnum] & VER_NDX_HIDDEN)) return (false); } } result->sym_out = symp; return (true); } /* * Search for symbol using SysV hash function. * obj->buckets is known not to be NULL at this point; the test for this was * performed with the obj->valid_hash_sysv assignment. */ static int symlook_obj1_sysv(SymLook *req, const Obj_Entry *obj) { unsigned long symnum; Sym_Match_Result matchres; matchres.sym_out = NULL; matchres.vsymp = NULL; matchres.vcount = 0; for (symnum = obj->buckets[req->hash % obj->nbuckets]; symnum != STN_UNDEF; symnum = obj->chains[symnum]) { if (symnum >= obj->nchains) return (ESRCH); /* Bad object */ if (matched_symbol(req, obj, &matchres, symnum)) { req->sym_out = matchres.sym_out; req->defobj_out = obj; return (0); } } if (matchres.vcount == 1) { req->sym_out = matchres.vsymp; req->defobj_out = obj; return (0); } return (ESRCH); } /* Search for symbol using GNU hash function */ static int symlook_obj1_gnu(SymLook *req, const Obj_Entry *obj) { Elf_Addr bloom_word; const Elf32_Word *hashval; Elf32_Word bucket; Sym_Match_Result matchres; unsigned int h1, h2; unsigned long symnum; matchres.sym_out = NULL; matchres.vsymp = NULL; matchres.vcount = 0; /* Pick right bitmask word from Bloom filter array */ bloom_word = obj->bloom_gnu[(req->hash_gnu / __ELF_WORD_SIZE) & obj->maskwords_bm_gnu]; /* Calculate modulus word size of gnu hash and its derivative */ h1 = req->hash_gnu & (__ELF_WORD_SIZE - 1); h2 = ((req->hash_gnu >> obj->shift2_gnu) & (__ELF_WORD_SIZE - 1)); /* Filter out the "definitely not in set" queries */ if (((bloom_word >> h1) & (bloom_word >> h2) & 1) == 0) return (ESRCH); /* Locate hash chain and corresponding value element*/ bucket = obj->buckets_gnu[req->hash_gnu % obj->nbuckets_gnu]; if (bucket == 0) return (ESRCH); hashval = &obj->chain_zero_gnu[bucket]; do { if (((*hashval ^ req->hash_gnu) >> 1) == 0) { symnum = hashval - obj->chain_zero_gnu; if (matched_symbol(req, obj, &matchres, symnum)) { req->sym_out = matchres.sym_out; req->defobj_out = obj; return (0); } } } while ((*hashval++ & 1) == 0); if (matchres.vcount == 1) { req->sym_out = matchres.vsymp; req->defobj_out = obj; return (0); } return (ESRCH); } static void trace_loaded_objects(Obj_Entry *obj) { char *fmt1, *fmt2, *fmt, *main_local, *list_containers; int c; if ((main_local = getenv(LD_ "TRACE_LOADED_OBJECTS_PROGNAME")) == NULL) main_local = ""; if ((fmt1 = getenv(LD_ "TRACE_LOADED_OBJECTS_FMT1")) == NULL) fmt1 = "\t%o => %p (%x)\n"; if ((fmt2 = getenv(LD_ "TRACE_LOADED_OBJECTS_FMT2")) == NULL) fmt2 = "\t%o (%x)\n"; list_containers = getenv(LD_ "TRACE_LOADED_OBJECTS_ALL"); for (; obj; obj = obj->next) { Needed_Entry *needed; char *name, *path; bool is_lib; if (list_containers && obj->needed != NULL) rtld_printf("%s:\n", obj->path); for (needed = obj->needed; needed; needed = needed->next) { if (needed->obj != NULL) { if (needed->obj->traced && !list_containers) continue; needed->obj->traced = true; path = needed->obj->path; } else path = "not found"; name = (char *)obj->strtab + needed->name; is_lib = strncmp(name, "lib", 3) == 0; /* XXX - bogus */ fmt = is_lib ? fmt1 : fmt2; while ((c = *fmt++) != '\0') { switch (c) { default: rtld_putchar(c); continue; case '\\': switch (c = *fmt) { case '\0': continue; case 'n': rtld_putchar('\n'); break; case 't': rtld_putchar('\t'); break; } break; case '%': switch (c = *fmt) { case '\0': continue; case '%': default: rtld_putchar(c); break; case 'A': rtld_putstr(main_local); break; case 'a': rtld_putstr(obj_main->path); break; case 'o': rtld_putstr(name); break; #if 0 case 'm': rtld_printf("%d", sodp->sod_major); break; case 'n': rtld_printf("%d", sodp->sod_minor); break; #endif case 'p': rtld_putstr(path); break; case 'x': rtld_printf("%p", needed->obj ? needed->obj->mapbase : 0); break; } break; } ++fmt; } } } } /* * Unload a dlopened object and its dependencies from memory and from * our data structures. It is assumed that the DAG rooted in the * object has already been unreferenced, and that the object has a * reference count of 0. */ static void unload_object(Obj_Entry *root) { Obj_Entry *obj; Obj_Entry **linkp; assert(root->refcount == 0); /* * Pass over the DAG removing unreferenced objects from * appropriate lists. */ unlink_object(root); /* Unmap all objects that are no longer referenced. */ linkp = &obj_list->next; while ((obj = *linkp) != NULL) { if (obj->refcount == 0) { LD_UTRACE(UTRACE_UNLOAD_OBJECT, obj, obj->mapbase, obj->mapsize, 0, obj->path); dbg("unloading \"%s\"", obj->path); unload_filtees(root); munmap(obj->mapbase, obj->mapsize); linkmap_delete(obj); *linkp = obj->next; obj_count--; obj_free(obj); } else linkp = &obj->next; } obj_tail = linkp; } static void unlink_object(Obj_Entry *root) { Objlist_Entry *elm; if (root->refcount == 0) { /* Remove the object from the RTLD_GLOBAL list. */ objlist_remove(&list_global, root); /* Remove the object from all objects' DAG lists. */ STAILQ_FOREACH(elm, &root->dagmembers, link) { objlist_remove(&elm->obj->dldags, root); if (elm->obj != root) unlink_object(elm->obj); } } } static void ref_dag(Obj_Entry *root) { Objlist_Entry *elm; assert(root->dag_inited); STAILQ_FOREACH(elm, &root->dagmembers, link) elm->obj->refcount++; } static void unref_dag(Obj_Entry *root) { Objlist_Entry *elm; assert(root->dag_inited); STAILQ_FOREACH(elm, &root->dagmembers, link) elm->obj->refcount--; } /* * Common code for MD __tls_get_addr(). */ static void *tls_get_addr_slow(Elf_Addr **, int, size_t) __noinline; static void * tls_get_addr_slow(Elf_Addr **dtvp, int index, size_t offset) { Elf_Addr *newdtv, *dtv; RtldLockState lockstate; int to_copy; dtv = *dtvp; /* Check dtv generation in case new modules have arrived */ if (dtv[0] != tls_dtv_generation) { wlock_acquire(rtld_bind_lock, &lockstate); newdtv = xcalloc(tls_max_index + 2, sizeof(Elf_Addr)); to_copy = dtv[1]; if (to_copy > tls_max_index) to_copy = tls_max_index; memcpy(&newdtv[2], &dtv[2], to_copy * sizeof(Elf_Addr)); newdtv[0] = tls_dtv_generation; newdtv[1] = tls_max_index; free(dtv); lock_release(rtld_bind_lock, &lockstate); dtv = *dtvp = newdtv; } /* Dynamically allocate module TLS if necessary */ if (dtv[index + 1] == 0) { /* Signal safe, wlock will block out signals. */ wlock_acquire(rtld_bind_lock, &lockstate); if (!dtv[index + 1]) dtv[index + 1] = (Elf_Addr)allocate_module_tls(index); lock_release(rtld_bind_lock, &lockstate); } return ((void *)(dtv[index + 1] + offset)); } void * tls_get_addr_common(Elf_Addr **dtvp, int index, size_t offset) { Elf_Addr *dtv; dtv = *dtvp; /* Check dtv generation in case new modules have arrived */ if (__predict_true(dtv[0] == tls_dtv_generation && dtv[index + 1] != 0)) return ((void *)(dtv[index + 1] + offset)); return (tls_get_addr_slow(dtvp, index, offset)); } #if defined(__aarch64__) || defined(__arm__) || defined(__mips__) || \ defined(__powerpc__) /* * Allocate Static TLS using the Variant I method. */ void * allocate_tls(Obj_Entry *objs, void *oldtcb, size_t tcbsize, size_t tcbalign) { Obj_Entry *obj; char *tcb; Elf_Addr **tls; Elf_Addr *dtv; Elf_Addr addr; int i; if (oldtcb != NULL && tcbsize == TLS_TCB_SIZE) return (oldtcb); assert(tcbsize >= TLS_TCB_SIZE); tcb = xcalloc(1, tls_static_space - TLS_TCB_SIZE + tcbsize); tls = (Elf_Addr **)(tcb + tcbsize - TLS_TCB_SIZE); if (oldtcb != NULL) { memcpy(tls, oldtcb, tls_static_space); free(oldtcb); /* Adjust the DTV. */ dtv = tls[0]; for (i = 0; i < dtv[1]; i++) { if (dtv[i+2] >= (Elf_Addr)oldtcb && dtv[i+2] < (Elf_Addr)oldtcb + tls_static_space) { dtv[i+2] = dtv[i+2] - (Elf_Addr)oldtcb + (Elf_Addr)tls; } } } else { dtv = xcalloc(tls_max_index + 2, sizeof(Elf_Addr)); tls[0] = dtv; dtv[0] = tls_dtv_generation; dtv[1] = tls_max_index; for (obj = objs; obj; obj = obj->next) { if (obj->tlsoffset > 0) { addr = (Elf_Addr)tls + obj->tlsoffset; if (obj->tlsinitsize > 0) memcpy((void*) addr, obj->tlsinit, obj->tlsinitsize); if (obj->tlssize > obj->tlsinitsize) memset((void*) (addr + obj->tlsinitsize), 0, obj->tlssize - obj->tlsinitsize); dtv[obj->tlsindex + 1] = addr; } } } return (tcb); } void free_tls(void *tcb, size_t tcbsize, size_t tcbalign) { Elf_Addr *dtv; Elf_Addr tlsstart, tlsend; int dtvsize, i; assert(tcbsize >= TLS_TCB_SIZE); tlsstart = (Elf_Addr)tcb + tcbsize - TLS_TCB_SIZE; tlsend = tlsstart + tls_static_space; dtv = *(Elf_Addr **)tlsstart; dtvsize = dtv[1]; for (i = 0; i < dtvsize; i++) { if (dtv[i+2] && (dtv[i+2] < tlsstart || dtv[i+2] >= tlsend)) { free((void*)dtv[i+2]); } } free(dtv); free(tcb); } #endif #if defined(__i386__) || defined(__amd64__) || defined(__sparc64__) /* * Allocate Static TLS using the Variant II method. */ void * allocate_tls(Obj_Entry *objs, void *oldtls, size_t tcbsize, size_t tcbalign) { Obj_Entry *obj; size_t size, ralign; char *tls; Elf_Addr *dtv, *olddtv; Elf_Addr segbase, oldsegbase, addr; int i; ralign = tcbalign; if (tls_static_max_align > ralign) ralign = tls_static_max_align; size = round(tls_static_space, ralign) + round(tcbsize, ralign); assert(tcbsize >= 2*sizeof(Elf_Addr)); tls = malloc_aligned(size, ralign); dtv = xcalloc(tls_max_index + 2, sizeof(Elf_Addr)); segbase = (Elf_Addr)(tls + round(tls_static_space, ralign)); ((Elf_Addr*)segbase)[0] = segbase; ((Elf_Addr*)segbase)[1] = (Elf_Addr) dtv; dtv[0] = tls_dtv_generation; dtv[1] = tls_max_index; if (oldtls) { /* * Copy the static TLS block over whole. */ oldsegbase = (Elf_Addr) oldtls; memcpy((void *)(segbase - tls_static_space), (const void *)(oldsegbase - tls_static_space), tls_static_space); /* * If any dynamic TLS blocks have been created tls_get_addr(), * move them over. */ olddtv = ((Elf_Addr**)oldsegbase)[1]; for (i = 0; i < olddtv[1]; i++) { if (olddtv[i+2] < oldsegbase - size || olddtv[i+2] > oldsegbase) { dtv[i+2] = olddtv[i+2]; olddtv[i+2] = 0; } } /* * We assume that this block was the one we created with * allocate_initial_tls(). */ free_tls(oldtls, 2*sizeof(Elf_Addr), sizeof(Elf_Addr)); } else { for (obj = objs; obj; obj = obj->next) { if (obj->tlsoffset) { addr = segbase - obj->tlsoffset; memset((void*) (addr + obj->tlsinitsize), 0, obj->tlssize - obj->tlsinitsize); if (obj->tlsinit) memcpy((void*) addr, obj->tlsinit, obj->tlsinitsize); dtv[obj->tlsindex + 1] = addr; } } } return (void*) segbase; } void free_tls(void *tls, size_t tcbsize, size_t tcbalign) { Elf_Addr* dtv; size_t size, ralign; int dtvsize, i; Elf_Addr tlsstart, tlsend; /* * Figure out the size of the initial TLS block so that we can * find stuff which ___tls_get_addr() allocated dynamically. */ ralign = tcbalign; if (tls_static_max_align > ralign) ralign = tls_static_max_align; size = round(tls_static_space, ralign); dtv = ((Elf_Addr**)tls)[1]; dtvsize = dtv[1]; tlsend = (Elf_Addr) tls; tlsstart = tlsend - size; for (i = 0; i < dtvsize; i++) { if (dtv[i + 2] != 0 && (dtv[i + 2] < tlsstart || dtv[i + 2] > tlsend)) { free_aligned((void *)dtv[i + 2]); } } free_aligned((void *)tlsstart); free((void*) dtv); } #endif /* * Allocate TLS block for module with given index. */ void * allocate_module_tls(int index) { Obj_Entry* obj; char* p; for (obj = obj_list; obj; obj = obj->next) { if (obj->tlsindex == index) break; } if (!obj) { _rtld_error("Can't find module with TLS index %d", index); rtld_die(); } p = malloc_aligned(obj->tlssize, obj->tlsalign); memcpy(p, obj->tlsinit, obj->tlsinitsize); memset(p + obj->tlsinitsize, 0, obj->tlssize - obj->tlsinitsize); return p; } bool allocate_tls_offset(Obj_Entry *obj) { size_t off; if (obj->tls_done) return true; if (obj->tlssize == 0) { obj->tls_done = true; return true; } if (obj->tlsindex == 1) off = calculate_first_tls_offset(obj->tlssize, obj->tlsalign); else off = calculate_tls_offset(tls_last_offset, tls_last_size, obj->tlssize, obj->tlsalign); /* * If we have already fixed the size of the static TLS block, we * must stay within that size. When allocating the static TLS, we * leave a small amount of space spare to be used for dynamically * loading modules which use static TLS. */ if (tls_static_space != 0) { if (calculate_tls_end(off, obj->tlssize) > tls_static_space) return false; } else if (obj->tlsalign > tls_static_max_align) { tls_static_max_align = obj->tlsalign; } tls_last_offset = obj->tlsoffset = off; tls_last_size = obj->tlssize; obj->tls_done = true; return true; } void free_tls_offset(Obj_Entry *obj) { /* * If we were the last thing to allocate out of the static TLS * block, we give our space back to the 'allocator'. This is a * simplistic workaround to allow libGL.so.1 to be loaded and * unloaded multiple times. */ if (calculate_tls_end(obj->tlsoffset, obj->tlssize) == calculate_tls_end(tls_last_offset, tls_last_size)) { tls_last_offset -= obj->tlssize; tls_last_size = 0; } } void * _rtld_allocate_tls(void *oldtls, size_t tcbsize, size_t tcbalign) { void *ret; RtldLockState lockstate; wlock_acquire(rtld_bind_lock, &lockstate); ret = allocate_tls(obj_list, oldtls, tcbsize, tcbalign); lock_release(rtld_bind_lock, &lockstate); return (ret); } void _rtld_free_tls(void *tcb, size_t tcbsize, size_t tcbalign) { RtldLockState lockstate; wlock_acquire(rtld_bind_lock, &lockstate); free_tls(tcb, tcbsize, tcbalign); lock_release(rtld_bind_lock, &lockstate); } static void object_add_name(Obj_Entry *obj, const char *name) { Name_Entry *entry; size_t len; len = strlen(name); entry = malloc(sizeof(Name_Entry) + len); if (entry != NULL) { strcpy(entry->name, name); STAILQ_INSERT_TAIL(&obj->names, entry, link); } } static int object_match_name(const Obj_Entry *obj, const char *name) { Name_Entry *entry; STAILQ_FOREACH(entry, &obj->names, link) { if (strcmp(name, entry->name) == 0) return (1); } return (0); } static Obj_Entry * locate_dependency(const Obj_Entry *obj, const char *name) { const Objlist_Entry *entry; const Needed_Entry *needed; STAILQ_FOREACH(entry, &list_main, link) { if (object_match_name(entry->obj, name)) return entry->obj; } for (needed = obj->needed; needed != NULL; needed = needed->next) { if (strcmp(obj->strtab + needed->name, name) == 0 || (needed->obj != NULL && object_match_name(needed->obj, name))) { /* * If there is DT_NEEDED for the name we are looking for, * we are all set. Note that object might not be found if * dependency was not loaded yet, so the function can * return NULL here. This is expected and handled * properly by the caller. */ return (needed->obj); } } _rtld_error("%s: Unexpected inconsistency: dependency %s not found", obj->path, name); rtld_die(); } static int check_object_provided_version(Obj_Entry *refobj, const Obj_Entry *depobj, const Elf_Vernaux *vna) { const Elf_Verdef *vd; const char *vername; vername = refobj->strtab + vna->vna_name; vd = depobj->verdef; if (vd == NULL) { _rtld_error("%s: version %s required by %s not defined", depobj->path, vername, refobj->path); return (-1); } for (;;) { if (vd->vd_version != VER_DEF_CURRENT) { _rtld_error("%s: Unsupported version %d of Elf_Verdef entry", depobj->path, vd->vd_version); return (-1); } if (vna->vna_hash == vd->vd_hash) { const Elf_Verdaux *aux = (const Elf_Verdaux *) ((char *)vd + vd->vd_aux); if (strcmp(vername, depobj->strtab + aux->vda_name) == 0) return (0); } if (vd->vd_next == 0) break; vd = (const Elf_Verdef *) ((char *)vd + vd->vd_next); } if (vna->vna_flags & VER_FLG_WEAK) return (0); _rtld_error("%s: version %s required by %s not found", depobj->path, vername, refobj->path); return (-1); } static int rtld_verify_object_versions(Obj_Entry *obj) { const Elf_Verneed *vn; const Elf_Verdef *vd; const Elf_Verdaux *vda; const Elf_Vernaux *vna; const Obj_Entry *depobj; int maxvernum, vernum; if (obj->ver_checked) return (0); obj->ver_checked = true; maxvernum = 0; /* * Walk over defined and required version records and figure out * max index used by any of them. Do very basic sanity checking * while there. */ vn = obj->verneed; while (vn != NULL) { if (vn->vn_version != VER_NEED_CURRENT) { _rtld_error("%s: Unsupported version %d of Elf_Verneed entry", obj->path, vn->vn_version); return (-1); } vna = (const Elf_Vernaux *) ((char *)vn + vn->vn_aux); for (;;) { vernum = VER_NEED_IDX(vna->vna_other); if (vernum > maxvernum) maxvernum = vernum; if (vna->vna_next == 0) break; vna = (const Elf_Vernaux *) ((char *)vna + vna->vna_next); } if (vn->vn_next == 0) break; vn = (const Elf_Verneed *) ((char *)vn + vn->vn_next); } vd = obj->verdef; while (vd != NULL) { if (vd->vd_version != VER_DEF_CURRENT) { _rtld_error("%s: Unsupported version %d of Elf_Verdef entry", obj->path, vd->vd_version); return (-1); } vernum = VER_DEF_IDX(vd->vd_ndx); if (vernum > maxvernum) maxvernum = vernum; if (vd->vd_next == 0) break; vd = (const Elf_Verdef *) ((char *)vd + vd->vd_next); } if (maxvernum == 0) return (0); /* * Store version information in array indexable by version index. * Verify that object version requirements are satisfied along the * way. */ obj->vernum = maxvernum + 1; obj->vertab = xcalloc(obj->vernum, sizeof(Ver_Entry)); vd = obj->verdef; while (vd != NULL) { if ((vd->vd_flags & VER_FLG_BASE) == 0) { vernum = VER_DEF_IDX(vd->vd_ndx); assert(vernum <= maxvernum); vda = (const Elf_Verdaux *)((char *)vd + vd->vd_aux); obj->vertab[vernum].hash = vd->vd_hash; obj->vertab[vernum].name = obj->strtab + vda->vda_name; obj->vertab[vernum].file = NULL; obj->vertab[vernum].flags = 0; } if (vd->vd_next == 0) break; vd = (const Elf_Verdef *) ((char *)vd + vd->vd_next); } vn = obj->verneed; while (vn != NULL) { depobj = locate_dependency(obj, obj->strtab + vn->vn_file); if (depobj == NULL) return (-1); vna = (const Elf_Vernaux *) ((char *)vn + vn->vn_aux); for (;;) { if (check_object_provided_version(obj, depobj, vna)) return (-1); vernum = VER_NEED_IDX(vna->vna_other); assert(vernum <= maxvernum); obj->vertab[vernum].hash = vna->vna_hash; obj->vertab[vernum].name = obj->strtab + vna->vna_name; obj->vertab[vernum].file = obj->strtab + vn->vn_file; obj->vertab[vernum].flags = (vna->vna_other & VER_NEED_HIDDEN) ? VER_INFO_HIDDEN : 0; if (vna->vna_next == 0) break; vna = (const Elf_Vernaux *) ((char *)vna + vna->vna_next); } if (vn->vn_next == 0) break; vn = (const Elf_Verneed *) ((char *)vn + vn->vn_next); } return 0; } static int rtld_verify_versions(const Objlist *objlist) { Objlist_Entry *entry; int rc; rc = 0; STAILQ_FOREACH(entry, objlist, link) { /* * Skip dummy objects or objects that have their version requirements * already checked. */ if (entry->obj->strtab == NULL || entry->obj->vertab != NULL) continue; if (rtld_verify_object_versions(entry->obj) == -1) { rc = -1; if (ld_tracing == NULL) break; } } if (rc == 0 || ld_tracing != NULL) rc = rtld_verify_object_versions(&obj_rtld); return rc; } const Ver_Entry * fetch_ventry(const Obj_Entry *obj, unsigned long symnum) { Elf_Versym vernum; if (obj->vertab) { vernum = VER_NDX(obj->versyms[symnum]); if (vernum >= obj->vernum) { _rtld_error("%s: symbol %s has wrong verneed value %d", obj->path, obj->strtab + symnum, vernum); } else if (obj->vertab[vernum].hash != 0) { return &obj->vertab[vernum]; } } return NULL; } int _rtld_get_stack_prot(void) { return (stack_prot); } int _rtld_is_dlopened(void *arg) { Obj_Entry *obj; RtldLockState lockstate; int res; rlock_acquire(rtld_bind_lock, &lockstate); obj = dlcheck(arg); if (obj == NULL) obj = obj_from_addr(arg); if (obj == NULL) { _rtld_error("No shared object contains address"); lock_release(rtld_bind_lock, &lockstate); return (-1); } res = obj->dlopened ? 1 : 0; lock_release(rtld_bind_lock, &lockstate); return (res); } static void map_stacks_exec(RtldLockState *lockstate) { void (*thr_map_stacks_exec)(void); if ((max_stack_flags & PF_X) == 0 || (stack_prot & PROT_EXEC) != 0) return; thr_map_stacks_exec = (void (*)(void))(uintptr_t) get_program_var_addr("__pthread_map_stacks_exec", lockstate); if (thr_map_stacks_exec != NULL) { stack_prot |= PROT_EXEC; thr_map_stacks_exec(); } } void symlook_init(SymLook *dst, const char *name) { bzero(dst, sizeof(*dst)); dst->name = name; dst->hash = elf_hash(name); dst->hash_gnu = gnu_hash(name); } static void symlook_init_from_req(SymLook *dst, const SymLook *src) { dst->name = src->name; dst->hash = src->hash; dst->hash_gnu = src->hash_gnu; dst->ventry = src->ventry; dst->flags = src->flags; dst->defobj_out = NULL; dst->sym_out = NULL; dst->lockstate = src->lockstate; } /* * Parse a file descriptor number without pulling in more of libc (e.g. atoi). */ static int parse_libdir(const char *str) { static const int RADIX = 10; /* XXXJA: possibly support hex? */ const char *orig; int fd; char c; orig = str; fd = 0; for (c = *str; c != '\0'; c = *++str) { if (c < '0' || c > '9') return (-1); fd *= RADIX; fd += c - '0'; } /* Make sure we actually parsed something. */ if (str == orig) { _rtld_error("failed to parse directory FD from '%s'", str); return (-1); } return (fd); } /* * Overrides for libc_pic-provided functions. */ int __getosreldate(void) { size_t len; int oid[2]; int error, osrel; if (osreldate != 0) return (osreldate); oid[0] = CTL_KERN; oid[1] = KERN_OSRELDATE; osrel = 0; len = sizeof(osrel); error = sysctl(oid, 2, &osrel, &len, NULL, 0); if (error == 0 && osrel > 0 && len == sizeof(osrel)) osreldate = osrel; return (osreldate); } void exit(int status) { _exit(status); } void (*__cleanup)(void); int __isthreaded = 0; int _thread_autoinit_dummy_decl = 1; /* * No unresolved symbols for rtld. */ void __pthread_cxa_finalize(struct dl_phdr_info *a) { } void __stack_chk_fail(void) { _rtld_error("stack overflow detected; terminated"); rtld_die(); } __weak_reference(__stack_chk_fail, __stack_chk_fail_local); void __chk_fail(void) { _rtld_error("buffer overflow detected; terminated"); rtld_die(); } const char * rtld_strerror(int errnum) { if (errnum < 0 || errnum >= sys_nerr) return ("Unknown error"); return (sys_errlist[errnum]); } Index: user/ngie/more-tests/libexec/rtld-elf/rtld.h =================================================================== --- user/ngie/more-tests/libexec/rtld-elf/rtld.h (revision 281584) +++ user/ngie/more-tests/libexec/rtld-elf/rtld.h (revision 281585) @@ -1,408 +1,409 @@ /*- * Copyright 1996, 1997, 1998, 1999, 2000 John D. Polstra. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef RTLD_H /* { */ #define RTLD_H 1 #include #include #include #include #include #include #include #include #include "rtld_lock.h" #include "rtld_machdep.h" #ifdef COMPAT_32BIT #undef STANDARD_LIBRARY_PATH #undef _PATH_ELF_HINTS #define _PATH_ELF_HINTS "/var/run/ld-elf32.so.hints" /* For running 32 bit binaries */ #define STANDARD_LIBRARY_PATH "/lib32:/usr/lib32" #define LD_ "LD_32_" #endif #ifndef STANDARD_LIBRARY_PATH #define STANDARD_LIBRARY_PATH "/lib:/usr/lib" #endif #ifndef LD_ #define LD_ "LD_" #endif #define NEW(type) ((type *) xmalloc(sizeof(type))) #define CNEW(type) ((type *) xcalloc(1, sizeof(type))) /* We might as well do booleans like C++. */ typedef unsigned char bool; #define false 0 #define true 1 extern size_t tls_last_offset; extern size_t tls_last_size; extern size_t tls_static_space; extern int tls_dtv_generation; extern int tls_max_index; extern int npagesizes; extern size_t *pagesizes; extern int main_argc; extern char **main_argv; extern char **environ; struct stat; struct Struct_Obj_Entry; /* Lists of shared objects */ typedef struct Struct_Objlist_Entry { STAILQ_ENTRY(Struct_Objlist_Entry) link; struct Struct_Obj_Entry *obj; } Objlist_Entry; typedef STAILQ_HEAD(Struct_Objlist, Struct_Objlist_Entry) Objlist; /* Types of init and fini functions */ typedef void (*InitFunc)(void); typedef void (*InitArrFunc)(int, char **, char **); /* Lists of shared object dependencies */ typedef struct Struct_Needed_Entry { struct Struct_Needed_Entry *next; struct Struct_Obj_Entry *obj; unsigned long name; /* Offset of name in string table */ } Needed_Entry; typedef struct Struct_Name_Entry { STAILQ_ENTRY(Struct_Name_Entry) link; char name[1]; } Name_Entry; /* Lock object */ typedef struct Struct_LockInfo { void *context; /* Client context for creating locks */ void *thelock; /* The one big lock */ /* Debugging aids. */ volatile int rcount; /* Number of readers holding lock */ volatile int wcount; /* Number of writers holding lock */ /* Methods */ void *(*lock_create)(void *context); void (*rlock_acquire)(void *lock); void (*wlock_acquire)(void *lock); void (*rlock_release)(void *lock); void (*wlock_release)(void *lock); void (*lock_destroy)(void *lock); void (*context_destroy)(void *context); } LockInfo; typedef struct Struct_Ver_Entry { Elf_Word hash; unsigned int flags; const char *name; const char *file; } Ver_Entry; typedef struct Struct_Sym_Match_Result { const Elf_Sym *sym_out; const Elf_Sym *vsymp; int vcount; } Sym_Match_Result; #define VER_INFO_HIDDEN 0x01 /* * Shared object descriptor. * * Items marked with "(%)" are dynamically allocated, and must be freed * when the structure is destroyed. * * CAUTION: It appears that the JDK port peeks into these structures. * It looks at "next" and "mapbase" at least. Don't add new members * near the front, until this can be straightened out. */ typedef struct Struct_Obj_Entry { /* * These two items have to be set right for compatibility with the * original ElfKit crt1.o. */ Elf_Size magic; /* Magic number (sanity check) */ Elf_Size version; /* Version number of struct format */ struct Struct_Obj_Entry *next; char *path; /* Pathname of underlying file (%) */ char *origin_path; /* Directory path of origin file */ int refcount; int dl_refcount; /* Number of times loaded by dlopen */ /* These items are computed by map_object() or by digest_phdr(). */ caddr_t mapbase; /* Base address of mapped region */ size_t mapsize; /* Size of mapped region in bytes */ size_t textsize; /* Size of text segment in bytes */ Elf_Addr vaddrbase; /* Base address in shared object file */ caddr_t relocbase; /* Relocation constant = mapbase - vaddrbase */ const Elf_Dyn *dynamic; /* Dynamic section */ caddr_t entry; /* Entry point */ const Elf_Phdr *phdr; /* Program header if it is mapped, else NULL */ size_t phsize; /* Size of program header in bytes */ const char *interp; /* Pathname of the interpreter, if any */ Elf_Word stack_flags; /* TLS information */ int tlsindex; /* Index in DTV for this module */ void *tlsinit; /* Base address of TLS init block */ size_t tlsinitsize; /* Size of TLS init block for this module */ size_t tlssize; /* Size of TLS block for this module */ size_t tlsoffset; /* Offset of static TLS block for this module */ size_t tlsalign; /* Alignment of static TLS block */ caddr_t relro_page; size_t relro_size; /* Items from the dynamic section. */ Elf_Addr *pltgot; /* PLT or GOT, depending on architecture */ const Elf_Rel *rel; /* Relocation entries */ unsigned long relsize; /* Size in bytes of relocation info */ const Elf_Rela *rela; /* Relocation entries with addend */ unsigned long relasize; /* Size in bytes of addend relocation info */ const Elf_Rel *pltrel; /* PLT relocation entries */ unsigned long pltrelsize; /* Size in bytes of PLT relocation info */ const Elf_Rela *pltrela; /* PLT relocation entries with addend */ unsigned long pltrelasize; /* Size in bytes of PLT addend reloc info */ const Elf_Sym *symtab; /* Symbol table */ const char *strtab; /* String table */ unsigned long strsize; /* Size in bytes of string table */ #ifdef __mips__ Elf_Word local_gotno; /* Number of local GOT entries */ Elf_Word symtabno; /* Number of dynamic symbols */ Elf_Word gotsym; /* First dynamic symbol in GOT */ #endif const Elf_Verneed *verneed; /* Required versions. */ Elf_Word verneednum; /* Number of entries in verneed table */ const Elf_Verdef *verdef; /* Provided versions. */ Elf_Word verdefnum; /* Number of entries in verdef table */ const Elf_Versym *versyms; /* Symbol versions table */ const Elf_Hashelt *buckets; /* Hash table buckets array */ unsigned long nbuckets; /* Number of buckets */ const Elf_Hashelt *chains; /* Hash table chain array */ unsigned long nchains; /* Number of entries in chain array */ Elf32_Word nbuckets_gnu; /* Number of GNU hash buckets*/ Elf32_Word symndx_gnu; /* 1st accessible symbol on dynsym table */ Elf32_Word maskwords_bm_gnu; /* Bloom filter words - 1 (bitmask) */ Elf32_Word shift2_gnu; /* Bloom filter shift count */ Elf32_Word dynsymcount; /* Total entries in dynsym table */ Elf_Addr *bloom_gnu; /* Bloom filter used by GNU hash func */ const Elf_Hashelt *buckets_gnu; /* GNU hash table bucket array */ const Elf_Hashelt *chain_zero_gnu; /* GNU hash table value array (Zeroed) */ char *rpath; /* Search path specified in object */ char *runpath; /* Search path with different priority */ Needed_Entry *needed; /* Shared objects needed by this one (%) */ Needed_Entry *needed_filtees; Needed_Entry *needed_aux_filtees; STAILQ_HEAD(, Struct_Name_Entry) names; /* List of names for this object we know about. */ Ver_Entry *vertab; /* Versions required /defined by this object */ int vernum; /* Number of entries in vertab */ Elf_Addr init; /* Initialization function to call */ Elf_Addr fini; /* Termination function to call */ Elf_Addr preinit_array; /* Pre-initialization array of functions */ Elf_Addr init_array; /* Initialization array of functions */ Elf_Addr fini_array; /* Termination array of functions */ int preinit_array_num; /* Number of entries in preinit_array */ int init_array_num; /* Number of entries in init_array */ int fini_array_num; /* Number of entries in fini_array */ int32_t osrel; /* OSREL note value */ bool mainprog : 1; /* True if this is the main program */ bool rtld : 1; /* True if this is the dynamic linker */ bool relocated : 1; /* True if processed by relocate_objects() */ bool ver_checked : 1; /* True if processed by rtld_verify_object_versions */ bool textrel : 1; /* True if there are relocations to text seg */ bool symbolic : 1; /* True if generated with "-Bsymbolic" */ bool bind_now : 1; /* True if all relocations should be made first */ bool traced : 1; /* Already printed in ldd trace output */ bool jmpslots_done : 1; /* Already have relocated the jump slots */ bool init_done : 1; /* Already have added object to init list */ bool tls_done : 1; /* Already allocated offset for static TLS */ bool phdr_alloc : 1; /* Phdr is allocated and needs to be freed. */ bool z_origin : 1; /* Process rpath and soname tokens */ bool z_nodelete : 1; /* Do not unload the object and dependencies */ bool z_noopen : 1; /* Do not load on dlopen */ bool z_loadfltr : 1; /* Immediately load filtees */ bool z_interpose : 1; /* Interpose all objects but main */ bool z_nodeflib : 1; /* Don't search default library path */ + bool z_global : 1; /* Make the object global */ bool ref_nodel : 1; /* Refcount increased to prevent dlclose */ bool init_scanned: 1; /* Object is already on init list. */ bool on_fini_list: 1; /* Object is already on fini list. */ bool dag_inited : 1; /* Object has its DAG initialized. */ bool filtees_loaded : 1; /* Filtees loaded */ bool irelative : 1; /* Object has R_MACHDEP_IRELATIVE relocs */ bool gnu_ifunc : 1; /* Object has references to STT_GNU_IFUNC */ bool non_plt_gnu_ifunc : 1; /* Object has non-plt IFUNC references */ bool crt_no_init : 1; /* Object' crt does not call _init/_fini */ bool valid_hash_sysv : 1; /* A valid System V hash hash tag is available */ bool valid_hash_gnu : 1; /* A valid GNU hash tag is available */ bool dlopened : 1; /* dlopen()-ed (vs. load statically) */ struct link_map linkmap; /* For GDB and dlinfo() */ Objlist dldags; /* Object belongs to these dlopened DAGs (%) */ Objlist dagmembers; /* DAG has these members (%) */ dev_t dev; /* Object's filesystem's device */ ino_t ino; /* Object's inode number */ void *priv; /* Platform-dependent */ } Obj_Entry; #define RTLD_MAGIC 0xd550b87a #define RTLD_VERSION 1 #define RTLD_STATIC_TLS_EXTRA 128 /* Flags to be passed into symlook_ family of functions. */ #define SYMLOOK_IN_PLT 0x01 /* Lookup for PLT symbol */ #define SYMLOOK_DLSYM 0x02 /* Return newest versioned symbol. Used by dlsym. */ #define SYMLOOK_EARLY 0x04 /* Symlook is done during initialization. */ #define SYMLOOK_IFUNC 0x08 /* Allow IFUNC processing in reloc_non_plt(). */ /* Flags for load_object(). */ #define RTLD_LO_NOLOAD 0x01 /* dlopen() specified RTLD_NOLOAD. */ #define RTLD_LO_DLOPEN 0x02 /* Load_object() called from dlopen(). */ #define RTLD_LO_TRACE 0x04 /* Only tracing. */ #define RTLD_LO_NODELETE 0x08 /* Loaded object cannot be closed. */ #define RTLD_LO_FILTEES 0x10 /* Loading filtee. */ #define RTLD_LO_EARLY 0x20 /* Do not call ctors, postpone it to the initialization during the image start. */ /* * Symbol cache entry used during relocation to avoid multiple lookups * of the same symbol. */ typedef struct Struct_SymCache { const Elf_Sym *sym; /* Symbol table entry */ const Obj_Entry *obj; /* Shared object which defines it */ } SymCache; /* * This structure provides a reentrant way to keep a list of objects and * check which ones have already been processed in some way. */ typedef struct Struct_DoneList { const Obj_Entry **objs; /* Array of object pointers */ unsigned int num_alloc; /* Allocated size of the array */ unsigned int num_used; /* Number of array slots used */ } DoneList; struct Struct_RtldLockState { int lockstate; sigjmp_buf env; }; struct fill_search_info_args { int request; unsigned int flags; struct dl_serinfo *serinfo; struct dl_serpath *serpath; char *strspace; }; /* * The pack of arguments and results for the symbol lookup functions. */ typedef struct Struct_SymLook { const char *name; unsigned long hash; uint32_t hash_gnu; const Ver_Entry *ventry; int flags; const Obj_Entry *defobj_out; const Elf_Sym *sym_out; struct Struct_RtldLockState *lockstate; } SymLook; void _rtld_error(const char *, ...) __printflike(1, 2) __exported; void rtld_die(void) __dead2; const char *rtld_strerror(int); Obj_Entry *map_object(int, const char *, const struct stat *); void *xcalloc(size_t, size_t); void *xmalloc(size_t); char *xstrdup(const char *); void *malloc_aligned(size_t size, size_t align); void free_aligned(void *ptr); extern Elf_Addr _GLOBAL_OFFSET_TABLE_[]; extern Elf_Sym sym_zero; /* For resolving undefined weak refs. */ void dump_relocations(Obj_Entry *); void dump_obj_relocations(Obj_Entry *); void dump_Elf_Rel(Obj_Entry *, const Elf_Rel *, u_long); void dump_Elf_Rela(Obj_Entry *, const Elf_Rela *, u_long); /* * Function declarations. */ unsigned long elf_hash(const char *); const Elf_Sym *find_symdef(unsigned long, const Obj_Entry *, const Obj_Entry **, int, SymCache *, struct Struct_RtldLockState *); void init_pltgot(Obj_Entry *); void lockdflt_init(void); void digest_notes(Obj_Entry *, Elf_Addr, Elf_Addr); void obj_free(Obj_Entry *); Obj_Entry *obj_new(void); void _rtld_bind_start(void); void *rtld_resolve_ifunc(const Obj_Entry *obj, const Elf_Sym *def); void symlook_init(SymLook *, const char *); int symlook_obj(SymLook *, const Obj_Entry *); void *tls_get_addr_common(Elf_Addr** dtvp, int index, size_t offset); void *allocate_tls(Obj_Entry *, void *, size_t, size_t); void free_tls(void *, size_t, size_t); void *allocate_module_tls(int index); bool allocate_tls_offset(Obj_Entry *obj); void free_tls_offset(Obj_Entry *obj); const Ver_Entry *fetch_ventry(const Obj_Entry *obj, unsigned long); /* * MD function declarations. */ int do_copy_relocations(Obj_Entry *); int reloc_non_plt(Obj_Entry *, Obj_Entry *, int flags, struct Struct_RtldLockState *); int reloc_plt(Obj_Entry *); int reloc_jmpslots(Obj_Entry *, int flags, struct Struct_RtldLockState *); int reloc_iresolve(Obj_Entry *, struct Struct_RtldLockState *); int reloc_gnu_ifunc(Obj_Entry *, int flags, struct Struct_RtldLockState *); void allocate_initial_tls(Obj_Entry *); #endif /* } */ Index: user/ngie/more-tests/sys/arm64/arm64/vfp.c =================================================================== --- user/ngie/more-tests/sys/arm64/arm64/vfp.c (revision 281584) +++ user/ngie/more-tests/sys/arm64/arm64/vfp.c (revision 281585) @@ -1,194 +1,195 @@ /*- * Copyright (c) 2015 The FreeBSD Foundation * All rights reserved. * * This software was developed by Andrew Turner under * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #ifdef VFP #include #include #include #include #include #include #include #include /* Sanity check we can store all the VFP registers */ CTASSERT(sizeof(((struct pcb *)0)->pcb_vfp) == 16 * 32); static void vfp_enable(void) { uint32_t cpacr; cpacr = READ_SPECIALREG(cpacr_el1); cpacr = (cpacr & ~CPACR_FPEN_MASK) | CPACR_FPEN_TRAP_NONE; WRITE_SPECIALREG(cpacr_el1, cpacr); isb(); } static void vfp_disable(void) { uint32_t cpacr; cpacr = READ_SPECIALREG(cpacr_el1); cpacr = (cpacr & ~CPACR_FPEN_MASK) | CPACR_FPEN_TRAP_ALL1; WRITE_SPECIALREG(cpacr_el1, cpacr); isb(); } /* * Called when the thread is dying. If the thread was the last to use the * VFP unit mark it as unused to tell the kernel the fp state is unowned. * Ensure the VFP unit is off so we get an exception on the next access. */ void vfp_discard(struct thread *td) { if (PCPU_GET(fpcurthread) == td) PCPU_SET(fpcurthread, NULL); vfp_disable(); } void vfp_save_state(struct thread *td) { __int128_t *vfp_state; uint64_t fpcr, fpsr; uint32_t cpacr; + critical_enter(); /* * Only store the registers if the VFP is enabled, * i.e. return if we are trapping on FP access. */ cpacr = READ_SPECIALREG(cpacr_el1); - if ((cpacr & CPACR_FPEN_MASK) != CPACR_FPEN_TRAP_NONE) - return; + if ((cpacr & CPACR_FPEN_MASK) == CPACR_FPEN_TRAP_NONE) { + vfp_state = td->td_pcb->pcb_vfp; + __asm __volatile( + "mrs %0, fpcr \n" + "mrs %1, fpsr \n" + "stp q0, q1, [%2, #16 * 0]\n" + "stp q2, q3, [%2, #16 * 2]\n" + "stp q4, q5, [%2, #16 * 4]\n" + "stp q6, q7, [%2, #16 * 6]\n" + "stp q8, q9, [%2, #16 * 8]\n" + "stp q10, q11, [%2, #16 * 10]\n" + "stp q12, q13, [%2, #16 * 12]\n" + "stp q14, q15, [%2, #16 * 14]\n" + "stp q16, q17, [%2, #16 * 16]\n" + "stp q18, q19, [%2, #16 * 18]\n" + "stp q20, q21, [%2, #16 * 20]\n" + "stp q22, q23, [%2, #16 * 22]\n" + "stp q24, q25, [%2, #16 * 24]\n" + "stp q26, q27, [%2, #16 * 26]\n" + "stp q28, q29, [%2, #16 * 28]\n" + "stp q30, q31, [%2, #16 * 30]\n" + : "=&r"(fpcr), "=&r"(fpsr) : "r"(vfp_state)); - vfp_state = td->td_pcb->pcb_vfp; - __asm __volatile( - "mrs %0, fpcr \n" - "mrs %1, fpsr \n" - "stp q0, q1, [%2, #16 * 0]\n" - "stp q2, q3, [%2, #16 * 2]\n" - "stp q4, q5, [%2, #16 * 4]\n" - "stp q6, q7, [%2, #16 * 6]\n" - "stp q8, q9, [%2, #16 * 8]\n" - "stp q10, q11, [%2, #16 * 10]\n" - "stp q12, q13, [%2, #16 * 12]\n" - "stp q14, q15, [%2, #16 * 14]\n" - "stp q16, q17, [%2, #16 * 16]\n" - "stp q18, q19, [%2, #16 * 18]\n" - "stp q20, q21, [%2, #16 * 20]\n" - "stp q22, q23, [%2, #16 * 22]\n" - "stp q24, q25, [%2, #16 * 24]\n" - "stp q26, q27, [%2, #16 * 26]\n" - "stp q28, q29, [%2, #16 * 28]\n" - "stp q30, q31, [%2, #16 * 30]\n" - : "=&r"(fpcr), "=&r"(fpsr) : "r"(vfp_state)); + td->td_pcb->pcb_fpcr = fpcr; + td->td_pcb->pcb_fpsr = fpsr; - td->td_pcb->pcb_fpcr = fpcr; - td->td_pcb->pcb_fpsr = fpsr; - - dsb(); - vfp_disable(); + dsb(); + vfp_disable(); + } + critical_exit(); } void vfp_restore_state(void) { __int128_t *vfp_state; uint64_t fpcr, fpsr; struct pcb *curpcb; u_int cpu; critical_enter(); cpu = PCPU_GET(cpuid); curpcb = curthread->td_pcb; curpcb->pcb_fpflags |= PCB_FP_STARTED; vfp_enable(); if (PCPU_GET(fpcurthread) != curthread && cpu != curpcb->pcb_vfpcpu) { vfp_state = curthread->td_pcb->pcb_vfp; fpcr = curthread->td_pcb->pcb_fpcr; fpsr = curthread->td_pcb->pcb_fpsr; __asm __volatile( "ldp q0, q1, [%2, #16 * 0]\n" "ldp q2, q3, [%2, #16 * 2]\n" "ldp q4, q5, [%2, #16 * 4]\n" "ldp q6, q7, [%2, #16 * 6]\n" "ldp q8, q9, [%2, #16 * 8]\n" "ldp q10, q11, [%2, #16 * 10]\n" "ldp q12, q13, [%2, #16 * 12]\n" "ldp q14, q15, [%2, #16 * 14]\n" "ldp q16, q17, [%2, #16 * 16]\n" "ldp q18, q19, [%2, #16 * 18]\n" "ldp q20, q21, [%2, #16 * 20]\n" "ldp q22, q23, [%2, #16 * 22]\n" "ldp q24, q25, [%2, #16 * 24]\n" "ldp q26, q27, [%2, #16 * 26]\n" "ldp q28, q29, [%2, #16 * 28]\n" "ldp q30, q31, [%2, #16 * 30]\n" "msr fpcr, %0 \n" "msr fpsr, %1 \n" : : "r"(fpcr), "r"(fpsr), "r"(vfp_state)); PCPU_SET(fpcurthread, curthread); curpcb->pcb_vfpcpu = cpu; } critical_exit(); } void vfp_init(void) { uint64_t pfr; /* Check if there is a vfp unit present */ pfr = READ_SPECIALREG(id_aa64pfr0_el1); if ((pfr & ID_AA64PFR0_FP_MASK) == ID_AA64PFR0_FP_NONE) return; /* Disable to be enabled when it's used */ vfp_disable(); } SYSINIT(vfp, SI_SUB_CPU, SI_ORDER_ANY, vfp_init, NULL); #endif Index: user/ngie/more-tests/sys/arm64/arm64/vm_machdep.c =================================================================== --- user/ngie/more-tests/sys/arm64/arm64/vm_machdep.c (revision 281584) +++ user/ngie/more-tests/sys/arm64/arm64/vm_machdep.c (revision 281585) @@ -1,248 +1,265 @@ /*- * Copyright (c) 2014 Andrew Turner * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#ifdef VFP +#include +#endif + /* * Finish a fork operation, with process p2 nearly set up. * Copy and update the pcb, set up the stack so that the child * ready to run and return to user mode. */ void cpu_fork(struct thread *td1, struct proc *p2, struct thread *td2, int flags) { struct pcb *pcb2; struct trapframe *tf; if ((flags & RFPROC) == 0) return; + + if (td1 == curthread) { + /* + * Save the tpidr_el0 and the vfp state, these normally happen + * in cpu_switch, but if userland changes these then forks + * this may not have happened. + */ + td1->td_pcb->pcb_tpidr_el0 = READ_SPECIALREG(tpidr_el0); +#ifdef VFP + if ((td1->td_pcb->pcb_fpflags & PCB_FP_STARTED) != 0) + vfp_save_state(td1); +#endif + } pcb2 = (struct pcb *)(td2->td_kstack + td2->td_kstack_pages * PAGE_SIZE) - 1; td2->td_pcb = pcb2; bcopy(td1->td_pcb, pcb2, sizeof(*pcb2)); td2->td_pcb->pcb_l1addr = vtophys(vmspace_pmap(td2->td_proc->p_vmspace)->pm_l1); tf = (struct trapframe *)STACKALIGN((struct trapframe *)pcb2 - 1); bcopy(td1->td_frame, tf, sizeof(*tf)); tf->tf_x[0] = 0; tf->tf_x[1] = 0; tf->tf_spsr = 0; td2->td_frame = tf; /* Set the return value registers for fork() */ td2->td_pcb->pcb_x[8] = (uintptr_t)fork_return; td2->td_pcb->pcb_x[9] = (uintptr_t)td2; td2->td_pcb->pcb_x[PCB_LR] = (uintptr_t)fork_trampoline; td2->td_pcb->pcb_sp = (uintptr_t)td2->td_frame; td2->td_pcb->pcb_vfpcpu = UINT_MAX; /* Setup to release spin count in fork_exit(). */ td2->td_md.md_spinlock_count = 1; td2->td_md.md_saved_daif = 0; } void cpu_reset(void) { printf("cpu_reset"); while(1) __asm volatile("wfi" ::: "memory"); } void cpu_thread_swapin(struct thread *td) { } void cpu_thread_swapout(struct thread *td) { } void cpu_set_syscall_retval(struct thread *td, int error) { struct trapframe *frame; frame = td->td_frame; switch (error) { case 0: frame->tf_x[0] = td->td_retval[0]; frame->tf_x[1] = td->td_retval[1]; frame->tf_spsr &= ~PSR_C; /* carry bit */ break; case ERESTART: frame->tf_elr -= 4; break; case EJUSTRETURN: break; default: frame->tf_spsr |= PSR_C; /* carry bit */ frame->tf_x[0] = error; break; } } /* * Initialize machine state (pcb and trap frame) for a new thread about to * upcall. Put enough state in the new thread's PCB to get it to go back * userret(), where we can intercept it again to set the return (upcall) * Address and stack, along with those from upcals that are from other sources * such as those generated in thread_userret() itself. */ void cpu_set_upcall(struct thread *td, struct thread *td0) { bcopy(td0->td_frame, td->td_frame, sizeof(struct trapframe)); bcopy(td0->td_pcb, td->td_pcb, sizeof(struct pcb)); td->td_pcb->pcb_x[8] = (uintptr_t)fork_return; td->td_pcb->pcb_x[9] = (uintptr_t)td; td->td_pcb->pcb_x[PCB_LR] = (uintptr_t)fork_trampoline; td->td_pcb->pcb_sp = (uintptr_t)td->td_frame; td->td_pcb->pcb_vfpcpu = UINT_MAX; /* Setup to release spin count in fork_exit(). */ td->td_md.md_spinlock_count = 1; td->td_md.md_saved_daif = 0; } /* * Set that machine state for performing an upcall that has to * be done in thread_userret() so that those upcalls generated * in thread_userret() itself can be done as well. */ void cpu_set_upcall_kse(struct thread *td, void (*entry)(void *), void *arg, stack_t *stack) { panic("cpu_set_upcall_kse"); } int cpu_set_user_tls(struct thread *td, void *tls_base) { panic("cpu_set_user_tls"); } void cpu_thread_exit(struct thread *td) { } void cpu_thread_alloc(struct thread *td) { td->td_pcb = (struct pcb *)(td->td_kstack + td->td_kstack_pages * PAGE_SIZE) - 1; td->td_frame = (struct trapframe *)STACKALIGN( td->td_pcb - 1); } void cpu_thread_free(struct thread *td) { } void cpu_thread_clean(struct thread *td) { } /* * Intercept the return address from a freshly forked process that has NOT * been scheduled yet. * * This is needed to make kernel threads stay in kernel mode. */ void cpu_set_fork_handler(struct thread *td, void (*func)(void *), void *arg) { td->td_pcb->pcb_x[8] = (uintptr_t)func; td->td_pcb->pcb_x[9] = (uintptr_t)arg; td->td_pcb->pcb_x[PCB_LR] = (uintptr_t)fork_trampoline; td->td_pcb->pcb_sp = (uintptr_t)td->td_frame; td->td_pcb->pcb_vfpcpu = UINT_MAX; } void cpu_exit(struct thread *td) { } void swi_vm(void *v) { /* Nothing to do here - busdma bounce buffers are not implemented. */ } void * uma_small_alloc(uma_zone_t zone, vm_size_t bytes, u_int8_t *flags, int wait) { panic("uma_small_alloc"); } void uma_small_free(void *mem, vm_size_t size, u_int8_t flags) { panic("uma_small_free"); } Index: user/ngie/more-tests/sys/arm64/include/psl.h =================================================================== --- user/ngie/more-tests/sys/arm64/include/psl.h (nonexistent) +++ user/ngie/more-tests/sys/arm64/include/psl.h (revision 281585) @@ -0,0 +1 @@ +/* $FreeBSD$ */ Property changes on: user/ngie/more-tests/sys/arm64/include/psl.h ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/Makefile.arm =================================================================== --- user/ngie/more-tests/sys/boot/Makefile.arm (revision 281584) +++ user/ngie/more-tests/sys/boot/Makefile.arm (revision 281585) @@ -1,7 +1,7 @@ # $FreeBSD$ .if ${MK_FDT} != "no" SUBDIR+= fdt .endif -SUBDIR+= uboot +SUBDIR+= efi uboot Index: user/ngie/more-tests/sys/boot/Makefile.arm64 =================================================================== --- user/ngie/more-tests/sys/boot/Makefile.arm64 (nonexistent) +++ user/ngie/more-tests/sys/boot/Makefile.arm64 (revision 281585) @@ -0,0 +1,7 @@ +# $FreeBSD$ + +.if ${MK_FDT} != "no" +SUBDIR+= fdt +.endif + +SUBDIR+= efi Property changes on: user/ngie/more-tests/sys/boot/Makefile.arm64 ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/arm64/Makefile =================================================================== --- user/ngie/more-tests/sys/boot/arm64/Makefile (nonexistent) +++ user/ngie/more-tests/sys/boot/arm64/Makefile (revision 281585) @@ -0,0 +1,3 @@ +# $FreeBSD$ + +.include Property changes on: user/ngie/more-tests/sys/boot/arm64/Makefile ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c =================================================================== --- user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c (nonexistent) +++ user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c (revision 281585) @@ -0,0 +1,95 @@ +/*- + * Copyright (c) 2014 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by Semihalf under + * the sponsorship of the FreeBSD Foundation. + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include + +#include + +#include +#include + +#include "cache.h" + +static unsigned int +get_dcache_line_size(void) +{ + uint64_t ctr; + unsigned int dcl_size; + + /* Accessible from all security levels */ + ctr = READ_SPECIALREG(ctr_el0); + + /* + * Relevant field [19:16] is LOG2 + * of the number of words in DCache line + */ + dcl_size = CTR_DLINE_SIZE(ctr); + + /* Size of word shifted by cache line size */ + return (sizeof(int) << dcl_size); +} + +void +cpu_flush_dcache(const void *ptr, size_t len) +{ + + uint64_t cl_size; + vm_offset_t addr, end; + + cl_size = get_dcache_line_size(); + + /* Calculate end address to clean */ + end = (vm_offset_t)(ptr + len); + /* Align start address to cache line */ + addr = (vm_offset_t)ptr; + addr = rounddown2(addr, cl_size); + + for (; addr < end; addr += cl_size) + __asm __volatile("dc civac, %0" : : "r" (addr) : "memory"); + /* Full system DSB */ + __asm __volatile("dsb sy" : : : "memory"); +} + +void +cpu_inval_icache(const void *ptr, size_t len) +{ + + /* NULL ptr or 0 len means all */ + if (ptr == NULL || len == 0) { + __asm __volatile( + "ic ialluis \n" + "dsb ish \n" + : : : "memory"); + return; + } + + /* TODO: Other cache ranges if necessary */ +} Property changes on: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h =================================================================== --- user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h (nonexistent) +++ user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h (revision 281585) @@ -0,0 +1,38 @@ +/*- + * Copyright (c) 2014 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by Semihalf under + * the sponsorship of the FreeBSD Foundation. + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _CACHE_H_ +#define _CACHE_H_ + +/* cache.c */ +void cpu_flush_dcache(const void *, size_t); +void cpu_inval_icache(const void *, size_t); + +#endif /* _CACHE_H_ */ Property changes on: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/common/Makefile.inc =================================================================== --- user/ngie/more-tests/sys/boot/common/Makefile.inc (revision 281584) +++ user/ngie/more-tests/sys/boot/common/Makefile.inc (revision 281585) @@ -1,73 +1,75 @@ # $FreeBSD$ SRCS+= boot.c commands.c console.c devopen.c interp.c SRCS+= interp_backslash.c interp_parse.c ls.c misc.c SRCS+= module.c panic.c .if ${MACHINE} == "i386" || ${MACHINE_CPUARCH} == "amd64" SRCS+= load_elf32.c load_elf32_obj.c reloc_elf32.c SRCS+= load_elf64.c load_elf64_obj.c reloc_elf64.c .elif ${MACHINE} == "pc98" SRCS+= load_elf32.c load_elf32_obj.c reloc_elf32.c +.elif ${MACHINE_CPUARCH} == "aarch64" +SRCS+= load_elf64.c reloc_elf64.c .elif ${MACHINE_CPUARCH} == "arm" SRCS+= load_elf32.c reloc_elf32.c .elif ${MACHINE_CPUARCH} == "powerpc" SRCS+= load_elf32.c reloc_elf32.c SRCS+= load_elf64.c reloc_elf64.c .elif ${MACHINE_CPUARCH} == "sparc64" SRCS+= load_elf64.c reloc_elf64.c .elif ${MACHINE_ARCH} == "mips64" || ${MACHINE_ARCH} == "mips64el" SRCS+= load_elf64.c reloc_elf64.c .endif .if defined(LOADER_NET_SUPPORT) SRCS+= dev_net.c .endif .if !defined(LOADER_NO_DISK_SUPPORT) SRCS+= disk.c part.c CFLAGS+= -DLOADER_DISK_SUPPORT .if !defined(LOADER_NO_GPT_SUPPORT) SRCS+= crc32.c CFLAGS+= -DLOADER_GPT_SUPPORT .endif .if !defined(LOADER_NO_MBR_SUPPORT) CFLAGS+= -DLOADER_MBR_SUPPORT .endif .endif .if defined(HAVE_BCACHE) SRCS+= bcache.c .endif .if defined(MD_IMAGE_SIZE) CFLAGS+= -DMD_IMAGE_SIZE=${MD_IMAGE_SIZE} SRCS+= md.c .endif # Machine-independant ISA PnP .if defined(HAVE_ISABUS) SRCS+= isapnp.c .endif .if defined(HAVE_PNP) SRCS+= pnp.c .endif # Forth interpreter .if defined(BOOT_FORTH) SRCS+= interp_forth.c .endif .if defined(BOOT_PROMPT_123) CFLAGS+= -DBOOT_PROMPT_123 .endif .if defined(LOADER_INSTALL_SUPPORT) SRCS+= install.c CFLAGS+=-I${.CURDIR}/../../../../lib/libstand .endif MAN+= loader.8 .if ${MK_ZFS} != "no" MAN+= zfsloader.8 .endif Index: user/ngie/more-tests/sys/boot/efi/Makefile =================================================================== --- user/ngie/more-tests/sys/boot/efi/Makefile (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/Makefile (revision 281585) @@ -1,17 +1,19 @@ # $FreeBSD$ .include SUBDIR= libefi .if ${MACHINE_CPUARCH} == "aarch64" || ${MACHINE_CPUARCH} == "arm" .if ${MK_FDT} != "no" SUBDIR+= fdt .endif .endif -.if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "arm" +.if ${MACHINE_CPUARCH} == "aarch64" || \ + ${MACHINE_CPUARCH} == "amd64" || \ + ${MACHINE_CPUARCH} == "arm" SUBDIR+= loader boot1 .endif .include Index: user/ngie/more-tests/sys/boot/efi/boot1/Makefile =================================================================== --- user/ngie/more-tests/sys/boot/efi/boot1/Makefile (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/boot1/Makefile (revision 281585) @@ -1,106 +1,106 @@ # $FreeBSD$ MAN= .include # In-tree GCC does not support __attribute__((ms_abi)). .if ${COMPILER_TYPE} != "gcc" MK_SSP= no PROG= loader.sym INTERNALPROG= # architecture-specific loader code SRCS= boot1.c reloc.c start.S CFLAGS+= -I. CFLAGS+= -I${.CURDIR}/../include -CFLAGS+= -I${.CURDIR}/../include/${MACHINE_CPUARCH} +CFLAGS+= -I${.CURDIR}/../include/${MACHINE} CFLAGS+= -I${.CURDIR}/../../../contrib/dev/acpica/include CFLAGS+= -I${.CURDIR}/../../.. # Always add MI sources and REGULAR efi loader bits -.PATH: ${.CURDIR}/../loader/arch/${MACHINE_CPUARCH} +.PATH: ${.CURDIR}/../loader/arch/${MACHINE} .PATH: ${.CURDIR}/../loader .PATH: ${.CURDIR}/../../common CFLAGS+= -I${.CURDIR}/../../common FILES= boot1.efi boot1.efifat FILESMODE_boot1.efi= ${BINMODE} -LDSCRIPT= ${.CURDIR}/../loader/arch/${MACHINE_CPUARCH}/ldscript.${MACHINE_CPUARCH} +LDSCRIPT= ${.CURDIR}/../loader/arch/${MACHINE}/ldscript.${MACHINE} LDFLAGS= -Wl,-T${LDSCRIPT} -Wl,-Bsymbolic -shared .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" CFLAGS+= -fPIC LDFLAGS+= -Wl,-znocombreloc .endif .if ${MACHINE_CPUARCH} == "arm" || ${MACHINE_CPUARCH} == "i386" # # Add libstand for the runtime functions used by the compiler - for example # __aeabi_* (arm) or __divdi3 (i386). # DPADD+= ${LIBSTAND} LDADD+= -lstand .endif ${PROG}: ${LDSCRIPT} OBJCOPY?= objcopy OBJDUMP?= objdump .if ${MACHINE_CPUARCH} == "amd64" EFI_TARGET= efi-app-x86_64 .elif ${MACHINE_CPUARCH} == "i386" EFI_TARGET= efi-app-ia32 .else EFI_TARGET= binary .endif boot1.efi: loader.sym if [ `${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*' | wc -l` != 0 ]; then \ ${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*'; \ exit 1; \ fi ${OBJCOPY} -j .peheader -j .text -j .sdata -j .data \ -j .dynamic -j .dynsym -j .rel.dyn \ -j .rela.dyn -j .reloc -j .eh_frame -j set_Xcommand_set \ --output-target=${EFI_TARGET} ${.ALLSRC} ${.TARGET} boot1.o: ${.CURDIR}/../../common/ufsread.c # The following inserts out objects into a template FAT file system # created by generate-fat.sh .include "${.CURDIR}/Makefile.fat" boot1.efifat: boot1.efi echo ${.OBJDIR} - uudecode ${.CURDIR}/fat-${MACHINE_CPUARCH}.tmpl.bz2.uu - mv fat-${MACHINE_CPUARCH}.tmpl.bz2 ${.TARGET}.bz2 + uudecode ${.CURDIR}/fat-${MACHINE}.tmpl.bz2.uu + mv fat-${MACHINE}.tmpl.bz2 ${.TARGET}.bz2 bzip2 -f -d ${.TARGET}.bz2 dd if=boot1.efi of=${.TARGET} seek=${BOOT1_OFFSET} conv=notrunc CLEANFILES= boot1.efi boot1.efifat .endif # ${COMPILER_TYPE} != "gcc" .include beforedepend ${OBJS}: machine CLEANFILES+= machine machine: ln -sf ${.CURDIR}/../../../${MACHINE}/include machine .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" beforedepend ${OBJS}: x86 CLEANFILES+= x86 x86: ln -sf ${.CURDIR}/../../../x86/include x86 .endif Index: user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu =================================================================== --- user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu (nonexistent) +++ user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu (revision 281585) @@ -0,0 +1,26 @@ +FAT template boot filesystem created by generate-fat.sh +DO NOT EDIT +$FreeBSD$ +begin 644 fat-arm64.tmpl.bz2 +M0EIH.3%!62936:2BH:(`&T#_____ZZKJ[_^N_ZO_J_Z[OJ_NJ^JK^KZNKNNJ +MZNKNZOJ^P`+\```"``:`9,@T&F3$,@!B`,AIHP$#0-`T``,09--&31IH9#)D +M,(#$T&)B!A``-`,F0:#3)B&0`Q`&0TT8"!H&@:``&(,FFC)HTT,ADR&$!B:# +M$Q`P@`&@&3(-!IDQ#(`8@#(::,!`T#0-``#$&331DT::&0R9#"`Q-!B8@8"J +M*0GY$I&GH"-&AZ@T```T`:`!HT```&@&@-,C0`-`#U,(-`&)ZF(PGIJ>IO;U +M^=&QL3`-\E@Q+$(RTHB$7I"(B(-W:73$0@A#;S##$3$`A#FL\LAF,;&8;[CE +M&D=@ON\:9IWHO):QK7LL=LFN;1M6Y:%F>-1G^&O-A*(@0AQ,\#H*KRCJF>9Q +MFF_,RWU4X-,R6K5EJU:M6L"JMB5555555JVJU*U-JU:M6MUB*I555555;XHJ +ME555555YM='L;(N7+ERY_!^S^F[=KA#3YJ.S-)`!>O]K)`#-ZO +MU=9T,X(@!H',-+,1Q'-6'#ZNQORGURS=]_O%.6SF5G,PC`G#X_@7W$RC>2Q) +M9MW3P&G:AJFA?`^AKWXOV?R_QDL9F^`=5H>$UWWT9K&Q/HS.!KXB)U)9$6,) +M*/!EJ7>+W2L65_C\&LP69G$?'M$18.(LL.G:AZ;%>TUKYF.V+9/W9AMF[&^]HFB?J_AM6V;ANF2RG7,61I)?])L352^EX5C12BSAX@$(3`!EWEZ6P2R$9#_Q=R13A0D*2B +"H:(` +` +end Property changes on: user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Index: user/ngie/more-tests/sys/boot/efi/boot1/generate-fat.sh =================================================================== --- user/ngie/more-tests/sys/boot/efi/boot1/generate-fat.sh (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/boot1/generate-fat.sh (revision 281585) @@ -1,69 +1,69 @@ #!/bin/sh # This script generates the dummy FAT filesystem used for the EFI boot # blocks. It uses newfs_msdos to generate a template filesystem with the # relevant interesting files. These are then found by grep, and the offsets # written to a Makefile snippet. # # Because it requires root, and because it is overkill, we do not # do this as part of the normal build. If makefs(8) grows workable FAT # support, this should be revisited. # $FreeBSD$ FAT_SIZE=1600 #Size in 512-byte blocks of the produced image BOOT1_SIZE=128k # # Known filenames # amd64: BOOTx64.efi -# aarch64: BOOTaa64.efi +# arm64: BOOTaa64.efi # arm: BOOTarm.efi # i386: BOOTia32.efi # if [ -z "$2" ]; then echo "Usage: $0 arch boot-filename" exit 1 fi ARCH=$1 FILENAME=$2 # Generate 800K FAT image OUTPUT_FILE=fat-${ARCH}.tmpl dd if=/dev/zero of=$OUTPUT_FILE bs=512 count=$FAT_SIZE DEVICE=`mdconfig -a -f $OUTPUT_FILE` newfs_msdos -F 12 -L EFI $DEVICE mkdir stub mount -t msdosfs /dev/$DEVICE stub # Create and bless a directory for the boot loader mkdir -p stub/efi/boot # Make a dummy file for boot1 echo 'Boot1 START' | dd of=stub/efi/boot/$FILENAME cbs=$BOOT1_SIZE count=1 conv=block umount stub mdconfig -d -u $DEVICE rmdir stub # Locate the offset of the fake file BOOT1_OFFSET=$(hd $OUTPUT_FILE | grep 'Boot1 START' | cut -f 1 -d ' ') # Convert to number of blocks BOOT1_OFFSET=$(echo 0x$BOOT1_OFFSET | awk '{printf("%x\n",$1/512);}') echo '# This file autogenerated by generate-fat.sh - DO NOT EDIT' > Makefile.fat echo '# $FreeBSD$' >> Makefile.fat echo "BOOT1_OFFSET=0x$BOOT1_OFFSET" >> Makefile.fat bzip2 $OUTPUT_FILE echo 'FAT template boot filesystem created by generate-fat.sh' > $OUTPUT_FILE.bz2.uu echo 'DO NOT EDIT' >> $OUTPUT_FILE.bz2.uu echo '$FreeBSD$' >> $OUTPUT_FILE.bz2.uu uuencode $OUTPUT_FILE.bz2 $OUTPUT_FILE.bz2 >> $OUTPUT_FILE.bz2.uu rm $OUTPUT_FILE.bz2 Index: user/ngie/more-tests/sys/boot/efi/fdt/Makefile =================================================================== --- user/ngie/more-tests/sys/boot/efi/fdt/Makefile (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/fdt/Makefile (revision 281585) @@ -1,36 +1,36 @@ # $FreeBSD$ .include .PATH: ${.CURDIR}/../../common LIB= efi_fdt INTERNALLIB= SRCS= efi_fdt.c CFLAGS+= -ffreestanding -msoft-float -.if ${MACHINE_CPUARCH} == "arm64" +.if ${MACHINE_CPUARCH} == "aarch64" CFLAGS+= -mgeneral-regs-only .endif CFLAGS+= -I${.CURDIR}/../../../../lib/libstand/ # EFI library headers CFLAGS+= -I${.CURDIR}/../include -CFLAGS+= -I${.CURDIR}/../include/${MACHINE_CPUARCH} +CFLAGS+= -I${.CURDIR}/../include/${MACHINE} # libfdt headers CFLAGS+= -I${.CURDIR}/../../fdt # Pick up the bootstrap header for some interface items CFLAGS+= -I${.CURDIR}/../../common -I${.CURDIR}/../../.. -I. machine: - ln -sf ${.CURDIR}/../../../${MACHINE_CPUARCH}/include machine + ln -sf ${.CURDIR}/../../../${MACHINE}/include machine CLEANFILES+= machine .include beforedepend ${OBJS}: machine Index: user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h =================================================================== --- user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h (nonexistent) +++ user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h (revision 281585) @@ -0,0 +1,219 @@ +/* $FreeBSD$ */ +/*++ + +Copyright (c) 1999 - 2003 Intel Corporation. All rights reserved +This software and associated documentation (if any) is furnished +under a license and may only be used or copied in accordance +with the terms of the license. Except as permitted by such +license, no part of this software or documentation may be +reproduced, stored in a retrieval system, or transmitted in any +form or by any means without the express written consent of +Intel Corporation. + +Module Name: + + efefind.h + +Abstract: + + EFI to compile bindings + + + + +Revision History + +--*/ + +#pragma pack() + + +#ifdef __FreeBSD__ +#include +#else +// +// Basic int types of various widths +// + +#if (__STDC_VERSION__ < 199901L ) + + // No ANSI C 1999/2000 stdint.h integer width declarations + + #if _MSC_EXTENSIONS + + // Use Microsoft C compiler integer width declarations + + typedef unsigned __int64 uint64_t; + typedef __int64 int64_t; + typedef unsigned __int32 uint32_t; + typedef __int32 int32_t; + typedef unsigned __int16 uint16_t; + typedef __int16 int16_t; + typedef unsigned __int8 uint8_t; + typedef __int8 int8_t; + #else + #ifdef UNIX_LP64 + + // Use LP64 programming model from C_FLAGS for integer width declarations + + typedef unsigned long uint64_t; + typedef long int64_t; + typedef unsigned int uint32_t; + typedef int int32_t; + typedef unsigned short uint16_t; + typedef short int16_t; + typedef unsigned char uint8_t; + typedef char int8_t; + #else + + // Assume P64 programming model from C_FLAGS for integer width declarations + + typedef unsigned long long uint64_t; + typedef long long int64_t; + typedef unsigned int uint32_t; + typedef int int32_t; + typedef unsigned short uint16_t; + typedef short int16_t; + typedef unsigned char uint8_t; + typedef char int8_t; + #endif + #endif +#endif +#endif /* __FreeBSD__ */ + +// +// Basic EFI types of various widths +// + + +typedef uint64_t UINT64; +typedef int64_t INT64; +typedef uint32_t UINT32; +typedef int32_t INT32; +typedef uint16_t UINT16; +typedef int16_t INT16; +typedef uint8_t UINT8; +typedef int8_t INT8; + + +#undef VOID +#define VOID void + + +typedef int64_t INTN; +typedef uint64_t UINTN; + +//++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +// BugBug: Code to debug +// +#define BIT63 0x8000000000000000 + +#define PLATFORM_IOBASE_ADDRESS (0xffffc000000 | BIT63) +#define PORT_TO_MEMD(_Port) (PLATFORM_IOBASE_ADDRESS | ( ( ( (_Port) & 0xfffc) << 10 ) | ( (_Port) & 0x0fff) ) ) + +// +// Macro's with casts make this much easier to use and read. +// +#define PORT_TO_MEM8D(_Port) (*(UINT8 *)(PORT_TO_MEMD(_Port))) +#define POST_CODE(_Data) (PORT_TO_MEM8D(0x80) = (_Data)) +// +// BugBug: End Debug Code!!! +//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +#define EFIERR(a) (0x8000000000000000 | a) +#define EFI_ERROR_MASK 0x8000000000000000 +#define EFIERR_OEM(a) (0xc000000000000000 | a) + +#define BAD_POINTER 0xFBFBFBFBFBFBFBFB +#define MAX_ADDRESS 0xFFFFFFFFFFFFFFFF + +#pragma intrinsic (__break) +#define BREAKPOINT() __break(0) + +// +// Pointers must be aligned to these address to function +// you will get an alignment fault if this value is less than 8 +// +#define MIN_ALIGNMENT_SIZE 8 + +#define ALIGN_VARIABLE(Value , Adjustment) \ + (UINTN) Adjustment = 0; \ + if((UINTN)Value % MIN_ALIGNMENT_SIZE) \ + (UINTN)Adjustment = MIN_ALIGNMENT_SIZE - ((UINTN)Value % MIN_ALIGNMENT_SIZE); \ + Value = (UINTN)Value + (UINTN)Adjustment + +// +// Define macros to create data structure signatures. +// + +#define EFI_SIGNATURE_16(A,B) ((A) | (B<<8)) +#define EFI_SIGNATURE_32(A,B,C,D) (EFI_SIGNATURE_16(A,B) | (EFI_SIGNATURE_16(C,D) << 16)) +#define EFI_SIGNATURE_64(A,B,C,D,E,F,G,H) (EFI_SIGNATURE_32(A,B,C,D) | ((UINT64)(EFI_SIGNATURE_32(E,F,G,H)) << 32)) + +// +// EFIAPI - prototype calling convention for EFI function pointers +// BOOTSERVICE - prototype for implementation of a boot service interface +// RUNTIMESERVICE - prototype for implementation of a runtime service interface +// RUNTIMEFUNCTION - prototype for implementation of a runtime function that is not a service +// RUNTIME_CODE - pragma macro for declaring runtime code +// + +#ifndef EFIAPI // Forces EFI calling conventions reguardless of compiler options + #if _MSC_EXTENSIONS + #define EFIAPI __cdecl // Force C calling convention for Microsoft C compiler + #else + #define EFIAPI // Substitute expresion to force C calling convention + #endif +#endif + +#define BOOTSERVICE +#define RUNTIMESERVICE +#define RUNTIMEFUNCTION + +#define RUNTIME_CODE(a) alloc_text("rtcode", a) +#define BEGIN_RUNTIME_DATA() data_seg("rtdata") +#define END_RUNTIME_DATA() data_seg() + +#define VOLATILE volatile + +// +// BugBug: Need to find out if this is portable accross compliers. +// +void __mfa (void); +#pragma intrinsic (__mfa) +#define MEMORY_FENCE() __mfa() + +#ifdef EFI_NO_INTERFACE_DECL + #define EFI_FORWARD_DECLARATION(x) + #define EFI_INTERFACE_DECL(x) +#else + #define EFI_FORWARD_DECLARATION(x) typedef struct _##x x + #define EFI_INTERFACE_DECL(x) typedef struct x +#endif + +// +// When build similiar to FW, then link everything together as +// one big module. +// + +#define EFI_DRIVER_ENTRY_POINT(InitFunction) + +#define LOAD_INTERNAL_DRIVER(_if, type, name, entry) \ + (_if)->LoadInternal(type, name, entry) +// entry(NULL, ST) + +#ifdef __FreeBSD__ +#define INTERFACE_DECL(x) struct x +#else +// +// Some compilers don't support the forward reference construct: +// typedef struct XXXXX +// +// The following macro provide a workaround for such cases. +// +#ifdef NO_INTERFACE_DECL +#define INTERFACE_DECL(x) +#else +#define INTERFACE_DECL(x) typedef struct x +#endif +#endif Property changes on: user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/efi/libefi/Makefile =================================================================== --- user/ngie/more-tests/sys/boot/efi/libefi/Makefile (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/libefi/Makefile (revision 281585) @@ -1,22 +1,22 @@ # $FreeBSD$ LIB= efi INTERNALLIB= SRCS= delay.c efi_console.c efinet.c efipart.c errno.c handles.c \ libefi.c time.c .if ${MACHINE_ARCH} == "amd64" CFLAGS+= -fPIC -mno-red-zone .endif CFLAGS+= -I${.CURDIR}/../include -CFLAGS+= -I${.CURDIR}/../include/${MACHINE_CPUARCH} +CFLAGS+= -I${.CURDIR}/../include/${MACHINE} CFLAGS+= -I${.CURDIR}/../../../../lib/libstand # Pick up the bootstrap header for some interface items CFLAGS+= -I${.CURDIR}/../../common # Handle FreeBSD specific %b and %D printf format specifiers CFLAGS+= ${FORMAT_EXTENSIONS} .include Index: user/ngie/more-tests/sys/boot/efi/loader/Makefile =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/Makefile (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/loader/Makefile (revision 281585) @@ -1,127 +1,127 @@ # $FreeBSD$ MAN= .include # In-tree GCC does not support __attribute__((ms_abi)). .if ${COMPILER_TYPE} != "gcc" MK_SSP= no PROG= loader.sym INTERNALPROG= .PATH: ${.CURDIR}/../../efi/loader # architecture-specific loader code SRCS= autoload.c \ bootinfo.c \ conf.c \ copy.c \ devicename.c \ main.c \ reloc.c \ smbios.c \ vers.c -.PATH: ${.CURDIR}/arch/${MACHINE_CPUARCH} +.PATH: ${.CURDIR}/arch/${MACHINE} # For smbios.c .PATH: ${.CURDIR}/../../i386/libi386 -.include "${.CURDIR}/arch/${MACHINE_CPUARCH}/Makefile.inc" +.include "${.CURDIR}/arch/${MACHINE}/Makefile.inc" CFLAGS+= -I${.CURDIR} -CFLAGS+= -I${.CURDIR}/arch/${MACHINE_CPUARCH} +CFLAGS+= -I${.CURDIR}/arch/${MACHINE} CFLAGS+= -I${.CURDIR}/../include -CFLAGS+= -I${.CURDIR}/../include/${MACHINE_CPUARCH} +CFLAGS+= -I${.CURDIR}/../include/${MACHINE} CFLAGS+= -I${.CURDIR}/../../../contrib/dev/acpica/include CFLAGS+= -I${.CURDIR}/../../.. CFLAGS+= -I${.CURDIR}/../../i386/libi386 CFLAGS+= -DNO_PCI -DEFI .if ${MK_FORTH} != "no" BOOT_FORTH= yes CFLAGS+= -DBOOT_FORTH CFLAGS+= -I${.CURDIR}/../../ficl CFLAGS+= -I${.CURDIR}/../../ficl/${MACHINE_CPUARCH} LIBFICL= ${.OBJDIR}/../../ficl/libficl.a .endif LOADER_FDT_SUPPORT?= no .if ${MK_FDT} != "no" && ${LOADER_FDT_SUPPORT} != "no" CFLAGS+= -I${.CURDIR}/../../fdt CFLAGS+= -I${.OBJDIR}/../../fdt CFLAGS+= -DLOADER_FDT_SUPPORT LIBEFI_FDT= ${.OBJDIR}/../../efi/fdt/libefi_fdt.a LIBFDT= ${.OBJDIR}/../../fdt/libfdt.a .endif # Include bcache code. HAVE_BCACHE= yes .if defined(EFI_STAGING_SIZE) CFLAGS+= -DEFI_STAGING_SIZE=${EFI_STAGING_SIZE} .endif # Always add MI sources .PATH: ${.CURDIR}/../../common .include "${.CURDIR}/../../common/Makefile.inc" CFLAGS+= -I${.CURDIR}/../../common FILES= loader.efi FILESMODE_loader.efi= ${BINMODE} -LDSCRIPT= ${.CURDIR}/arch/${MACHINE_CPUARCH}/ldscript.${MACHINE_CPUARCH} +LDSCRIPT= ${.CURDIR}/arch/${MACHINE}/ldscript.${MACHINE} LDFLAGS+= -Wl,-T${LDSCRIPT} -Wl,-Bsymbolic -shared CLEANFILES= vers.c loader.efi -NEWVERSWHAT= "EFI loader" ${MACHINE_CPUARCH} +NEWVERSWHAT= "EFI loader" ${MACHINE} vers.c: ${.CURDIR}/../../common/newvers.sh ${.CURDIR}/../../efi/loader/version sh ${.CURDIR}/../../common/newvers.sh ${.CURDIR}/version ${NEWVERSWHAT} OBJCOPY?= objcopy OBJDUMP?= objdump .if ${MACHINE_CPUARCH} == "amd64" EFI_TARGET= efi-app-x86_64 .elif ${MACHINE_CPUARCH} == "i386" EFI_TARGET= efi-app-ia32 .else EFI_TARGET= binary .endif loader.efi: loader.sym if [ `${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*' | wc -l` != 0 ]; then \ ${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*'; \ exit 1; \ fi ${OBJCOPY} -j .peheader -j .text -j .sdata -j .data \ -j .dynamic -j .dynsym -j .rel.dyn \ -j .rela.dyn -j .reloc -j .eh_frame -j set_Xcommand_set \ --output-target=${EFI_TARGET} ${.ALLSRC} ${.TARGET} LIBEFI= ${.OBJDIR}/../libefi/libefi.a DPADD= ${LIBFICL} ${LIBEFI} ${LIBFDT} ${LIBEFI_FDT} ${LIBSTAND} \ ${LDSCRIPT} LDADD= ${LIBFICL} ${LIBEFI} ${LIBFDT} ${LIBEFI_FDT} ${LIBSTAND} .endif # ${COMPILER_TYPE} != "gcc" .include beforedepend ${OBJS}: machine CLEANFILES+= machine machine: ln -sf ${.CURDIR}/../../../${MACHINE}/include machine .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" beforedepend ${OBJS}: x86 CLEANFILES+= x86 x86: ln -sf ${.CURDIR}/../../../x86/include x86 .endif Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm/start.S =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/arch/arm/start.S (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm/start.S (revision 281585) @@ -1,190 +1,189 @@ /*- * Copyright (c) 2014, 2015 Andrew Turner * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include /* * We need to be a PE32 file for EFI. On some architectures we can use * objcopy to create the correct file, however on arm we need to do * it ourselves. */ #define IMAGE_FILE_MACHINE_ARM 0x01c2 #define IMAGE_SCN_CNT_CODE 0x00000020 #define IMAGE_SCN_CNT_INITIALIZED_DATA 0x00000040 #define IMAGE_SCN_MEM_DISCARDABLE 0x02000000 #define IMAGE_SCN_MEM_EXECUTE 0x20000000 #define IMAGE_SCN_MEM_READ 0x40000000 .section .peheader efi_start: /* The MS-DOS Stub, only used to get the offset of the COFF header */ .ascii "MZ" .short 0 .space 0x38 .long pe_sig - efi_start /* The PE32 Signature. Needs to be 8-byte aligned */ .align 3 pe_sig: .ascii "PE" .short 0 coff_head: .short IMAGE_FILE_MACHINE_ARM /* ARM file */ .short 2 /* 2 Sections */ .long 0 /* Timestamp */ .long 0 /* No symbol table */ .long 0 /* No symbols */ .short section_table - optional_header /* Optional header size */ .short 0 /* Characteristics TODO: Fill in */ optional_header: .short 0x010b /* PE32 (32-bit addressing) */ .byte 0 /* Major linker version */ .byte 0 /* Minor linker version */ .long _edata - _end_header /* Code size */ .long 0 /* No initialized data */ .long 0 /* No uninitialized data */ .long _start - efi_start /* Entry point */ .long _end_header - efi_start /* Start of code */ .long 0 /* Start of data */ optional_windows_header: .long 0 /* Image base */ .long 32 /* Section Alignment */ .long 8 /* File alignment */ .short 0 /* Major OS version */ .short 0 /* Minor OS version */ .short 0 /* Major image version */ .short 0 /* Minor image version */ .short 0 /* Major subsystem version */ .short 0 /* Minor subsystem version */ .long 0 /* Win32 version */ .long _edata - efi_start /* Image size */ .long _end_header - efi_start /* Header size */ .long 0 /* Checksum */ .short 0xa /* Subsystem (EFI app) */ .short 0 /* DLL Characteristics */ .long 0 /* Stack reserve */ .long 0 /* Stack commit */ .long 0 /* Heap reserve */ .long 0 /* Heap commit */ .long 0 /* Loader flags */ .long 6 /* Number of RVAs */ /* RVAs: */ .quad 0 .quad 0 .quad 0 .quad 0 .quad 0 .quad 0 section_table: /* We need a .reloc section for EFI */ .ascii ".reloc" .byte 0 .byte 0 /* Pad to 8 bytes */ .long 0 /* Virtual size */ .long 0 /* Virtual address */ .long 0 /* Size of raw data */ .long 0 /* Pointer to raw data */ .long 0 /* Pointer to relocations */ .long 0 /* Pointer to line numbers */ .short 0 /* Number of relocations */ .short 0 /* Number of line numbers */ .long (IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ | \ IMAGE_SCN_MEM_DISCARDABLE) /* Characteristics */ /* The contents of the loader */ .ascii ".text" .byte 0 .byte 0 .byte 0 /* Pad to 8 bytes */ .long _edata - _end_header /* Virtual size */ .long _end_header - efi_start /* Virtual address */ .long _edata - _end_header /* Size of raw data */ .long _end_header - efi_start /* Pointer to raw data */ .long 0 /* Pointer to relocations */ .long 0 /* Pointer to line numbers */ .short 0 /* Number of relocations */ .short 0 /* Number of line numbers */ .long (IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_EXECUTE | \ IMAGE_SCN_MEM_READ) /* Characteristics */ _end_header: .text _start: /* Save the boot params to the stack */ push {r0, r1} adr r0, .Lbase ldr r1, [r0] sub r5, r0, r1 ldr r0, .Limagebase add r0, r0, r5 ldr r1, .Ldynamic add r1, r1, r5 bl _C_LABEL(_reloc) /* Zero the BSS, _reloc fixed the values for us */ ldr r0, .Lbss ldr r1, .Lbssend mov r2, #0 1: cmp r0, r1 bgt 2f str r2, [r0], #4 b 1b 2: pop {r0, r1} bl _C_LABEL(efi_main) -1: WFI - b 1b +1: b 1b .Lbase: .word . .Limagebase: .word ImageBase .Ldynamic: .word _DYNAMIC .Lbss: .word __bss_start .Lbssend: .word __bss_end .align 3 stack: .space 512 stack_end: Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc (nonexistent) +++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc (revision 281585) @@ -0,0 +1,9 @@ +# $FreeBSD$ + +LOADER_FDT_SUPPORT=yes +SRCS+= exec.c \ + start.S + +.PATH: ${.CURDIR}/../../arm64/libarm64 +CFLAGS+=-I${.CURDIR}/../../arm64/libarm64 +SRCS+= cache.c Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c (nonexistent) +++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c (revision 281585) @@ -0,0 +1,109 @@ +/*- + * Copyright (c) 2006 Marcel Moolenaar + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include + +#include +#include +#include + +#include + +#include +#include + +#include "loader_efi.h" +#include "cache.h" + +static int elf64_exec(struct preloaded_file *amp); +static int elf64_obj_exec(struct preloaded_file *amp); + +int bi_load(char *args, vm_offset_t *modulep, vm_offset_t *kernendp); + +static struct file_format arm64_elf = { + elf64_loadfile, + elf64_exec +}; + +struct file_format *file_formats[] = { + &arm64_elf, + NULL +}; + +static int +elf64_exec(struct preloaded_file *fp) +{ + vm_offset_t modulep, kernendp; + vm_offset_t clean_addr; + size_t clean_size; + struct file_metadata *md; + EFI_STATUS status; + EFI_PHYSICAL_ADDRESS addr; + Elf_Ehdr *ehdr; + int err; + void (*entry)(vm_offset_t); + + if ((md = file_findmetadata(fp, MODINFOMD_ELFHDR)) == NULL) + return(EFTYPE); + + ehdr = (Elf_Ehdr *)&(md->md_data); + entry = efi_translate(ehdr->e_entry); + + err = bi_load(fp->f_args, &modulep, &kernendp); + if (err != 0) + return (err); + + status = BS->ExitBootServices(IH, efi_mapkey); + if (EFI_ERROR(status)) { + printf("%s: ExitBootServices() returned 0x%lx\n", __func__, + (long)status); + return (EINVAL); + } + + /* Clean D-cache under kernel area and invalidate whole I-cache */ + clean_addr = efi_translate(fp->f_addr); + clean_size = efi_translate(kernendp) - clean_addr; + + cpu_flush_dcache((void *)clean_addr, clean_size); + cpu_inval_icache(NULL, 0); + + (*entry)(modulep); + panic("exec returned"); +} + +static int +elf64_obj_exec(struct preloaded_file *fp) +{ + + printf("%s called for preloaded file %p (=%s):\n", __func__, fp, + fp->f_name); + return (ENOSYS); +} + Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64 =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64 (nonexistent) +++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64 (revision 281585) @@ -0,0 +1,80 @@ +/* $FreeBSD$ */ +/* +OUTPUT_FORMAT("elf64-aarch64-freebsd", "elf64-aarch64-freebsd", "elf64-aarch64-freebsd") +*/ +OUTPUT_ARCH(aarch64) +ENTRY(_start) +SECTIONS +{ + /* Read-only sections, merged into text segment: */ + . = 0; + ImageBase = .; + .text : { + *(.peheader) + *(.text .stub .text.* .gnu.linkonce.t.*) + /* .gnu.warning sections are handled specially by elf32.em. */ + *(.gnu.warning) + *(.plt) + } =0x00300000010070000002000001000400 + . = ALIGN(16); + .data : { + *(.rodata .rodata.* .gnu.linkonce.r.*) + *(.rodata1) + *(.sdata2 .sdata2.* .gnu.linkonce.s2.*) + *(.sbss2 .sbss2.* .gnu.linkonce.sb2.*) + *(.opd) + *(.data .data.* .gnu.linkonce.d.*) + *(.data1) + *(.plabel) + + . = ALIGN(16); + __bss_start = .; + *(.sbss .sbss.* .gnu.linkonce.sb.*) + *(.scommon) + *(.dynbss) + *(.bss *.bss.*) + *(COMMON) + . = ALIGN(16); + __bss_end = .; + } + . = ALIGN(16); + set_Xcommand_set : { + __start_set_Xcommand_set = .; + *(set_Xcommand_set) + __stop_set_Xcommand_set = .; + } + . = ALIGN(16); + __gp = .; + .sdata : { + *(.got.plt .got) + *(.sdata .sdata.* .gnu.linkonce.s.*) + *(dynsbss) + *(.scommon) + } + . = ALIGN(16); + .dynamic : { *(.dynamic) } + . = ALIGN(16); + .rela.dyn : { + *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) + *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) + *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) + *(.rela.got) + *(.rela.sdata .rela.sdata.* .rela.gnu.linkonce.s.*) + *(.rela.sbss .rela.sbss.* .rela.gnu.linkonce.sb.*) + *(.rela.sdata2 .rela.sdata2.* .rela.gnu.linkonce.s2.*) + *(.rela.sbss2 .rela.sbss2.* .rela.gnu.linkonce.sb2.*) + *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) + *(.rela.plt) + *(.relset_*) + *(.rela.dyn .rela.dyn.*) + } + . = ALIGN(16); + .reloc : { *(.reloc) } + . = ALIGN(16); + .dynsym : { *(.dynsym) } + _edata = .; + + /* Unused sections */ + .dynstr : { *(.dynstr) } + .hash : { *(.hash) } +} Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64 ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S (nonexistent) +++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S (revision 281585) @@ -0,0 +1,165 @@ +/*- + * Copyright (c) 2014 Andrew Turner + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +/* + * We need to be a PE32+ file for EFI. On some architectures we can use + * objcopy to create the correct file, however on arm64 we need to do + * it ourselves. + */ + +#define IMAGE_FILE_MACHINE_ARM64 0xaa64 + +#define IMAGE_SCN_CNT_CODE 0x00000020 +#define IMAGE_SCN_CNT_INITIALIZED_DATA 0x00000040 +#define IMAGE_SCN_MEM_DISCARDABLE 0x02000000 +#define IMAGE_SCN_MEM_EXECUTE 0x20000000 +#define IMAGE_SCN_MEM_READ 0x40000000 + + .section .peheader +efi_start: + /* The MS-DOS Stub, only used to get the offset of the COFF header */ + .ascii "MZ" + .short 0 + .space 0x38 + .long pe_sig - efi_start + + /* The PE32 Signature. Needs to be 8-byte aligned */ + .align 3 +pe_sig: + .ascii "PE" + .short 0 +coff_head: + .short IMAGE_FILE_MACHINE_ARM64 /* AArch64 file */ + .short 2 /* 2 Sections */ + .long 0 /* Timestamp */ + .long 0 /* No symbol table */ + .long 0 /* No symbols */ + .short section_table - optional_header /* Optional header size */ + .short 0 /* Characteristics TODO: Fill in */ + +optional_header: + .short 0x020b /* PE32+ (64-bit addressing) */ + .byte 0 /* Major linker version */ + .byte 0 /* Minor linker version */ + .long _edata - _end_header /* Code size */ + .long 0 /* No initialized data */ + .long 0 /* No uninitialized data */ + .long _start - efi_start /* Entry point */ + .long _end_header - efi_start /* Start of code */ + +optional_windows_header: + .quad 0 /* Image base */ + .long 32 /* Section Alignment */ + .long 8 /* File alignment */ + .short 0 /* Major OS version */ + .short 0 /* Minor OS version */ + .short 0 /* Major image version */ + .short 0 /* Minor image version */ + .short 0 /* Major subsystem version */ + .short 0 /* Minor subsystem version */ + .long 0 /* Win32 version */ + .long _edata - efi_start /* Image size */ + .long _end_header - efi_start /* Header size */ + .long 0 /* Checksum */ + .short 0xa /* Subsystem (EFI app) */ + .short 0 /* DLL Characteristics */ + .quad 0 /* Stack reserve */ + .quad 0 /* Stack commit */ + .quad 0 /* Heap reserve */ + .quad 0 /* Heap commit */ + .long 0 /* Loader flags */ + .long 6 /* Number of RVAs */ + + /* RVAs: */ + .quad 0 + .quad 0 + .quad 0 + .quad 0 + .quad 0 + .quad 0 + +section_table: + /* We need a .reloc section for EFI */ + .ascii ".reloc" + .byte 0 + .byte 0 /* Pad to 8 bytes */ + .long 0 /* Virtual size */ + .long 0 /* Virtual address */ + .long 0 /* Size of raw data */ + .long 0 /* Pointer to raw data */ + .long 0 /* Pointer to relocations */ + .long 0 /* Pointer to line numbers */ + .short 0 /* Number of relocations */ + .short 0 /* Number of line numbers */ + .long (IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ | \ + IMAGE_SCN_MEM_DISCARDABLE) /* Characteristics */ + + /* The contents of the loader */ + .ascii ".text" + .byte 0 + .byte 0 + .byte 0 /* Pad to 8 bytes */ + .long _edata - _end_header /* Virtual size */ + .long _end_header - efi_start /* Virtual address */ + .long _edata - _end_header /* Size of raw data */ + .long _end_header - efi_start /* Pointer to raw data */ + .long 0 /* Pointer to relocations */ + .long 0 /* Pointer to line numbers */ + .short 0 /* Number of relocations */ + .short 0 /* Number of line numbers */ + .long (IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_EXECUTE | \ + IMAGE_SCN_MEM_READ) /* Characteristics */ +_end_header: + + .text + .globl _start +_start: + /* Save the boot params to the stack */ + stp x0, x1, [sp, #-16]! + + adr x0, __bss_start + adr x1, __bss_end + + b 2f + +1: + stp xzr, xzr, [x0], #16 +2: + cmp x0, x1 + b.lo 1b + + adr x0, ImageBase + adr x1, _DYNAMIC + + bl _reloc + + ldp x0, x1, [sp], #16 + + bl efi_main + +1: b 1b Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/ngie/more-tests/sys/boot/efi/loader/copy.c =================================================================== --- user/ngie/more-tests/sys/boot/efi/loader/copy.c (revision 281584) +++ user/ngie/more-tests/sys/boot/efi/loader/copy.c (revision 281585) @@ -1,133 +1,139 @@ /*- * Copyright (c) 2013 The FreeBSD Foundation * All rights reserved. * * This software was developed by Benno Rice under sponsorship from * the FreeBSD Foundation. * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #ifndef EFI_STAGING_SIZE #define EFI_STAGING_SIZE 32 #endif #define STAGE_PAGES ((EFI_STAGING_SIZE) * 1024 * 1024 / 4096) EFI_PHYSICAL_ADDRESS staging, staging_end; int stage_offset_set = 0; ssize_t stage_offset; int efi_copy_init(void) { EFI_STATUS status; status = BS->AllocatePages(AllocateAnyPages, EfiLoaderData, STAGE_PAGES, &staging); if (EFI_ERROR(status)) { printf("failed to allocate staging area: %lu\n", (unsigned long)(status & EFI_ERROR_MASK)); return (status); } staging_end = staging + STAGE_PAGES * 4096; -#ifdef __arm__ - /* Round the kernel load address to a 2MiB value */ +#if defined(__aarch64__) || defined(__arm__) + /* + * Round the kernel load address to a 2MiB value. This is needed + * because the kernel builds a page table based on where it has + * been loaded in physical address space. As the kernel will use + * either a 1MiB or 2MiB page for this we need to make sure it + * is correctly aligned for both cases. + */ staging = roundup2(staging, 2 * 1024 * 1024); #endif return (0); } void * efi_translate(vm_offset_t ptr) { return ((void *)(ptr + stage_offset)); } ssize_t efi_copyin(const void *src, vm_offset_t dest, const size_t len) { if (!stage_offset_set) { stage_offset = (vm_offset_t)staging - dest; stage_offset_set = 1; } /* XXX: Callers do not check for failure. */ if (dest + stage_offset + len > staging_end) { errno = ENOMEM; return (-1); } bcopy(src, (void *)(dest + stage_offset), len); return (len); } ssize_t efi_copyout(const vm_offset_t src, void *dest, const size_t len) { /* XXX: Callers do not check for failure. */ if (src + stage_offset + len > staging_end) { errno = ENOMEM; return (-1); } bcopy((void *)(src + stage_offset), dest, len); return (len); } ssize_t efi_readin(const int fd, vm_offset_t dest, const size_t len) { if (dest + stage_offset + len > staging_end) { errno = ENOMEM; return (-1); } return (read(fd, (void *)(dest + stage_offset), len)); } void efi_copy_finish(void) { uint64_t *src, *dst, *last; src = (uint64_t *)staging; dst = (uint64_t *)(staging - stage_offset); last = (uint64_t *)(staging + STAGE_PAGES * EFI_PAGE_SIZE); while (src < last) *dst++ = *src++; } Index: user/ngie/more-tests/sys/boot =================================================================== --- user/ngie/more-tests/sys/boot (revision 281584) +++ user/ngie/more-tests/sys/boot (revision 281585) Property changes on: user/ngie/more-tests/sys/boot ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/sys/boot:r281504-281584 Index: user/ngie/more-tests/sys/cam/cam_xpt.c =================================================================== --- user/ngie/more-tests/sys/cam/cam_xpt.c (revision 281584) +++ user/ngie/more-tests/sys/cam/cam_xpt.c (revision 281585) @@ -1,5316 +1,5318 @@ /*- * Implementation of the Common Access Method Transport (XPT) layer. * * Copyright (c) 1997, 1998, 1999 Justin T. Gibbs. * Copyright (c) 1997, 1998, 1999 Kenneth D. Merry. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* geometry translation */ #include /* for xpt_print below */ #include "opt_cam.h" /* * This is the maximum number of high powered commands (e.g. start unit) * that can be outstanding at a particular time. */ #ifndef CAM_MAX_HIGHPOWER #define CAM_MAX_HIGHPOWER 4 #endif /* Datastructures internal to the xpt layer */ MALLOC_DEFINE(M_CAMXPT, "CAM XPT", "CAM XPT buffers"); MALLOC_DEFINE(M_CAMDEV, "CAM DEV", "CAM devices"); MALLOC_DEFINE(M_CAMCCB, "CAM CCB", "CAM CCBs"); MALLOC_DEFINE(M_CAMPATH, "CAM path", "CAM paths"); /* Object for defering XPT actions to a taskqueue */ struct xpt_task { struct task task; void *data1; uintptr_t data2; }; struct xpt_softc { uint32_t xpt_generation; /* number of high powered commands that can go through right now */ struct mtx xpt_highpower_lock; STAILQ_HEAD(highpowerlist, cam_ed) highpowerq; int num_highpower; /* queue for handling async rescan requests. */ TAILQ_HEAD(, ccb_hdr) ccb_scanq; int buses_to_config; int buses_config_done; /* Registered busses */ TAILQ_HEAD(,cam_eb) xpt_busses; u_int bus_generation; struct intr_config_hook *xpt_config_hook; int boot_delay; struct callout boot_callout; struct mtx xpt_topo_lock; struct mtx xpt_lock; struct taskqueue *xpt_taskq; }; typedef enum { DM_RET_COPY = 0x01, DM_RET_FLAG_MASK = 0x0f, DM_RET_NONE = 0x00, DM_RET_STOP = 0x10, DM_RET_DESCEND = 0x20, DM_RET_ERROR = 0x30, DM_RET_ACTION_MASK = 0xf0 } dev_match_ret; typedef enum { XPT_DEPTH_BUS, XPT_DEPTH_TARGET, XPT_DEPTH_DEVICE, XPT_DEPTH_PERIPH } xpt_traverse_depth; struct xpt_traverse_config { xpt_traverse_depth depth; void *tr_func; void *tr_arg; }; typedef int xpt_busfunc_t (struct cam_eb *bus, void *arg); typedef int xpt_targetfunc_t (struct cam_et *target, void *arg); typedef int xpt_devicefunc_t (struct cam_ed *device, void *arg); typedef int xpt_periphfunc_t (struct cam_periph *periph, void *arg); typedef int xpt_pdrvfunc_t (struct periph_driver **pdrv, void *arg); /* Transport layer configuration information */ static struct xpt_softc xsoftc; MTX_SYSINIT(xpt_topo_init, &xsoftc.xpt_topo_lock, "XPT topology lock", MTX_DEF); SYSCTL_INT(_kern_cam, OID_AUTO, boot_delay, CTLFLAG_RDTUN, &xsoftc.boot_delay, 0, "Bus registration wait time"); SYSCTL_UINT(_kern_cam, OID_AUTO, xpt_generation, CTLFLAG_RD, &xsoftc.xpt_generation, 0, "CAM peripheral generation count"); struct cam_doneq { struct mtx_padalign cam_doneq_mtx; STAILQ_HEAD(, ccb_hdr) cam_doneq; int cam_doneq_sleep; }; static struct cam_doneq cam_doneqs[MAXCPU]; static int cam_num_doneqs; static struct proc *cam_proc; SYSCTL_INT(_kern_cam, OID_AUTO, num_doneqs, CTLFLAG_RDTUN, &cam_num_doneqs, 0, "Number of completion queues/threads"); struct cam_periph *xpt_periph; static periph_init_t xpt_periph_init; static struct periph_driver xpt_driver = { xpt_periph_init, "xpt", TAILQ_HEAD_INITIALIZER(xpt_driver.units), /* generation */ 0, CAM_PERIPH_DRV_EARLY }; PERIPHDRIVER_DECLARE(xpt, xpt_driver); static d_open_t xptopen; static d_close_t xptclose; static d_ioctl_t xptioctl; static d_ioctl_t xptdoioctl; static struct cdevsw xpt_cdevsw = { .d_version = D_VERSION, .d_flags = 0, .d_open = xptopen, .d_close = xptclose, .d_ioctl = xptioctl, .d_name = "xpt", }; /* Storage for debugging datastructures */ struct cam_path *cam_dpath; u_int32_t cam_dflags = CAM_DEBUG_FLAGS; SYSCTL_UINT(_kern_cam, OID_AUTO, dflags, CTLFLAG_RWTUN, &cam_dflags, 0, "Enabled debug flags"); u_int32_t cam_debug_delay = CAM_DEBUG_DELAY; SYSCTL_UINT(_kern_cam, OID_AUTO, debug_delay, CTLFLAG_RWTUN, &cam_debug_delay, 0, "Delay in us after each debug message"); /* Our boot-time initialization hook */ static int cam_module_event_handler(module_t, int /*modeventtype_t*/, void *); static moduledata_t cam_moduledata = { "cam", cam_module_event_handler, NULL }; static int xpt_init(void *); DECLARE_MODULE(cam, cam_moduledata, SI_SUB_CONFIGURE, SI_ORDER_SECOND); MODULE_VERSION(cam, 1); static void xpt_async_bcast(struct async_list *async_head, u_int32_t async_code, struct cam_path *path, void *async_arg); static path_id_t xptnextfreepathid(void); static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus); static union ccb *xpt_get_ccb(struct cam_periph *periph); static union ccb *xpt_get_ccb_nowait(struct cam_periph *periph); static void xpt_run_allocq(struct cam_periph *periph, int sleep); static void xpt_run_allocq_task(void *context, int pending); static void xpt_run_devq(struct cam_devq *devq); static timeout_t xpt_release_devq_timeout; static void xpt_release_simq_timeout(void *arg) __unused; static void xpt_acquire_bus(struct cam_eb *bus); static void xpt_release_bus(struct cam_eb *bus); static uint32_t xpt_freeze_devq_device(struct cam_ed *dev, u_int count); static int xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue); static struct cam_et* xpt_alloc_target(struct cam_eb *bus, target_id_t target_id); static void xpt_acquire_target(struct cam_et *target); static void xpt_release_target(struct cam_et *target); static struct cam_eb* xpt_find_bus(path_id_t path_id); static struct cam_et* xpt_find_target(struct cam_eb *bus, target_id_t target_id); static struct cam_ed* xpt_find_device(struct cam_et *target, lun_id_t lun_id); static void xpt_config(void *arg); static int xpt_schedule_dev(struct camq *queue, cam_pinfo *dev_pinfo, u_int32_t new_priority); static xpt_devicefunc_t xptpassannouncefunc; static void xptaction(struct cam_sim *sim, union ccb *work_ccb); static void xptpoll(struct cam_sim *sim); static void camisr_runqueue(void); static void xpt_done_process(struct ccb_hdr *ccb_h); static void xpt_done_td(void *); static dev_match_ret xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_eb *bus); static dev_match_ret xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_ed *device); static dev_match_ret xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_periph *periph); static xpt_busfunc_t xptedtbusfunc; static xpt_targetfunc_t xptedttargetfunc; static xpt_devicefunc_t xptedtdevicefunc; static xpt_periphfunc_t xptedtperiphfunc; static xpt_pdrvfunc_t xptplistpdrvfunc; static xpt_periphfunc_t xptplistperiphfunc; static int xptedtmatch(struct ccb_dev_match *cdm); static int xptperiphlistmatch(struct ccb_dev_match *cdm); static int xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg); static int xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target, xpt_targetfunc_t *tr_func, void *arg); static int xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device, xpt_devicefunc_t *tr_func, void *arg); static int xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg); static int xptpdrvtraverse(struct periph_driver **start_pdrv, xpt_pdrvfunc_t *tr_func, void *arg); static int xptpdperiphtraverse(struct periph_driver **pdrv, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg); static xpt_busfunc_t xptdefbusfunc; static xpt_targetfunc_t xptdeftargetfunc; static xpt_devicefunc_t xptdefdevicefunc; static xpt_periphfunc_t xptdefperiphfunc; static void xpt_finishconfig_task(void *context, int pending); static void xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg); static struct cam_ed * xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id); static xpt_devicefunc_t xptsetasyncfunc; static xpt_busfunc_t xptsetasyncbusfunc; static cam_status xptregister(struct cam_periph *periph, void *arg); static __inline int device_is_queued(struct cam_ed *device); static __inline int xpt_schedule_devq(struct cam_devq *devq, struct cam_ed *dev) { int retval; mtx_assert(&devq->send_mtx, MA_OWNED); if ((dev->ccbq.queue.entries > 0) && (dev->ccbq.dev_openings > 0) && (dev->ccbq.queue.qfrozen_cnt == 0)) { /* * The priority of a device waiting for controller * resources is that of the highest priority CCB * enqueued. */ retval = xpt_schedule_dev(&devq->send_queue, &dev->devq_entry, CAMQ_GET_PRIO(&dev->ccbq.queue)); } else { retval = 0; } return (retval); } static __inline int device_is_queued(struct cam_ed *device) { return (device->devq_entry.index != CAM_UNQUEUED_INDEX); } static void xpt_periph_init() { make_dev(&xpt_cdevsw, 0, UID_ROOT, GID_OPERATOR, 0600, "xpt0"); } static int xptopen(struct cdev *dev, int flags, int fmt, struct thread *td) { /* * Only allow read-write access. */ if (((flags & FWRITE) == 0) || ((flags & FREAD) == 0)) return(EPERM); /* * We don't allow nonblocking access. */ if ((flags & O_NONBLOCK) != 0) { printf("%s: can't do nonblocking access\n", devtoname(dev)); return(ENODEV); } return(0); } static int xptclose(struct cdev *dev, int flag, int fmt, struct thread *td) { return(0); } /* * Don't automatically grab the xpt softc lock here even though this is going * through the xpt device. The xpt device is really just a back door for * accessing other devices and SIMs, so the right thing to do is to grab * the appropriate SIM lock once the bus/SIM is located. */ static int xptioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; if ((error = xptdoioctl(dev, cmd, addr, flag, td)) == ENOTTY) { error = cam_compat_ioctl(dev, cmd, addr, flag, td, xptdoioctl); } return (error); } static int xptdoioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; error = 0; switch(cmd) { /* * For the transport layer CAMIOCOMMAND ioctl, we really only want * to accept CCB types that don't quite make sense to send through a * passthrough driver. XPT_PATH_INQ is an exception to this, as stated * in the CAM spec. */ case CAMIOCOMMAND: { union ccb *ccb; union ccb *inccb; struct cam_eb *bus; inccb = (union ccb *)addr; bus = xpt_find_bus(inccb->ccb_h.path_id); if (bus == NULL) return (EINVAL); switch (inccb->ccb_h.func_code) { case XPT_SCAN_BUS: case XPT_RESET_BUS: if (inccb->ccb_h.target_id != CAM_TARGET_WILDCARD || inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) { xpt_release_bus(bus); return (EINVAL); } break; case XPT_SCAN_TGT: if (inccb->ccb_h.target_id == CAM_TARGET_WILDCARD || inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) { xpt_release_bus(bus); return (EINVAL); } break; default: break; } switch(inccb->ccb_h.func_code) { case XPT_SCAN_BUS: case XPT_RESET_BUS: case XPT_PATH_INQ: case XPT_ENG_INQ: case XPT_SCAN_LUN: case XPT_SCAN_TGT: ccb = xpt_alloc_ccb(); /* * Create a path using the bus, target, and lun the * user passed in. */ if (xpt_create_path(&ccb->ccb_h.path, NULL, inccb->ccb_h.path_id, inccb->ccb_h.target_id, inccb->ccb_h.target_lun) != CAM_REQ_CMP){ error = EINVAL; xpt_free_ccb(ccb); break; } /* Ensure all of our fields are correct */ xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, inccb->ccb_h.pinfo.priority); xpt_merge_ccb(ccb, inccb); xpt_path_lock(ccb->ccb_h.path); cam_periph_runccb(ccb, NULL, 0, 0, NULL); xpt_path_unlock(ccb->ccb_h.path); bcopy(ccb, inccb, sizeof(union ccb)); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); break; case XPT_DEBUG: { union ccb ccb; /* * This is an immediate CCB, so it's okay to * allocate it on the stack. */ /* * Create a path using the bus, target, and lun the * user passed in. */ if (xpt_create_path(&ccb.ccb_h.path, NULL, inccb->ccb_h.path_id, inccb->ccb_h.target_id, inccb->ccb_h.target_lun) != CAM_REQ_CMP){ error = EINVAL; break; } /* Ensure all of our fields are correct */ xpt_setup_ccb(&ccb.ccb_h, ccb.ccb_h.path, inccb->ccb_h.pinfo.priority); xpt_merge_ccb(&ccb, inccb); xpt_action(&ccb); bcopy(&ccb, inccb, sizeof(union ccb)); xpt_free_path(ccb.ccb_h.path); break; } case XPT_DEV_MATCH: { struct cam_periph_map_info mapinfo; struct cam_path *old_path; /* * We can't deal with physical addresses for this * type of transaction. */ if ((inccb->ccb_h.flags & CAM_DATA_MASK) != CAM_DATA_VADDR) { error = EINVAL; break; } /* * Save this in case the caller had it set to * something in particular. */ old_path = inccb->ccb_h.path; /* * We really don't need a path for the matching * code. The path is needed because of the * debugging statements in xpt_action(). They * assume that the CCB has a valid path. */ inccb->ccb_h.path = xpt_periph->path; bzero(&mapinfo, sizeof(mapinfo)); /* * Map the pattern and match buffers into kernel * virtual address space. */ error = cam_periph_mapmem(inccb, &mapinfo); if (error) { inccb->ccb_h.path = old_path; break; } /* * This is an immediate CCB, we can send it on directly. */ xpt_action(inccb); /* * Map the buffers back into user space. */ cam_periph_unmapmem(inccb, &mapinfo); inccb->ccb_h.path = old_path; error = 0; break; } default: error = ENOTSUP; break; } xpt_release_bus(bus); break; } /* * This is the getpassthru ioctl. It takes a XPT_GDEVLIST ccb as input, * with the periphal driver name and unit name filled in. The other * fields don't really matter as input. The passthrough driver name * ("pass"), and unit number are passed back in the ccb. The current * device generation number, and the index into the device peripheral * driver list, and the status are also passed back. Note that * since we do everything in one pass, unlike the XPT_GDEVLIST ccb, * we never return a status of CAM_GDEVLIST_LIST_CHANGED. It is * (or rather should be) impossible for the device peripheral driver * list to change since we look at the whole thing in one pass, and * we do it with lock protection. * */ case CAMGETPASSTHRU: { union ccb *ccb; struct cam_periph *periph; struct periph_driver **p_drv; char *name; u_int unit; int base_periph_found; ccb = (union ccb *)addr; unit = ccb->cgdl.unit_number; name = ccb->cgdl.periph_name; base_periph_found = 0; /* * Sanity check -- make sure we don't get a null peripheral * driver name. */ if (*ccb->cgdl.periph_name == '\0') { error = EINVAL; break; } /* Keep the list from changing while we traverse it */ xpt_lock_buses(); /* first find our driver in the list of drivers */ for (p_drv = periph_drivers; *p_drv != NULL; p_drv++) if (strcmp((*p_drv)->driver_name, name) == 0) break; if (*p_drv == NULL) { xpt_unlock_buses(); ccb->ccb_h.status = CAM_REQ_CMP_ERR; ccb->cgdl.status = CAM_GDEVLIST_ERROR; *ccb->cgdl.periph_name = '\0'; ccb->cgdl.unit_number = 0; error = ENOENT; break; } /* * Run through every peripheral instance of this driver * and check to see whether it matches the unit passed * in by the user. If it does, get out of the loops and * find the passthrough driver associated with that * peripheral driver. */ for (periph = TAILQ_FIRST(&(*p_drv)->units); periph != NULL; periph = TAILQ_NEXT(periph, unit_links)) { if (periph->unit_number == unit) break; } /* * If we found the peripheral driver that the user passed * in, go through all of the peripheral drivers for that * particular device and look for a passthrough driver. */ if (periph != NULL) { struct cam_ed *device; int i; base_periph_found = 1; device = periph->path->device; for (i = 0, periph = SLIST_FIRST(&device->periphs); periph != NULL; periph = SLIST_NEXT(periph, periph_links), i++) { /* * Check to see whether we have a * passthrough device or not. */ if (strcmp(periph->periph_name, "pass") == 0) { /* * Fill in the getdevlist fields. */ strcpy(ccb->cgdl.periph_name, periph->periph_name); ccb->cgdl.unit_number = periph->unit_number; if (SLIST_NEXT(periph, periph_links)) ccb->cgdl.status = CAM_GDEVLIST_MORE_DEVS; else ccb->cgdl.status = CAM_GDEVLIST_LAST_DEVICE; ccb->cgdl.generation = device->generation; ccb->cgdl.index = i; /* * Fill in some CCB header fields * that the user may want. */ ccb->ccb_h.path_id = periph->path->bus->path_id; ccb->ccb_h.target_id = periph->path->target->target_id; ccb->ccb_h.target_lun = periph->path->device->lun_id; ccb->ccb_h.status = CAM_REQ_CMP; break; } } } /* * If the periph is null here, one of two things has * happened. The first possibility is that we couldn't * find the unit number of the particular peripheral driver * that the user is asking about. e.g. the user asks for * the passthrough driver for "da11". We find the list of * "da" peripherals all right, but there is no unit 11. * The other possibility is that we went through the list * of peripheral drivers attached to the device structure, * but didn't find one with the name "pass". Either way, * we return ENOENT, since we couldn't find something. */ if (periph == NULL) { ccb->ccb_h.status = CAM_REQ_CMP_ERR; ccb->cgdl.status = CAM_GDEVLIST_ERROR; *ccb->cgdl.periph_name = '\0'; ccb->cgdl.unit_number = 0; error = ENOENT; /* * It is unfortunate that this is even necessary, * but there are many, many clueless users out there. * If this is true, the user is looking for the * passthrough driver, but doesn't have one in his * kernel. */ if (base_periph_found == 1) { printf("xptioctl: pass driver is not in the " "kernel\n"); printf("xptioctl: put \"device pass\" in " "your kernel config file\n"); } } xpt_unlock_buses(); break; } default: error = ENOTTY; break; } return(error); } static int cam_module_event_handler(module_t mod, int what, void *arg) { int error; switch (what) { case MOD_LOAD: if ((error = xpt_init(NULL)) != 0) return (error); break; case MOD_UNLOAD: return EBUSY; default: return EOPNOTSUPP; } return 0; } static void xpt_rescan_done(struct cam_periph *periph, union ccb *done_ccb) { if (done_ccb->ccb_h.ppriv_ptr1 == NULL) { xpt_free_path(done_ccb->ccb_h.path); xpt_free_ccb(done_ccb); } else { done_ccb->ccb_h.cbfcnp = done_ccb->ccb_h.ppriv_ptr1; (*done_ccb->ccb_h.cbfcnp)(periph, done_ccb); } xpt_release_boot(); } /* thread to handle bus rescans */ static void xpt_scanner_thread(void *dummy) { union ccb *ccb; struct cam_path path; xpt_lock_buses(); for (;;) { if (TAILQ_EMPTY(&xsoftc.ccb_scanq)) msleep(&xsoftc.ccb_scanq, &xsoftc.xpt_topo_lock, PRIBIO, "-", 0); if ((ccb = (union ccb *)TAILQ_FIRST(&xsoftc.ccb_scanq)) != NULL) { TAILQ_REMOVE(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe); xpt_unlock_buses(); /* * Since lock can be dropped inside and path freed * by completion callback even before return here, * take our own path copy for reference. */ xpt_copy_path(&path, ccb->ccb_h.path); xpt_path_lock(&path); xpt_action(ccb); xpt_path_unlock(&path); xpt_release_path(&path); xpt_lock_buses(); } } } void xpt_rescan(union ccb *ccb) { struct ccb_hdr *hdr; /* Prepare request */ if (ccb->ccb_h.path->target->target_id == CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_BUS; else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_TGT; else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id != CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_LUN; else { xpt_print(ccb->ccb_h.path, "illegal scan path\n"); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } ccb->ccb_h.ppriv_ptr1 = ccb->ccb_h.cbfcnp; ccb->ccb_h.cbfcnp = xpt_rescan_done; xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, CAM_PRIORITY_XPT); /* Don't make duplicate entries for the same paths. */ xpt_lock_buses(); if (ccb->ccb_h.ppriv_ptr1 == NULL) { TAILQ_FOREACH(hdr, &xsoftc.ccb_scanq, sim_links.tqe) { if (xpt_path_comp(hdr->path, ccb->ccb_h.path) == 0) { wakeup(&xsoftc.ccb_scanq); xpt_unlock_buses(); xpt_print(ccb->ccb_h.path, "rescan already queued\n"); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } } } TAILQ_INSERT_TAIL(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe); xsoftc.buses_to_config++; wakeup(&xsoftc.ccb_scanq); xpt_unlock_buses(); } /* Functions accessed by the peripheral drivers */ static int xpt_init(void *dummy) { struct cam_sim *xpt_sim; struct cam_path *path; struct cam_devq *devq; cam_status status; int error, i; TAILQ_INIT(&xsoftc.xpt_busses); TAILQ_INIT(&xsoftc.ccb_scanq); STAILQ_INIT(&xsoftc.highpowerq); xsoftc.num_highpower = CAM_MAX_HIGHPOWER; mtx_init(&xsoftc.xpt_lock, "XPT lock", NULL, MTX_DEF); mtx_init(&xsoftc.xpt_highpower_lock, "XPT highpower lock", NULL, MTX_DEF); xsoftc.xpt_taskq = taskqueue_create("CAM XPT task", M_WAITOK, taskqueue_thread_enqueue, /*context*/&xsoftc.xpt_taskq); #ifdef CAM_BOOT_DELAY /* * Override this value at compile time to assist our users * who don't use loader to boot a kernel. */ xsoftc.boot_delay = CAM_BOOT_DELAY; #endif /* * The xpt layer is, itself, the equivelent of a SIM. * Allow 16 ccbs in the ccb pool for it. This should * give decent parallelism when we probe busses and * perform other XPT functions. */ devq = cam_simq_alloc(16); xpt_sim = cam_sim_alloc(xptaction, xptpoll, "xpt", /*softc*/NULL, /*unit*/0, /*mtx*/&xsoftc.xpt_lock, /*max_dev_transactions*/0, /*max_tagged_dev_transactions*/0, devq); if (xpt_sim == NULL) return (ENOMEM); mtx_lock(&xsoftc.xpt_lock); if ((status = xpt_bus_register(xpt_sim, NULL, 0)) != CAM_SUCCESS) { mtx_unlock(&xsoftc.xpt_lock); printf("xpt_init: xpt_bus_register failed with status %#x," " failing attach\n", status); return (EINVAL); } mtx_unlock(&xsoftc.xpt_lock); /* * Looking at the XPT from the SIM layer, the XPT is * the equivelent of a peripheral driver. Allocate * a peripheral driver entry for us. */ if ((status = xpt_create_path(&path, NULL, CAM_XPT_PATH_ID, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD)) != CAM_REQ_CMP) { printf("xpt_init: xpt_create_path failed with status %#x," " failing attach\n", status); return (EINVAL); } xpt_path_lock(path); cam_periph_alloc(xptregister, NULL, NULL, NULL, "xpt", CAM_PERIPH_BIO, path, NULL, 0, xpt_sim); xpt_path_unlock(path); xpt_free_path(path); if (cam_num_doneqs < 1) cam_num_doneqs = 1 + mp_ncpus / 6; else if (cam_num_doneqs > MAXCPU) cam_num_doneqs = MAXCPU; for (i = 0; i < cam_num_doneqs; i++) { mtx_init(&cam_doneqs[i].cam_doneq_mtx, "CAM doneq", NULL, MTX_DEF); STAILQ_INIT(&cam_doneqs[i].cam_doneq); error = kproc_kthread_add(xpt_done_td, &cam_doneqs[i], &cam_proc, NULL, 0, 0, "cam", "doneq%d", i); if (error != 0) { cam_num_doneqs = i; break; } } if (cam_num_doneqs < 1) { printf("xpt_init: Cannot init completion queues " "- failing attach\n"); return (ENOMEM); } /* * Register a callback for when interrupts are enabled. */ xsoftc.xpt_config_hook = (struct intr_config_hook *)malloc(sizeof(struct intr_config_hook), M_CAMXPT, M_NOWAIT | M_ZERO); if (xsoftc.xpt_config_hook == NULL) { printf("xpt_init: Cannot malloc config hook " "- failing attach\n"); return (ENOMEM); } xsoftc.xpt_config_hook->ich_func = xpt_config; if (config_intrhook_establish(xsoftc.xpt_config_hook) != 0) { free (xsoftc.xpt_config_hook, M_CAMXPT); printf("xpt_init: config_intrhook_establish failed " "- failing attach\n"); } return (0); } static cam_status xptregister(struct cam_periph *periph, void *arg) { struct cam_sim *xpt_sim; if (periph == NULL) { printf("xptregister: periph was NULL!!\n"); return(CAM_REQ_CMP_ERR); } xpt_sim = (struct cam_sim *)arg; xpt_sim->softc = periph; xpt_periph = periph; periph->softc = NULL; return(CAM_REQ_CMP); } int32_t xpt_add_periph(struct cam_periph *periph) { struct cam_ed *device; int32_t status; TASK_INIT(&periph->periph_run_task, 0, xpt_run_allocq_task, periph); device = periph->path->device; status = CAM_REQ_CMP; if (device != NULL) { mtx_lock(&device->target->bus->eb_mtx); device->generation++; SLIST_INSERT_HEAD(&device->periphs, periph, periph_links); mtx_unlock(&device->target->bus->eb_mtx); atomic_add_32(&xsoftc.xpt_generation, 1); } return (status); } void xpt_remove_periph(struct cam_periph *periph) { struct cam_ed *device; device = periph->path->device; if (device != NULL) { mtx_lock(&device->target->bus->eb_mtx); device->generation++; SLIST_REMOVE(&device->periphs, periph, cam_periph, periph_links); mtx_unlock(&device->target->bus->eb_mtx); atomic_add_32(&xsoftc.xpt_generation, 1); } } void xpt_announce_periph(struct cam_periph *periph, char *announce_string) { struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); periph->flags |= CAM_PERIPH_ANNOUNCED; printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n", periph->periph_name, periph->unit_number, path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id, path->bus->path_id, path->target->target_id, (uintmax_t)path->device->lun_id); printf("%s%d: ", periph->periph_name, periph->unit_number); if (path->device->protocol == PROTO_SCSI) scsi_print_inquiry(&path->device->inq_data); else if (path->device->protocol == PROTO_ATA || path->device->protocol == PROTO_SATAPM) ata_print_ident(&path->device->ident_data); else if (path->device->protocol == PROTO_SEMB) semb_print_ident( (struct sep_identify_data *)&path->device->ident_data); else printf("Unknown protocol device\n"); if (path->device->serial_num_len > 0) { /* Don't wrap the screen - print only the first 60 chars */ printf("%s%d: Serial Number %.60s\n", periph->periph_name, periph->unit_number, path->device->serial_num); } /* Announce transport details. */ (*(path->bus->xport->announce))(periph); /* Announce command queueing. */ if (path->device->inq_flags & SID_CmdQue || path->device->flags & CAM_DEV_TAG_AFTER_COUNT) { printf("%s%d: Command Queueing enabled\n", periph->periph_name, periph->unit_number); } /* Announce caller's details if they've passed in. */ if (announce_string != NULL) printf("%s%d: %s\n", periph->periph_name, periph->unit_number, announce_string); } void xpt_announce_quirks(struct cam_periph *periph, int quirks, char *bit_string) { if (quirks != 0) { printf("%s%d: quirks=0x%b\n", periph->periph_name, periph->unit_number, quirks, bit_string); } } void xpt_denounce_periph(struct cam_periph *periph) { struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n", periph->periph_name, periph->unit_number, path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id, path->bus->path_id, path->target->target_id, (uintmax_t)path->device->lun_id); printf("%s%d: ", periph->periph_name, periph->unit_number); if (path->device->protocol == PROTO_SCSI) scsi_print_inquiry_short(&path->device->inq_data); else if (path->device->protocol == PROTO_ATA || path->device->protocol == PROTO_SATAPM) ata_print_ident_short(&path->device->ident_data); else if (path->device->protocol == PROTO_SEMB) semb_print_ident_short( (struct sep_identify_data *)&path->device->ident_data); else printf("Unknown protocol device"); if (path->device->serial_num_len > 0) printf(" s/n %.60s", path->device->serial_num); printf(" detached\n"); } int xpt_getattr(char *buf, size_t len, const char *attr, struct cam_path *path) { int ret = -1, l; struct ccb_dev_advinfo cdai; struct scsi_vpd_id_descriptor *idd; xpt_path_assert(path, MA_OWNED); memset(&cdai, 0, sizeof(cdai)); xpt_setup_ccb(&cdai.ccb_h, path, CAM_PRIORITY_NORMAL); cdai.ccb_h.func_code = XPT_DEV_ADVINFO; cdai.bufsiz = len; if (!strcmp(attr, "GEOM::ident")) cdai.buftype = CDAI_TYPE_SERIAL_NUM; else if (!strcmp(attr, "GEOM::physpath")) cdai.buftype = CDAI_TYPE_PHYS_PATH; else if (strcmp(attr, "GEOM::lunid") == 0 || strcmp(attr, "GEOM::lunname") == 0) { cdai.buftype = CDAI_TYPE_SCSI_DEVID; cdai.bufsiz = CAM_SCSI_DEVID_MAXLEN; } else goto out; cdai.buf = malloc(cdai.bufsiz, M_CAMXPT, M_NOWAIT|M_ZERO); if (cdai.buf == NULL) { ret = ENOMEM; goto out; } xpt_action((union ccb *)&cdai); /* can only be synchronous */ if ((cdai.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(cdai.ccb_h.path, 0, 0, 0, FALSE); if (cdai.provsiz == 0) goto out; if (cdai.buftype == CDAI_TYPE_SCSI_DEVID) { if (strcmp(attr, "GEOM::lunid") == 0) { idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_naa); if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_eui64); } else idd = NULL; if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_t10); if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_name); if (idd == NULL) goto out; ret = 0; if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_ASCII) { if (idd->length < len) { for (l = 0; l < idd->length; l++) buf[l] = idd->identifier[l] ? idd->identifier[l] : ' '; buf[l] = 0; } else ret = EFAULT; } else if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_UTF8) { l = strnlen(idd->identifier, idd->length); if (l < len) { bcopy(idd->identifier, buf, l); buf[l] = 0; } else ret = EFAULT; } else { if (idd->length * 2 < len) { for (l = 0; l < idd->length; l++) sprintf(buf + l * 2, "%02x", idd->identifier[l]); } else ret = EFAULT; } } else { ret = 0; if (strlcpy(buf, cdai.buf, len) >= len) ret = EFAULT; } out: if (cdai.buf != NULL) free(cdai.buf, M_CAMXPT); return ret; } static dev_match_ret xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_eb *bus) { dev_match_ret retval; int i; retval = DM_RET_NONE; /* * If we aren't given something to match against, that's an error. */ if (bus == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this bus matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_DESCEND | DM_RET_COPY); for (i = 0; i < num_patterns; i++) { struct bus_match_pattern *cur_pattern; /* * If the pattern in question isn't for a bus node, we * aren't interested. However, we do indicate to the * calling routine that we should continue descending the * tree, since the user wants to match against lower-level * EDT elements. */ if (patterns[i].type != DEV_MATCH_BUS) { if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_DESCEND; continue; } cur_pattern = &patterns[i].pattern.bus_pattern; /* * If they want to match any bus node, we give them any * device node. */ if (cur_pattern->flags == BUS_MATCH_ANY) { /* set the copy flag */ retval |= DM_RET_COPY; /* * If we've already decided on an action, go ahead * and return. */ if ((retval & DM_RET_ACTION_MASK) != DM_RET_NONE) return(retval); } /* * Not sure why someone would do this... */ if (cur_pattern->flags == BUS_MATCH_NONE) continue; if (((cur_pattern->flags & BUS_MATCH_PATH) != 0) && (cur_pattern->path_id != bus->path_id)) continue; if (((cur_pattern->flags & BUS_MATCH_BUS_ID) != 0) && (cur_pattern->bus_id != bus->sim->bus_id)) continue; if (((cur_pattern->flags & BUS_MATCH_UNIT) != 0) && (cur_pattern->unit_number != bus->sim->unit_number)) continue; if (((cur_pattern->flags & BUS_MATCH_NAME) != 0) && (strncmp(cur_pattern->dev_name, bus->sim->sim_name, DEV_IDLEN) != 0)) continue; /* * If we get to this point, the user definitely wants * information on this bus. So tell the caller to copy the * data out. */ retval |= DM_RET_COPY; /* * If the return action has been set to descend, then we * know that we've already seen a non-bus matching * expression, therefore we need to further descend the tree. * This won't change by continuing around the loop, so we * go ahead and return. If we haven't seen a non-bus * matching expression, we keep going around the loop until * we exhaust the matching expressions. We'll set the stop * flag once we fall out of the loop. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND) return(retval); } /* * If the return action hasn't been set to descend yet, that means * we haven't seen anything other than bus matching patterns. So * tell the caller to stop descending the tree -- the user doesn't * want to match against lower level tree elements. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_STOP; return(retval); } static dev_match_ret xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_ed *device) { dev_match_ret retval; int i; retval = DM_RET_NONE; /* * If we aren't given something to match against, that's an error. */ if (device == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this device matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_DESCEND | DM_RET_COPY); for (i = 0; i < num_patterns; i++) { struct device_match_pattern *cur_pattern; struct scsi_vpd_device_id *device_id_page; /* * If the pattern in question isn't for a device node, we * aren't interested. */ if (patterns[i].type != DEV_MATCH_DEVICE) { if ((patterns[i].type == DEV_MATCH_PERIPH) && ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE)) retval |= DM_RET_DESCEND; continue; } cur_pattern = &patterns[i].pattern.device_pattern; /* Error out if mutually exclusive options are specified. */ if ((cur_pattern->flags & (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID)) == (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID)) return(DM_RET_ERROR); /* * If they want to match any device node, we give them any * device node. */ if (cur_pattern->flags == DEV_MATCH_ANY) goto copy_dev_node; /* * Not sure why someone would do this... */ if (cur_pattern->flags == DEV_MATCH_NONE) continue; if (((cur_pattern->flags & DEV_MATCH_PATH) != 0) && (cur_pattern->path_id != device->target->bus->path_id)) continue; if (((cur_pattern->flags & DEV_MATCH_TARGET) != 0) && (cur_pattern->target_id != device->target->target_id)) continue; if (((cur_pattern->flags & DEV_MATCH_LUN) != 0) && (cur_pattern->target_lun != device->lun_id)) continue; if (((cur_pattern->flags & DEV_MATCH_INQUIRY) != 0) && (cam_quirkmatch((caddr_t)&device->inq_data, (caddr_t)&cur_pattern->data.inq_pat, 1, sizeof(cur_pattern->data.inq_pat), scsi_static_inquiry_match) == NULL)) continue; device_id_page = (struct scsi_vpd_device_id *)device->device_id; if (((cur_pattern->flags & DEV_MATCH_DEVID) != 0) && (device->device_id_len < SVPD_DEVICE_ID_HDR_LEN || scsi_devid_match((uint8_t *)device_id_page->desc_list, device->device_id_len - SVPD_DEVICE_ID_HDR_LEN, cur_pattern->data.devid_pat.id, cur_pattern->data.devid_pat.id_len) != 0)) continue; copy_dev_node: /* * If we get to this point, the user definitely wants * information on this device. So tell the caller to copy * the data out. */ retval |= DM_RET_COPY; /* * If the return action has been set to descend, then we * know that we've already seen a peripheral matching * expression, therefore we need to further descend the tree. * This won't change by continuing around the loop, so we * go ahead and return. If we haven't seen a peripheral * matching expression, we keep going around the loop until * we exhaust the matching expressions. We'll set the stop * flag once we fall out of the loop. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND) return(retval); } /* * If the return action hasn't been set to descend yet, that means * we haven't seen any peripheral matching patterns. So tell the * caller to stop descending the tree -- the user doesn't want to * match against lower level tree elements. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_STOP; return(retval); } /* * Match a single peripheral against any number of match patterns. */ static dev_match_ret xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_periph *periph) { dev_match_ret retval; int i; /* * If we aren't given something to match against, that's an error. */ if (periph == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this peripheral matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_STOP | DM_RET_COPY); /* * There aren't any nodes below a peripheral node, so there's no * reason to descend the tree any further. */ retval = DM_RET_STOP; for (i = 0; i < num_patterns; i++) { struct periph_match_pattern *cur_pattern; /* * If the pattern in question isn't for a peripheral, we * aren't interested. */ if (patterns[i].type != DEV_MATCH_PERIPH) continue; cur_pattern = &patterns[i].pattern.periph_pattern; /* * If they want to match on anything, then we will do so. */ if (cur_pattern->flags == PERIPH_MATCH_ANY) { /* set the copy flag */ retval |= DM_RET_COPY; /* * We've already set the return action to stop, * since there are no nodes below peripherals in * the tree. */ return(retval); } /* * Not sure why someone would do this... */ if (cur_pattern->flags == PERIPH_MATCH_NONE) continue; if (((cur_pattern->flags & PERIPH_MATCH_PATH) != 0) && (cur_pattern->path_id != periph->path->bus->path_id)) continue; /* * For the target and lun id's, we have to make sure the * target and lun pointers aren't NULL. The xpt peripheral * has a wildcard target and device. */ if (((cur_pattern->flags & PERIPH_MATCH_TARGET) != 0) && ((periph->path->target == NULL) ||(cur_pattern->target_id != periph->path->target->target_id))) continue; if (((cur_pattern->flags & PERIPH_MATCH_LUN) != 0) && ((periph->path->device == NULL) || (cur_pattern->target_lun != periph->path->device->lun_id))) continue; if (((cur_pattern->flags & PERIPH_MATCH_UNIT) != 0) && (cur_pattern->unit_number != periph->unit_number)) continue; if (((cur_pattern->flags & PERIPH_MATCH_NAME) != 0) && (strncmp(cur_pattern->periph_name, periph->periph_name, DEV_IDLEN) != 0)) continue; /* * If we get to this point, the user definitely wants * information on this peripheral. So tell the caller to * copy the data out. */ retval |= DM_RET_COPY; /* * The return action has already been set to stop, since * peripherals don't have any nodes below them in the EDT. */ return(retval); } /* * If we get to this point, the peripheral that was passed in * doesn't match any of the patterns. */ return(retval); } static int xptedtbusfunc(struct cam_eb *bus, void *arg) { struct ccb_dev_match *cdm; struct cam_et *target; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; /* * If our position is for something deeper in the tree, that means * that we've already seen this node. So, we keep going down. */ if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target != NULL)) retval = DM_RET_DESCEND; else retval = xptbusmatch(cdm->patterns, cdm->num_patterns, bus); /* * If we got an error, bail out of the search. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this bus out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS; cdm->pos.cookie.bus = bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_BUS; cdm->matches[j].result.bus_result.path_id = bus->path_id; cdm->matches[j].result.bus_result.bus_id = bus->sim->bus_id; cdm->matches[j].result.bus_result.unit_number = bus->sim->unit_number; strncpy(cdm->matches[j].result.bus_result.dev_name, bus->sim->sim_name, DEV_IDLEN); } /* * If the user is only interested in busses, there's no * reason to descend to the next level in the tree. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP) return(1); /* * If there is a target generation recorded, check it to * make sure the target list hasn't changed. */ mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target != NULL)) { if ((cdm->pos.generations[CAM_TARGET_GENERATION] != bus->generation)) { mtx_unlock(&bus->eb_mtx); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return (0); } target = (struct cam_et *)cdm->pos.cookie.target; target->refcount++; } else target = NULL; mtx_unlock(&bus->eb_mtx); return (xpttargettraverse(bus, target, xptedttargetfunc, arg)); } static int xptedttargetfunc(struct cam_et *target, void *arg) { struct ccb_dev_match *cdm; struct cam_eb *bus; struct cam_ed *device; cdm = (struct ccb_dev_match *)arg; bus = target->bus; /* * If there is a device list generation recorded, check it to * make sure the device list hasn't changed. */ mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target == target) && (cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device != NULL)) { if (cdm->pos.generations[CAM_DEV_GENERATION] != target->generation) { mtx_unlock(&bus->eb_mtx); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } device = (struct cam_ed *)cdm->pos.cookie.device; device->refcount++; } else device = NULL; mtx_unlock(&bus->eb_mtx); return (xptdevicetraverse(target, device, xptedtdevicefunc, arg)); } static int xptedtdevicefunc(struct cam_ed *device, void *arg) { struct cam_eb *bus; struct cam_periph *periph; struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; bus = device->target->bus; /* * If our position is for something deeper in the tree, that means * that we've already seen this node. So, we keep going down. */ if ((cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device == device) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) retval = DM_RET_DESCEND; else retval = xptdevicematch(cdm->patterns, cdm->num_patterns, device); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this device out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS | CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE; cdm->pos.cookie.bus = device->target->bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->pos.cookie.target = device->target; cdm->pos.generations[CAM_TARGET_GENERATION] = device->target->bus->generation; cdm->pos.cookie.device = device; cdm->pos.generations[CAM_DEV_GENERATION] = device->target->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_DEVICE; cdm->matches[j].result.device_result.path_id = device->target->bus->path_id; cdm->matches[j].result.device_result.target_id = device->target->target_id; cdm->matches[j].result.device_result.target_lun = device->lun_id; cdm->matches[j].result.device_result.protocol = device->protocol; bcopy(&device->inq_data, &cdm->matches[j].result.device_result.inq_data, sizeof(struct scsi_inquiry_data)); bcopy(&device->ident_data, &cdm->matches[j].result.device_result.ident_data, sizeof(struct ata_params)); /* Let the user know whether this device is unconfigured */ if (device->flags & CAM_DEV_UNCONFIGURED) cdm->matches[j].result.device_result.flags = DEV_RESULT_UNCONFIGURED; else cdm->matches[j].result.device_result.flags = DEV_RESULT_NOFLAG; } /* * If the user isn't interested in peripherals, don't descend * the tree any further. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP) return(1); /* * If there is a peripheral list generation recorded, make sure * it hasn't changed. */ xpt_lock_buses(); mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target == device->target) && (cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device == device) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) { if (cdm->pos.generations[CAM_PERIPH_GENERATION] != device->generation) { mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } periph = (struct cam_periph *)cdm->pos.cookie.periph; periph->refcount++; } else periph = NULL; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); return (xptperiphtraverse(device, periph, xptedtperiphfunc, arg)); } static int xptedtperiphfunc(struct cam_periph *periph, void *arg) { struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this peripheral out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS | CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE | CAM_DEV_POS_PERIPH; cdm->pos.cookie.bus = periph->path->bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->pos.cookie.target = periph->path->target; cdm->pos.generations[CAM_TARGET_GENERATION] = periph->path->bus->generation; cdm->pos.cookie.device = periph->path->device; cdm->pos.generations[CAM_DEV_GENERATION] = periph->path->target->generation; cdm->pos.cookie.periph = periph; cdm->pos.generations[CAM_PERIPH_GENERATION] = periph->path->device->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_PERIPH; cdm->matches[j].result.periph_result.path_id = periph->path->bus->path_id; cdm->matches[j].result.periph_result.target_id = periph->path->target->target_id; cdm->matches[j].result.periph_result.target_lun = periph->path->device->lun_id; cdm->matches[j].result.periph_result.unit_number = periph->unit_number; strncpy(cdm->matches[j].result.periph_result.periph_name, periph->periph_name, DEV_IDLEN); } return(1); } static int xptedtmatch(struct ccb_dev_match *cdm) { struct cam_eb *bus; int ret; cdm->num_matches = 0; /* * Check the bus list generation. If it has changed, the user * needs to reset everything and start over. */ xpt_lock_buses(); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus != NULL)) { if (cdm->pos.generations[CAM_BUS_GENERATION] != xsoftc.bus_generation) { xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } bus = (struct cam_eb *)cdm->pos.cookie.bus; bus->refcount++; } else bus = NULL; xpt_unlock_buses(); ret = xptbustraverse(bus, xptedtbusfunc, cdm); /* * If we get back 0, that means that we had to stop before fully * traversing the EDT. It also means that one of the subroutines * has set the status field to the proper value. If we get back 1, * we've fully traversed the EDT and copied out any matching entries. */ if (ret == 1) cdm->status = CAM_DEV_MATCH_LAST; return(ret); } static int xptplistpdrvfunc(struct periph_driver **pdrv, void *arg) { struct cam_periph *periph; struct ccb_dev_match *cdm; cdm = (struct ccb_dev_match *)arg; xpt_lock_buses(); if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR) && (cdm->pos.cookie.pdrv == pdrv) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) { if (cdm->pos.generations[CAM_PERIPH_GENERATION] != (*pdrv)->generation) { xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } periph = (struct cam_periph *)cdm->pos.cookie.periph; periph->refcount++; } else periph = NULL; xpt_unlock_buses(); return (xptpdperiphtraverse(pdrv, periph, xptplistperiphfunc, arg)); } static int xptplistperiphfunc(struct cam_periph *periph, void *arg) { struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this peripheral out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { struct periph_driver **pdrv; pdrv = NULL; bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_PDRV | CAM_DEV_POS_PDPTR | CAM_DEV_POS_PERIPH; /* * This may look a bit non-sensical, but it is * actually quite logical. There are very few * peripheral drivers, and bloating every peripheral * structure with a pointer back to its parent * peripheral driver linker set entry would cost * more in the long run than doing this quick lookup. */ for (pdrv = periph_drivers; *pdrv != NULL; pdrv++) { if (strcmp((*pdrv)->driver_name, periph->periph_name) == 0) break; } if (*pdrv == NULL) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } cdm->pos.cookie.pdrv = pdrv; /* * The periph generation slot does double duty, as * does the periph pointer slot. They are used for * both edt and pdrv lookups and positioning. */ cdm->pos.cookie.periph = periph; cdm->pos.generations[CAM_PERIPH_GENERATION] = (*pdrv)->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_PERIPH; cdm->matches[j].result.periph_result.path_id = periph->path->bus->path_id; /* * The transport layer peripheral doesn't have a target or * lun. */ if (periph->path->target) cdm->matches[j].result.periph_result.target_id = periph->path->target->target_id; else cdm->matches[j].result.periph_result.target_id = CAM_TARGET_WILDCARD; if (periph->path->device) cdm->matches[j].result.periph_result.target_lun = periph->path->device->lun_id; else cdm->matches[j].result.periph_result.target_lun = CAM_LUN_WILDCARD; cdm->matches[j].result.periph_result.unit_number = periph->unit_number; strncpy(cdm->matches[j].result.periph_result.periph_name, periph->periph_name, DEV_IDLEN); } return(1); } static int xptperiphlistmatch(struct ccb_dev_match *cdm) { int ret; cdm->num_matches = 0; /* * At this point in the edt traversal function, we check the bus * list generation to make sure that no busses have been added or * removed since the user last sent a XPT_DEV_MATCH ccb through. * For the peripheral driver list traversal function, however, we * don't have to worry about new peripheral driver types coming or * going; they're in a linker set, and therefore can't change * without a recompile. */ if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR) && (cdm->pos.cookie.pdrv != NULL)) ret = xptpdrvtraverse( (struct periph_driver **)cdm->pos.cookie.pdrv, xptplistpdrvfunc, cdm); else ret = xptpdrvtraverse(NULL, xptplistpdrvfunc, cdm); /* * If we get back 0, that means that we had to stop before fully * traversing the peripheral driver tree. It also means that one of * the subroutines has set the status field to the proper value. If * we get back 1, we've fully traversed the EDT and copied out any * matching entries. */ if (ret == 1) cdm->status = CAM_DEV_MATCH_LAST; return(ret); } static int xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg) { struct cam_eb *bus, *next_bus; int retval; retval = 1; if (start_bus) bus = start_bus; else { xpt_lock_buses(); bus = TAILQ_FIRST(&xsoftc.xpt_busses); if (bus == NULL) { xpt_unlock_buses(); return (retval); } bus->refcount++; xpt_unlock_buses(); } for (; bus != NULL; bus = next_bus) { retval = tr_func(bus, arg); if (retval == 0) { xpt_release_bus(bus); break; } xpt_lock_buses(); next_bus = TAILQ_NEXT(bus, links); if (next_bus) next_bus->refcount++; xpt_unlock_buses(); xpt_release_bus(bus); } return(retval); } static int xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target, xpt_targetfunc_t *tr_func, void *arg) { struct cam_et *target, *next_target; int retval; retval = 1; if (start_target) target = start_target; else { mtx_lock(&bus->eb_mtx); target = TAILQ_FIRST(&bus->et_entries); if (target == NULL) { mtx_unlock(&bus->eb_mtx); return (retval); } target->refcount++; mtx_unlock(&bus->eb_mtx); } for (; target != NULL; target = next_target) { retval = tr_func(target, arg); if (retval == 0) { xpt_release_target(target); break; } mtx_lock(&bus->eb_mtx); next_target = TAILQ_NEXT(target, links); if (next_target) next_target->refcount++; mtx_unlock(&bus->eb_mtx); xpt_release_target(target); } return(retval); } static int xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device, xpt_devicefunc_t *tr_func, void *arg) { struct cam_eb *bus; struct cam_ed *device, *next_device; int retval; retval = 1; bus = target->bus; if (start_device) device = start_device; else { mtx_lock(&bus->eb_mtx); device = TAILQ_FIRST(&target->ed_entries); if (device == NULL) { mtx_unlock(&bus->eb_mtx); return (retval); } device->refcount++; mtx_unlock(&bus->eb_mtx); } for (; device != NULL; device = next_device) { mtx_lock(&device->device_mtx); retval = tr_func(device, arg); mtx_unlock(&device->device_mtx); if (retval == 0) { xpt_release_device(device); break; } mtx_lock(&bus->eb_mtx); next_device = TAILQ_NEXT(device, links); if (next_device) next_device->refcount++; mtx_unlock(&bus->eb_mtx); xpt_release_device(device); } return(retval); } static int xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg) { struct cam_eb *bus; struct cam_periph *periph, *next_periph; int retval; retval = 1; bus = device->target->bus; if (start_periph) periph = start_periph; else { xpt_lock_buses(); mtx_lock(&bus->eb_mtx); periph = SLIST_FIRST(&device->periphs); while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0) periph = SLIST_NEXT(periph, periph_links); if (periph == NULL) { mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); return (retval); } periph->refcount++; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); } for (; periph != NULL; periph = next_periph) { retval = tr_func(periph, arg); if (retval == 0) { cam_periph_release_locked(periph); break; } xpt_lock_buses(); mtx_lock(&bus->eb_mtx); next_periph = SLIST_NEXT(periph, periph_links); while (next_periph != NULL && (next_periph->flags & CAM_PERIPH_FREE) != 0) next_periph = SLIST_NEXT(next_periph, periph_links); if (next_periph) next_periph->refcount++; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); cam_periph_release_locked(periph); } return(retval); } static int xptpdrvtraverse(struct periph_driver **start_pdrv, xpt_pdrvfunc_t *tr_func, void *arg) { struct periph_driver **pdrv; int retval; retval = 1; /* * We don't traverse the peripheral driver list like we do the * other lists, because it is a linker set, and therefore cannot be * changed during runtime. If the peripheral driver list is ever * re-done to be something other than a linker set (i.e. it can * change while the system is running), the list traversal should * be modified to work like the other traversal functions. */ for (pdrv = (start_pdrv ? start_pdrv : periph_drivers); *pdrv != NULL; pdrv++) { retval = tr_func(pdrv, arg); if (retval == 0) return(retval); } return(retval); } static int xptpdperiphtraverse(struct periph_driver **pdrv, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg) { struct cam_periph *periph, *next_periph; int retval; retval = 1; if (start_periph) periph = start_periph; else { xpt_lock_buses(); periph = TAILQ_FIRST(&(*pdrv)->units); while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0) periph = TAILQ_NEXT(periph, unit_links); if (periph == NULL) { xpt_unlock_buses(); return (retval); } periph->refcount++; xpt_unlock_buses(); } for (; periph != NULL; periph = next_periph) { cam_periph_lock(periph); retval = tr_func(periph, arg); cam_periph_unlock(periph); if (retval == 0) { cam_periph_release(periph); break; } xpt_lock_buses(); next_periph = TAILQ_NEXT(periph, unit_links); while (next_periph != NULL && (next_periph->flags & CAM_PERIPH_FREE) != 0) next_periph = TAILQ_NEXT(next_periph, unit_links); if (next_periph) next_periph->refcount++; xpt_unlock_buses(); cam_periph_release(periph); } return(retval); } static int xptdefbusfunc(struct cam_eb *bus, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_BUS) { xpt_busfunc_t *tr_func; tr_func = (xpt_busfunc_t *)tr_config->tr_func; return(tr_func(bus, tr_config->tr_arg)); } else return(xpttargettraverse(bus, NULL, xptdeftargetfunc, arg)); } static int xptdeftargetfunc(struct cam_et *target, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_TARGET) { xpt_targetfunc_t *tr_func; tr_func = (xpt_targetfunc_t *)tr_config->tr_func; return(tr_func(target, tr_config->tr_arg)); } else return(xptdevicetraverse(target, NULL, xptdefdevicefunc, arg)); } static int xptdefdevicefunc(struct cam_ed *device, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_DEVICE) { xpt_devicefunc_t *tr_func; tr_func = (xpt_devicefunc_t *)tr_config->tr_func; return(tr_func(device, tr_config->tr_arg)); } else return(xptperiphtraverse(device, NULL, xptdefperiphfunc, arg)); } static int xptdefperiphfunc(struct cam_periph *periph, void *arg) { struct xpt_traverse_config *tr_config; xpt_periphfunc_t *tr_func; tr_config = (struct xpt_traverse_config *)arg; tr_func = (xpt_periphfunc_t *)tr_config->tr_func; /* * Unlike the other default functions, we don't check for depth * here. The peripheral driver level is the last level in the EDT, * so if we're here, we should execute the function in question. */ return(tr_func(periph, tr_config->tr_arg)); } /* * Execute the given function for every bus in the EDT. */ static int xpt_for_all_busses(xpt_busfunc_t *tr_func, void *arg) { struct xpt_traverse_config tr_config; tr_config.depth = XPT_DEPTH_BUS; tr_config.tr_func = tr_func; tr_config.tr_arg = arg; return(xptbustraverse(NULL, xptdefbusfunc, &tr_config)); } /* * Execute the given function for every device in the EDT. */ static int xpt_for_all_devices(xpt_devicefunc_t *tr_func, void *arg) { struct xpt_traverse_config tr_config; tr_config.depth = XPT_DEPTH_DEVICE; tr_config.tr_func = tr_func; tr_config.tr_arg = arg; return(xptbustraverse(NULL, xptdefbusfunc, &tr_config)); } static int xptsetasyncfunc(struct cam_ed *device, void *arg) { struct cam_path path; struct ccb_getdev cgd; struct ccb_setasync *csa = (struct ccb_setasync *)arg; /* * Don't report unconfigured devices (Wildcard devs, * devices only for target mode, device instances * that have been invalidated but are waiting for * their last reference count to be released). */ if ((device->flags & CAM_DEV_UNCONFIGURED) != 0) return (1); xpt_compile_path(&path, NULL, device->target->bus->path_id, device->target->target_id, device->lun_id); xpt_setup_ccb(&cgd.ccb_h, &path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); csa->callback(csa->callback_arg, AC_FOUND_DEVICE, &path, &cgd); xpt_release_path(&path); return(1); } static int xptsetasyncbusfunc(struct cam_eb *bus, void *arg) { struct cam_path path; struct ccb_pathinq cpi; struct ccb_setasync *csa = (struct ccb_setasync *)arg; xpt_compile_path(&path, /*periph*/NULL, bus->path_id, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); xpt_path_lock(&path); xpt_setup_ccb(&cpi.ccb_h, &path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); csa->callback(csa->callback_arg, AC_PATH_REGISTERED, &path, &cpi); xpt_path_unlock(&path); xpt_release_path(&path); return(1); } void xpt_action(union ccb *start_ccb) { CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_action\n")); start_ccb->ccb_h.status = CAM_REQ_INPROG; (*(start_ccb->ccb_h.path->bus->xport->action))(start_ccb); } void xpt_action_default(union ccb *start_ccb) { struct cam_path *path; struct cam_sim *sim; int lock; path = start_ccb->ccb_h.path; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_action_default\n")); switch (start_ccb->ccb_h.func_code) { case XPT_SCSI_IO: { struct cam_ed *device; /* * For the sake of compatibility with SCSI-1 * devices that may not understand the identify * message, we include lun information in the * second byte of all commands. SCSI-1 specifies * that luns are a 3 bit value and reserves only 3 * bits for lun information in the CDB. Later * revisions of the SCSI spec allow for more than 8 * luns, but have deprecated lun information in the * CDB. So, if the lun won't fit, we must omit. * * Also be aware that during initial probing for devices, * the inquiry information is unknown but initialized to 0. * This means that this code will be exercised while probing * devices with an ANSI revision greater than 2. */ device = path->device; if (device->protocol_version <= SCSI_REV_2 && start_ccb->ccb_h.target_lun < 8 && (start_ccb->ccb_h.flags & CAM_CDB_POINTER) == 0) { start_ccb->csio.cdb_io.cdb_bytes[1] |= start_ccb->ccb_h.target_lun << 5; } start_ccb->csio.scsi_status = SCSI_STATUS_OK; } /* FALLTHROUGH */ case XPT_TARGET_IO: case XPT_CONT_TARGET_IO: start_ccb->csio.sense_resid = 0; start_ccb->csio.resid = 0; /* FALLTHROUGH */ case XPT_ATA_IO: if (start_ccb->ccb_h.func_code == XPT_ATA_IO) start_ccb->ataio.resid = 0; /* FALLTHROUGH */ case XPT_RESET_DEV: case XPT_ENG_EXEC: case XPT_SMP_IO: { struct cam_devq *devq; devq = path->bus->sim->devq; mtx_lock(&devq->send_mtx); cam_ccbq_insert_ccb(&path->device->ccbq, start_ccb); if (xpt_schedule_devq(devq, path->device) != 0) xpt_run_devq(devq); mtx_unlock(&devq->send_mtx); break; } case XPT_CALC_GEOMETRY: /* Filter out garbage */ if (start_ccb->ccg.block_size == 0 || start_ccb->ccg.volume_size == 0) { start_ccb->ccg.cylinders = 0; start_ccb->ccg.heads = 0; start_ccb->ccg.secs_per_track = 0; start_ccb->ccb_h.status = CAM_REQ_CMP; break; } #if defined(PC98) || defined(__sparc64__) /* * In a PC-98 system, geometry translation depens on * the "real" device geometry obtained from mode page 4. * SCSI geometry translation is performed in the * initialization routine of the SCSI BIOS and the result * stored in host memory. If the translation is available * in host memory, use it. If not, rely on the default * translation the device driver performs. * For sparc64, we may need adjust the geometry of large * disks in order to fit the limitations of the 16-bit * fields of the VTOC8 disk label. */ if (scsi_da_bios_params(&start_ccb->ccg) != 0) { start_ccb->ccb_h.status = CAM_REQ_CMP; break; } #endif goto call_sim; case XPT_ABORT: { union ccb* abort_ccb; abort_ccb = start_ccb->cab.abort_ccb; if (XPT_FC_IS_DEV_QUEUED(abort_ccb)) { if (abort_ccb->ccb_h.pinfo.index >= 0) { struct cam_ccbq *ccbq; struct cam_ed *device; device = abort_ccb->ccb_h.path->device; ccbq = &device->ccbq; cam_ccbq_remove_ccb(ccbq, abort_ccb); abort_ccb->ccb_h.status = CAM_REQ_ABORTED|CAM_DEV_QFRZN; xpt_freeze_devq(abort_ccb->ccb_h.path, 1); xpt_done(abort_ccb); start_ccb->ccb_h.status = CAM_REQ_CMP; break; } if (abort_ccb->ccb_h.pinfo.index == CAM_UNQUEUED_INDEX && (abort_ccb->ccb_h.status & CAM_SIM_QUEUED) == 0) { /* * We've caught this ccb en route to * the SIM. Flag it for abort and the * SIM will do so just before starting * real work on the CCB. */ abort_ccb->ccb_h.status = CAM_REQ_ABORTED|CAM_DEV_QFRZN; xpt_freeze_devq(abort_ccb->ccb_h.path, 1); start_ccb->ccb_h.status = CAM_REQ_CMP; break; } } if (XPT_FC_IS_QUEUED(abort_ccb) && (abort_ccb->ccb_h.pinfo.index == CAM_DONEQ_INDEX)) { /* * It's already completed but waiting * for our SWI to get to it. */ start_ccb->ccb_h.status = CAM_UA_ABORT; break; } /* * If we weren't able to take care of the abort request * in the XPT, pass the request down to the SIM for processing. */ } /* FALLTHROUGH */ case XPT_ACCEPT_TARGET_IO: case XPT_EN_LUN: case XPT_IMMED_NOTIFY: case XPT_NOTIFY_ACK: case XPT_RESET_BUS: case XPT_IMMEDIATE_NOTIFY: case XPT_NOTIFY_ACKNOWLEDGE: case XPT_GET_SIM_KNOB: case XPT_SET_SIM_KNOB: case XPT_GET_TRAN_SETTINGS: case XPT_SET_TRAN_SETTINGS: case XPT_PATH_INQ: call_sim: sim = path->bus->sim; lock = (mtx_owned(sim->mtx) == 0); if (lock) CAM_SIM_LOCK(sim); (*(sim->sim_action))(sim, start_ccb); if (lock) CAM_SIM_UNLOCK(sim); break; case XPT_PATH_STATS: start_ccb->cpis.last_reset = path->bus->last_reset; start_ccb->ccb_h.status = CAM_REQ_CMP; break; case XPT_GDEV_TYPE: { struct cam_ed *dev; dev = path->device; if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) { start_ccb->ccb_h.status = CAM_DEV_NOT_THERE; } else { struct ccb_getdev *cgd; cgd = &start_ccb->cgd; cgd->protocol = dev->protocol; cgd->inq_data = dev->inq_data; cgd->ident_data = dev->ident_data; cgd->inq_flags = dev->inq_flags; cgd->ccb_h.status = CAM_REQ_CMP; cgd->serial_num_len = dev->serial_num_len; if ((dev->serial_num_len > 0) && (dev->serial_num != NULL)) bcopy(dev->serial_num, cgd->serial_num, dev->serial_num_len); } break; } case XPT_GDEV_STATS: { struct cam_ed *dev; dev = path->device; if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) { start_ccb->ccb_h.status = CAM_DEV_NOT_THERE; } else { struct ccb_getdevstats *cgds; struct cam_eb *bus; struct cam_et *tar; struct cam_devq *devq; cgds = &start_ccb->cgds; bus = path->bus; tar = path->target; devq = bus->sim->devq; mtx_lock(&devq->send_mtx); cgds->dev_openings = dev->ccbq.dev_openings; cgds->dev_active = dev->ccbq.dev_active; cgds->allocated = dev->ccbq.allocated; cgds->queued = cam_ccbq_pending_ccb_count(&dev->ccbq); cgds->held = cgds->allocated - cgds->dev_active - cgds->queued; cgds->last_reset = tar->last_reset; cgds->maxtags = dev->maxtags; cgds->mintags = dev->mintags; if (timevalcmp(&tar->last_reset, &bus->last_reset, <)) cgds->last_reset = bus->last_reset; mtx_unlock(&devq->send_mtx); cgds->ccb_h.status = CAM_REQ_CMP; } break; } case XPT_GDEVLIST: { struct cam_periph *nperiph; struct periph_list *periph_head; struct ccb_getdevlist *cgdl; u_int i; struct cam_ed *device; int found; found = 0; /* * Don't want anyone mucking with our data. */ device = path->device; periph_head = &device->periphs; cgdl = &start_ccb->cgdl; /* * Check and see if the list has changed since the user * last requested a list member. If so, tell them that the * list has changed, and therefore they need to start over * from the beginning. */ if ((cgdl->index != 0) && (cgdl->generation != device->generation)) { cgdl->status = CAM_GDEVLIST_LIST_CHANGED; break; } /* * Traverse the list of peripherals and attempt to find * the requested peripheral. */ for (nperiph = SLIST_FIRST(periph_head), i = 0; (nperiph != NULL) && (i <= cgdl->index); nperiph = SLIST_NEXT(nperiph, periph_links), i++) { if (i == cgdl->index) { strncpy(cgdl->periph_name, nperiph->periph_name, DEV_IDLEN); cgdl->unit_number = nperiph->unit_number; found = 1; } } if (found == 0) { cgdl->status = CAM_GDEVLIST_ERROR; break; } if (nperiph == NULL) cgdl->status = CAM_GDEVLIST_LAST_DEVICE; else cgdl->status = CAM_GDEVLIST_MORE_DEVS; cgdl->index++; cgdl->generation = device->generation; cgdl->ccb_h.status = CAM_REQ_CMP; break; } case XPT_DEV_MATCH: { dev_pos_type position_type; struct ccb_dev_match *cdm; cdm = &start_ccb->cdm; /* * There are two ways of getting at information in the EDT. * The first way is via the primary EDT tree. It starts * with a list of busses, then a list of targets on a bus, * then devices/luns on a target, and then peripherals on a * device/lun. The "other" way is by the peripheral driver * lists. The peripheral driver lists are organized by * peripheral driver. (obviously) So it makes sense to * use the peripheral driver list if the user is looking * for something like "da1", or all "da" devices. If the * user is looking for something on a particular bus/target * or lun, it's generally better to go through the EDT tree. */ if (cdm->pos.position_type != CAM_DEV_POS_NONE) position_type = cdm->pos.position_type; else { u_int i; position_type = CAM_DEV_POS_NONE; for (i = 0; i < cdm->num_patterns; i++) { if ((cdm->patterns[i].type == DEV_MATCH_BUS) ||(cdm->patterns[i].type == DEV_MATCH_DEVICE)){ position_type = CAM_DEV_POS_EDT; break; } } if (cdm->num_patterns == 0) position_type = CAM_DEV_POS_EDT; else if (position_type == CAM_DEV_POS_NONE) position_type = CAM_DEV_POS_PDRV; } switch(position_type & CAM_DEV_POS_TYPEMASK) { case CAM_DEV_POS_EDT: xptedtmatch(cdm); break; case CAM_DEV_POS_PDRV: xptperiphlistmatch(cdm); break; default: cdm->status = CAM_DEV_MATCH_ERROR; break; } if (cdm->status == CAM_DEV_MATCH_ERROR) start_ccb->ccb_h.status = CAM_REQ_CMP_ERR; else start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_SASYNC_CB: { struct ccb_setasync *csa; struct async_node *cur_entry; struct async_list *async_head; u_int32_t added; csa = &start_ccb->csa; added = csa->event_enable; async_head = &path->device->asyncs; /* * If there is already an entry for us, simply * update it. */ cur_entry = SLIST_FIRST(async_head); while (cur_entry != NULL) { if ((cur_entry->callback_arg == csa->callback_arg) && (cur_entry->callback == csa->callback)) break; cur_entry = SLIST_NEXT(cur_entry, links); } if (cur_entry != NULL) { /* * If the request has no flags set, * remove the entry. */ added &= ~cur_entry->event_enable; if (csa->event_enable == 0) { SLIST_REMOVE(async_head, cur_entry, async_node, links); xpt_release_device(path->device); free(cur_entry, M_CAMXPT); } else { cur_entry->event_enable = csa->event_enable; } csa->event_enable = added; } else { cur_entry = malloc(sizeof(*cur_entry), M_CAMXPT, M_NOWAIT); if (cur_entry == NULL) { csa->ccb_h.status = CAM_RESRC_UNAVAIL; break; } cur_entry->event_enable = csa->event_enable; cur_entry->event_lock = mtx_owned(path->bus->sim->mtx) ? 1 : 0; cur_entry->callback_arg = csa->callback_arg; cur_entry->callback = csa->callback; SLIST_INSERT_HEAD(async_head, cur_entry, links); xpt_acquire_device(path->device); } start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_REL_SIMQ: { struct ccb_relsim *crs; struct cam_ed *dev; crs = &start_ccb->crs; dev = path->device; if (dev == NULL) { crs->ccb_h.status = CAM_DEV_NOT_THERE; break; } if ((crs->release_flags & RELSIM_ADJUST_OPENINGS) != 0) { /* Don't ever go below one opening */ if (crs->openings > 0) { xpt_dev_ccbq_resize(path, crs->openings); if (bootverbose) { xpt_print(path, "number of openings is now %d\n", crs->openings); } } } mtx_lock(&dev->sim->devq->send_mtx); if ((crs->release_flags & RELSIM_RELEASE_AFTER_TIMEOUT) != 0) { if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) { /* * Just extend the old timeout and decrement * the freeze count so that a single timeout * is sufficient for releasing the queue. */ start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; callout_stop(&dev->callout); } else { start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } callout_reset_sbt(&dev->callout, SBT_1MS * crs->release_timeout, 0, xpt_release_devq_timeout, dev, 0); dev->flags |= CAM_DEV_REL_TIMEOUT_PENDING; } if ((crs->release_flags & RELSIM_RELEASE_AFTER_CMDCMPLT) != 0) { if ((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0) { /* * Decrement the freeze count so that a single * completion is still sufficient to unfreeze * the queue. */ start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; } else { dev->flags |= CAM_DEV_REL_ON_COMPLETE; start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } } if ((crs->release_flags & RELSIM_RELEASE_AFTER_QEMPTY) != 0) { if ((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0 || (dev->ccbq.dev_active == 0)) { start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; } else { dev->flags |= CAM_DEV_REL_ON_QUEUE_EMPTY; start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } } mtx_unlock(&dev->sim->devq->send_mtx); if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) == 0) xpt_release_devq(path, /*count*/1, /*run_queue*/TRUE); start_ccb->crs.qfrozen_cnt = dev->ccbq.queue.qfrozen_cnt; start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_DEBUG: { struct cam_path *oldpath; /* Check that all request bits are supported. */ if (start_ccb->cdbg.flags & ~(CAM_DEBUG_COMPILE)) { start_ccb->ccb_h.status = CAM_FUNC_NOTAVAIL; break; } cam_dflags = CAM_DEBUG_NONE; if (cam_dpath != NULL) { oldpath = cam_dpath; cam_dpath = NULL; xpt_free_path(oldpath); } if (start_ccb->cdbg.flags != CAM_DEBUG_NONE) { if (xpt_create_path(&cam_dpath, NULL, start_ccb->ccb_h.path_id, start_ccb->ccb_h.target_id, start_ccb->ccb_h.target_lun) != CAM_REQ_CMP) { start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; } else { cam_dflags = start_ccb->cdbg.flags; start_ccb->ccb_h.status = CAM_REQ_CMP; xpt_print(cam_dpath, "debugging flags now %x\n", cam_dflags); } } else start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_NOOP: if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) xpt_freeze_devq(path, 1); start_ccb->ccb_h.status = CAM_REQ_CMP; break; default: case XPT_SDEV_TYPE: case XPT_TERM_IO: case XPT_ENG_INQ: /* XXX Implement */ printf("%s: CCB type %#x not supported\n", __func__, start_ccb->ccb_h.func_code); start_ccb->ccb_h.status = CAM_PROVIDE_FAIL; if (start_ccb->ccb_h.func_code & XPT_FC_DEV_QUEUED) { xpt_done(start_ccb); } break; } } void xpt_polled_action(union ccb *start_ccb) { u_int32_t timeout; struct cam_sim *sim; struct cam_devq *devq; struct cam_ed *dev; timeout = start_ccb->ccb_h.timeout * 10; sim = start_ccb->ccb_h.path->bus->sim; devq = sim->devq; dev = start_ccb->ccb_h.path->device; mtx_unlock(&dev->device_mtx); /* * Steal an opening so that no other queued requests * can get it before us while we simulate interrupts. */ mtx_lock(&devq->send_mtx); dev->ccbq.dev_openings--; while((devq->send_openings <= 0 || dev->ccbq.dev_openings < 0) && (--timeout > 0)) { mtx_unlock(&devq->send_mtx); DELAY(100); CAM_SIM_LOCK(sim); (*(sim->sim_poll))(sim); CAM_SIM_UNLOCK(sim); camisr_runqueue(); mtx_lock(&devq->send_mtx); } dev->ccbq.dev_openings++; mtx_unlock(&devq->send_mtx); if (timeout != 0) { xpt_action(start_ccb); while(--timeout > 0) { CAM_SIM_LOCK(sim); (*(sim->sim_poll))(sim); CAM_SIM_UNLOCK(sim); camisr_runqueue(); if ((start_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_INPROG) break; DELAY(100); } if (timeout == 0) { /* * XXX Is it worth adding a sim_timeout entry * point so we can attempt recovery? If * this is only used for dumps, I don't think * it is. */ start_ccb->ccb_h.status = CAM_CMD_TIMEOUT; } } else { start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; } mtx_lock(&dev->device_mtx); } /* * Schedule a peripheral driver to receive a ccb when its * target device has space for more transactions. */ void xpt_schedule(struct cam_periph *periph, u_int32_t new_priority) { CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("xpt_schedule\n")); cam_periph_assert(periph, MA_OWNED); if (new_priority < periph->scheduled_priority) { periph->scheduled_priority = new_priority; xpt_run_allocq(periph, 0); } } /* * Schedule a device to run on a given queue. * If the device was inserted as a new entry on the queue, * return 1 meaning the device queue should be run. If we * were already queued, implying someone else has already * started the queue, return 0 so the caller doesn't attempt * to run the queue. */ static int xpt_schedule_dev(struct camq *queue, cam_pinfo *pinfo, u_int32_t new_priority) { int retval; u_int32_t old_priority; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_schedule_dev\n")); old_priority = pinfo->priority; /* * Are we already queued? */ if (pinfo->index != CAM_UNQUEUED_INDEX) { /* Simply reorder based on new priority */ if (new_priority < old_priority) { camq_change_priority(queue, pinfo->index, new_priority); CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("changed priority to %d\n", new_priority)); retval = 1; } else retval = 0; } else { /* New entry on the queue */ if (new_priority < old_priority) pinfo->priority = new_priority; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("Inserting onto queue\n")); pinfo->generation = ++queue->generation; camq_insert(queue, pinfo); retval = 1; } return (retval); } static void xpt_run_allocq_task(void *context, int pending) { struct cam_periph *periph = context; cam_periph_lock(periph); periph->flags &= ~CAM_PERIPH_RUN_TASK; xpt_run_allocq(periph, 1); cam_periph_unlock(periph); cam_periph_release(periph); } static void xpt_run_allocq(struct cam_periph *periph, int sleep) { struct cam_ed *device; union ccb *ccb; uint32_t prio; cam_periph_assert(periph, MA_OWNED); if (periph->periph_allocating) return; periph->periph_allocating = 1; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_allocq(%p)\n", periph)); device = periph->path->device; ccb = NULL; restart: while ((prio = min(periph->scheduled_priority, periph->immediate_priority)) != CAM_PRIORITY_NONE && (periph->periph_allocated - (ccb != NULL ? 1 : 0) < device->ccbq.total_openings || prio <= CAM_PRIORITY_OOB)) { if (ccb == NULL && (ccb = xpt_get_ccb_nowait(periph)) == NULL) { if (sleep) { ccb = xpt_get_ccb(periph); goto restart; } if (periph->flags & CAM_PERIPH_RUN_TASK) break; cam_periph_doacquire(periph); periph->flags |= CAM_PERIPH_RUN_TASK; taskqueue_enqueue(xsoftc.xpt_taskq, &periph->periph_run_task); break; } xpt_setup_ccb(&ccb->ccb_h, periph->path, prio); if (prio == periph->immediate_priority) { periph->immediate_priority = CAM_PRIORITY_NONE; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("waking cam_periph_getccb()\n")); SLIST_INSERT_HEAD(&periph->ccb_list, &ccb->ccb_h, periph_links.sle); wakeup(&periph->ccb_list); } else { periph->scheduled_priority = CAM_PRIORITY_NONE; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("calling periph_start()\n")); periph->periph_start(periph, ccb); } ccb = NULL; } if (ccb != NULL) xpt_release_ccb(ccb); periph->periph_allocating = 0; } static void xpt_run_devq(struct cam_devq *devq) { char cdb_str[(SCSI_MAX_CDBLEN * 3) + 1]; int lock; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_devq\n")); devq->send_queue.qfrozen_cnt++; while ((devq->send_queue.entries > 0) && (devq->send_openings > 0) && (devq->send_queue.qfrozen_cnt <= 1)) { struct cam_ed *device; union ccb *work_ccb; struct cam_sim *sim; device = (struct cam_ed *)camq_remove(&devq->send_queue, CAMQ_HEAD); CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("running device %p\n", device)); work_ccb = cam_ccbq_peek_ccb(&device->ccbq, CAMQ_HEAD); if (work_ccb == NULL) { printf("device on run queue with no ccbs???\n"); continue; } if ((work_ccb->ccb_h.flags & CAM_HIGH_POWER) != 0) { mtx_lock(&xsoftc.xpt_highpower_lock); if (xsoftc.num_highpower <= 0) { /* * We got a high power command, but we * don't have any available slots. Freeze * the device queue until we have a slot * available. */ xpt_freeze_devq_device(device, 1); STAILQ_INSERT_TAIL(&xsoftc.highpowerq, device, highpowerq_entry); mtx_unlock(&xsoftc.xpt_highpower_lock); continue; } else { /* * Consume a high power slot while * this ccb runs. */ xsoftc.num_highpower--; } mtx_unlock(&xsoftc.xpt_highpower_lock); } cam_ccbq_remove_ccb(&device->ccbq, work_ccb); cam_ccbq_send_ccb(&device->ccbq, work_ccb); devq->send_openings--; devq->send_active++; xpt_schedule_devq(devq, device); mtx_unlock(&devq->send_mtx); if ((work_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) { /* * The client wants to freeze the queue * after this CCB is sent. */ xpt_freeze_devq(work_ccb->ccb_h.path, 1); } /* In Target mode, the peripheral driver knows best... */ if (work_ccb->ccb_h.func_code == XPT_SCSI_IO) { if ((device->inq_flags & SID_CmdQue) != 0 && work_ccb->csio.tag_action != CAM_TAG_ACTION_NONE) work_ccb->ccb_h.flags |= CAM_TAG_ACTION_VALID; else /* * Clear this in case of a retried CCB that * failed due to a rejected tag. */ work_ccb->ccb_h.flags &= ~CAM_TAG_ACTION_VALID; } switch (work_ccb->ccb_h.func_code) { case XPT_SCSI_IO: CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_CDB,("%s. CDB: %s\n", scsi_op_desc(work_ccb->csio.cdb_io.cdb_bytes[0], &device->inq_data), scsi_cdb_string(work_ccb->csio.cdb_io.cdb_bytes, cdb_str, sizeof(cdb_str)))); break; case XPT_ATA_IO: CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_CDB,("%s. ACB: %s\n", ata_op_string(&work_ccb->ataio.cmd), ata_cmd_string(&work_ccb->ataio.cmd, cdb_str, sizeof(cdb_str)))); break; default: break; } /* * Device queues can be shared among multiple SIM instances * that reside on different busses. Use the SIM from the * queued device, rather than the one from the calling bus. */ sim = device->sim; lock = (mtx_owned(sim->mtx) == 0); if (lock) CAM_SIM_LOCK(sim); (*(sim->sim_action))(sim, work_ccb); if (lock) CAM_SIM_UNLOCK(sim); mtx_lock(&devq->send_mtx); } devq->send_queue.qfrozen_cnt--; } /* * This function merges stuff from the slave ccb into the master ccb, while * keeping important fields in the master ccb constant. */ void xpt_merge_ccb(union ccb *master_ccb, union ccb *slave_ccb) { /* * Pull fields that are valid for peripheral drivers to set * into the master CCB along with the CCB "payload". */ master_ccb->ccb_h.retry_count = slave_ccb->ccb_h.retry_count; master_ccb->ccb_h.func_code = slave_ccb->ccb_h.func_code; master_ccb->ccb_h.timeout = slave_ccb->ccb_h.timeout; master_ccb->ccb_h.flags = slave_ccb->ccb_h.flags; bcopy(&(&slave_ccb->ccb_h)[1], &(&master_ccb->ccb_h)[1], sizeof(union ccb) - sizeof(struct ccb_hdr)); } void xpt_setup_ccb(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_setup_ccb\n")); ccb_h->pinfo.priority = priority; ccb_h->path = path; ccb_h->path_id = path->bus->path_id; if (path->target) ccb_h->target_id = path->target->target_id; else ccb_h->target_id = CAM_TARGET_WILDCARD; if (path->device) { ccb_h->target_lun = path->device->lun_id; ccb_h->pinfo.generation = ++path->device->ccbq.queue.generation; } else { ccb_h->target_lun = CAM_TARGET_WILDCARD; } ccb_h->pinfo.index = CAM_UNQUEUED_INDEX; ccb_h->flags = 0; ccb_h->xflags = 0; } /* Path manipulation functions */ cam_status xpt_create_path(struct cam_path **new_path_ptr, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { struct cam_path *path; cam_status status; path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT); if (path == NULL) { status = CAM_RESRC_UNAVAIL; return(status); } status = xpt_compile_path(path, perph, path_id, target_id, lun_id); if (status != CAM_REQ_CMP) { free(path, M_CAMPATH); path = NULL; } *new_path_ptr = path; return (status); } cam_status xpt_create_path_unlocked(struct cam_path **new_path_ptr, struct cam_periph *periph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { return (xpt_create_path(new_path_ptr, periph, path_id, target_id, lun_id)); } cam_status xpt_compile_path(struct cam_path *new_path, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { struct cam_eb *bus; struct cam_et *target; struct cam_ed *device; cam_status status; status = CAM_REQ_CMP; /* Completed without error */ target = NULL; /* Wildcarded */ device = NULL; /* Wildcarded */ /* * We will potentially modify the EDT, so block interrupts * that may attempt to create cam paths. */ bus = xpt_find_bus(path_id); if (bus == NULL) { status = CAM_PATH_INVALID; } else { xpt_lock_buses(); mtx_lock(&bus->eb_mtx); target = xpt_find_target(bus, target_id); if (target == NULL) { /* Create one */ struct cam_et *new_target; new_target = xpt_alloc_target(bus, target_id); if (new_target == NULL) { status = CAM_RESRC_UNAVAIL; } else { target = new_target; } } xpt_unlock_buses(); if (target != NULL) { device = xpt_find_device(target, lun_id); if (device == NULL) { /* Create one */ struct cam_ed *new_device; new_device = (*(bus->xport->alloc_device))(bus, target, lun_id); if (new_device == NULL) { status = CAM_RESRC_UNAVAIL; } else { device = new_device; } } } mtx_unlock(&bus->eb_mtx); } /* * Only touch the user's data if we are successful. */ if (status == CAM_REQ_CMP) { new_path->periph = perph; new_path->bus = bus; new_path->target = target; new_path->device = device; CAM_DEBUG(new_path, CAM_DEBUG_TRACE, ("xpt_compile_path\n")); } else { if (device != NULL) xpt_release_device(device); if (target != NULL) xpt_release_target(target); if (bus != NULL) xpt_release_bus(bus); } return (status); } cam_status xpt_clone_path(struct cam_path **new_path_ptr, struct cam_path *path) { struct cam_path *new_path; new_path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT); if (new_path == NULL) return(CAM_RESRC_UNAVAIL); xpt_copy_path(new_path, path); *new_path_ptr = new_path; return (CAM_REQ_CMP); } void xpt_copy_path(struct cam_path *new_path, struct cam_path *path) { *new_path = *path; if (path->bus != NULL) xpt_acquire_bus(path->bus); if (path->target != NULL) xpt_acquire_target(path->target); if (path->device != NULL) xpt_acquire_device(path->device); } void xpt_release_path(struct cam_path *path) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_path\n")); if (path->device != NULL) { xpt_release_device(path->device); path->device = NULL; } if (path->target != NULL) { xpt_release_target(path->target); path->target = NULL; } if (path->bus != NULL) { xpt_release_bus(path->bus); path->bus = NULL; } } void xpt_free_path(struct cam_path *path) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_free_path\n")); xpt_release_path(path); free(path, M_CAMPATH); } void xpt_path_counts(struct cam_path *path, uint32_t *bus_ref, uint32_t *periph_ref, uint32_t *target_ref, uint32_t *device_ref) { xpt_lock_buses(); if (bus_ref) { if (path->bus) *bus_ref = path->bus->refcount; else *bus_ref = 0; } if (periph_ref) { if (path->periph) *periph_ref = path->periph->refcount; else *periph_ref = 0; } xpt_unlock_buses(); if (target_ref) { if (path->target) *target_ref = path->target->refcount; else *target_ref = 0; } if (device_ref) { if (path->device) *device_ref = path->device->refcount; else *device_ref = 0; } } /* * Return -1 for failure, 0 for exact match, 1 for match with wildcards * in path1, 2 for match with wildcards in path2. */ int xpt_path_comp(struct cam_path *path1, struct cam_path *path2) { int retval = 0; if (path1->bus != path2->bus) { if (path1->bus->path_id == CAM_BUS_WILDCARD) retval = 1; else if (path2->bus->path_id == CAM_BUS_WILDCARD) retval = 2; else return (-1); } if (path1->target != path2->target) { if (path1->target->target_id == CAM_TARGET_WILDCARD) { if (retval == 0) retval = 1; } else if (path2->target->target_id == CAM_TARGET_WILDCARD) retval = 2; else return (-1); } if (path1->device != path2->device) { if (path1->device->lun_id == CAM_LUN_WILDCARD) { if (retval == 0) retval = 1; } else if (path2->device->lun_id == CAM_LUN_WILDCARD) retval = 2; else return (-1); } return (retval); } int xpt_path_comp_dev(struct cam_path *path, struct cam_ed *dev) { int retval = 0; if (path->bus != dev->target->bus) { if (path->bus->path_id == CAM_BUS_WILDCARD) retval = 1; else if (dev->target->bus->path_id == CAM_BUS_WILDCARD) retval = 2; else return (-1); } if (path->target != dev->target) { if (path->target->target_id == CAM_TARGET_WILDCARD) { if (retval == 0) retval = 1; } else if (dev->target->target_id == CAM_TARGET_WILDCARD) retval = 2; else return (-1); } if (path->device != dev) { if (path->device->lun_id == CAM_LUN_WILDCARD) { if (retval == 0) retval = 1; } else if (dev->lun_id == CAM_LUN_WILDCARD) retval = 2; else return (-1); } return (retval); } void xpt_print_path(struct cam_path *path) { if (path == NULL) printf("(nopath): "); else { if (path->periph != NULL) printf("(%s%d:", path->periph->periph_name, path->periph->unit_number); else printf("(noperiph:"); if (path->bus != NULL) printf("%s%d:%d:", path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id); else printf("nobus:"); if (path->target != NULL) printf("%d:", path->target->target_id); else printf("X:"); if (path->device != NULL) printf("%jx): ", (uintmax_t)path->device->lun_id); else printf("X): "); } } void xpt_print_device(struct cam_ed *device) { if (device == NULL) printf("(nopath): "); else { printf("(noperiph:%s%d:%d:%d:%jx): ", device->sim->sim_name, device->sim->unit_number, device->sim->bus_id, device->target->target_id, (uintmax_t)device->lun_id); } } void xpt_print(struct cam_path *path, const char *fmt, ...) { va_list ap; xpt_print_path(path); va_start(ap, fmt); vprintf(fmt, ap); va_end(ap); } int xpt_path_string(struct cam_path *path, char *str, size_t str_len) { struct sbuf sb; sbuf_new(&sb, str, str_len, 0); if (path == NULL) sbuf_printf(&sb, "(nopath): "); else { if (path->periph != NULL) sbuf_printf(&sb, "(%s%d:", path->periph->periph_name, path->periph->unit_number); else sbuf_printf(&sb, "(noperiph:"); if (path->bus != NULL) sbuf_printf(&sb, "%s%d:%d:", path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id); else sbuf_printf(&sb, "nobus:"); if (path->target != NULL) sbuf_printf(&sb, "%d:", path->target->target_id); else sbuf_printf(&sb, "X:"); if (path->device != NULL) sbuf_printf(&sb, "%jx): ", (uintmax_t)path->device->lun_id); else sbuf_printf(&sb, "X): "); } sbuf_finish(&sb); return(sbuf_len(&sb)); } path_id_t xpt_path_path_id(struct cam_path *path) { return(path->bus->path_id); } target_id_t xpt_path_target_id(struct cam_path *path) { if (path->target != NULL) return (path->target->target_id); else return (CAM_TARGET_WILDCARD); } lun_id_t xpt_path_lun_id(struct cam_path *path) { if (path->device != NULL) return (path->device->lun_id); else return (CAM_LUN_WILDCARD); } struct cam_sim * xpt_path_sim(struct cam_path *path) { return (path->bus->sim); } struct cam_periph* xpt_path_periph(struct cam_path *path) { return (path->periph); } int xpt_path_legacy_ata_id(struct cam_path *path) { struct cam_eb *bus; int bus_id; if ((strcmp(path->bus->sim->sim_name, "ata") != 0) && strcmp(path->bus->sim->sim_name, "ahcich") != 0 && strcmp(path->bus->sim->sim_name, "mvsch") != 0 && strcmp(path->bus->sim->sim_name, "siisch") != 0) return (-1); if (strcmp(path->bus->sim->sim_name, "ata") == 0 && path->bus->sim->unit_number < 2) { bus_id = path->bus->sim->unit_number; } else { bus_id = 2; xpt_lock_buses(); TAILQ_FOREACH(bus, &xsoftc.xpt_busses, links) { if (bus == path->bus) break; if ((strcmp(bus->sim->sim_name, "ata") == 0 && bus->sim->unit_number >= 2) || strcmp(bus->sim->sim_name, "ahcich") == 0 || strcmp(bus->sim->sim_name, "mvsch") == 0 || strcmp(bus->sim->sim_name, "siisch") == 0) bus_id++; } xpt_unlock_buses(); } if (path->target != NULL) { if (path->target->target_id < 2) return (bus_id * 2 + path->target->target_id); else return (-1); } else return (bus_id * 2); } /* * Release a CAM control block for the caller. Remit the cost of the structure * to the device referenced by the path. If the this device had no 'credits' * and peripheral drivers have registered async callbacks for this notification * call them now. */ void xpt_release_ccb(union ccb *free_ccb) { struct cam_ed *device; struct cam_periph *periph; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_release_ccb\n")); xpt_path_assert(free_ccb->ccb_h.path, MA_OWNED); device = free_ccb->ccb_h.path->device; periph = free_ccb->ccb_h.path->periph; xpt_free_ccb(free_ccb); periph->periph_allocated--; cam_ccbq_release_opening(&device->ccbq); xpt_run_allocq(periph, 0); } /* Functions accessed by SIM drivers */ static struct xpt_xport xport_default = { .alloc_device = xpt_alloc_device_default, .action = xpt_action_default, .async = xpt_dev_async_default, }; /* * A sim structure, listing the SIM entry points and instance * identification info is passed to xpt_bus_register to hook the SIM * into the CAM framework. xpt_bus_register creates a cam_eb entry * for this new bus and places it in the array of busses and assigns * it a path_id. The path_id may be influenced by "hard wiring" * information specified by the user. Once interrupt services are * available, the bus will be probed. */ int32_t xpt_bus_register(struct cam_sim *sim, device_t parent, u_int32_t bus) { struct cam_eb *new_bus; struct cam_eb *old_bus; struct ccb_pathinq cpi; struct cam_path *path; cam_status status; mtx_assert(sim->mtx, MA_OWNED); sim->bus_id = bus; new_bus = (struct cam_eb *)malloc(sizeof(*new_bus), M_CAMXPT, M_NOWAIT|M_ZERO); if (new_bus == NULL) { /* Couldn't satisfy request */ return (CAM_RESRC_UNAVAIL); } mtx_init(&new_bus->eb_mtx, "CAM bus lock", NULL, MTX_DEF); TAILQ_INIT(&new_bus->et_entries); cam_sim_hold(sim); new_bus->sim = sim; timevalclear(&new_bus->last_reset); new_bus->flags = 0; new_bus->refcount = 1; /* Held until a bus_deregister event */ new_bus->generation = 0; xpt_lock_buses(); sim->path_id = new_bus->path_id = xptpathid(sim->sim_name, sim->unit_number, sim->bus_id); old_bus = TAILQ_FIRST(&xsoftc.xpt_busses); while (old_bus != NULL && old_bus->path_id < new_bus->path_id) old_bus = TAILQ_NEXT(old_bus, links); if (old_bus != NULL) TAILQ_INSERT_BEFORE(old_bus, new_bus, links); else TAILQ_INSERT_TAIL(&xsoftc.xpt_busses, new_bus, links); xsoftc.bus_generation++; xpt_unlock_buses(); /* * Set a default transport so that a PATH_INQ can be issued to * the SIM. This will then allow for probing and attaching of * a more appropriate transport. */ new_bus->xport = &xport_default; status = xpt_create_path(&path, /*periph*/NULL, sim->path_id, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) { xpt_release_bus(new_bus); free(path, M_CAMXPT); return (CAM_RESRC_UNAVAIL); } xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); if (cpi.ccb_h.status == CAM_REQ_CMP) { switch (cpi.transport) { case XPORT_SPI: case XPORT_SAS: case XPORT_FC: case XPORT_USB: case XPORT_ISCSI: case XPORT_SRP: case XPORT_PPB: new_bus->xport = scsi_get_xport(); break; case XPORT_ATA: case XPORT_SATA: new_bus->xport = ata_get_xport(); break; default: new_bus->xport = &xport_default; break; } } /* Notify interested parties */ if (sim->path_id != CAM_XPT_PATH_ID) { xpt_async(AC_PATH_REGISTERED, path, &cpi); if ((cpi.hba_misc & PIM_NOSCAN) == 0) { union ccb *scan_ccb; /* Initiate bus rescan. */ scan_ccb = xpt_alloc_ccb_nowait(); if (scan_ccb != NULL) { scan_ccb->ccb_h.path = path; scan_ccb->ccb_h.func_code = XPT_SCAN_BUS; scan_ccb->crcn.flags = 0; xpt_rescan(scan_ccb); } else { xpt_print(path, "Can't allocate CCB to scan bus\n"); xpt_free_path(path); } } else xpt_free_path(path); } else xpt_free_path(path); return (CAM_SUCCESS); } int32_t xpt_bus_deregister(path_id_t pathid) { struct cam_path bus_path; cam_status status; status = xpt_compile_path(&bus_path, NULL, pathid, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) return (status); xpt_async(AC_LOST_DEVICE, &bus_path, NULL); xpt_async(AC_PATH_DEREGISTERED, &bus_path, NULL); /* Release the reference count held while registered. */ xpt_release_bus(bus_path.bus); xpt_release_path(&bus_path); return (CAM_REQ_CMP); } static path_id_t xptnextfreepathid(void) { struct cam_eb *bus; path_id_t pathid; const char *strval; mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED); pathid = 0; bus = TAILQ_FIRST(&xsoftc.xpt_busses); retry: /* Find an unoccupied pathid */ while (bus != NULL && bus->path_id <= pathid) { if (bus->path_id == pathid) pathid++; bus = TAILQ_NEXT(bus, links); } /* * Ensure that this pathid is not reserved for * a bus that may be registered in the future. */ if (resource_string_value("scbus", pathid, "at", &strval) == 0) { ++pathid; /* Start the search over */ goto retry; } return (pathid); } static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus) { path_id_t pathid; int i, dunit, val; char buf[32]; const char *dname; pathid = CAM_XPT_PATH_ID; snprintf(buf, sizeof(buf), "%s%d", sim_name, sim_unit); if (strcmp(buf, "xpt0") == 0 && sim_bus == 0) return (pathid); i = 0; while ((resource_find_match(&i, &dname, &dunit, "at", buf)) == 0) { if (strcmp(dname, "scbus")) { /* Avoid a bit of foot shooting. */ continue; } if (dunit < 0) /* unwired?! */ continue; if (resource_int_value("scbus", dunit, "bus", &val) == 0) { if (sim_bus == val) { pathid = dunit; break; } } else if (sim_bus == 0) { /* Unspecified matches bus 0 */ pathid = dunit; break; } else { printf("Ambiguous scbus configuration for %s%d " "bus %d, cannot wire down. The kernel " "config entry for scbus%d should " "specify a controller bus.\n" "Scbus will be assigned dynamically.\n", sim_name, sim_unit, sim_bus, dunit); break; } } if (pathid == CAM_XPT_PATH_ID) pathid = xptnextfreepathid(); return (pathid); } static const char * xpt_async_string(u_int32_t async_code) { switch (async_code) { case AC_BUS_RESET: return ("AC_BUS_RESET"); case AC_UNSOL_RESEL: return ("AC_UNSOL_RESEL"); case AC_SCSI_AEN: return ("AC_SCSI_AEN"); case AC_SENT_BDR: return ("AC_SENT_BDR"); case AC_PATH_REGISTERED: return ("AC_PATH_REGISTERED"); case AC_PATH_DEREGISTERED: return ("AC_PATH_DEREGISTERED"); case AC_FOUND_DEVICE: return ("AC_FOUND_DEVICE"); case AC_LOST_DEVICE: return ("AC_LOST_DEVICE"); case AC_TRANSFER_NEG: return ("AC_TRANSFER_NEG"); case AC_INQ_CHANGED: return ("AC_INQ_CHANGED"); case AC_GETDEV_CHANGED: return ("AC_GETDEV_CHANGED"); case AC_CONTRACT: return ("AC_CONTRACT"); case AC_ADVINFO_CHANGED: return ("AC_ADVINFO_CHANGED"); case AC_UNIT_ATTENTION: return ("AC_UNIT_ATTENTION"); } return ("AC_UNKNOWN"); } static int xpt_async_size(u_int32_t async_code) { switch (async_code) { case AC_BUS_RESET: return (0); case AC_UNSOL_RESEL: return (0); case AC_SCSI_AEN: return (0); case AC_SENT_BDR: return (0); case AC_PATH_REGISTERED: return (sizeof(struct ccb_pathinq)); case AC_PATH_DEREGISTERED: return (0); case AC_FOUND_DEVICE: return (sizeof(struct ccb_getdev)); case AC_LOST_DEVICE: return (0); case AC_TRANSFER_NEG: return (sizeof(struct ccb_trans_settings)); case AC_INQ_CHANGED: return (0); case AC_GETDEV_CHANGED: return (0); case AC_CONTRACT: return (sizeof(struct ac_contract)); case AC_ADVINFO_CHANGED: return (-1); case AC_UNIT_ATTENTION: return (sizeof(struct ccb_scsiio)); } return (0); } static int xpt_async_process_dev(struct cam_ed *device, void *arg) { union ccb *ccb = arg; struct cam_path *path = ccb->ccb_h.path; void *async_arg = ccb->casync.async_arg_ptr; u_int32_t async_code = ccb->casync.async_code; int relock; if (path->device != device && path->device->lun_id != CAM_LUN_WILDCARD && device->lun_id != CAM_LUN_WILDCARD) return (1); /* * The async callback could free the device. * If it is a broadcast async, it doesn't hold * device reference, so take our own reference. */ xpt_acquire_device(device); /* * If async for specific device is to be delivered to * the wildcard client, take the specific device lock. * XXX: We may need a way for client to specify it. */ if ((device->lun_id == CAM_LUN_WILDCARD && path->device->lun_id != CAM_LUN_WILDCARD) || (device->target->target_id == CAM_TARGET_WILDCARD && path->target->target_id != CAM_TARGET_WILDCARD) || (device->target->bus->path_id == CAM_BUS_WILDCARD && path->target->bus->path_id != CAM_BUS_WILDCARD)) { mtx_unlock(&device->device_mtx); xpt_path_lock(path); relock = 1; } else relock = 0; (*(device->target->bus->xport->async))(async_code, device->target->bus, device->target, device, async_arg); xpt_async_bcast(&device->asyncs, async_code, path, async_arg); if (relock) { xpt_path_unlock(path); mtx_lock(&device->device_mtx); } xpt_release_device(device); return (1); } static int xpt_async_process_tgt(struct cam_et *target, void *arg) { union ccb *ccb = arg; struct cam_path *path = ccb->ccb_h.path; if (path->target != target && path->target->target_id != CAM_TARGET_WILDCARD && target->target_id != CAM_TARGET_WILDCARD) return (1); if (ccb->casync.async_code == AC_SENT_BDR) { /* Update our notion of when the last reset occurred */ microtime(&target->last_reset); } return (xptdevicetraverse(target, NULL, xpt_async_process_dev, ccb)); } static void xpt_async_process(struct cam_periph *periph, union ccb *ccb) { struct cam_eb *bus; struct cam_path *path; void *async_arg; u_int32_t async_code; path = ccb->ccb_h.path; async_code = ccb->casync.async_code; async_arg = ccb->casync.async_arg_ptr; CAM_DEBUG(path, CAM_DEBUG_TRACE | CAM_DEBUG_INFO, ("xpt_async(%s)\n", xpt_async_string(async_code))); bus = path->bus; if (async_code == AC_BUS_RESET) { /* Update our notion of when the last reset occurred */ microtime(&bus->last_reset); } xpttargettraverse(bus, NULL, xpt_async_process_tgt, ccb); /* * If this wasn't a fully wildcarded async, tell all * clients that want all async events. */ if (bus != xpt_periph->path->bus) { xpt_path_lock(xpt_periph->path); xpt_async_process_dev(xpt_periph->path->device, ccb); xpt_path_unlock(xpt_periph->path); } if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD) xpt_release_devq(path, 1, TRUE); else xpt_release_simq(path->bus->sim, TRUE); if (ccb->casync.async_arg_size > 0) free(async_arg, M_CAMXPT); xpt_free_path(path); xpt_free_ccb(ccb); } static void xpt_async_bcast(struct async_list *async_head, u_int32_t async_code, struct cam_path *path, void *async_arg) { struct async_node *cur_entry; int lock; cur_entry = SLIST_FIRST(async_head); while (cur_entry != NULL) { struct async_node *next_entry; /* * Grab the next list entry before we call the current * entry's callback. This is because the callback function * can delete its async callback entry. */ next_entry = SLIST_NEXT(cur_entry, links); if ((cur_entry->event_enable & async_code) != 0) { lock = cur_entry->event_lock; if (lock) CAM_SIM_LOCK(path->device->sim); cur_entry->callback(cur_entry->callback_arg, async_code, path, async_arg); if (lock) CAM_SIM_UNLOCK(path->device->sim); } cur_entry = next_entry; } } void xpt_async(u_int32_t async_code, struct cam_path *path, void *async_arg) { union ccb *ccb; int size; ccb = xpt_alloc_ccb_nowait(); if (ccb == NULL) { xpt_print(path, "Can't allocate CCB to send %s\n", xpt_async_string(async_code)); return; } if (xpt_clone_path(&ccb->ccb_h.path, path) != CAM_REQ_CMP) { xpt_print(path, "Can't allocate path to send %s\n", xpt_async_string(async_code)); xpt_free_ccb(ccb); return; } ccb->ccb_h.path->periph = NULL; ccb->ccb_h.func_code = XPT_ASYNC; ccb->ccb_h.cbfcnp = xpt_async_process; ccb->ccb_h.flags |= CAM_UNLOCKED; ccb->casync.async_code = async_code; ccb->casync.async_arg_size = 0; size = xpt_async_size(async_code); if (size > 0 && async_arg != NULL) { ccb->casync.async_arg_ptr = malloc(size, M_CAMXPT, M_NOWAIT); if (ccb->casync.async_arg_ptr == NULL) { xpt_print(path, "Can't allocate argument to send %s\n", xpt_async_string(async_code)); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } memcpy(ccb->casync.async_arg_ptr, async_arg, size); ccb->casync.async_arg_size = size; - } else if (size < 0) + } else if (size < 0) { + ccb->casync.async_arg_ptr = async_arg; ccb->casync.async_arg_size = size; + } if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD) xpt_freeze_devq(path, 1); else xpt_freeze_simq(path->bus->sim, 1); xpt_done(ccb); } static void xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg) { /* * We only need to handle events for real devices. */ if (target->target_id == CAM_TARGET_WILDCARD || device->lun_id == CAM_LUN_WILDCARD) return; printf("%s called\n", __func__); } static uint32_t xpt_freeze_devq_device(struct cam_ed *dev, u_int count) { struct cam_devq *devq; uint32_t freeze; devq = dev->sim->devq; mtx_assert(&devq->send_mtx, MA_OWNED); CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_freeze_devq_device(%d) %u->%u\n", count, dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt + count)); freeze = (dev->ccbq.queue.qfrozen_cnt += count); /* Remove frozen device from sendq. */ if (device_is_queued(dev)) camq_remove(&devq->send_queue, dev->devq_entry.index); return (freeze); } u_int32_t xpt_freeze_devq(struct cam_path *path, u_int count) { struct cam_ed *dev = path->device; struct cam_devq *devq; uint32_t freeze; devq = dev->sim->devq; mtx_lock(&devq->send_mtx); CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_freeze_devq(%d)\n", count)); freeze = xpt_freeze_devq_device(dev, count); mtx_unlock(&devq->send_mtx); return (freeze); } u_int32_t xpt_freeze_simq(struct cam_sim *sim, u_int count) { struct cam_devq *devq; uint32_t freeze; devq = sim->devq; mtx_lock(&devq->send_mtx); freeze = (devq->send_queue.qfrozen_cnt += count); mtx_unlock(&devq->send_mtx); return (freeze); } static void xpt_release_devq_timeout(void *arg) { struct cam_ed *dev; struct cam_devq *devq; dev = (struct cam_ed *)arg; CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_timeout\n")); devq = dev->sim->devq; mtx_assert(&devq->send_mtx, MA_OWNED); if (xpt_release_devq_device(dev, /*count*/1, /*run_queue*/TRUE)) xpt_run_devq(devq); } void xpt_release_devq(struct cam_path *path, u_int count, int run_queue) { struct cam_ed *dev; struct cam_devq *devq; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_devq(%d, %d)\n", count, run_queue)); dev = path->device; devq = dev->sim->devq; mtx_lock(&devq->send_mtx); if (xpt_release_devq_device(dev, count, run_queue)) xpt_run_devq(dev->sim->devq); mtx_unlock(&devq->send_mtx); } static int xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue) { mtx_assert(&dev->sim->devq->send_mtx, MA_OWNED); CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_device(%d, %d) %u->%u\n", count, run_queue, dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt - count)); if (count > dev->ccbq.queue.qfrozen_cnt) { #ifdef INVARIANTS printf("xpt_release_devq(): requested %u > present %u\n", count, dev->ccbq.queue.qfrozen_cnt); #endif count = dev->ccbq.queue.qfrozen_cnt; } dev->ccbq.queue.qfrozen_cnt -= count; if (dev->ccbq.queue.qfrozen_cnt == 0) { /* * No longer need to wait for a successful * command completion. */ dev->flags &= ~CAM_DEV_REL_ON_COMPLETE; /* * Remove any timeouts that might be scheduled * to release this queue. */ if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) { callout_stop(&dev->callout); dev->flags &= ~CAM_DEV_REL_TIMEOUT_PENDING; } /* * Now that we are unfrozen schedule the * device so any pending transactions are * run. */ xpt_schedule_devq(dev->sim->devq, dev); } else run_queue = 0; return (run_queue); } void xpt_release_simq(struct cam_sim *sim, int run_queue) { struct cam_devq *devq; devq = sim->devq; mtx_lock(&devq->send_mtx); if (devq->send_queue.qfrozen_cnt <= 0) { #ifdef INVARIANTS printf("xpt_release_simq: requested 1 > present %u\n", devq->send_queue.qfrozen_cnt); #endif } else devq->send_queue.qfrozen_cnt--; if (devq->send_queue.qfrozen_cnt == 0) { /* * If there is a timeout scheduled to release this * sim queue, remove it. The queue frozen count is * already at 0. */ if ((sim->flags & CAM_SIM_REL_TIMEOUT_PENDING) != 0){ callout_stop(&sim->callout); sim->flags &= ~CAM_SIM_REL_TIMEOUT_PENDING; } if (run_queue) { /* * Now that we are unfrozen run the send queue. */ xpt_run_devq(sim->devq); } } mtx_unlock(&devq->send_mtx); } /* * XXX Appears to be unused. */ static void xpt_release_simq_timeout(void *arg) { struct cam_sim *sim; sim = (struct cam_sim *)arg; xpt_release_simq(sim, /* run_queue */ TRUE); } void xpt_done(union ccb *done_ccb) { struct cam_doneq *queue; int run, hash; CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done\n")); if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0) return; hash = (done_ccb->ccb_h.path_id + done_ccb->ccb_h.target_id + done_ccb->ccb_h.target_lun) % cam_num_doneqs; queue = &cam_doneqs[hash]; mtx_lock(&queue->cam_doneq_mtx); run = (queue->cam_doneq_sleep && STAILQ_EMPTY(&queue->cam_doneq)); STAILQ_INSERT_TAIL(&queue->cam_doneq, &done_ccb->ccb_h, sim_links.stqe); done_ccb->ccb_h.pinfo.index = CAM_DONEQ_INDEX; mtx_unlock(&queue->cam_doneq_mtx); if (run) wakeup(&queue->cam_doneq); } void xpt_done_direct(union ccb *done_ccb) { CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done_direct\n")); if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0) return; xpt_done_process(&done_ccb->ccb_h); } union ccb * xpt_alloc_ccb() { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK); return (new_ccb); } union ccb * xpt_alloc_ccb_nowait() { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT); return (new_ccb); } void xpt_free_ccb(union ccb *free_ccb) { free(free_ccb, M_CAMCCB); } /* Private XPT functions */ /* * Get a CAM control block for the caller. Charge the structure to the device * referenced by the path. If we don't have sufficient resources to allocate * more ccbs, we return NULL. */ static union ccb * xpt_get_ccb_nowait(struct cam_periph *periph) { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT); if (new_ccb == NULL) return (NULL); periph->periph_allocated++; cam_ccbq_take_opening(&periph->path->device->ccbq); return (new_ccb); } static union ccb * xpt_get_ccb(struct cam_periph *periph) { union ccb *new_ccb; cam_periph_unlock(periph); new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK); cam_periph_lock(periph); periph->periph_allocated++; cam_ccbq_take_opening(&periph->path->device->ccbq); return (new_ccb); } union ccb * cam_periph_getccb(struct cam_periph *periph, u_int32_t priority) { struct ccb_hdr *ccb_h; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("cam_periph_getccb\n")); cam_periph_assert(periph, MA_OWNED); while ((ccb_h = SLIST_FIRST(&periph->ccb_list)) == NULL || ccb_h->pinfo.priority != priority) { if (priority < periph->immediate_priority) { periph->immediate_priority = priority; xpt_run_allocq(periph, 0); } else cam_periph_sleep(periph, &periph->ccb_list, PRIBIO, "cgticb", 0); } SLIST_REMOVE_HEAD(&periph->ccb_list, periph_links.sle); return ((union ccb *)ccb_h); } static void xpt_acquire_bus(struct cam_eb *bus) { xpt_lock_buses(); bus->refcount++; xpt_unlock_buses(); } static void xpt_release_bus(struct cam_eb *bus) { xpt_lock_buses(); KASSERT(bus->refcount >= 1, ("bus->refcount >= 1")); if (--bus->refcount > 0) { xpt_unlock_buses(); return; } TAILQ_REMOVE(&xsoftc.xpt_busses, bus, links); xsoftc.bus_generation++; xpt_unlock_buses(); KASSERT(TAILQ_EMPTY(&bus->et_entries), ("destroying bus, but target list is not empty")); cam_sim_release(bus->sim); mtx_destroy(&bus->eb_mtx); free(bus, M_CAMXPT); } static struct cam_et * xpt_alloc_target(struct cam_eb *bus, target_id_t target_id) { struct cam_et *cur_target, *target; mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED); mtx_assert(&bus->eb_mtx, MA_OWNED); target = (struct cam_et *)malloc(sizeof(*target), M_CAMXPT, M_NOWAIT|M_ZERO); if (target == NULL) return (NULL); TAILQ_INIT(&target->ed_entries); target->bus = bus; target->target_id = target_id; target->refcount = 1; target->generation = 0; target->luns = NULL; mtx_init(&target->luns_mtx, "CAM LUNs lock", NULL, MTX_DEF); timevalclear(&target->last_reset); /* * Hold a reference to our parent bus so it * will not go away before we do. */ bus->refcount++; /* Insertion sort into our bus's target list */ cur_target = TAILQ_FIRST(&bus->et_entries); while (cur_target != NULL && cur_target->target_id < target_id) cur_target = TAILQ_NEXT(cur_target, links); if (cur_target != NULL) { TAILQ_INSERT_BEFORE(cur_target, target, links); } else { TAILQ_INSERT_TAIL(&bus->et_entries, target, links); } bus->generation++; return (target); } static void xpt_acquire_target(struct cam_et *target) { struct cam_eb *bus = target->bus; mtx_lock(&bus->eb_mtx); target->refcount++; mtx_unlock(&bus->eb_mtx); } static void xpt_release_target(struct cam_et *target) { struct cam_eb *bus = target->bus; mtx_lock(&bus->eb_mtx); if (--target->refcount > 0) { mtx_unlock(&bus->eb_mtx); return; } TAILQ_REMOVE(&bus->et_entries, target, links); bus->generation++; mtx_unlock(&bus->eb_mtx); KASSERT(TAILQ_EMPTY(&target->ed_entries), ("destroying target, but device list is not empty")); xpt_release_bus(bus); mtx_destroy(&target->luns_mtx); if (target->luns) free(target->luns, M_CAMXPT); free(target, M_CAMXPT); } static struct cam_ed * xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct cam_ed *device; device = xpt_alloc_device(bus, target, lun_id); if (device == NULL) return (NULL); device->mintags = 1; device->maxtags = 1; return (device); } static void xpt_destroy_device(void *context, int pending) { struct cam_ed *device = context; mtx_lock(&device->device_mtx); mtx_destroy(&device->device_mtx); free(device, M_CAMDEV); } struct cam_ed * xpt_alloc_device(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct cam_ed *cur_device, *device; struct cam_devq *devq; cam_status status; mtx_assert(&bus->eb_mtx, MA_OWNED); /* Make space for us in the device queue on our bus */ devq = bus->sim->devq; mtx_lock(&devq->send_mtx); status = cam_devq_resize(devq, devq->send_queue.array_size + 1); mtx_unlock(&devq->send_mtx); if (status != CAM_REQ_CMP) return (NULL); device = (struct cam_ed *)malloc(sizeof(*device), M_CAMDEV, M_NOWAIT|M_ZERO); if (device == NULL) return (NULL); cam_init_pinfo(&device->devq_entry); device->target = target; device->lun_id = lun_id; device->sim = bus->sim; if (cam_ccbq_init(&device->ccbq, bus->sim->max_dev_openings) != 0) { free(device, M_CAMDEV); return (NULL); } SLIST_INIT(&device->asyncs); SLIST_INIT(&device->periphs); device->generation = 0; device->flags = CAM_DEV_UNCONFIGURED; device->tag_delay_count = 0; device->tag_saved_openings = 0; device->refcount = 1; mtx_init(&device->device_mtx, "CAM device lock", NULL, MTX_DEF); callout_init_mtx(&device->callout, &devq->send_mtx, 0); TASK_INIT(&device->device_destroy_task, 0, xpt_destroy_device, device); /* * Hold a reference to our parent bus so it * will not go away before we do. */ target->refcount++; cur_device = TAILQ_FIRST(&target->ed_entries); while (cur_device != NULL && cur_device->lun_id < lun_id) cur_device = TAILQ_NEXT(cur_device, links); if (cur_device != NULL) TAILQ_INSERT_BEFORE(cur_device, device, links); else TAILQ_INSERT_TAIL(&target->ed_entries, device, links); target->generation++; return (device); } void xpt_acquire_device(struct cam_ed *device) { struct cam_eb *bus = device->target->bus; mtx_lock(&bus->eb_mtx); device->refcount++; mtx_unlock(&bus->eb_mtx); } void xpt_release_device(struct cam_ed *device) { struct cam_eb *bus = device->target->bus; struct cam_devq *devq; mtx_lock(&bus->eb_mtx); if (--device->refcount > 0) { mtx_unlock(&bus->eb_mtx); return; } TAILQ_REMOVE(&device->target->ed_entries, device,links); device->target->generation++; mtx_unlock(&bus->eb_mtx); /* Release our slot in the devq */ devq = bus->sim->devq; mtx_lock(&devq->send_mtx); cam_devq_resize(devq, devq->send_queue.array_size - 1); mtx_unlock(&devq->send_mtx); KASSERT(SLIST_EMPTY(&device->periphs), ("destroying device, but periphs list is not empty")); KASSERT(device->devq_entry.index == CAM_UNQUEUED_INDEX, ("destroying device while still queued for ccbs")); if ((device->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) callout_stop(&device->callout); xpt_release_target(device->target); cam_ccbq_fini(&device->ccbq); /* * Free allocated memory. free(9) does nothing if the * supplied pointer is NULL, so it is safe to call without * checking. */ free(device->supported_vpds, M_CAMXPT); free(device->device_id, M_CAMXPT); free(device->ext_inq, M_CAMXPT); free(device->physpath, M_CAMXPT); free(device->rcap_buf, M_CAMXPT); free(device->serial_num, M_CAMXPT); taskqueue_enqueue(xsoftc.xpt_taskq, &device->device_destroy_task); } u_int32_t xpt_dev_ccbq_resize(struct cam_path *path, int newopenings) { int result; struct cam_ed *dev; dev = path->device; mtx_lock(&dev->sim->devq->send_mtx); result = cam_ccbq_resize(&dev->ccbq, newopenings); mtx_unlock(&dev->sim->devq->send_mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 || (dev->inq_flags & SID_CmdQue) != 0) dev->tag_saved_openings = newopenings; return (result); } static struct cam_eb * xpt_find_bus(path_id_t path_id) { struct cam_eb *bus; xpt_lock_buses(); for (bus = TAILQ_FIRST(&xsoftc.xpt_busses); bus != NULL; bus = TAILQ_NEXT(bus, links)) { if (bus->path_id == path_id) { bus->refcount++; break; } } xpt_unlock_buses(); return (bus); } static struct cam_et * xpt_find_target(struct cam_eb *bus, target_id_t target_id) { struct cam_et *target; mtx_assert(&bus->eb_mtx, MA_OWNED); for (target = TAILQ_FIRST(&bus->et_entries); target != NULL; target = TAILQ_NEXT(target, links)) { if (target->target_id == target_id) { target->refcount++; break; } } return (target); } static struct cam_ed * xpt_find_device(struct cam_et *target, lun_id_t lun_id) { struct cam_ed *device; mtx_assert(&target->bus->eb_mtx, MA_OWNED); for (device = TAILQ_FIRST(&target->ed_entries); device != NULL; device = TAILQ_NEXT(device, links)) { if (device->lun_id == lun_id) { device->refcount++; break; } } return (device); } void xpt_start_tags(struct cam_path *path) { struct ccb_relsim crs; struct cam_ed *device; struct cam_sim *sim; int newopenings; device = path->device; sim = path->bus->sim; device->flags &= ~CAM_DEV_TAG_AFTER_COUNT; xpt_freeze_devq(path, /*count*/1); device->inq_flags |= SID_CmdQue; if (device->tag_saved_openings != 0) newopenings = device->tag_saved_openings; else newopenings = min(device->maxtags, sim->max_tagged_dev_openings); xpt_dev_ccbq_resize(path, newopenings); xpt_async(AC_GETDEV_CHANGED, path, NULL); xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL); crs.ccb_h.func_code = XPT_REL_SIMQ; crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY; crs.openings = crs.release_timeout = crs.qfrozen_cnt = 0; xpt_action((union ccb *)&crs); } void xpt_stop_tags(struct cam_path *path) { struct ccb_relsim crs; struct cam_ed *device; struct cam_sim *sim; device = path->device; sim = path->bus->sim; device->flags &= ~CAM_DEV_TAG_AFTER_COUNT; device->tag_delay_count = 0; xpt_freeze_devq(path, /*count*/1); device->inq_flags &= ~SID_CmdQue; xpt_dev_ccbq_resize(path, sim->max_dev_openings); xpt_async(AC_GETDEV_CHANGED, path, NULL); xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL); crs.ccb_h.func_code = XPT_REL_SIMQ; crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY; crs.openings = crs.release_timeout = crs.qfrozen_cnt = 0; xpt_action((union ccb *)&crs); } static void xpt_boot_delay(void *arg) { xpt_release_boot(); } static void xpt_config(void *arg) { /* * Now that interrupts are enabled, go find our devices */ if (taskqueue_start_threads(&xsoftc.xpt_taskq, 1, PRIBIO, "CAM taskq")) printf("xpt_config: failed to create taskqueue thread.\n"); /* Setup debugging path */ if (cam_dflags != CAM_DEBUG_NONE) { if (xpt_create_path(&cam_dpath, NULL, CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN) != CAM_REQ_CMP) { printf("xpt_config: xpt_create_path() failed for debug" " target %d:%d:%d, debugging disabled\n", CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN); cam_dflags = CAM_DEBUG_NONE; } } else cam_dpath = NULL; periphdriver_init(1); xpt_hold_boot(); callout_init(&xsoftc.boot_callout, 1); callout_reset_sbt(&xsoftc.boot_callout, SBT_1MS * xsoftc.boot_delay, 0, xpt_boot_delay, NULL, 0); /* Fire up rescan thread. */ if (kproc_kthread_add(xpt_scanner_thread, NULL, &cam_proc, NULL, 0, 0, "cam", "scanner")) { printf("xpt_config: failed to create rescan thread.\n"); } } void xpt_hold_boot(void) { xpt_lock_buses(); xsoftc.buses_to_config++; xpt_unlock_buses(); } void xpt_release_boot(void) { xpt_lock_buses(); xsoftc.buses_to_config--; if (xsoftc.buses_to_config == 0 && xsoftc.buses_config_done == 0) { struct xpt_task *task; xsoftc.buses_config_done = 1; xpt_unlock_buses(); /* Call manually because we don't have any busses */ task = malloc(sizeof(struct xpt_task), M_CAMXPT, M_NOWAIT); if (task != NULL) { TASK_INIT(&task->task, 0, xpt_finishconfig_task, task); taskqueue_enqueue(taskqueue_thread, &task->task); } } else xpt_unlock_buses(); } /* * If the given device only has one peripheral attached to it, and if that * peripheral is the passthrough driver, announce it. This insures that the * user sees some sort of announcement for every peripheral in their system. */ static int xptpassannouncefunc(struct cam_ed *device, void *arg) { struct cam_periph *periph; int i; for (periph = SLIST_FIRST(&device->periphs), i = 0; periph != NULL; periph = SLIST_NEXT(periph, periph_links), i++); periph = SLIST_FIRST(&device->periphs); if ((i == 1) && (strncmp(periph->periph_name, "pass", 4) == 0)) xpt_announce_periph(periph, NULL); return(1); } static void xpt_finishconfig_task(void *context, int pending) { periphdriver_init(2); /* * Check for devices with no "standard" peripheral driver * attached. For any devices like that, announce the * passthrough driver so the user will see something. */ if (!bootverbose) xpt_for_all_devices(xptpassannouncefunc, NULL); /* Release our hook so that the boot can continue. */ config_intrhook_disestablish(xsoftc.xpt_config_hook); free(xsoftc.xpt_config_hook, M_CAMXPT); xsoftc.xpt_config_hook = NULL; free(context, M_CAMXPT); } cam_status xpt_register_async(int event, ac_callback_t *cbfunc, void *cbarg, struct cam_path *path) { struct ccb_setasync csa; cam_status status; int xptpath = 0; if (path == NULL) { status = xpt_create_path(&path, /*periph*/NULL, CAM_XPT_PATH_ID, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) return (status); xpt_path_lock(path); xptpath = 1; } xpt_setup_ccb(&csa.ccb_h, path, CAM_PRIORITY_NORMAL); csa.ccb_h.func_code = XPT_SASYNC_CB; csa.event_enable = event; csa.callback = cbfunc; csa.callback_arg = cbarg; xpt_action((union ccb *)&csa); status = csa.ccb_h.status; if (xptpath) { xpt_path_unlock(path); xpt_free_path(path); } if ((status == CAM_REQ_CMP) && (csa.event_enable & AC_FOUND_DEVICE)) { /* * Get this peripheral up to date with all * the currently existing devices. */ xpt_for_all_devices(xptsetasyncfunc, &csa); } if ((status == CAM_REQ_CMP) && (csa.event_enable & AC_PATH_REGISTERED)) { /* * Get this peripheral up to date with all * the currently existing busses. */ xpt_for_all_busses(xptsetasyncbusfunc, &csa); } return (status); } static void xptaction(struct cam_sim *sim, union ccb *work_ccb) { CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xptaction\n")); switch (work_ccb->ccb_h.func_code) { /* Common cases first */ case XPT_PATH_INQ: /* Path routing inquiry */ { struct ccb_pathinq *cpi; cpi = &work_ccb->cpi; cpi->version_num = 1; /* XXX??? */ cpi->hba_inquiry = 0; cpi->target_sprt = 0; cpi->hba_misc = 0; cpi->hba_eng_cnt = 0; cpi->max_target = 0; cpi->max_lun = 0; cpi->initiator_id = 0; strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN); strncpy(cpi->hba_vid, "", HBA_IDLEN); strncpy(cpi->dev_name, sim->sim_name, DEV_IDLEN); cpi->unit_number = sim->unit_number; cpi->bus_id = sim->bus_id; cpi->base_transfer_speed = 0; cpi->protocol = PROTO_UNSPECIFIED; cpi->protocol_version = PROTO_VERSION_UNSPECIFIED; cpi->transport = XPORT_UNSPECIFIED; cpi->transport_version = XPORT_VERSION_UNSPECIFIED; cpi->ccb_h.status = CAM_REQ_CMP; xpt_done(work_ccb); break; } default: work_ccb->ccb_h.status = CAM_REQ_INVALID; xpt_done(work_ccb); break; } } /* * The xpt as a "controller" has no interrupt sources, so polling * is a no-op. */ static void xptpoll(struct cam_sim *sim) { } void xpt_lock_buses(void) { mtx_lock(&xsoftc.xpt_topo_lock); } void xpt_unlock_buses(void) { mtx_unlock(&xsoftc.xpt_topo_lock); } struct mtx * xpt_path_mtx(struct cam_path *path) { return (&path->device->device_mtx); } static void xpt_done_process(struct ccb_hdr *ccb_h) { struct cam_sim *sim; struct cam_devq *devq; struct mtx *mtx = NULL; if (ccb_h->flags & CAM_HIGH_POWER) { struct highpowerlist *hphead; struct cam_ed *device; mtx_lock(&xsoftc.xpt_highpower_lock); hphead = &xsoftc.highpowerq; device = STAILQ_FIRST(hphead); /* * Increment the count since this command is done. */ xsoftc.num_highpower++; /* * Any high powered commands queued up? */ if (device != NULL) { STAILQ_REMOVE_HEAD(hphead, highpowerq_entry); mtx_unlock(&xsoftc.xpt_highpower_lock); mtx_lock(&device->sim->devq->send_mtx); xpt_release_devq_device(device, /*count*/1, /*runqueue*/TRUE); mtx_unlock(&device->sim->devq->send_mtx); } else mtx_unlock(&xsoftc.xpt_highpower_lock); } sim = ccb_h->path->bus->sim; if (ccb_h->status & CAM_RELEASE_SIMQ) { xpt_release_simq(sim, /*run_queue*/FALSE); ccb_h->status &= ~CAM_RELEASE_SIMQ; } if ((ccb_h->flags & CAM_DEV_QFRZDIS) && (ccb_h->status & CAM_DEV_QFRZN)) { xpt_release_devq(ccb_h->path, /*count*/1, /*run_queue*/TRUE); ccb_h->status &= ~CAM_DEV_QFRZN; } devq = sim->devq; if ((ccb_h->func_code & XPT_FC_USER_CCB) == 0) { struct cam_ed *dev = ccb_h->path->device; mtx_lock(&devq->send_mtx); devq->send_active--; devq->send_openings++; cam_ccbq_ccb_done(&dev->ccbq, (union ccb *)ccb_h); if (((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0 && (dev->ccbq.dev_active == 0))) { dev->flags &= ~CAM_DEV_REL_ON_QUEUE_EMPTY; xpt_release_devq_device(dev, /*count*/1, /*run_queue*/FALSE); } if (((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0 && (ccb_h->status&CAM_STATUS_MASK) != CAM_REQUEUE_REQ)) { dev->flags &= ~CAM_DEV_REL_ON_COMPLETE; xpt_release_devq_device(dev, /*count*/1, /*run_queue*/FALSE); } if (!device_is_queued(dev)) (void)xpt_schedule_devq(devq, dev); xpt_run_devq(devq); mtx_unlock(&devq->send_mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0) { mtx = xpt_path_mtx(ccb_h->path); mtx_lock(mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 && (--dev->tag_delay_count == 0)) xpt_start_tags(ccb_h->path); } } if ((ccb_h->flags & CAM_UNLOCKED) == 0) { if (mtx == NULL) { mtx = xpt_path_mtx(ccb_h->path); mtx_lock(mtx); } } else { if (mtx != NULL) { mtx_unlock(mtx); mtx = NULL; } } /* Call the peripheral driver's callback */ ccb_h->pinfo.index = CAM_UNQUEUED_INDEX; (*ccb_h->cbfcnp)(ccb_h->path->periph, (union ccb *)ccb_h); if (mtx != NULL) mtx_unlock(mtx); } void xpt_done_td(void *arg) { struct cam_doneq *queue = arg; struct ccb_hdr *ccb_h; STAILQ_HEAD(, ccb_hdr) doneq; STAILQ_INIT(&doneq); mtx_lock(&queue->cam_doneq_mtx); while (1) { while (STAILQ_EMPTY(&queue->cam_doneq)) { queue->cam_doneq_sleep = 1; msleep(&queue->cam_doneq, &queue->cam_doneq_mtx, PRIBIO, "-", 0); queue->cam_doneq_sleep = 0; } STAILQ_CONCAT(&doneq, &queue->cam_doneq); mtx_unlock(&queue->cam_doneq_mtx); THREAD_NO_SLEEPING(); while ((ccb_h = STAILQ_FIRST(&doneq)) != NULL) { STAILQ_REMOVE_HEAD(&doneq, sim_links.stqe); xpt_done_process(ccb_h); } THREAD_SLEEPING_OK(); mtx_lock(&queue->cam_doneq_mtx); } } static void camisr_runqueue(void) { struct ccb_hdr *ccb_h; struct cam_doneq *queue; int i; /* Process global queues. */ for (i = 0; i < cam_num_doneqs; i++) { queue = &cam_doneqs[i]; mtx_lock(&queue->cam_doneq_mtx); while ((ccb_h = STAILQ_FIRST(&queue->cam_doneq)) != NULL) { STAILQ_REMOVE_HEAD(&queue->cam_doneq, sim_links.stqe); mtx_unlock(&queue->cam_doneq_mtx); xpt_done_process(ccb_h); mtx_lock(&queue->cam_doneq_mtx); } mtx_unlock(&queue->cam_doneq_mtx); } } Index: user/ngie/more-tests/sys/compat/freebsd32/freebsd32_misc.c =================================================================== --- user/ngie/more-tests/sys/compat/freebsd32/freebsd32_misc.c (revision 281584) +++ user/ngie/more-tests/sys/compat/freebsd32/freebsd32_misc.c (revision 281585) @@ -1,3124 +1,3125 @@ /*- * Copyright (c) 2002 Doug Rabson * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include "opt_inet.h" #include "opt_inet6.h" #define __ELF_WORD_SIZE 32 #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* Must come after sys/malloc.h */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* Must come after sys/selinfo.h */ #include /* Must come after sys/selinfo.h */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include FEATURE(compat_freebsd_32bit, "Compatible with 32-bit FreeBSD"); #ifndef __mips__ CTASSERT(sizeof(struct timeval32) == 8); CTASSERT(sizeof(struct timespec32) == 8); CTASSERT(sizeof(struct itimerval32) == 16); #endif CTASSERT(sizeof(struct statfs32) == 256); #ifndef __mips__ CTASSERT(sizeof(struct rusage32) == 72); #endif CTASSERT(sizeof(struct sigaltstack32) == 12); CTASSERT(sizeof(struct kevent32) == 20); CTASSERT(sizeof(struct iovec32) == 8); CTASSERT(sizeof(struct msghdr32) == 28); #ifndef __mips__ CTASSERT(sizeof(struct stat32) == 96); #endif CTASSERT(sizeof(struct sigaction32) == 24); static int freebsd32_kevent_copyout(void *arg, struct kevent *kevp, int count); static int freebsd32_kevent_copyin(void *arg, struct kevent *kevp, int count); void freebsd32_rusage_out(const struct rusage *s, struct rusage32 *s32) { TV_CP(*s, *s32, ru_utime); TV_CP(*s, *s32, ru_stime); CP(*s, *s32, ru_maxrss); CP(*s, *s32, ru_ixrss); CP(*s, *s32, ru_idrss); CP(*s, *s32, ru_isrss); CP(*s, *s32, ru_minflt); CP(*s, *s32, ru_majflt); CP(*s, *s32, ru_nswap); CP(*s, *s32, ru_inblock); CP(*s, *s32, ru_oublock); CP(*s, *s32, ru_msgsnd); CP(*s, *s32, ru_msgrcv); CP(*s, *s32, ru_nsignals); CP(*s, *s32, ru_nvcsw); CP(*s, *s32, ru_nivcsw); } int freebsd32_wait4(struct thread *td, struct freebsd32_wait4_args *uap) { int error, status; struct rusage32 ru32; struct rusage ru, *rup; if (uap->rusage != NULL) rup = &ru; else rup = NULL; error = kern_wait(td, uap->pid, &status, uap->options, rup); if (error) return (error); if (uap->status != NULL) error = copyout(&status, uap->status, sizeof(status)); if (uap->rusage != NULL && error == 0) { freebsd32_rusage_out(&ru, &ru32); error = copyout(&ru32, uap->rusage, sizeof(ru32)); } return (error); } int freebsd32_wait6(struct thread *td, struct freebsd32_wait6_args *uap) { struct wrusage32 wru32; struct __wrusage wru, *wrup; struct siginfo32 si32; struct __siginfo si, *sip; int error, status; if (uap->wrusage != NULL) wrup = &wru; else wrup = NULL; if (uap->info != NULL) { sip = &si; bzero(sip, sizeof(*sip)); } else sip = NULL; error = kern_wait6(td, uap->idtype, PAIR32TO64(id_t, uap->id), &status, uap->options, wrup, sip); if (error != 0) return (error); if (uap->status != NULL) error = copyout(&status, uap->status, sizeof(status)); if (uap->wrusage != NULL && error == 0) { freebsd32_rusage_out(&wru.wru_self, &wru32.wru_self); freebsd32_rusage_out(&wru.wru_children, &wru32.wru_children); error = copyout(&wru32, uap->wrusage, sizeof(wru32)); } if (uap->info != NULL && error == 0) { siginfo_to_siginfo32 (&si, &si32); error = copyout(&si32, uap->info, sizeof(si32)); } return (error); } #ifdef COMPAT_FREEBSD4 static void copy_statfs(struct statfs *in, struct statfs32 *out) { statfs_scale_blocks(in, INT32_MAX); bzero(out, sizeof(*out)); CP(*in, *out, f_bsize); out->f_iosize = MIN(in->f_iosize, INT32_MAX); CP(*in, *out, f_blocks); CP(*in, *out, f_bfree); CP(*in, *out, f_bavail); out->f_files = MIN(in->f_files, INT32_MAX); out->f_ffree = MIN(in->f_ffree, INT32_MAX); CP(*in, *out, f_fsid); CP(*in, *out, f_owner); CP(*in, *out, f_type); CP(*in, *out, f_flags); out->f_syncwrites = MIN(in->f_syncwrites, INT32_MAX); out->f_asyncwrites = MIN(in->f_asyncwrites, INT32_MAX); strlcpy(out->f_fstypename, in->f_fstypename, MFSNAMELEN); strlcpy(out->f_mntonname, in->f_mntonname, min(MNAMELEN, FREEBSD4_MNAMELEN)); out->f_syncreads = MIN(in->f_syncreads, INT32_MAX); out->f_asyncreads = MIN(in->f_asyncreads, INT32_MAX); strlcpy(out->f_mntfromname, in->f_mntfromname, min(MNAMELEN, FREEBSD4_MNAMELEN)); } #endif #ifdef COMPAT_FREEBSD4 int freebsd4_freebsd32_getfsstat(struct thread *td, struct freebsd4_freebsd32_getfsstat_args *uap) { struct statfs *buf, *sp; struct statfs32 stat32; size_t count, size; int error; count = uap->bufsize / sizeof(struct statfs32); size = count * sizeof(struct statfs); - error = kern_getfsstat(td, &buf, size, UIO_SYSSPACE, uap->flags); + error = kern_getfsstat(td, &buf, size, &count, UIO_SYSSPACE, uap->flags); if (size > 0) { - count = td->td_retval[0]; sp = buf; while (count > 0 && error == 0) { copy_statfs(sp, &stat32); error = copyout(&stat32, uap->buf, sizeof(stat32)); sp++; uap->buf++; count--; } free(buf, M_TEMP); } + if (error == 0) + td->td_retval[0] = count; return (error); } #endif int freebsd32_sigaltstack(struct thread *td, struct freebsd32_sigaltstack_args *uap) { struct sigaltstack32 s32; struct sigaltstack ss, oss, *ssp; int error; if (uap->ss != NULL) { error = copyin(uap->ss, &s32, sizeof(s32)); if (error) return (error); PTRIN_CP(s32, ss, ss_sp); CP(s32, ss, ss_size); CP(s32, ss, ss_flags); ssp = &ss; } else ssp = NULL; error = kern_sigaltstack(td, ssp, &oss); if (error == 0 && uap->oss != NULL) { PTROUT_CP(oss, s32, ss_sp); CP(oss, s32, ss_size); CP(oss, s32, ss_flags); error = copyout(&s32, uap->oss, sizeof(s32)); } return (error); } /* * Custom version of exec_copyin_args() so that we can translate * the pointers. */ int freebsd32_exec_copyin_args(struct image_args *args, char *fname, enum uio_seg segflg, u_int32_t *argv, u_int32_t *envv) { char *argp, *envp; u_int32_t *p32, arg; size_t length; int error; bzero(args, sizeof(*args)); if (argv == NULL) return (EFAULT); /* * Allocate demand-paged memory for the file name, argument, and * environment strings. */ error = exec_alloc_args(args); if (error != 0) return (error); /* * Copy the file name. */ if (fname != NULL) { args->fname = args->buf; error = (segflg == UIO_SYSSPACE) ? copystr(fname, args->fname, PATH_MAX, &length) : copyinstr(fname, args->fname, PATH_MAX, &length); if (error != 0) goto err_exit; } else length = 0; args->begin_argv = args->buf + length; args->endp = args->begin_argv; args->stringspace = ARG_MAX; /* * extract arguments first */ p32 = argv; for (;;) { error = copyin(p32++, &arg, sizeof(arg)); if (error) goto err_exit; if (arg == 0) break; argp = PTRIN(arg); error = copyinstr(argp, args->endp, args->stringspace, &length); if (error) { if (error == ENAMETOOLONG) error = E2BIG; goto err_exit; } args->stringspace -= length; args->endp += length; args->argc++; } args->begin_envv = args->endp; /* * extract environment strings */ if (envv) { p32 = envv; for (;;) { error = copyin(p32++, &arg, sizeof(arg)); if (error) goto err_exit; if (arg == 0) break; envp = PTRIN(arg); error = copyinstr(envp, args->endp, args->stringspace, &length); if (error) { if (error == ENAMETOOLONG) error = E2BIG; goto err_exit; } args->stringspace -= length; args->endp += length; args->envc++; } } return (0); err_exit: exec_free_args(args); return (error); } int freebsd32_execve(struct thread *td, struct freebsd32_execve_args *uap) { struct image_args eargs; int error; error = freebsd32_exec_copyin_args(&eargs, uap->fname, UIO_USERSPACE, uap->argv, uap->envv); if (error == 0) error = kern_execve(td, &eargs, NULL); return (error); } int freebsd32_fexecve(struct thread *td, struct freebsd32_fexecve_args *uap) { struct image_args eargs; int error; error = freebsd32_exec_copyin_args(&eargs, NULL, UIO_SYSSPACE, uap->argv, uap->envv); if (error == 0) { eargs.fd = uap->fd; error = kern_execve(td, &eargs, NULL); } return (error); } int freebsd32_mprotect(struct thread *td, struct freebsd32_mprotect_args *uap) { struct mprotect_args ap; ap.addr = PTRIN(uap->addr); ap.len = uap->len; ap.prot = uap->prot; #if defined(__amd64__) if (i386_read_exec && (ap.prot & PROT_READ) != 0) ap.prot |= PROT_EXEC; #endif return (sys_mprotect(td, &ap)); } int freebsd32_mmap(struct thread *td, struct freebsd32_mmap_args *uap) { struct mmap_args ap; vm_offset_t addr = (vm_offset_t) uap->addr; vm_size_t len = uap->len; int prot = uap->prot; int flags = uap->flags; int fd = uap->fd; off_t pos = PAIR32TO64(off_t,uap->pos); #if defined(__amd64__) if (i386_read_exec && (prot & PROT_READ)) prot |= PROT_EXEC; #endif ap.addr = (void *) addr; ap.len = len; ap.prot = prot; ap.flags = flags; ap.fd = fd; ap.pos = pos; return (sys_mmap(td, &ap)); } #ifdef COMPAT_FREEBSD6 int freebsd6_freebsd32_mmap(struct thread *td, struct freebsd6_freebsd32_mmap_args *uap) { struct freebsd32_mmap_args ap; ap.addr = uap->addr; ap.len = uap->len; ap.prot = uap->prot; ap.flags = uap->flags; ap.fd = uap->fd; ap.pos1 = uap->pos1; ap.pos2 = uap->pos2; return (freebsd32_mmap(td, &ap)); } #endif int freebsd32_setitimer(struct thread *td, struct freebsd32_setitimer_args *uap) { struct itimerval itv, oitv, *itvp; struct itimerval32 i32; int error; if (uap->itv != NULL) { error = copyin(uap->itv, &i32, sizeof(i32)); if (error) return (error); TV_CP(i32, itv, it_interval); TV_CP(i32, itv, it_value); itvp = &itv; } else itvp = NULL; error = kern_setitimer(td, uap->which, itvp, &oitv); if (error || uap->oitv == NULL) return (error); TV_CP(oitv, i32, it_interval); TV_CP(oitv, i32, it_value); return (copyout(&i32, uap->oitv, sizeof(i32))); } int freebsd32_getitimer(struct thread *td, struct freebsd32_getitimer_args *uap) { struct itimerval itv; struct itimerval32 i32; int error; error = kern_getitimer(td, uap->which, &itv); if (error || uap->itv == NULL) return (error); TV_CP(itv, i32, it_interval); TV_CP(itv, i32, it_value); return (copyout(&i32, uap->itv, sizeof(i32))); } int freebsd32_select(struct thread *td, struct freebsd32_select_args *uap) { struct timeval32 tv32; struct timeval tv, *tvp; int error; if (uap->tv != NULL) { error = copyin(uap->tv, &tv32, sizeof(tv32)); if (error) return (error); CP(tv32, tv, tv_sec); CP(tv32, tv, tv_usec); tvp = &tv; } else tvp = NULL; /* * XXX Do pointers need PTRIN()? */ return (kern_select(td, uap->nd, uap->in, uap->ou, uap->ex, tvp, sizeof(int32_t) * 8)); } int freebsd32_pselect(struct thread *td, struct freebsd32_pselect_args *uap) { struct timespec32 ts32; struct timespec ts; struct timeval tv, *tvp; sigset_t set, *uset; int error; if (uap->ts != NULL) { error = copyin(uap->ts, &ts32, sizeof(ts32)); if (error != 0) return (error); CP(ts32, ts, tv_sec); CP(ts32, ts, tv_nsec); TIMESPEC_TO_TIMEVAL(&tv, &ts); tvp = &tv; } else tvp = NULL; if (uap->sm != NULL) { error = copyin(uap->sm, &set, sizeof(set)); if (error != 0) return (error); uset = &set; } else uset = NULL; /* * XXX Do pointers need PTRIN()? */ error = kern_pselect(td, uap->nd, uap->in, uap->ou, uap->ex, tvp, uset, sizeof(int32_t) * 8); return (error); } /* * Copy 'count' items into the destination list pointed to by uap->eventlist. */ static int freebsd32_kevent_copyout(void *arg, struct kevent *kevp, int count) { struct freebsd32_kevent_args *uap; struct kevent32 ks32[KQ_NEVENTS]; int i, error = 0; KASSERT(count <= KQ_NEVENTS, ("count (%d) > KQ_NEVENTS", count)); uap = (struct freebsd32_kevent_args *)arg; for (i = 0; i < count; i++) { CP(kevp[i], ks32[i], ident); CP(kevp[i], ks32[i], filter); CP(kevp[i], ks32[i], flags); CP(kevp[i], ks32[i], fflags); CP(kevp[i], ks32[i], data); PTROUT_CP(kevp[i], ks32[i], udata); } error = copyout(ks32, uap->eventlist, count * sizeof *ks32); if (error == 0) uap->eventlist += count; return (error); } /* * Copy 'count' items from the list pointed to by uap->changelist. */ static int freebsd32_kevent_copyin(void *arg, struct kevent *kevp, int count) { struct freebsd32_kevent_args *uap; struct kevent32 ks32[KQ_NEVENTS]; int i, error = 0; KASSERT(count <= KQ_NEVENTS, ("count (%d) > KQ_NEVENTS", count)); uap = (struct freebsd32_kevent_args *)arg; error = copyin(uap->changelist, ks32, count * sizeof *ks32); if (error) goto done; uap->changelist += count; for (i = 0; i < count; i++) { CP(ks32[i], kevp[i], ident); CP(ks32[i], kevp[i], filter); CP(ks32[i], kevp[i], flags); CP(ks32[i], kevp[i], fflags); CP(ks32[i], kevp[i], data); PTRIN_CP(ks32[i], kevp[i], udata); } done: return (error); } int freebsd32_kevent(struct thread *td, struct freebsd32_kevent_args *uap) { struct timespec32 ts32; struct timespec ts, *tsp; struct kevent_copyops k_ops = { uap, freebsd32_kevent_copyout, freebsd32_kevent_copyin}; int error; if (uap->timeout) { error = copyin(uap->timeout, &ts32, sizeof(ts32)); if (error) return (error); CP(ts32, ts, tv_sec); CP(ts32, ts, tv_nsec); tsp = &ts; } else tsp = NULL; error = kern_kevent(td, uap->fd, uap->nchanges, uap->nevents, &k_ops, tsp); return (error); } int freebsd32_gettimeofday(struct thread *td, struct freebsd32_gettimeofday_args *uap) { struct timeval atv; struct timeval32 atv32; struct timezone rtz; int error = 0; if (uap->tp) { microtime(&atv); CP(atv, atv32, tv_sec); CP(atv, atv32, tv_usec); error = copyout(&atv32, uap->tp, sizeof (atv32)); } if (error == 0 && uap->tzp != NULL) { rtz.tz_minuteswest = tz_minuteswest; rtz.tz_dsttime = tz_dsttime; error = copyout(&rtz, uap->tzp, sizeof (rtz)); } return (error); } int freebsd32_getrusage(struct thread *td, struct freebsd32_getrusage_args *uap) { struct rusage32 s32; struct rusage s; int error; error = kern_getrusage(td, uap->who, &s); if (error) return (error); if (uap->rusage != NULL) { freebsd32_rusage_out(&s, &s32); error = copyout(&s32, uap->rusage, sizeof(s32)); } return (error); } static int freebsd32_copyinuio(struct iovec32 *iovp, u_int iovcnt, struct uio **uiop) { struct iovec32 iov32; struct iovec *iov; struct uio *uio; u_int iovlen; int error, i; *uiop = NULL; if (iovcnt > UIO_MAXIOV) return (EINVAL); iovlen = iovcnt * sizeof(struct iovec); uio = malloc(iovlen + sizeof *uio, M_IOV, M_WAITOK); iov = (struct iovec *)(uio + 1); for (i = 0; i < iovcnt; i++) { error = copyin(&iovp[i], &iov32, sizeof(struct iovec32)); if (error) { free(uio, M_IOV); return (error); } iov[i].iov_base = PTRIN(iov32.iov_base); iov[i].iov_len = iov32.iov_len; } uio->uio_iov = iov; uio->uio_iovcnt = iovcnt; uio->uio_segflg = UIO_USERSPACE; uio->uio_offset = -1; uio->uio_resid = 0; for (i = 0; i < iovcnt; i++) { if (iov->iov_len > INT_MAX - uio->uio_resid) { free(uio, M_IOV); return (EINVAL); } uio->uio_resid += iov->iov_len; iov++; } *uiop = uio; return (0); } int freebsd32_readv(struct thread *td, struct freebsd32_readv_args *uap) { struct uio *auio; int error; error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = kern_readv(td, uap->fd, auio); free(auio, M_IOV); return (error); } int freebsd32_writev(struct thread *td, struct freebsd32_writev_args *uap) { struct uio *auio; int error; error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = kern_writev(td, uap->fd, auio); free(auio, M_IOV); return (error); } int freebsd32_preadv(struct thread *td, struct freebsd32_preadv_args *uap) { struct uio *auio; int error; error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = kern_preadv(td, uap->fd, auio, PAIR32TO64(off_t,uap->offset)); free(auio, M_IOV); return (error); } int freebsd32_pwritev(struct thread *td, struct freebsd32_pwritev_args *uap) { struct uio *auio; int error; error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = kern_pwritev(td, uap->fd, auio, PAIR32TO64(off_t,uap->offset)); free(auio, M_IOV); return (error); } int freebsd32_copyiniov(struct iovec32 *iovp32, u_int iovcnt, struct iovec **iovp, int error) { struct iovec32 iov32; struct iovec *iov; u_int iovlen; int i; *iovp = NULL; if (iovcnt > UIO_MAXIOV) return (error); iovlen = iovcnt * sizeof(struct iovec); iov = malloc(iovlen, M_IOV, M_WAITOK); for (i = 0; i < iovcnt; i++) { error = copyin(&iovp32[i], &iov32, sizeof(struct iovec32)); if (error) { free(iov, M_IOV); return (error); } iov[i].iov_base = PTRIN(iov32.iov_base); iov[i].iov_len = iov32.iov_len; } *iovp = iov; return (0); } static int freebsd32_copyinmsghdr(struct msghdr32 *msg32, struct msghdr *msg) { struct msghdr32 m32; int error; error = copyin(msg32, &m32, sizeof(m32)); if (error) return (error); msg->msg_name = PTRIN(m32.msg_name); msg->msg_namelen = m32.msg_namelen; msg->msg_iov = PTRIN(m32.msg_iov); msg->msg_iovlen = m32.msg_iovlen; msg->msg_control = PTRIN(m32.msg_control); msg->msg_controllen = m32.msg_controllen; msg->msg_flags = m32.msg_flags; return (0); } static int freebsd32_copyoutmsghdr(struct msghdr *msg, struct msghdr32 *msg32) { struct msghdr32 m32; int error; m32.msg_name = PTROUT(msg->msg_name); m32.msg_namelen = msg->msg_namelen; m32.msg_iov = PTROUT(msg->msg_iov); m32.msg_iovlen = msg->msg_iovlen; m32.msg_control = PTROUT(msg->msg_control); m32.msg_controllen = msg->msg_controllen; m32.msg_flags = msg->msg_flags; error = copyout(&m32, msg32, sizeof(m32)); return (error); } #ifndef __mips__ #define FREEBSD32_ALIGNBYTES (sizeof(int) - 1) #else #define FREEBSD32_ALIGNBYTES (sizeof(long) - 1) #endif #define FREEBSD32_ALIGN(p) \ (((u_long)(p) + FREEBSD32_ALIGNBYTES) & ~FREEBSD32_ALIGNBYTES) #define FREEBSD32_CMSG_SPACE(l) \ (FREEBSD32_ALIGN(sizeof(struct cmsghdr)) + FREEBSD32_ALIGN(l)) #define FREEBSD32_CMSG_DATA(cmsg) ((unsigned char *)(cmsg) + \ FREEBSD32_ALIGN(sizeof(struct cmsghdr))) static int freebsd32_copy_msg_out(struct msghdr *msg, struct mbuf *control) { struct cmsghdr *cm; void *data; socklen_t clen, datalen; int error; caddr_t ctlbuf; int len, maxlen, copylen; struct mbuf *m; error = 0; len = msg->msg_controllen; maxlen = msg->msg_controllen; msg->msg_controllen = 0; m = control; ctlbuf = msg->msg_control; while (m && len > 0) { cm = mtod(m, struct cmsghdr *); clen = m->m_len; while (cm != NULL) { if (sizeof(struct cmsghdr) > clen || cm->cmsg_len > clen) { error = EINVAL; break; } data = CMSG_DATA(cm); datalen = (caddr_t)cm + cm->cmsg_len - (caddr_t)data; /* Adjust message length */ cm->cmsg_len = FREEBSD32_ALIGN(sizeof(struct cmsghdr)) + datalen; /* Copy cmsghdr */ copylen = sizeof(struct cmsghdr); if (len < copylen) { msg->msg_flags |= MSG_CTRUNC; copylen = len; } error = copyout(cm,ctlbuf,copylen); if (error) goto exit; ctlbuf += FREEBSD32_ALIGN(copylen); len -= FREEBSD32_ALIGN(copylen); if (len <= 0) break; /* Copy data */ copylen = datalen; if (len < copylen) { msg->msg_flags |= MSG_CTRUNC; copylen = len; } error = copyout(data,ctlbuf,copylen); if (error) goto exit; ctlbuf += FREEBSD32_ALIGN(copylen); len -= FREEBSD32_ALIGN(copylen); if (CMSG_SPACE(datalen) < clen) { clen -= CMSG_SPACE(datalen); cm = (struct cmsghdr *) ((caddr_t)cm + CMSG_SPACE(datalen)); } else { clen = 0; cm = NULL; } } m = m->m_next; } msg->msg_controllen = (len <= 0) ? maxlen : ctlbuf - (caddr_t)msg->msg_control; exit: return (error); } int freebsd32_recvmsg(td, uap) struct thread *td; struct freebsd32_recvmsg_args /* { int s; struct msghdr32 *msg; int flags; } */ *uap; { struct msghdr msg; struct msghdr32 m32; struct iovec *uiov, *iov; struct mbuf *control = NULL; struct mbuf **controlp; int error; error = copyin(uap->msg, &m32, sizeof(m32)); if (error) return (error); error = freebsd32_copyinmsghdr(uap->msg, &msg); if (error) return (error); error = freebsd32_copyiniov(PTRIN(m32.msg_iov), m32.msg_iovlen, &iov, EMSGSIZE); if (error) return (error); msg.msg_flags = uap->flags; uiov = msg.msg_iov; msg.msg_iov = iov; controlp = (msg.msg_control != NULL) ? &control : NULL; error = kern_recvit(td, uap->s, &msg, UIO_USERSPACE, controlp); if (error == 0) { msg.msg_iov = uiov; if (control != NULL) error = freebsd32_copy_msg_out(&msg, control); else msg.msg_controllen = 0; if (error == 0) error = freebsd32_copyoutmsghdr(&msg, uap->msg); } free(iov, M_IOV); if (control != NULL) m_freem(control); return (error); } /* * Copy-in the array of control messages constructed using alignment * and padding suitable for a 32-bit environment and construct an * mbuf using alignment and padding suitable for a 64-bit kernel. * The alignment and padding are defined indirectly by CMSG_DATA(), * CMSG_SPACE() and CMSG_LEN(). */ static int freebsd32_copyin_control(struct mbuf **mp, caddr_t buf, u_int buflen) { struct mbuf *m; void *md; u_int idx, len, msglen; int error; buflen = FREEBSD32_ALIGN(buflen); if (buflen > MCLBYTES) return (EINVAL); /* * Iterate over the buffer and get the length of each message * in there. This has 32-bit alignment and padding. Use it to * determine the length of these messages when using 64-bit * alignment and padding. */ idx = 0; len = 0; while (idx < buflen) { error = copyin(buf + idx, &msglen, sizeof(msglen)); if (error) return (error); if (msglen < sizeof(struct cmsghdr)) return (EINVAL); msglen = FREEBSD32_ALIGN(msglen); if (idx + msglen > buflen) return (EINVAL); idx += msglen; msglen += CMSG_ALIGN(sizeof(struct cmsghdr)) - FREEBSD32_ALIGN(sizeof(struct cmsghdr)); len += CMSG_ALIGN(msglen); } if (len > MCLBYTES) return (EINVAL); m = m_get(M_WAITOK, MT_CONTROL); if (len > MLEN) MCLGET(m, M_WAITOK); m->m_len = len; md = mtod(m, void *); while (buflen > 0) { error = copyin(buf, md, sizeof(struct cmsghdr)); if (error) break; msglen = *(u_int *)md; msglen = FREEBSD32_ALIGN(msglen); /* Modify the message length to account for alignment. */ *(u_int *)md = msglen + CMSG_ALIGN(sizeof(struct cmsghdr)) - FREEBSD32_ALIGN(sizeof(struct cmsghdr)); md = (char *)md + CMSG_ALIGN(sizeof(struct cmsghdr)); buf += FREEBSD32_ALIGN(sizeof(struct cmsghdr)); buflen -= FREEBSD32_ALIGN(sizeof(struct cmsghdr)); msglen -= FREEBSD32_ALIGN(sizeof(struct cmsghdr)); if (msglen > 0) { error = copyin(buf, md, msglen); if (error) break; md = (char *)md + CMSG_ALIGN(msglen); buf += msglen; buflen -= msglen; } } if (error) m_free(m); else *mp = m; return (error); } int freebsd32_sendmsg(struct thread *td, struct freebsd32_sendmsg_args *uap) { struct msghdr msg; struct msghdr32 m32; struct iovec *iov; struct mbuf *control = NULL; struct sockaddr *to = NULL; int error; error = copyin(uap->msg, &m32, sizeof(m32)); if (error) return (error); error = freebsd32_copyinmsghdr(uap->msg, &msg); if (error) return (error); error = freebsd32_copyiniov(PTRIN(m32.msg_iov), m32.msg_iovlen, &iov, EMSGSIZE); if (error) return (error); msg.msg_iov = iov; if (msg.msg_name != NULL) { error = getsockaddr(&to, msg.msg_name, msg.msg_namelen); if (error) { to = NULL; goto out; } msg.msg_name = to; } if (msg.msg_control) { if (msg.msg_controllen < sizeof(struct cmsghdr)) { error = EINVAL; goto out; } error = freebsd32_copyin_control(&control, msg.msg_control, msg.msg_controllen); if (error) goto out; msg.msg_control = NULL; msg.msg_controllen = 0; } error = kern_sendit(td, uap->s, &msg, uap->flags, control, UIO_USERSPACE); out: free(iov, M_IOV); if (to) free(to, M_SONAME); return (error); } int freebsd32_recvfrom(struct thread *td, struct freebsd32_recvfrom_args *uap) { struct msghdr msg; struct iovec aiov; int error; if (uap->fromlenaddr) { error = copyin(PTRIN(uap->fromlenaddr), &msg.msg_namelen, sizeof(msg.msg_namelen)); if (error) return (error); } else { msg.msg_namelen = 0; } msg.msg_name = PTRIN(uap->from); msg.msg_iov = &aiov; msg.msg_iovlen = 1; aiov.iov_base = PTRIN(uap->buf); aiov.iov_len = uap->len; msg.msg_control = NULL; msg.msg_flags = uap->flags; error = kern_recvit(td, uap->s, &msg, UIO_USERSPACE, NULL); if (error == 0 && uap->fromlenaddr) error = copyout(&msg.msg_namelen, PTRIN(uap->fromlenaddr), sizeof (msg.msg_namelen)); return (error); } int freebsd32_settimeofday(struct thread *td, struct freebsd32_settimeofday_args *uap) { struct timeval32 tv32; struct timeval tv, *tvp; struct timezone tz, *tzp; int error; if (uap->tv) { error = copyin(uap->tv, &tv32, sizeof(tv32)); if (error) return (error); CP(tv32, tv, tv_sec); CP(tv32, tv, tv_usec); tvp = &tv; } else tvp = NULL; if (uap->tzp) { error = copyin(uap->tzp, &tz, sizeof(tz)); if (error) return (error); tzp = &tz; } else tzp = NULL; return (kern_settimeofday(td, tvp, tzp)); } int freebsd32_utimes(struct thread *td, struct freebsd32_utimes_args *uap) { struct timeval32 s32[2]; struct timeval s[2], *sp; int error; if (uap->tptr != NULL) { error = copyin(uap->tptr, s32, sizeof(s32)); if (error) return (error); CP(s32[0], s[0], tv_sec); CP(s32[0], s[0], tv_usec); CP(s32[1], s[1], tv_sec); CP(s32[1], s[1], tv_usec); sp = s; } else sp = NULL; return (kern_utimesat(td, AT_FDCWD, uap->path, UIO_USERSPACE, sp, UIO_SYSSPACE)); } int freebsd32_lutimes(struct thread *td, struct freebsd32_lutimes_args *uap) { struct timeval32 s32[2]; struct timeval s[2], *sp; int error; if (uap->tptr != NULL) { error = copyin(uap->tptr, s32, sizeof(s32)); if (error) return (error); CP(s32[0], s[0], tv_sec); CP(s32[0], s[0], tv_usec); CP(s32[1], s[1], tv_sec); CP(s32[1], s[1], tv_usec); sp = s; } else sp = NULL; return (kern_lutimes(td, uap->path, UIO_USERSPACE, sp, UIO_SYSSPACE)); } int freebsd32_futimes(struct thread *td, struct freebsd32_futimes_args *uap) { struct timeval32 s32[2]; struct timeval s[2], *sp; int error; if (uap->tptr != NULL) { error = copyin(uap->tptr, s32, sizeof(s32)); if (error) return (error); CP(s32[0], s[0], tv_sec); CP(s32[0], s[0], tv_usec); CP(s32[1], s[1], tv_sec); CP(s32[1], s[1], tv_usec); sp = s; } else sp = NULL; return (kern_futimes(td, uap->fd, sp, UIO_SYSSPACE)); } int freebsd32_futimesat(struct thread *td, struct freebsd32_futimesat_args *uap) { struct timeval32 s32[2]; struct timeval s[2], *sp; int error; if (uap->times != NULL) { error = copyin(uap->times, s32, sizeof(s32)); if (error) return (error); CP(s32[0], s[0], tv_sec); CP(s32[0], s[0], tv_usec); CP(s32[1], s[1], tv_sec); CP(s32[1], s[1], tv_usec); sp = s; } else sp = NULL; return (kern_utimesat(td, uap->fd, uap->path, UIO_USERSPACE, sp, UIO_SYSSPACE)); } int freebsd32_futimens(struct thread *td, struct freebsd32_futimens_args *uap) { struct timespec32 ts32[2]; struct timespec ts[2], *tsp; int error; if (uap->times != NULL) { error = copyin(uap->times, ts32, sizeof(ts32)); if (error) return (error); CP(ts32[0], ts[0], tv_sec); CP(ts32[0], ts[0], tv_nsec); CP(ts32[1], ts[1], tv_sec); CP(ts32[1], ts[1], tv_nsec); tsp = ts; } else tsp = NULL; return (kern_futimens(td, uap->fd, tsp, UIO_SYSSPACE)); } int freebsd32_utimensat(struct thread *td, struct freebsd32_utimensat_args *uap) { struct timespec32 ts32[2]; struct timespec ts[2], *tsp; int error; if (uap->times != NULL) { error = copyin(uap->times, ts32, sizeof(ts32)); if (error) return (error); CP(ts32[0], ts[0], tv_sec); CP(ts32[0], ts[0], tv_nsec); CP(ts32[1], ts[1], tv_sec); CP(ts32[1], ts[1], tv_nsec); tsp = ts; } else tsp = NULL; return (kern_utimensat(td, uap->fd, uap->path, UIO_USERSPACE, tsp, UIO_SYSSPACE, uap->flag)); } int freebsd32_adjtime(struct thread *td, struct freebsd32_adjtime_args *uap) { struct timeval32 tv32; struct timeval delta, olddelta, *deltap; int error; if (uap->delta) { error = copyin(uap->delta, &tv32, sizeof(tv32)); if (error) return (error); CP(tv32, delta, tv_sec); CP(tv32, delta, tv_usec); deltap = δ } else deltap = NULL; error = kern_adjtime(td, deltap, &olddelta); if (uap->olddelta && error == 0) { CP(olddelta, tv32, tv_sec); CP(olddelta, tv32, tv_usec); error = copyout(&tv32, uap->olddelta, sizeof(tv32)); } return (error); } #ifdef COMPAT_FREEBSD4 int freebsd4_freebsd32_statfs(struct thread *td, struct freebsd4_freebsd32_statfs_args *uap) { struct statfs32 s32; struct statfs s; int error; error = kern_statfs(td, uap->path, UIO_USERSPACE, &s); if (error) return (error); copy_statfs(&s, &s32); return (copyout(&s32, uap->buf, sizeof(s32))); } #endif #ifdef COMPAT_FREEBSD4 int freebsd4_freebsd32_fstatfs(struct thread *td, struct freebsd4_freebsd32_fstatfs_args *uap) { struct statfs32 s32; struct statfs s; int error; error = kern_fstatfs(td, uap->fd, &s); if (error) return (error); copy_statfs(&s, &s32); return (copyout(&s32, uap->buf, sizeof(s32))); } #endif #ifdef COMPAT_FREEBSD4 int freebsd4_freebsd32_fhstatfs(struct thread *td, struct freebsd4_freebsd32_fhstatfs_args *uap) { struct statfs32 s32; struct statfs s; fhandle_t fh; int error; if ((error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t))) != 0) return (error); error = kern_fhstatfs(td, fh, &s); if (error) return (error); copy_statfs(&s, &s32); return (copyout(&s32, uap->buf, sizeof(s32))); } #endif int freebsd32_pread(struct thread *td, struct freebsd32_pread_args *uap) { struct pread_args ap; ap.fd = uap->fd; ap.buf = uap->buf; ap.nbyte = uap->nbyte; ap.offset = PAIR32TO64(off_t,uap->offset); return (sys_pread(td, &ap)); } int freebsd32_pwrite(struct thread *td, struct freebsd32_pwrite_args *uap) { struct pwrite_args ap; ap.fd = uap->fd; ap.buf = uap->buf; ap.nbyte = uap->nbyte; ap.offset = PAIR32TO64(off_t,uap->offset); return (sys_pwrite(td, &ap)); } #ifdef COMPAT_43 int ofreebsd32_lseek(struct thread *td, struct ofreebsd32_lseek_args *uap) { struct lseek_args nuap; nuap.fd = uap->fd; nuap.offset = uap->offset; nuap.whence = uap->whence; return (sys_lseek(td, &nuap)); } #endif int freebsd32_lseek(struct thread *td, struct freebsd32_lseek_args *uap) { int error; struct lseek_args ap; off_t pos; ap.fd = uap->fd; ap.offset = PAIR32TO64(off_t,uap->offset); ap.whence = uap->whence; error = sys_lseek(td, &ap); /* Expand the quad return into two parts for eax and edx */ pos = td->td_uretoff.tdu_off; td->td_retval[RETVAL_LO] = pos & 0xffffffff; /* %eax */ td->td_retval[RETVAL_HI] = pos >> 32; /* %edx */ return error; } int freebsd32_truncate(struct thread *td, struct freebsd32_truncate_args *uap) { struct truncate_args ap; ap.path = uap->path; ap.length = PAIR32TO64(off_t,uap->length); return (sys_truncate(td, &ap)); } int freebsd32_ftruncate(struct thread *td, struct freebsd32_ftruncate_args *uap) { struct ftruncate_args ap; ap.fd = uap->fd; ap.length = PAIR32TO64(off_t,uap->length); return (sys_ftruncate(td, &ap)); } #ifdef COMPAT_43 int ofreebsd32_getdirentries(struct thread *td, struct ofreebsd32_getdirentries_args *uap) { struct ogetdirentries_args ap; int error; long loff; int32_t loff_cut; ap.fd = uap->fd; ap.buf = uap->buf; ap.count = uap->count; ap.basep = NULL; error = kern_ogetdirentries(td, &ap, &loff); if (error == 0) { loff_cut = loff; error = copyout(&loff_cut, uap->basep, sizeof(int32_t)); } return (error); } #endif int freebsd32_getdirentries(struct thread *td, struct freebsd32_getdirentries_args *uap) { long base; int32_t base32; int error; error = kern_getdirentries(td, uap->fd, uap->buf, uap->count, &base, NULL, UIO_USERSPACE); if (error) return (error); if (uap->basep != NULL) { base32 = base; error = copyout(&base32, uap->basep, sizeof(int32_t)); } return (error); } #ifdef COMPAT_FREEBSD6 /* versions with the 'int pad' argument */ int freebsd6_freebsd32_pread(struct thread *td, struct freebsd6_freebsd32_pread_args *uap) { struct pread_args ap; ap.fd = uap->fd; ap.buf = uap->buf; ap.nbyte = uap->nbyte; ap.offset = PAIR32TO64(off_t,uap->offset); return (sys_pread(td, &ap)); } int freebsd6_freebsd32_pwrite(struct thread *td, struct freebsd6_freebsd32_pwrite_args *uap) { struct pwrite_args ap; ap.fd = uap->fd; ap.buf = uap->buf; ap.nbyte = uap->nbyte; ap.offset = PAIR32TO64(off_t,uap->offset); return (sys_pwrite(td, &ap)); } int freebsd6_freebsd32_lseek(struct thread *td, struct freebsd6_freebsd32_lseek_args *uap) { int error; struct lseek_args ap; off_t pos; ap.fd = uap->fd; ap.offset = PAIR32TO64(off_t,uap->offset); ap.whence = uap->whence; error = sys_lseek(td, &ap); /* Expand the quad return into two parts for eax and edx */ pos = *(off_t *)(td->td_retval); td->td_retval[RETVAL_LO] = pos & 0xffffffff; /* %eax */ td->td_retval[RETVAL_HI] = pos >> 32; /* %edx */ return error; } int freebsd6_freebsd32_truncate(struct thread *td, struct freebsd6_freebsd32_truncate_args *uap) { struct truncate_args ap; ap.path = uap->path; ap.length = PAIR32TO64(off_t,uap->length); return (sys_truncate(td, &ap)); } int freebsd6_freebsd32_ftruncate(struct thread *td, struct freebsd6_freebsd32_ftruncate_args *uap) { struct ftruncate_args ap; ap.fd = uap->fd; ap.length = PAIR32TO64(off_t,uap->length); return (sys_ftruncate(td, &ap)); } #endif /* COMPAT_FREEBSD6 */ struct sf_hdtr32 { uint32_t headers; int hdr_cnt; uint32_t trailers; int trl_cnt; }; static int freebsd32_do_sendfile(struct thread *td, struct freebsd32_sendfile_args *uap, int compat) { struct sf_hdtr32 hdtr32; struct sf_hdtr hdtr; struct uio *hdr_uio, *trl_uio; struct file *fp; cap_rights_t rights; struct iovec32 *iov32; off_t offset, sbytes; int error; offset = PAIR32TO64(off_t, uap->offset); if (offset < 0) return (EINVAL); hdr_uio = trl_uio = NULL; if (uap->hdtr != NULL) { error = copyin(uap->hdtr, &hdtr32, sizeof(hdtr32)); if (error) goto out; PTRIN_CP(hdtr32, hdtr, headers); CP(hdtr32, hdtr, hdr_cnt); PTRIN_CP(hdtr32, hdtr, trailers); CP(hdtr32, hdtr, trl_cnt); if (hdtr.headers != NULL) { iov32 = PTRIN(hdtr32.headers); error = freebsd32_copyinuio(iov32, hdtr32.hdr_cnt, &hdr_uio); if (error) goto out; } if (hdtr.trailers != NULL) { iov32 = PTRIN(hdtr32.trailers); error = freebsd32_copyinuio(iov32, hdtr32.trl_cnt, &trl_uio); if (error) goto out; } } AUDIT_ARG_FD(uap->fd); if ((error = fget_read(td, uap->fd, cap_rights_init(&rights, CAP_PREAD), &fp)) != 0) goto out; error = fo_sendfile(fp, uap->s, hdr_uio, trl_uio, offset, uap->nbytes, &sbytes, uap->flags, compat ? SFK_COMPAT : 0, td); fdrop(fp, td); if (uap->sbytes != NULL) copyout(&sbytes, uap->sbytes, sizeof(off_t)); out: if (hdr_uio) free(hdr_uio, M_IOV); if (trl_uio) free(trl_uio, M_IOV); return (error); } #ifdef COMPAT_FREEBSD4 int freebsd4_freebsd32_sendfile(struct thread *td, struct freebsd4_freebsd32_sendfile_args *uap) { return (freebsd32_do_sendfile(td, (struct freebsd32_sendfile_args *)uap, 1)); } #endif int freebsd32_sendfile(struct thread *td, struct freebsd32_sendfile_args *uap) { return (freebsd32_do_sendfile(td, uap, 0)); } static void copy_stat(struct stat *in, struct stat32 *out) { CP(*in, *out, st_dev); CP(*in, *out, st_ino); CP(*in, *out, st_mode); CP(*in, *out, st_nlink); CP(*in, *out, st_uid); CP(*in, *out, st_gid); CP(*in, *out, st_rdev); TS_CP(*in, *out, st_atim); TS_CP(*in, *out, st_mtim); TS_CP(*in, *out, st_ctim); CP(*in, *out, st_size); CP(*in, *out, st_blocks); CP(*in, *out, st_blksize); CP(*in, *out, st_flags); CP(*in, *out, st_gen); TS_CP(*in, *out, st_birthtim); } #ifdef COMPAT_43 static void copy_ostat(struct stat *in, struct ostat32 *out) { CP(*in, *out, st_dev); CP(*in, *out, st_ino); CP(*in, *out, st_mode); CP(*in, *out, st_nlink); CP(*in, *out, st_uid); CP(*in, *out, st_gid); CP(*in, *out, st_rdev); CP(*in, *out, st_size); TS_CP(*in, *out, st_atim); TS_CP(*in, *out, st_mtim); TS_CP(*in, *out, st_ctim); CP(*in, *out, st_blksize); CP(*in, *out, st_blocks); CP(*in, *out, st_flags); CP(*in, *out, st_gen); } #endif int freebsd32_stat(struct thread *td, struct freebsd32_stat_args *uap) { struct stat sb; struct stat32 sb32; int error; error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error) return (error); copy_stat(&sb, &sb32); error = copyout(&sb32, uap->ub, sizeof (sb32)); return (error); } #ifdef COMPAT_43 int ofreebsd32_stat(struct thread *td, struct ofreebsd32_stat_args *uap) { struct stat sb; struct ostat32 sb32; int error; error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error) return (error); copy_ostat(&sb, &sb32); error = copyout(&sb32, uap->ub, sizeof (sb32)); return (error); } #endif int freebsd32_fstat(struct thread *td, struct freebsd32_fstat_args *uap) { struct stat ub; struct stat32 ub32; int error; error = kern_fstat(td, uap->fd, &ub); if (error) return (error); copy_stat(&ub, &ub32); error = copyout(&ub32, uap->ub, sizeof(ub32)); return (error); } #ifdef COMPAT_43 int ofreebsd32_fstat(struct thread *td, struct ofreebsd32_fstat_args *uap) { struct stat ub; struct ostat32 ub32; int error; error = kern_fstat(td, uap->fd, &ub); if (error) return (error); copy_ostat(&ub, &ub32); error = copyout(&ub32, uap->ub, sizeof(ub32)); return (error); } #endif int freebsd32_fstatat(struct thread *td, struct freebsd32_fstatat_args *uap) { struct stat ub; struct stat32 ub32; int error; error = kern_statat(td, uap->flag, uap->fd, uap->path, UIO_USERSPACE, &ub, NULL); if (error) return (error); copy_stat(&ub, &ub32); error = copyout(&ub32, uap->buf, sizeof(ub32)); return (error); } int freebsd32_lstat(struct thread *td, struct freebsd32_lstat_args *uap) { struct stat sb; struct stat32 sb32; int error; error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error) return (error); copy_stat(&sb, &sb32); error = copyout(&sb32, uap->ub, sizeof (sb32)); return (error); } #ifdef COMPAT_43 int ofreebsd32_lstat(struct thread *td, struct ofreebsd32_lstat_args *uap) { struct stat sb; struct ostat32 sb32; int error; error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error) return (error); copy_ostat(&sb, &sb32); error = copyout(&sb32, uap->ub, sizeof (sb32)); return (error); } #endif int freebsd32_sysctl(struct thread *td, struct freebsd32_sysctl_args *uap) { int error, name[CTL_MAXNAME]; size_t j, oldlen; uint32_t tmp; if (uap->namelen > CTL_MAXNAME || uap->namelen < 2) return (EINVAL); error = copyin(uap->name, name, uap->namelen * sizeof(int)); if (error) return (error); if (uap->oldlenp) { error = fueword32(uap->oldlenp, &tmp); oldlen = tmp; } else { oldlen = 0; } if (error != 0) return (EFAULT); error = userland_sysctl(td, name, uap->namelen, uap->old, &oldlen, 1, uap->new, uap->newlen, &j, SCTL_MASK32); if (error && error != ENOMEM) return (error); if (uap->oldlenp) suword32(uap->oldlenp, j); return (0); } int freebsd32_jail(struct thread *td, struct freebsd32_jail_args *uap) { uint32_t version; int error; struct jail j; error = copyin(uap->jail, &version, sizeof(uint32_t)); if (error) return (error); switch (version) { case 0: { /* FreeBSD single IPv4 jails. */ struct jail32_v0 j32_v0; bzero(&j, sizeof(struct jail)); error = copyin(uap->jail, &j32_v0, sizeof(struct jail32_v0)); if (error) return (error); CP(j32_v0, j, version); PTRIN_CP(j32_v0, j, path); PTRIN_CP(j32_v0, j, hostname); j.ip4s = htonl(j32_v0.ip_number); /* jail_v0 is host order */ break; } case 1: /* * Version 1 was used by multi-IPv4 jail implementations * that never made it into the official kernel. */ return (EINVAL); case 2: /* JAIL_API_VERSION */ { /* FreeBSD multi-IPv4/IPv6,noIP jails. */ struct jail32 j32; error = copyin(uap->jail, &j32, sizeof(struct jail32)); if (error) return (error); CP(j32, j, version); PTRIN_CP(j32, j, path); PTRIN_CP(j32, j, hostname); PTRIN_CP(j32, j, jailname); CP(j32, j, ip4s); CP(j32, j, ip6s); PTRIN_CP(j32, j, ip4); PTRIN_CP(j32, j, ip6); break; } default: /* Sci-Fi jails are not supported, sorry. */ return (EINVAL); } return (kern_jail(td, &j)); } int freebsd32_jail_set(struct thread *td, struct freebsd32_jail_set_args *uap) { struct uio *auio; int error; /* Check that we have an even number of iovecs. */ if (uap->iovcnt & 1) return (EINVAL); error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = kern_jail_set(td, auio, uap->flags); free(auio, M_IOV); return (error); } int freebsd32_jail_get(struct thread *td, struct freebsd32_jail_get_args *uap) { struct iovec32 iov32; struct uio *auio; int error, i; /* Check that we have an even number of iovecs. */ if (uap->iovcnt & 1) return (EINVAL); error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = kern_jail_get(td, auio, uap->flags); if (error == 0) for (i = 0; i < uap->iovcnt; i++) { PTROUT_CP(auio->uio_iov[i], iov32, iov_base); CP(auio->uio_iov[i], iov32, iov_len); error = copyout(&iov32, uap->iovp + i, sizeof(iov32)); if (error != 0) break; } free(auio, M_IOV); return (error); } int freebsd32_sigaction(struct thread *td, struct freebsd32_sigaction_args *uap) { struct sigaction32 s32; struct sigaction sa, osa, *sap; int error; if (uap->act) { error = copyin(uap->act, &s32, sizeof(s32)); if (error) return (error); sa.sa_handler = PTRIN(s32.sa_u); CP(s32, sa, sa_flags); CP(s32, sa, sa_mask); sap = &sa; } else sap = NULL; error = kern_sigaction(td, uap->sig, sap, &osa, 0); if (error == 0 && uap->oact != NULL) { s32.sa_u = PTROUT(osa.sa_handler); CP(osa, s32, sa_flags); CP(osa, s32, sa_mask); error = copyout(&s32, uap->oact, sizeof(s32)); } return (error); } #ifdef COMPAT_FREEBSD4 int freebsd4_freebsd32_sigaction(struct thread *td, struct freebsd4_freebsd32_sigaction_args *uap) { struct sigaction32 s32; struct sigaction sa, osa, *sap; int error; if (uap->act) { error = copyin(uap->act, &s32, sizeof(s32)); if (error) return (error); sa.sa_handler = PTRIN(s32.sa_u); CP(s32, sa, sa_flags); CP(s32, sa, sa_mask); sap = &sa; } else sap = NULL; error = kern_sigaction(td, uap->sig, sap, &osa, KSA_FREEBSD4); if (error == 0 && uap->oact != NULL) { s32.sa_u = PTROUT(osa.sa_handler); CP(osa, s32, sa_flags); CP(osa, s32, sa_mask); error = copyout(&s32, uap->oact, sizeof(s32)); } return (error); } #endif #ifdef COMPAT_43 struct osigaction32 { u_int32_t sa_u; osigset_t sa_mask; int sa_flags; }; #define ONSIG 32 int ofreebsd32_sigaction(struct thread *td, struct ofreebsd32_sigaction_args *uap) { struct osigaction32 s32; struct sigaction sa, osa, *sap; int error; if (uap->signum <= 0 || uap->signum >= ONSIG) return (EINVAL); if (uap->nsa) { error = copyin(uap->nsa, &s32, sizeof(s32)); if (error) return (error); sa.sa_handler = PTRIN(s32.sa_u); CP(s32, sa, sa_flags); OSIG2SIG(s32.sa_mask, sa.sa_mask); sap = &sa; } else sap = NULL; error = kern_sigaction(td, uap->signum, sap, &osa, KSA_OSIGSET); if (error == 0 && uap->osa != NULL) { s32.sa_u = PTROUT(osa.sa_handler); CP(osa, s32, sa_flags); SIG2OSIG(osa.sa_mask, s32.sa_mask); error = copyout(&s32, uap->osa, sizeof(s32)); } return (error); } int ofreebsd32_sigprocmask(struct thread *td, struct ofreebsd32_sigprocmask_args *uap) { sigset_t set, oset; int error; OSIG2SIG(uap->mask, set); error = kern_sigprocmask(td, uap->how, &set, &oset, SIGPROCMASK_OLD); SIG2OSIG(oset, td->td_retval[0]); return (error); } int ofreebsd32_sigpending(struct thread *td, struct ofreebsd32_sigpending_args *uap) { struct proc *p = td->td_proc; sigset_t siglist; PROC_LOCK(p); siglist = p->p_siglist; SIGSETOR(siglist, td->td_siglist); PROC_UNLOCK(p); SIG2OSIG(siglist, td->td_retval[0]); return (0); } struct sigvec32 { u_int32_t sv_handler; int sv_mask; int sv_flags; }; int ofreebsd32_sigvec(struct thread *td, struct ofreebsd32_sigvec_args *uap) { struct sigvec32 vec; struct sigaction sa, osa, *sap; int error; if (uap->signum <= 0 || uap->signum >= ONSIG) return (EINVAL); if (uap->nsv) { error = copyin(uap->nsv, &vec, sizeof(vec)); if (error) return (error); sa.sa_handler = PTRIN(vec.sv_handler); OSIG2SIG(vec.sv_mask, sa.sa_mask); sa.sa_flags = vec.sv_flags; sa.sa_flags ^= SA_RESTART; sap = &sa; } else sap = NULL; error = kern_sigaction(td, uap->signum, sap, &osa, KSA_OSIGSET); if (error == 0 && uap->osv != NULL) { vec.sv_handler = PTROUT(osa.sa_handler); SIG2OSIG(osa.sa_mask, vec.sv_mask); vec.sv_flags = osa.sa_flags; vec.sv_flags &= ~SA_NOCLDWAIT; vec.sv_flags ^= SA_RESTART; error = copyout(&vec, uap->osv, sizeof(vec)); } return (error); } int ofreebsd32_sigblock(struct thread *td, struct ofreebsd32_sigblock_args *uap) { sigset_t set, oset; OSIG2SIG(uap->mask, set); kern_sigprocmask(td, SIG_BLOCK, &set, &oset, 0); SIG2OSIG(oset, td->td_retval[0]); return (0); } int ofreebsd32_sigsetmask(struct thread *td, struct ofreebsd32_sigsetmask_args *uap) { sigset_t set, oset; OSIG2SIG(uap->mask, set); kern_sigprocmask(td, SIG_SETMASK, &set, &oset, 0); SIG2OSIG(oset, td->td_retval[0]); return (0); } int ofreebsd32_sigsuspend(struct thread *td, struct ofreebsd32_sigsuspend_args *uap) { sigset_t mask; OSIG2SIG(uap->mask, mask); return (kern_sigsuspend(td, mask)); } struct sigstack32 { u_int32_t ss_sp; int ss_onstack; }; int ofreebsd32_sigstack(struct thread *td, struct ofreebsd32_sigstack_args *uap) { struct sigstack32 s32; struct sigstack nss, oss; int error = 0, unss; if (uap->nss != NULL) { error = copyin(uap->nss, &s32, sizeof(s32)); if (error) return (error); nss.ss_sp = PTRIN(s32.ss_sp); CP(s32, nss, ss_onstack); unss = 1; } else { unss = 0; } oss.ss_sp = td->td_sigstk.ss_sp; oss.ss_onstack = sigonstack(cpu_getstack(td)); if (unss) { td->td_sigstk.ss_sp = nss.ss_sp; td->td_sigstk.ss_size = 0; td->td_sigstk.ss_flags |= (nss.ss_onstack & SS_ONSTACK); td->td_pflags |= TDP_ALTSTACK; } if (uap->oss != NULL) { s32.ss_sp = PTROUT(oss.ss_sp); CP(oss, s32, ss_onstack); error = copyout(&s32, uap->oss, sizeof(s32)); } return (error); } #endif int freebsd32_nanosleep(struct thread *td, struct freebsd32_nanosleep_args *uap) { struct timespec32 rmt32, rqt32; struct timespec rmt, rqt; int error; error = copyin(uap->rqtp, &rqt32, sizeof(rqt32)); if (error) return (error); CP(rqt32, rqt, tv_sec); CP(rqt32, rqt, tv_nsec); if (uap->rmtp && !useracc((caddr_t)uap->rmtp, sizeof(rmt), VM_PROT_WRITE)) return (EFAULT); error = kern_nanosleep(td, &rqt, &rmt); if (error && uap->rmtp) { int error2; CP(rmt, rmt32, tv_sec); CP(rmt, rmt32, tv_nsec); error2 = copyout(&rmt32, uap->rmtp, sizeof(rmt32)); if (error2) error = error2; } return (error); } int freebsd32_clock_gettime(struct thread *td, struct freebsd32_clock_gettime_args *uap) { struct timespec ats; struct timespec32 ats32; int error; error = kern_clock_gettime(td, uap->clock_id, &ats); if (error == 0) { CP(ats, ats32, tv_sec); CP(ats, ats32, tv_nsec); error = copyout(&ats32, uap->tp, sizeof(ats32)); } return (error); } int freebsd32_clock_settime(struct thread *td, struct freebsd32_clock_settime_args *uap) { struct timespec ats; struct timespec32 ats32; int error; error = copyin(uap->tp, &ats32, sizeof(ats32)); if (error) return (error); CP(ats32, ats, tv_sec); CP(ats32, ats, tv_nsec); return (kern_clock_settime(td, uap->clock_id, &ats)); } int freebsd32_clock_getres(struct thread *td, struct freebsd32_clock_getres_args *uap) { struct timespec ts; struct timespec32 ts32; int error; if (uap->tp == NULL) return (0); error = kern_clock_getres(td, uap->clock_id, &ts); if (error == 0) { CP(ts, ts32, tv_sec); CP(ts, ts32, tv_nsec); error = copyout(&ts32, uap->tp, sizeof(ts32)); } return (error); } int freebsd32_ktimer_create(struct thread *td, struct freebsd32_ktimer_create_args *uap) { struct sigevent32 ev32; struct sigevent ev, *evp; int error, id; if (uap->evp == NULL) { evp = NULL; } else { evp = &ev; error = copyin(uap->evp, &ev32, sizeof(ev32)); if (error != 0) return (error); error = convert_sigevent32(&ev32, &ev); if (error != 0) return (error); } error = kern_ktimer_create(td, uap->clock_id, evp, &id, -1); if (error == 0) { error = copyout(&id, uap->timerid, sizeof(int)); if (error != 0) kern_ktimer_delete(td, id); } return (error); } int freebsd32_ktimer_settime(struct thread *td, struct freebsd32_ktimer_settime_args *uap) { struct itimerspec32 val32, oval32; struct itimerspec val, oval, *ovalp; int error; error = copyin(uap->value, &val32, sizeof(val32)); if (error != 0) return (error); ITS_CP(val32, val); ovalp = uap->ovalue != NULL ? &oval : NULL; error = kern_ktimer_settime(td, uap->timerid, uap->flags, &val, ovalp); if (error == 0 && uap->ovalue != NULL) { ITS_CP(oval, oval32); error = copyout(&oval32, uap->ovalue, sizeof(oval32)); } return (error); } int freebsd32_ktimer_gettime(struct thread *td, struct freebsd32_ktimer_gettime_args *uap) { struct itimerspec32 val32; struct itimerspec val; int error; error = kern_ktimer_gettime(td, uap->timerid, &val); if (error == 0) { ITS_CP(val, val32); error = copyout(&val32, uap->value, sizeof(val32)); } return (error); } int freebsd32_clock_getcpuclockid2(struct thread *td, struct freebsd32_clock_getcpuclockid2_args *uap) { clockid_t clk_id; int error; error = kern_clock_getcpuclockid2(td, PAIR32TO64(id_t, uap->id), uap->which, &clk_id); if (error == 0) error = copyout(&clk_id, uap->clock_id, sizeof(clockid_t)); return (error); } int freebsd32_thr_new(struct thread *td, struct freebsd32_thr_new_args *uap) { struct thr_param32 param32; struct thr_param param; int error; if (uap->param_size < 0 || uap->param_size > sizeof(struct thr_param32)) return (EINVAL); bzero(¶m, sizeof(struct thr_param)); bzero(¶m32, sizeof(struct thr_param32)); error = copyin(uap->param, ¶m32, uap->param_size); if (error != 0) return (error); param.start_func = PTRIN(param32.start_func); param.arg = PTRIN(param32.arg); param.stack_base = PTRIN(param32.stack_base); param.stack_size = param32.stack_size; param.tls_base = PTRIN(param32.tls_base); param.tls_size = param32.tls_size; param.child_tid = PTRIN(param32.child_tid); param.parent_tid = PTRIN(param32.parent_tid); param.flags = param32.flags; param.rtp = PTRIN(param32.rtp); param.spare[0] = PTRIN(param32.spare[0]); param.spare[1] = PTRIN(param32.spare[1]); param.spare[2] = PTRIN(param32.spare[2]); return (kern_thr_new(td, ¶m)); } int freebsd32_thr_suspend(struct thread *td, struct freebsd32_thr_suspend_args *uap) { struct timespec32 ts32; struct timespec ts, *tsp; int error; error = 0; tsp = NULL; if (uap->timeout != NULL) { error = copyin((const void *)uap->timeout, (void *)&ts32, sizeof(struct timespec32)); if (error != 0) return (error); ts.tv_sec = ts32.tv_sec; ts.tv_nsec = ts32.tv_nsec; tsp = &ts; } return (kern_thr_suspend(td, tsp)); } void siginfo_to_siginfo32(const siginfo_t *src, struct siginfo32 *dst) { bzero(dst, sizeof(*dst)); dst->si_signo = src->si_signo; dst->si_errno = src->si_errno; dst->si_code = src->si_code; dst->si_pid = src->si_pid; dst->si_uid = src->si_uid; dst->si_status = src->si_status; dst->si_addr = (uintptr_t)src->si_addr; dst->si_value.sival_int = src->si_value.sival_int; dst->si_timerid = src->si_timerid; dst->si_overrun = src->si_overrun; } int freebsd32_sigtimedwait(struct thread *td, struct freebsd32_sigtimedwait_args *uap) { struct timespec32 ts32; struct timespec ts; struct timespec *timeout; sigset_t set; ksiginfo_t ksi; struct siginfo32 si32; int error; if (uap->timeout) { error = copyin(uap->timeout, &ts32, sizeof(ts32)); if (error) return (error); ts.tv_sec = ts32.tv_sec; ts.tv_nsec = ts32.tv_nsec; timeout = &ts; } else timeout = NULL; error = copyin(uap->set, &set, sizeof(set)); if (error) return (error); error = kern_sigtimedwait(td, set, &ksi, timeout); if (error) return (error); if (uap->info) { siginfo_to_siginfo32(&ksi.ksi_info, &si32); error = copyout(&si32, uap->info, sizeof(struct siginfo32)); } if (error == 0) td->td_retval[0] = ksi.ksi_signo; return (error); } /* * MPSAFE */ int freebsd32_sigwaitinfo(struct thread *td, struct freebsd32_sigwaitinfo_args *uap) { ksiginfo_t ksi; struct siginfo32 si32; sigset_t set; int error; error = copyin(uap->set, &set, sizeof(set)); if (error) return (error); error = kern_sigtimedwait(td, set, &ksi, NULL); if (error) return (error); if (uap->info) { siginfo_to_siginfo32(&ksi.ksi_info, &si32); error = copyout(&si32, uap->info, sizeof(struct siginfo32)); } if (error == 0) td->td_retval[0] = ksi.ksi_signo; return (error); } int freebsd32_cpuset_setid(struct thread *td, struct freebsd32_cpuset_setid_args *uap) { struct cpuset_setid_args ap; ap.which = uap->which; ap.id = PAIR32TO64(id_t,uap->id); ap.setid = uap->setid; return (sys_cpuset_setid(td, &ap)); } int freebsd32_cpuset_getid(struct thread *td, struct freebsd32_cpuset_getid_args *uap) { struct cpuset_getid_args ap; ap.level = uap->level; ap.which = uap->which; ap.id = PAIR32TO64(id_t,uap->id); ap.setid = uap->setid; return (sys_cpuset_getid(td, &ap)); } int freebsd32_cpuset_getaffinity(struct thread *td, struct freebsd32_cpuset_getaffinity_args *uap) { struct cpuset_getaffinity_args ap; ap.level = uap->level; ap.which = uap->which; ap.id = PAIR32TO64(id_t,uap->id); ap.cpusetsize = uap->cpusetsize; ap.mask = uap->mask; return (sys_cpuset_getaffinity(td, &ap)); } int freebsd32_cpuset_setaffinity(struct thread *td, struct freebsd32_cpuset_setaffinity_args *uap) { struct cpuset_setaffinity_args ap; ap.level = uap->level; ap.which = uap->which; ap.id = PAIR32TO64(id_t,uap->id); ap.cpusetsize = uap->cpusetsize; ap.mask = uap->mask; return (sys_cpuset_setaffinity(td, &ap)); } int freebsd32_nmount(struct thread *td, struct freebsd32_nmount_args /* { struct iovec *iovp; unsigned int iovcnt; int flags; } */ *uap) { struct uio *auio; uint64_t flags; int error; /* * Mount flags are now 64-bits. On 32-bit archtectures only * 32-bits are passed in, but from here on everything handles * 64-bit flags correctly. */ flags = uap->flags; AUDIT_ARG_FFLAGS(flags); /* * Filter out MNT_ROOTFS. We do not want clients of nmount() in * userspace to set this flag, but we must filter it out if we want * MNT_UPDATE on the root file system to work. * MNT_ROOTFS should only be set by the kernel when mounting its * root file system. */ flags &= ~MNT_ROOTFS; /* * check that we have an even number of iovec's * and that we have at least two options. */ if ((uap->iovcnt & 1) || (uap->iovcnt < 4)) return (EINVAL); error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio); if (error) return (error); error = vfs_donmount(td, flags, auio); free(auio, M_IOV); return error; } #if 0 int freebsd32_xxx(struct thread *td, struct freebsd32_xxx_args *uap) { struct yyy32 *p32, s32; struct yyy *p = NULL, s; struct xxx_arg ap; int error; if (uap->zzz) { error = copyin(uap->zzz, &s32, sizeof(s32)); if (error) return (error); /* translate in */ p = &s; } error = kern_xxx(td, p); if (error) return (error); if (uap->zzz) { /* translate out */ error = copyout(&s32, p32, sizeof(s32)); } return (error); } #endif int syscall32_register(int *offset, struct sysent *new_sysent, struct sysent *old_sysent, int flags) { if ((flags & ~SY_THR_STATIC) != 0) return (EINVAL); if (*offset == NO_SYSCALL) { int i; for (i = 1; i < SYS_MAXSYSCALL; ++i) if (freebsd32_sysent[i].sy_call == (sy_call_t *)lkmnosys) break; if (i == SYS_MAXSYSCALL) return (ENFILE); *offset = i; } else if (*offset < 0 || *offset >= SYS_MAXSYSCALL) return (EINVAL); else if (freebsd32_sysent[*offset].sy_call != (sy_call_t *)lkmnosys && freebsd32_sysent[*offset].sy_call != (sy_call_t *)lkmressys) return (EEXIST); *old_sysent = freebsd32_sysent[*offset]; freebsd32_sysent[*offset] = *new_sysent; atomic_store_rel_32(&freebsd32_sysent[*offset].sy_thrcnt, flags); return (0); } int syscall32_deregister(int *offset, struct sysent *old_sysent) { if (*offset == 0) return (0); freebsd32_sysent[*offset] = *old_sysent; return (0); } int syscall32_module_handler(struct module *mod, int what, void *arg) { struct syscall_module_data *data = (struct syscall_module_data*)arg; modspecific_t ms; int error; switch (what) { case MOD_LOAD: error = syscall32_register(data->offset, data->new_sysent, &data->old_sysent, SY_THR_STATIC_KLD); if (error) { /* Leave a mark so we know to safely unload below. */ data->offset = NULL; return error; } ms.intval = *data->offset; MOD_XLOCK; module_setspecific(mod, &ms); MOD_XUNLOCK; if (data->chainevh) error = data->chainevh(mod, what, data->chainarg); return (error); case MOD_UNLOAD: /* * MOD_LOAD failed, so just return without calling the * chained handler since we didn't pass along the MOD_LOAD * event. */ if (data->offset == NULL) return (0); if (data->chainevh) { error = data->chainevh(mod, what, data->chainarg); if (error) return (error); } error = syscall32_deregister(data->offset, &data->old_sysent); return (error); default: error = EOPNOTSUPP; if (data->chainevh) error = data->chainevh(mod, what, data->chainarg); return (error); } } int syscall32_helper_register(struct syscall_helper_data *sd, int flags) { struct syscall_helper_data *sd1; int error; for (sd1 = sd; sd1->syscall_no != NO_SYSCALL; sd1++) { error = syscall32_register(&sd1->syscall_no, &sd1->new_sysent, &sd1->old_sysent, flags); if (error != 0) { syscall32_helper_unregister(sd); return (error); } sd1->registered = 1; } return (0); } int syscall32_helper_unregister(struct syscall_helper_data *sd) { struct syscall_helper_data *sd1; for (sd1 = sd; sd1->registered != 0; sd1++) { syscall32_deregister(&sd1->syscall_no, &sd1->old_sysent); sd1->registered = 0; } return (0); } register_t * freebsd32_copyout_strings(struct image_params *imgp) { int argc, envc, i; u_int32_t *vectp; char *stringp; uintptr_t destp; u_int32_t *stack_base; struct freebsd32_ps_strings *arginfo; char canary[sizeof(long) * 8]; int32_t pagesizes32[MAXPAGESIZES]; size_t execpath_len; int szsigcode; /* * Calculate string base and vector table pointers. * Also deal with signal trampoline code for this exec type. */ if (imgp->execpath != NULL && imgp->auxargs != NULL) execpath_len = strlen(imgp->execpath) + 1; else execpath_len = 0; arginfo = (struct freebsd32_ps_strings *)curproc->p_sysent-> sv_psstrings; if (imgp->proc->p_sysent->sv_sigcode_base == 0) szsigcode = *(imgp->proc->p_sysent->sv_szsigcode); else szsigcode = 0; destp = (uintptr_t)arginfo; /* * install sigcode */ if (szsigcode != 0) { destp -= szsigcode; destp = rounddown2(destp, sizeof(uint32_t)); copyout(imgp->proc->p_sysent->sv_sigcode, (void *)destp, szsigcode); } /* * Copy the image path for the rtld. */ if (execpath_len != 0) { destp -= execpath_len; imgp->execpathp = destp; copyout(imgp->execpath, (void *)destp, execpath_len); } /* * Prepare the canary for SSP. */ arc4rand(canary, sizeof(canary), 0); destp -= sizeof(canary); imgp->canary = destp; copyout(canary, (void *)destp, sizeof(canary)); imgp->canarylen = sizeof(canary); /* * Prepare the pagesizes array. */ for (i = 0; i < MAXPAGESIZES; i++) pagesizes32[i] = (uint32_t)pagesizes[i]; destp -= sizeof(pagesizes32); destp = rounddown2(destp, sizeof(uint32_t)); imgp->pagesizes = destp; copyout(pagesizes32, (void *)destp, sizeof(pagesizes32)); imgp->pagesizeslen = sizeof(pagesizes32); destp -= ARG_MAX - imgp->args->stringspace; destp = rounddown2(destp, sizeof(uint32_t)); /* * If we have a valid auxargs ptr, prepare some room * on the stack. */ if (imgp->auxargs) { /* * 'AT_COUNT*2' is size for the ELF Auxargs data. This is for * lower compatibility. */ imgp->auxarg_size = (imgp->auxarg_size) ? imgp->auxarg_size : (AT_COUNT * 2); /* * The '+ 2' is for the null pointers at the end of each of * the arg and env vector sets,and imgp->auxarg_size is room * for argument of Runtime loader. */ vectp = (u_int32_t *) (destp - (imgp->args->argc + imgp->args->envc + 2 + imgp->auxarg_size + execpath_len) * sizeof(u_int32_t)); } else { /* * The '+ 2' is for the null pointers at the end of each of * the arg and env vector sets */ vectp = (u_int32_t *)(destp - (imgp->args->argc + imgp->args->envc + 2) * sizeof(u_int32_t)); } /* * vectp also becomes our initial stack base */ stack_base = vectp; stringp = imgp->args->begin_argv; argc = imgp->args->argc; envc = imgp->args->envc; /* * Copy out strings - arguments and environment. */ copyout(stringp, (void *)destp, ARG_MAX - imgp->args->stringspace); /* * Fill in "ps_strings" struct for ps, w, etc. */ suword32(&arginfo->ps_argvstr, (u_int32_t)(intptr_t)vectp); suword32(&arginfo->ps_nargvstr, argc); /* * Fill in argument portion of vector table. */ for (; argc > 0; --argc) { suword32(vectp++, (u_int32_t)(intptr_t)destp); while (*stringp++ != 0) destp++; destp++; } /* a null vector table pointer separates the argp's from the envp's */ suword32(vectp++, 0); suword32(&arginfo->ps_envstr, (u_int32_t)(intptr_t)vectp); suword32(&arginfo->ps_nenvstr, envc); /* * Fill in environment portion of vector table. */ for (; envc > 0; --envc) { suword32(vectp++, (u_int32_t)(intptr_t)destp); while (*stringp++ != 0) destp++; destp++; } /* end of vector table is a null pointer */ suword32(vectp, 0); return ((register_t *)stack_base); } int freebsd32_kldstat(struct thread *td, struct freebsd32_kldstat_args *uap) { struct kld_file_stat stat; struct kld32_file_stat stat32; int error, version; if ((error = copyin(&uap->stat->version, &version, sizeof(version))) != 0) return (error); if (version != sizeof(struct kld32_file_stat_1) && version != sizeof(struct kld32_file_stat)) return (EINVAL); error = kern_kldstat(td, uap->fileid, &stat); if (error != 0) return (error); bcopy(&stat.name[0], &stat32.name[0], sizeof(stat.name)); CP(stat, stat32, refs); CP(stat, stat32, id); PTROUT_CP(stat, stat32, address); CP(stat, stat32, size); bcopy(&stat.pathname[0], &stat32.pathname[0], sizeof(stat.pathname)); return (copyout(&stat32, uap->stat, version)); } int freebsd32_posix_fallocate(struct thread *td, struct freebsd32_posix_fallocate_args *uap) { td->td_retval[0] = kern_posix_fallocate(td, uap->fd, PAIR32TO64(off_t, uap->offset), PAIR32TO64(off_t, uap->len)); return (0); } int freebsd32_posix_fadvise(struct thread *td, struct freebsd32_posix_fadvise_args *uap) { td->td_retval[0] = kern_posix_fadvise(td, uap->fd, PAIR32TO64(off_t, uap->offset), PAIR32TO64(off_t, uap->len), uap->advice); return (0); } int convert_sigevent32(struct sigevent32 *sig32, struct sigevent *sig) { CP(*sig32, *sig, sigev_notify); switch (sig->sigev_notify) { case SIGEV_NONE: break; case SIGEV_THREAD_ID: CP(*sig32, *sig, sigev_notify_thread_id); /* FALLTHROUGH */ case SIGEV_SIGNAL: CP(*sig32, *sig, sigev_signo); PTRIN_CP(*sig32, *sig, sigev_value.sival_ptr); break; case SIGEV_KEVENT: CP(*sig32, *sig, sigev_notify_kqueue); CP(*sig32, *sig, sigev_notify_kevent_flags); PTRIN_CP(*sig32, *sig, sigev_value.sival_ptr); break; default: return (EINVAL); } return (0); } int freebsd32_procctl(struct thread *td, struct freebsd32_procctl_args *uap) { void *data; union { struct procctl_reaper_status rs; struct procctl_reaper_pids rp; struct procctl_reaper_kill rk; } x; union { struct procctl_reaper_pids32 rp; } x32; int error, error1, flags; switch (uap->com) { case PROC_SPROTECT: case PROC_TRACE_CTL: error = copyin(PTRIN(uap->data), &flags, sizeof(flags)); if (error != 0) return (error); data = &flags; break; case PROC_REAP_ACQUIRE: case PROC_REAP_RELEASE: if (uap->data != NULL) return (EINVAL); data = NULL; break; case PROC_REAP_STATUS: data = &x.rs; break; case PROC_REAP_GETPIDS: error = copyin(uap->data, &x32.rp, sizeof(x32.rp)); if (error != 0) return (error); CP(x32.rp, x.rp, rp_count); PTRIN_CP(x32.rp, x.rp, rp_pids); data = &x.rp; break; case PROC_REAP_KILL: error = copyin(uap->data, &x.rk, sizeof(x.rk)); if (error != 0) return (error); data = &x.rk; break; case PROC_TRACE_STATUS: data = &flags; break; default: return (EINVAL); } error = kern_procctl(td, uap->idtype, PAIR32TO64(id_t, uap->id), uap->com, data); switch (uap->com) { case PROC_REAP_STATUS: if (error == 0) error = copyout(&x.rs, uap->data, sizeof(x.rs)); break; case PROC_REAP_KILL: error1 = copyout(&x.rk, uap->data, sizeof(x.rk)); if (error == 0) error = error1; break; case PROC_TRACE_STATUS: if (error == 0) error = copyout(&flags, uap->data, sizeof(flags)); break; } return (error); } int freebsd32_fcntl(struct thread *td, struct freebsd32_fcntl_args *uap) { long tmp; switch (uap->cmd) { /* * Do unsigned conversion for arg when operation * interprets it as flags or pointer. */ case F_SETLK_REMOTE: case F_SETLKW: case F_SETLK: case F_GETLK: case F_SETFD: case F_SETFL: case F_OGETLK: case F_OSETLK: case F_OSETLKW: tmp = (unsigned int)(uap->arg); break; default: tmp = uap->arg; break; } return (kern_fcntl_freebsd(td, uap->fd, uap->cmd, tmp)); } int freebsd32_ppoll(struct thread *td, struct freebsd32_ppoll_args *uap) { struct timespec32 ts32; struct timespec ts, *tsp; sigset_t set, *ssp; int error; if (uap->ts != NULL) { error = copyin(uap->ts, &ts32, sizeof(ts32)); if (error != 0) return (error); CP(ts32, ts, tv_sec); CP(ts32, ts, tv_nsec); tsp = &ts; } else tsp = NULL; if (uap->set != NULL) { error = copyin(uap->set, &set, sizeof(set)); if (error != 0) return (error); ssp = &set; } else ssp = NULL; return (kern_poll(td, uap->fds, uap->nfds, tsp, ssp)); } Index: user/ngie/more-tests/sys/compat/linprocfs/linprocfs.c =================================================================== --- user/ngie/more-tests/sys/compat/linprocfs/linprocfs.c (revision 281584) +++ user/ngie/more-tests/sys/compat/linprocfs/linprocfs.c (revision 281585) @@ -1,1467 +1,1476 @@ /*- * Copyright (c) 2000 Dag-Erling Coïdan Smørgrav * Copyright (c) 1999 Pierre Beyssac * Copyright (c) 1993 Jan-Simon Pendry * Copyright (c) 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Jan-Simon Pendry. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)procfs_status.c 8.4 (Berkeley) 6/15/94 */ #include "opt_compat.h" #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include +#include #include #include #include -#include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if defined(__i386__) || defined(__amd64__) #include #include #endif /* __i386__ || __amd64__ */ #ifdef COMPAT_FREEBSD32 #include #endif #include #include #include #include #include #include /* * Various conversion macros */ #define T2J(x) ((long)(((x) * 100ULL) / (stathz ? stathz : hz))) /* ticks to jiffies */ #define T2CS(x) ((unsigned long)(((x) * 100ULL) / (stathz ? stathz : hz))) /* ticks to centiseconds */ #define T2S(x) ((x) / (stathz ? stathz : hz)) /* ticks to seconds */ #define B2K(x) ((x) >> 10) /* bytes to kbytes */ #define B2P(x) ((x) >> PAGE_SHIFT) /* bytes to pages */ #define P2B(x) ((x) << PAGE_SHIFT) /* pages to bytes */ #define P2K(x) ((x) << (PAGE_SHIFT - 10)) /* pages to kbytes */ #define TV2J(x) ((x)->tv_sec * 100UL + (x)->tv_usec / 10000) /** * @brief Mapping of ki_stat in struct kinfo_proc to the linux state * * The linux procfs state field displays one of the characters RSDZTW to * denote running, sleeping in an interruptible wait, waiting in an * uninterruptible disk sleep, a zombie process, process is being traced * or stopped, or process is paging respectively. * * Our struct kinfo_proc contains the variable ki_stat which contains a * value out of SIDL, SRUN, SSLEEP, SSTOP, SZOMB, SWAIT and SLOCK. * * This character array is used with ki_stati-1 as an index and tries to * map our states to suitable linux states. */ static char linux_state[] = "RRSTZDD"; /* * Filler function for proc/meminfo */ static int linprocfs_domeminfo(PFS_FILL_ARGS) { unsigned long memtotal; /* total memory in bytes */ unsigned long memused; /* used memory in bytes */ unsigned long memfree; /* free memory in bytes */ unsigned long memshared; /* shared memory ??? */ unsigned long buffers, cached; /* buffer / cache memory ??? */ unsigned long long swaptotal; /* total swap space in bytes */ unsigned long long swapused; /* used swap space in bytes */ unsigned long long swapfree; /* free swap space in bytes */ vm_object_t object; int i, j; memtotal = physmem * PAGE_SIZE; /* * The correct thing here would be: * memfree = vm_cnt.v_free_count * PAGE_SIZE; memused = memtotal - memfree; * * but it might mislead linux binaries into thinking there * is very little memory left, so we cheat and tell them that * all memory that isn't wired down is free. */ memused = vm_cnt.v_wire_count * PAGE_SIZE; memfree = memtotal - memused; swap_pager_status(&i, &j); swaptotal = (unsigned long long)i * PAGE_SIZE; swapused = (unsigned long long)j * PAGE_SIZE; swapfree = swaptotal - swapused; memshared = 0; mtx_lock(&vm_object_list_mtx); TAILQ_FOREACH(object, &vm_object_list, object_list) if (object->shadow_count > 1) memshared += object->resident_page_count; mtx_unlock(&vm_object_list_mtx); memshared *= PAGE_SIZE; /* * We'd love to be able to write: * buffers = bufspace; * * but bufspace is internal to vfs_bio.c and we don't feel * like unstaticizing it just for linprocfs's sake. */ buffers = 0; cached = vm_cnt.v_cache_count * PAGE_SIZE; sbuf_printf(sb, " total: used: free: shared: buffers: cached:\n" "Mem: %lu %lu %lu %lu %lu %lu\n" "Swap: %llu %llu %llu\n" "MemTotal: %9lu kB\n" "MemFree: %9lu kB\n" "MemShared:%9lu kB\n" "Buffers: %9lu kB\n" "Cached: %9lu kB\n" "SwapTotal:%9llu kB\n" "SwapFree: %9llu kB\n", memtotal, memused, memfree, memshared, buffers, cached, swaptotal, swapused, swapfree, B2K(memtotal), B2K(memfree), B2K(memshared), B2K(buffers), B2K(cached), B2K(swaptotal), B2K(swapfree)); return (0); } #if defined(__i386__) || defined(__amd64__) /* * Filler function for proc/cpuinfo (i386 & amd64 version) */ static int linprocfs_docpuinfo(PFS_FILL_ARGS) { int hw_model[2]; char model[128]; uint64_t freq; size_t size; int class, fqmhz, fqkhz; int i; /* * We default the flags to include all non-conflicting flags, * and the Intel versions of conflicting flags. */ static char *flags[] = { "fpu", "vme", "de", "pse", "tsc", "msr", "pae", "mce", "cx8", "apic", "sep", "sep", "mtrr", "pge", "mca", "cmov", "pat", "pse36", "pn", "b19", "b20", "b21", "mmxext", "mmx", "fxsr", "xmm", "sse2", "b27", "b28", "b29", "3dnowext", "3dnow" }; switch (cpu_class) { #ifdef __i386__ case CPUCLASS_286: class = 2; break; case CPUCLASS_386: class = 3; break; case CPUCLASS_486: class = 4; break; case CPUCLASS_586: class = 5; break; case CPUCLASS_686: class = 6; break; default: class = 0; break; #else /* __amd64__ */ default: class = 15; break; #endif } hw_model[0] = CTL_HW; hw_model[1] = HW_MODEL; model[0] = '\0'; size = sizeof(model); if (kernel_sysctl(td, hw_model, 2, &model, &size, 0, 0, 0, 0) != 0) strcpy(model, "unknown"); for (i = 0; i < mp_ncpus; ++i) { sbuf_printf(sb, "processor\t: %d\n" "vendor_id\t: %.20s\n" "cpu family\t: %u\n" "model\t\t: %u\n" "model name\t: %s\n" "stepping\t: %u\n\n", i, cpu_vendor, CPUID_TO_FAMILY(cpu_id), CPUID_TO_MODEL(cpu_id), model, cpu_id & CPUID_STEPPING); /* XXX per-cpu vendor / class / model / id? */ } sbuf_cat(sb, "flags\t\t:"); #ifdef __i386__ switch (cpu_vendor_id) { case CPU_VENDOR_AMD: if (class < 6) flags[16] = "fcmov"; break; case CPU_VENDOR_CYRIX: flags[24] = "cxmmx"; break; } #endif for (i = 0; i < 32; i++) if (cpu_feature & (1 << i)) sbuf_printf(sb, " %s", flags[i]); sbuf_cat(sb, "\n"); freq = atomic_load_acq_64(&tsc_freq); if (freq != 0) { fqmhz = (freq + 4999) / 1000000; fqkhz = ((freq + 4999) / 10000) % 100; sbuf_printf(sb, "cpu MHz\t\t: %d.%02d\n" "bogomips\t: %d.%02d\n", fqmhz, fqkhz, fqmhz, fqkhz); } return (0); } #endif /* __i386__ || __amd64__ */ /* * Filler function for proc/mtab * * This file doesn't exist in Linux' procfs, but is included here so * users can symlink /compat/linux/etc/mtab to /proc/mtab */ static int linprocfs_domtab(PFS_FILL_ARGS) { struct nameidata nd; - struct mount *mp; const char *lep; char *dlep, *flep, *mntto, *mntfrom, *fstype; size_t lep_len; int error; + struct statfs *buf, *sp; + size_t count; /* resolve symlinks etc. in the emulation tree prefix */ NDINIT(&nd, LOOKUP, FOLLOW, UIO_SYSSPACE, linux_emul_path, td); flep = NULL; error = namei(&nd); lep = linux_emul_path; if (error == 0) { if (vn_fullpath(td, nd.ni_vp, &dlep, &flep) == 0) lep = dlep; vrele(nd.ni_vp); } lep_len = strlen(lep); - mtx_lock(&mountlist_mtx); - error = 0; - TAILQ_FOREACH(mp, &mountlist, mnt_list) { + buf = NULL; + error = kern_getfsstat(td, &buf, SIZE_T_MAX, &count, + UIO_SYSSPACE, MNT_WAIT); + if (error != 0) { + free(buf, M_TEMP); + free(flep, M_TEMP); + return (error); + } + + for (sp = buf; count > 0; sp++, count--) { /* determine device name */ - mntfrom = mp->mnt_stat.f_mntfromname; + mntfrom = sp->f_mntfromname; /* determine mount point */ - mntto = mp->mnt_stat.f_mntonname; - if (strncmp(mntto, lep, lep_len) == 0 && - mntto[lep_len] == '/') + mntto = sp->f_mntonname; + if (strncmp(mntto, lep, lep_len) == 0 && mntto[lep_len] == '/') mntto += lep_len; /* determine fs type */ - fstype = mp->mnt_stat.f_fstypename; + fstype = sp->f_fstypename; if (strcmp(fstype, pn->pn_info->pi_name) == 0) mntfrom = fstype = "proc"; else if (strcmp(fstype, "procfs") == 0) continue; if (strcmp(fstype, "linsysfs") == 0) { sbuf_printf(sb, "/sys %s sysfs %s", mntto, - mp->mnt_stat.f_flags & MNT_RDONLY ? "ro" : "rw"); + sp->f_flags & MNT_RDONLY ? "ro" : "rw"); } else { /* For Linux msdosfs is called vfat */ if (strcmp(fstype, "msdosfs") == 0) fstype = "vfat"; sbuf_printf(sb, "%s %s %s %s", mntfrom, mntto, fstype, - mp->mnt_stat.f_flags & MNT_RDONLY ? "ro" : "rw"); + sp->f_flags & MNT_RDONLY ? "ro" : "rw"); } #define ADD_OPTION(opt, name) \ - if (mp->mnt_stat.f_flags & (opt)) sbuf_printf(sb, "," name); + if (sp->f_flags & (opt)) sbuf_printf(sb, "," name); ADD_OPTION(MNT_SYNCHRONOUS, "sync"); ADD_OPTION(MNT_NOEXEC, "noexec"); ADD_OPTION(MNT_NOSUID, "nosuid"); ADD_OPTION(MNT_UNION, "union"); ADD_OPTION(MNT_ASYNC, "async"); ADD_OPTION(MNT_SUIDDIR, "suiddir"); ADD_OPTION(MNT_NOSYMFOLLOW, "nosymfollow"); ADD_OPTION(MNT_NOATIME, "noatime"); #undef ADD_OPTION /* a real Linux mtab will also show NFS options */ sbuf_printf(sb, " 0 0\n"); } - mtx_unlock(&mountlist_mtx); + + free(buf, M_TEMP); free(flep, M_TEMP); return (error); } /* * Filler function for proc/partitions */ static int linprocfs_dopartitions(PFS_FILL_ARGS) { struct g_class *cp; struct g_geom *gp; struct g_provider *pp; int major, minor; g_topology_lock(); sbuf_printf(sb, "major minor #blocks name rio rmerge rsect " "ruse wio wmerge wsect wuse running use aveq\n"); LIST_FOREACH(cp, &g_classes, class) { if (strcmp(cp->name, "DISK") == 0 || strcmp(cp->name, "PART") == 0) LIST_FOREACH(gp, &cp->geom, geom) { LIST_FOREACH(pp, &gp->provider, provider) { if (linux_driver_get_major_minor( pp->name, &major, &minor) != 0) { major = 0; minor = 0; } sbuf_printf(sb, "%d %d %lld %s " "%d %d %d %d %d " "%d %d %d %d %d %d\n", major, minor, (long long)pp->mediasize, pp->name, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); } } } g_topology_unlock(); return (0); } /* * Filler function for proc/stat */ static int linprocfs_dostat(PFS_FILL_ARGS) { struct pcpu *pcpu; long cp_time[CPUSTATES]; long *cp; int i; read_cpu_time(cp_time); sbuf_printf(sb, "cpu %ld %ld %ld %ld\n", T2J(cp_time[CP_USER]), T2J(cp_time[CP_NICE]), T2J(cp_time[CP_SYS] /*+ cp_time[CP_INTR]*/), T2J(cp_time[CP_IDLE])); CPU_FOREACH(i) { pcpu = pcpu_find(i); cp = pcpu->pc_cp_time; sbuf_printf(sb, "cpu%d %ld %ld %ld %ld\n", i, T2J(cp[CP_USER]), T2J(cp[CP_NICE]), T2J(cp[CP_SYS] /*+ cp[CP_INTR]*/), T2J(cp[CP_IDLE])); } sbuf_printf(sb, "disk 0 0 0 0\n" "page %u %u\n" "swap %u %u\n" "intr %u\n" "ctxt %u\n" "btime %lld\n", vm_cnt.v_vnodepgsin, vm_cnt.v_vnodepgsout, vm_cnt.v_swappgsin, vm_cnt.v_swappgsout, vm_cnt.v_intr, vm_cnt.v_swtch, (long long)boottime.tv_sec); return (0); } static int linprocfs_doswaps(PFS_FILL_ARGS) { struct xswdev xsw; uintmax_t total, used; int n; char devname[SPECNAMELEN + 1]; sbuf_printf(sb, "Filename\t\t\t\tType\t\tSize\tUsed\tPriority\n"); mtx_lock(&Giant); for (n = 0; ; n++) { if (swap_dev_info(n, &xsw, devname, sizeof(devname)) != 0) break; total = (uintmax_t)xsw.xsw_nblks * PAGE_SIZE / 1024; used = (uintmax_t)xsw.xsw_used * PAGE_SIZE / 1024; /* * The space and not tab after the device name is on * purpose. Linux does so. */ sbuf_printf(sb, "/dev/%-34s unknown\t\t%jd\t%jd\t-1\n", devname, total, used); } mtx_unlock(&Giant); return (0); } /* * Filler function for proc/uptime */ static int linprocfs_douptime(PFS_FILL_ARGS) { long cp_time[CPUSTATES]; struct timeval tv; getmicrouptime(&tv); read_cpu_time(cp_time); sbuf_printf(sb, "%lld.%02ld %ld.%02lu\n", (long long)tv.tv_sec, tv.tv_usec / 10000, T2S(cp_time[CP_IDLE] / mp_ncpus), T2CS(cp_time[CP_IDLE] / mp_ncpus) % 100); return (0); } /* * Get OS build date */ static void linprocfs_osbuild(struct thread *td, struct sbuf *sb) { #if 0 char osbuild[256]; char *cp1, *cp2; strncpy(osbuild, version, 256); osbuild[255] = '\0'; cp1 = strstr(osbuild, "\n"); cp2 = strstr(osbuild, ":"); if (cp1 && cp2) { *cp1 = *cp2 = '\0'; cp1 = strstr(osbuild, "#"); } else cp1 = NULL; if (cp1) sbuf_printf(sb, "%s%s", cp1, cp2 + 1); else #endif sbuf_cat(sb, "#4 Sun Dec 18 04:30:00 CET 1977"); } /* * Get OS builder */ static void linprocfs_osbuilder(struct thread *td, struct sbuf *sb) { #if 0 char builder[256]; char *cp; cp = strstr(version, "\n "); if (cp) { strncpy(builder, cp + 5, 256); builder[255] = '\0'; cp = strstr(builder, ":"); if (cp) *cp = '\0'; } if (cp) sbuf_cat(sb, builder); else #endif sbuf_cat(sb, "des@freebsd.org"); } /* * Filler function for proc/version */ static int linprocfs_doversion(PFS_FILL_ARGS) { char osname[LINUX_MAX_UTSNAME]; char osrelease[LINUX_MAX_UTSNAME]; linux_get_osname(td, osname); linux_get_osrelease(td, osrelease); sbuf_printf(sb, "%s version %s (", osname, osrelease); linprocfs_osbuilder(td, sb); sbuf_cat(sb, ") (gcc version " __VERSION__ ") "); linprocfs_osbuild(td, sb); sbuf_cat(sb, "\n"); return (0); } /* * Filler function for proc/loadavg */ static int linprocfs_doloadavg(PFS_FILL_ARGS) { sbuf_printf(sb, "%d.%02d %d.%02d %d.%02d %d/%d %d\n", (int)(averunnable.ldavg[0] / averunnable.fscale), (int)(averunnable.ldavg[0] * 100 / averunnable.fscale % 100), (int)(averunnable.ldavg[1] / averunnable.fscale), (int)(averunnable.ldavg[1] * 100 / averunnable.fscale % 100), (int)(averunnable.ldavg[2] / averunnable.fscale), (int)(averunnable.ldavg[2] * 100 / averunnable.fscale % 100), 1, /* number of running tasks */ nprocs, /* number of tasks */ lastpid /* the last pid */ ); return (0); } /* * Filler function for proc/pid/stat */ static int linprocfs_doprocstat(PFS_FILL_ARGS) { struct kinfo_proc kp; char state; static int ratelimit = 0; vm_offset_t startcode, startdata; sx_slock(&proctree_lock); PROC_LOCK(p); fill_kinfo_proc(p, &kp); sx_sunlock(&proctree_lock); if (p->p_vmspace) { startcode = (vm_offset_t)p->p_vmspace->vm_taddr; startdata = (vm_offset_t)p->p_vmspace->vm_daddr; } else { startcode = 0; startdata = 0; }; sbuf_printf(sb, "%d", p->p_pid); #define PS_ADD(name, fmt, arg) sbuf_printf(sb, " " fmt, arg) PS_ADD("comm", "(%s)", p->p_comm); if (kp.ki_stat > sizeof(linux_state)) { state = 'R'; if (ratelimit == 0) { printf("linprocfs: don't know how to handle unknown FreeBSD state %d/%zd, mapping to R\n", kp.ki_stat, sizeof(linux_state)); ++ratelimit; } } else state = linux_state[kp.ki_stat - 1]; PS_ADD("state", "%c", state); PS_ADD("ppid", "%d", p->p_pptr ? p->p_pptr->p_pid : 0); PS_ADD("pgrp", "%d", p->p_pgid); PS_ADD("session", "%d", p->p_session->s_sid); PROC_UNLOCK(p); PS_ADD("tty", "%ju", (uintmax_t)kp.ki_tdev); PS_ADD("tpgid", "%d", kp.ki_tpgid); PS_ADD("flags", "%u", 0); /* XXX */ PS_ADD("minflt", "%lu", kp.ki_rusage.ru_minflt); PS_ADD("cminflt", "%lu", kp.ki_rusage_ch.ru_minflt); PS_ADD("majflt", "%lu", kp.ki_rusage.ru_majflt); PS_ADD("cmajflt", "%lu", kp.ki_rusage_ch.ru_majflt); PS_ADD("utime", "%ld", TV2J(&kp.ki_rusage.ru_utime)); PS_ADD("stime", "%ld", TV2J(&kp.ki_rusage.ru_stime)); PS_ADD("cutime", "%ld", TV2J(&kp.ki_rusage_ch.ru_utime)); PS_ADD("cstime", "%ld", TV2J(&kp.ki_rusage_ch.ru_stime)); PS_ADD("priority", "%d", kp.ki_pri.pri_user); PS_ADD("nice", "%d", kp.ki_nice); /* 19 (nicest) to -19 */ PS_ADD("0", "%d", 0); /* removed field */ PS_ADD("itrealvalue", "%d", 0); /* XXX */ PS_ADD("starttime", "%lu", TV2J(&kp.ki_start) - TV2J(&boottime)); PS_ADD("vsize", "%ju", P2K((uintmax_t)kp.ki_size)); PS_ADD("rss", "%ju", (uintmax_t)kp.ki_rssize); PS_ADD("rlim", "%lu", kp.ki_rusage.ru_maxrss); PS_ADD("startcode", "%ju", (uintmax_t)startcode); PS_ADD("endcode", "%ju", (uintmax_t)startdata); PS_ADD("startstack", "%u", 0); /* XXX */ PS_ADD("kstkesp", "%u", 0); /* XXX */ PS_ADD("kstkeip", "%u", 0); /* XXX */ PS_ADD("signal", "%u", 0); /* XXX */ PS_ADD("blocked", "%u", 0); /* XXX */ PS_ADD("sigignore", "%u", 0); /* XXX */ PS_ADD("sigcatch", "%u", 0); /* XXX */ PS_ADD("wchan", "%u", 0); /* XXX */ PS_ADD("nswap", "%lu", kp.ki_rusage.ru_nswap); PS_ADD("cnswap", "%lu", kp.ki_rusage_ch.ru_nswap); PS_ADD("exitsignal", "%d", 0); /* XXX */ PS_ADD("processor", "%u", kp.ki_lastcpu); PS_ADD("rt_priority", "%u", 0); /* XXX */ /* >= 2.5.19 */ PS_ADD("policy", "%u", kp.ki_pri.pri_class); /* >= 2.5.19 */ #undef PS_ADD sbuf_putc(sb, '\n'); return (0); } /* * Filler function for proc/pid/statm */ static int linprocfs_doprocstatm(PFS_FILL_ARGS) { struct kinfo_proc kp; segsz_t lsize; sx_slock(&proctree_lock); PROC_LOCK(p); fill_kinfo_proc(p, &kp); PROC_UNLOCK(p); sx_sunlock(&proctree_lock); /* * See comments in linprocfs_doprocstatus() regarding the * computation of lsize. */ /* size resident share trs drs lrs dt */ sbuf_printf(sb, "%ju ", B2P((uintmax_t)kp.ki_size)); sbuf_printf(sb, "%ju ", (uintmax_t)kp.ki_rssize); sbuf_printf(sb, "%ju ", (uintmax_t)0); /* XXX */ sbuf_printf(sb, "%ju ", (uintmax_t)kp.ki_tsize); sbuf_printf(sb, "%ju ", (uintmax_t)(kp.ki_dsize + kp.ki_ssize)); lsize = B2P(kp.ki_size) - kp.ki_dsize - kp.ki_ssize - kp.ki_tsize - 1; sbuf_printf(sb, "%ju ", (uintmax_t)lsize); sbuf_printf(sb, "%ju\n", (uintmax_t)0); /* XXX */ return (0); } /* * Filler function for proc/pid/status */ static int linprocfs_doprocstatus(PFS_FILL_ARGS) { struct kinfo_proc kp; char *state; segsz_t lsize; struct thread *td2; struct sigacts *ps; int i; sx_slock(&proctree_lock); PROC_LOCK(p); td2 = FIRST_THREAD_IN_PROC(p); /* XXXKSE pretend only one thread */ if (P_SHOULDSTOP(p)) { state = "T (stopped)"; } else { switch(p->p_state) { case PRS_NEW: state = "I (idle)"; break; case PRS_NORMAL: if (p->p_flag & P_WEXIT) { state = "X (exiting)"; break; } switch(td2->td_state) { case TDS_INHIBITED: state = "S (sleeping)"; break; case TDS_RUNQ: case TDS_RUNNING: state = "R (running)"; break; default: state = "? (unknown)"; break; } break; case PRS_ZOMBIE: state = "Z (zombie)"; break; default: state = "? (unknown)"; break; } } fill_kinfo_proc(p, &kp); sx_sunlock(&proctree_lock); sbuf_printf(sb, "Name:\t%s\n", p->p_comm); /* XXX escape */ sbuf_printf(sb, "State:\t%s\n", state); /* * Credentials */ sbuf_printf(sb, "Pid:\t%d\n", p->p_pid); sbuf_printf(sb, "PPid:\t%d\n", p->p_pptr ? p->p_pptr->p_pid : 0); sbuf_printf(sb, "Uid:\t%d %d %d %d\n", p->p_ucred->cr_ruid, p->p_ucred->cr_uid, p->p_ucred->cr_svuid, /* FreeBSD doesn't have fsuid */ p->p_ucred->cr_uid); sbuf_printf(sb, "Gid:\t%d %d %d %d\n", p->p_ucred->cr_rgid, p->p_ucred->cr_gid, p->p_ucred->cr_svgid, /* FreeBSD doesn't have fsgid */ p->p_ucred->cr_gid); sbuf_cat(sb, "Groups:\t"); for (i = 0; i < p->p_ucred->cr_ngroups; i++) sbuf_printf(sb, "%d ", p->p_ucred->cr_groups[i]); PROC_UNLOCK(p); sbuf_putc(sb, '\n'); /* * Memory * * While our approximation of VmLib may not be accurate (I * don't know of a simple way to verify it, and I'm not sure * it has much meaning anyway), I believe it's good enough. * * The same code that could (I think) accurately compute VmLib * could also compute VmLck, but I don't really care enough to * implement it. Submissions are welcome. */ sbuf_printf(sb, "VmSize:\t%8ju kB\n", B2K((uintmax_t)kp.ki_size)); sbuf_printf(sb, "VmLck:\t%8u kB\n", P2K(0)); /* XXX */ sbuf_printf(sb, "VmRSS:\t%8ju kB\n", P2K((uintmax_t)kp.ki_rssize)); sbuf_printf(sb, "VmData:\t%8ju kB\n", P2K((uintmax_t)kp.ki_dsize)); sbuf_printf(sb, "VmStk:\t%8ju kB\n", P2K((uintmax_t)kp.ki_ssize)); sbuf_printf(sb, "VmExe:\t%8ju kB\n", P2K((uintmax_t)kp.ki_tsize)); lsize = B2P(kp.ki_size) - kp.ki_dsize - kp.ki_ssize - kp.ki_tsize - 1; sbuf_printf(sb, "VmLib:\t%8ju kB\n", P2K((uintmax_t)lsize)); /* * Signal masks * * We support up to 128 signals, while Linux supports 32, * but we only define 32 (the same 32 as Linux, to boot), so * just show the lower 32 bits of each mask. XXX hack. * * NB: on certain platforms (Sparc at least) Linux actually * supports 64 signals, but this code is a long way from * running on anything but i386, so ignore that for now. */ PROC_LOCK(p); sbuf_printf(sb, "SigPnd:\t%08x\n", p->p_siglist.__bits[0]); /* * I can't seem to find out where the signal mask is in * relation to struct proc, so SigBlk is left unimplemented. */ sbuf_printf(sb, "SigBlk:\t%08x\n", 0); /* XXX */ ps = p->p_sigacts; mtx_lock(&ps->ps_mtx); sbuf_printf(sb, "SigIgn:\t%08x\n", ps->ps_sigignore.__bits[0]); sbuf_printf(sb, "SigCgt:\t%08x\n", ps->ps_sigcatch.__bits[0]); mtx_unlock(&ps->ps_mtx); PROC_UNLOCK(p); /* * Linux also prints the capability masks, but we don't have * capabilities yet, and when we do get them they're likely to * be meaningless to Linux programs, so we lie. XXX */ sbuf_printf(sb, "CapInh:\t%016x\n", 0); sbuf_printf(sb, "CapPrm:\t%016x\n", 0); sbuf_printf(sb, "CapEff:\t%016x\n", 0); return (0); } /* * Filler function for proc/pid/cwd */ static int linprocfs_doproccwd(PFS_FILL_ARGS) { char *fullpath = "unknown"; char *freepath = NULL; vn_fullpath(td, p->p_fd->fd_cdir, &fullpath, &freepath); sbuf_printf(sb, "%s", fullpath); if (freepath) free(freepath, M_TEMP); return (0); } /* * Filler function for proc/pid/root */ static int linprocfs_doprocroot(PFS_FILL_ARGS) { struct vnode *rvp; char *fullpath = "unknown"; char *freepath = NULL; rvp = jailed(p->p_ucred) ? p->p_fd->fd_jdir : p->p_fd->fd_rdir; vn_fullpath(td, rvp, &fullpath, &freepath); sbuf_printf(sb, "%s", fullpath); if (freepath) free(freepath, M_TEMP); return (0); } /* * Filler function for proc/pid/cmdline */ static int linprocfs_doproccmdline(PFS_FILL_ARGS) { int ret; PROC_LOCK(p); if ((ret = p_cansee(td, p)) != 0) { PROC_UNLOCK(p); return (ret); } /* * Mimic linux behavior and pass only processes with usermode * address space as valid. Return zero silently otherwize. */ if (p->p_vmspace == &vmspace0) { PROC_UNLOCK(p); return (0); } if (p->p_args != NULL) { sbuf_bcpy(sb, p->p_args->ar_args, p->p_args->ar_length); PROC_UNLOCK(p); return (0); } if ((p->p_flag & P_SYSTEM) != 0) { PROC_UNLOCK(p); return (0); } PROC_UNLOCK(p); ret = proc_getargv(td, p, sb); return (ret); } /* * Filler function for proc/pid/environ */ static int linprocfs_doprocenviron(PFS_FILL_ARGS) { int ret; PROC_LOCK(p); if ((ret = p_candebug(td, p)) != 0) { PROC_UNLOCK(p); return (ret); } /* * Mimic linux behavior and pass only processes with usermode * address space as valid. Return zero silently otherwize. */ if (p->p_vmspace == &vmspace0) { PROC_UNLOCK(p); return (0); } if ((p->p_flag & P_SYSTEM) != 0) { PROC_UNLOCK(p); return (0); } PROC_UNLOCK(p); ret = proc_getenvv(td, p, sb); return (ret); } /* * Filler function for proc/pid/maps */ static int linprocfs_doprocmaps(PFS_FILL_ARGS) { struct vmspace *vm; vm_map_t map; vm_map_entry_t entry, tmp_entry; vm_object_t obj, tobj, lobj; vm_offset_t e_start, e_end; vm_ooffset_t off = 0; vm_prot_t e_prot; unsigned int last_timestamp; char *name = "", *freename = NULL; ino_t ino; int ref_count, shadow_count, flags; int error; struct vnode *vp; struct vattr vat; PROC_LOCK(p); error = p_candebug(td, p); PROC_UNLOCK(p); if (error) return (error); if (uio->uio_rw != UIO_READ) return (EOPNOTSUPP); error = 0; vm = vmspace_acquire_ref(p); if (vm == NULL) return (ESRCH); map = &vm->vm_map; vm_map_lock_read(map); for (entry = map->header.next; entry != &map->header; entry = entry->next) { name = ""; freename = NULL; if (entry->eflags & MAP_ENTRY_IS_SUB_MAP) continue; e_prot = entry->protection; e_start = entry->start; e_end = entry->end; obj = entry->object.vm_object; for (lobj = tobj = obj; tobj; tobj = tobj->backing_object) { VM_OBJECT_RLOCK(tobj); if (lobj != obj) VM_OBJECT_RUNLOCK(lobj); lobj = tobj; } last_timestamp = map->timestamp; vm_map_unlock_read(map); ino = 0; if (lobj) { off = IDX_TO_OFF(lobj->size); if (lobj->type == OBJT_VNODE) { vp = lobj->handle; if (vp) vref(vp); } else vp = NULL; if (lobj != obj) VM_OBJECT_RUNLOCK(lobj); flags = obj->flags; ref_count = obj->ref_count; shadow_count = obj->shadow_count; VM_OBJECT_RUNLOCK(obj); if (vp) { vn_fullpath(td, vp, &name, &freename); vn_lock(vp, LK_SHARED | LK_RETRY); VOP_GETATTR(vp, &vat, td->td_ucred); ino = vat.va_fileid; vput(vp); } } else { flags = 0; ref_count = 0; shadow_count = 0; } /* * format: * start, end, access, offset, major, minor, inode, name. */ error = sbuf_printf(sb, "%08lx-%08lx %s%s%s%s %08lx %02x:%02x %lu%s%s\n", (u_long)e_start, (u_long)e_end, (e_prot & VM_PROT_READ)?"r":"-", (e_prot & VM_PROT_WRITE)?"w":"-", (e_prot & VM_PROT_EXECUTE)?"x":"-", "p", (u_long)off, 0, 0, (u_long)ino, *name ? " " : "", name ); if (freename) free(freename, M_TEMP); vm_map_lock_read(map); if (error == -1) { error = 0; break; } if (last_timestamp != map->timestamp) { /* * Look again for the entry because the map was * modified while it was unlocked. Specifically, * the entry may have been clipped, merged, or deleted. */ vm_map_lookup_entry(map, e_end - 1, &tmp_entry); entry = tmp_entry; } } vm_map_unlock_read(map); vmspace_free(vm); return (error); } /* * Filler function for proc/net/dev */ static int linprocfs_donetdev(PFS_FILL_ARGS) { char ifname[16]; /* XXX LINUX_IFNAMSIZ */ struct ifnet *ifp; sbuf_printf(sb, "%6s|%58s|%s\n" "%6s|%58s|%58s\n", "Inter-", " Receive", " Transmit", " face", "bytes packets errs drop fifo frame compressed multicast", "bytes packets errs drop fifo colls carrier compressed"); CURVNET_SET(TD_TO_VNET(curthread)); IFNET_RLOCK(); TAILQ_FOREACH(ifp, &V_ifnet, if_link) { linux_ifname(ifp, ifname, sizeof ifname); sbuf_printf(sb, "%6.6s: ", ifname); sbuf_printf(sb, "%7ju %7ju %4ju %4ju %4lu %5lu %10lu %9ju ", (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IBYTES), (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IPACKETS), (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IERRORS), (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IQDROPS), /* rx_missed_errors */ 0UL, /* rx_fifo_errors */ 0UL, /* rx_length_errors + * rx_over_errors + * rx_crc_errors + * rx_frame_errors */ 0UL, /* rx_compressed */ (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IMCASTS)); /* XXX-BZ rx only? */ sbuf_printf(sb, "%8ju %7ju %4ju %4ju %4lu %5ju %7lu %10lu\n", (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OBYTES), (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OPACKETS), (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OERRORS), (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OQDROPS), 0UL, /* tx_fifo_errors */ (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_COLLISIONS), 0UL, /* tx_carrier_errors + * tx_aborted_errors + * tx_window_errors + * tx_heartbeat_errors*/ 0UL); /* tx_compressed */ } IFNET_RUNLOCK(); CURVNET_RESTORE(); return (0); } /* * Filler function for proc/sys/kernel/osrelease */ static int linprocfs_doosrelease(PFS_FILL_ARGS) { char osrelease[LINUX_MAX_UTSNAME]; linux_get_osrelease(td, osrelease); sbuf_printf(sb, "%s\n", osrelease); return (0); } /* * Filler function for proc/sys/kernel/ostype */ static int linprocfs_doostype(PFS_FILL_ARGS) { char osname[LINUX_MAX_UTSNAME]; linux_get_osname(td, osname); sbuf_printf(sb, "%s\n", osname); return (0); } /* * Filler function for proc/sys/kernel/version */ static int linprocfs_doosbuild(PFS_FILL_ARGS) { linprocfs_osbuild(td, sb); sbuf_cat(sb, "\n"); return (0); } /* * Filler function for proc/sys/kernel/msgmni */ static int linprocfs_domsgmni(PFS_FILL_ARGS) { sbuf_printf(sb, "%d\n", msginfo.msgmni); return (0); } /* * Filler function for proc/sys/kernel/pid_max */ static int linprocfs_dopid_max(PFS_FILL_ARGS) { sbuf_printf(sb, "%i\n", PID_MAX); return (0); } /* * Filler function for proc/sys/kernel/sem */ static int linprocfs_dosem(PFS_FILL_ARGS) { sbuf_printf(sb, "%d %d %d %d\n", seminfo.semmsl, seminfo.semmns, seminfo.semopm, seminfo.semmni); return (0); } /* * Filler function for proc/scsi/device_info */ static int linprocfs_doscsidevinfo(PFS_FILL_ARGS) { return (0); } /* * Filler function for proc/scsi/scsi */ static int linprocfs_doscsiscsi(PFS_FILL_ARGS) { return (0); } extern struct cdevsw *cdevsw[]; /* * Filler function for proc/devices */ static int linprocfs_dodevices(PFS_FILL_ARGS) { char *char_devices; sbuf_printf(sb, "Character devices:\n"); char_devices = linux_get_char_devices(); sbuf_printf(sb, "%s", char_devices); linux_free_get_char_devices(char_devices); sbuf_printf(sb, "\nBlock devices:\n"); return (0); } /* * Filler function for proc/cmdline */ static int linprocfs_docmdline(PFS_FILL_ARGS) { sbuf_printf(sb, "BOOT_IMAGE=%s", kernelname); sbuf_printf(sb, " ro root=302\n"); return (0); } /* * Filler function for proc/filesystems */ static int linprocfs_dofilesystems(PFS_FILL_ARGS) { struct vfsconf *vfsp; mtx_lock(&Giant); TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) { if (vfsp->vfc_flags & VFCF_SYNTHETIC) sbuf_printf(sb, "nodev"); sbuf_printf(sb, "\t%s\n", vfsp->vfc_name); } mtx_unlock(&Giant); return(0); } #if 0 /* * Filler function for proc/modules */ static int linprocfs_domodules(PFS_FILL_ARGS) { struct linker_file *lf; TAILQ_FOREACH(lf, &linker_files, link) { sbuf_printf(sb, "%-20s%8lu%4d\n", lf->filename, (unsigned long)lf->size, lf->refs); } return (0); } #endif /* * Filler function for proc/pid/fd */ static int linprocfs_dofdescfs(PFS_FILL_ARGS) { if (p == curproc) sbuf_printf(sb, "/dev/fd"); else sbuf_printf(sb, "unknown"); return (0); } /* * Filler function for proc/sys/kernel/random/uuid */ static int linprocfs_douuid(PFS_FILL_ARGS) { struct uuid uuid; kern_uuidgen(&uuid, 1); sbuf_printf_uuid(sb, &uuid); sbuf_printf(sb, "\n"); return(0); } /* * Constructor */ static int linprocfs_init(PFS_INIT_ARGS) { struct pfs_node *root; struct pfs_node *dir; root = pi->pi_root; /* /proc/... */ pfs_create_file(root, "cmdline", &linprocfs_docmdline, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "cpuinfo", &linprocfs_docpuinfo, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "devices", &linprocfs_dodevices, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "filesystems", &linprocfs_dofilesystems, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "loadavg", &linprocfs_doloadavg, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "meminfo", &linprocfs_domeminfo, NULL, NULL, NULL, PFS_RD); #if 0 pfs_create_file(root, "modules", &linprocfs_domodules, NULL, NULL, NULL, PFS_RD); #endif pfs_create_file(root, "mounts", &linprocfs_domtab, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "mtab", &linprocfs_domtab, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "partitions", &linprocfs_dopartitions, NULL, NULL, NULL, PFS_RD); pfs_create_link(root, "self", &procfs_docurproc, NULL, NULL, NULL, 0); pfs_create_file(root, "stat", &linprocfs_dostat, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "swaps", &linprocfs_doswaps, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "uptime", &linprocfs_douptime, NULL, NULL, NULL, PFS_RD); pfs_create_file(root, "version", &linprocfs_doversion, NULL, NULL, NULL, PFS_RD); /* /proc/net/... */ dir = pfs_create_dir(root, "net", NULL, NULL, NULL, 0); pfs_create_file(dir, "dev", &linprocfs_donetdev, NULL, NULL, NULL, PFS_RD); /* /proc//... */ dir = pfs_create_dir(root, "pid", NULL, NULL, NULL, PFS_PROCDEP); pfs_create_file(dir, "cmdline", &linprocfs_doproccmdline, NULL, NULL, NULL, PFS_RD); pfs_create_link(dir, "cwd", &linprocfs_doproccwd, NULL, NULL, NULL, 0); pfs_create_file(dir, "environ", &linprocfs_doprocenviron, NULL, NULL, NULL, PFS_RD); pfs_create_link(dir, "exe", &procfs_doprocfile, NULL, &procfs_notsystem, NULL, 0); pfs_create_file(dir, "maps", &linprocfs_doprocmaps, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "mem", &procfs_doprocmem, &procfs_attr, &procfs_candebug, NULL, PFS_RDWR|PFS_RAW); pfs_create_link(dir, "root", &linprocfs_doprocroot, NULL, NULL, NULL, 0); pfs_create_file(dir, "stat", &linprocfs_doprocstat, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "statm", &linprocfs_doprocstatm, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "status", &linprocfs_doprocstatus, NULL, NULL, NULL, PFS_RD); pfs_create_link(dir, "fd", &linprocfs_dofdescfs, NULL, NULL, NULL, 0); /* /proc/scsi/... */ dir = pfs_create_dir(root, "scsi", NULL, NULL, NULL, 0); pfs_create_file(dir, "device_info", &linprocfs_doscsidevinfo, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "scsi", &linprocfs_doscsiscsi, NULL, NULL, NULL, PFS_RD); /* /proc/sys/... */ dir = pfs_create_dir(root, "sys", NULL, NULL, NULL, 0); /* /proc/sys/kernel/... */ dir = pfs_create_dir(dir, "kernel", NULL, NULL, NULL, 0); pfs_create_file(dir, "osrelease", &linprocfs_doosrelease, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "ostype", &linprocfs_doostype, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "version", &linprocfs_doosbuild, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "msgmni", &linprocfs_domsgmni, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "pid_max", &linprocfs_dopid_max, NULL, NULL, NULL, PFS_RD); pfs_create_file(dir, "sem", &linprocfs_dosem, NULL, NULL, NULL, PFS_RD); /* /proc/sys/kernel/random/... */ dir = pfs_create_dir(dir, "random", NULL, NULL, NULL, 0); pfs_create_file(dir, "uuid", &linprocfs_douuid, NULL, NULL, NULL, PFS_RD); return (0); } /* * Destructor */ static int linprocfs_uninit(PFS_INIT_ARGS) { /* nothing to do, pseudofs will GC */ return (0); } PSEUDOFS(linprocfs, 1, 0); MODULE_DEPEND(linprocfs, linux, 1, 1, 1); MODULE_DEPEND(linprocfs, procfs, 1, 1, 1); MODULE_DEPEND(linprocfs, sysvmsg, 1, 1, 1); MODULE_DEPEND(linprocfs, sysvsem, 1, 1, 1); Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdaa_patches.c =================================================================== --- user/ngie/more-tests/sys/dev/sound/pci/hda/hdaa_patches.c (revision 281584) +++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdaa_patches.c (revision 281585) @@ -1,739 +1,746 @@ /*- * Copyright (c) 2006 Stephane E. Potvin * Copyright (c) 2006 Ariff Abdullah * Copyright (c) 2008-2012 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * Intel High Definition Audio (Audio function quirks) driver for FreeBSD. */ #ifdef HAVE_KERNEL_OPTION_HEADERS #include "opt_snd.h" #endif #include #include #include #include #include SND_DECLARE_FILE("$FreeBSD$"); static const struct { uint32_t model; uint32_t id; uint32_t subsystemid; uint32_t set, unset; uint32_t gpio; } hdac_quirks[] = { /* * XXX Force stereo quirk. Monoural recording / playback * on few codecs (especially ALC880) seems broken or * perhaps unsupported. */ { HDA_MATCH_ALL, HDA_MATCH_ALL, HDA_MATCH_ALL, HDAA_QUIRK_FORCESTEREO | HDAA_QUIRK_IVREF, 0, 0 }, { ACER_ALL_SUBVENDOR, HDA_MATCH_ALL, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { ASUS_G2K_SUBVENDOR, HDA_CODEC_ALC660, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { ASUS_M5200_SUBVENDOR, HDA_CODEC_ALC880, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { ASUS_A7M_SUBVENDOR, HDA_CODEC_ALC880, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { ASUS_A7T_SUBVENDOR, HDA_CODEC_ALC882, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { ASUS_W2J_SUBVENDOR, HDA_CODEC_ALC882, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { ASUS_U5F_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL, HDAA_QUIRK_EAPDINV, 0, 0 }, { ASUS_A8X_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL, HDAA_QUIRK_EAPDINV, 0, 0 }, { ASUS_F3JC_SUBVENDOR, HDA_CODEC_ALC861, HDA_MATCH_ALL, HDAA_QUIRK_OVREF, 0, 0 }, { UNIWILL_9075_SUBVENDOR, HDA_CODEC_ALC861, HDA_MATCH_ALL, HDAA_QUIRK_OVREF, 0, 0 }, /*{ ASUS_M2N_SUBVENDOR, HDA_CODEC_AD1988, HDA_MATCH_ALL, HDAA_QUIRK_IVREF80, HDAA_QUIRK_IVREF50 | HDAA_QUIRK_IVREF100, 0 },*/ { MEDION_MD95257_SUBVENDOR, HDA_CODEC_ALC880, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(1) }, { LENOVO_3KN100_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL, HDAA_QUIRK_EAPDINV | HDAA_QUIRK_SENSEINV, 0, 0 }, { SAMSUNG_Q1_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL, HDAA_QUIRK_EAPDINV, 0, 0 }, { APPLE_MB3_SUBVENDOR, HDA_CODEC_ALC885, HDA_MATCH_ALL, HDAA_QUIRK_OVREF50, 0, HDAA_GPIO_SET(0) }, { APPLE_INTEL_MAC, HDA_CODEC_STAC9221, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) | HDAA_GPIO_SET(1) }, { APPLE_MACBOOKAIR31, HDA_CODEC_CS4206, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) }, { APPLE_MACBOOKPRO55, HDA_CODEC_CS4206, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) }, { APPLE_MACBOOKPRO71, HDA_CODEC_CS4206, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) }, { HDA_INTEL_MACBOOKPRO92, HDA_CODEC_CS4206, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) }, { DELL_D630_SUBVENDOR, HDA_CODEC_STAC9205X, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { DELL_V1400_SUBVENDOR, HDA_CODEC_STAC9228X, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(2) }, { DELL_V1500_SUBVENDOR, HDA_CODEC_STAC9205X, HDA_MATCH_ALL, 0, 0, HDAA_GPIO_SET(0) }, { HDA_MATCH_ALL, HDA_CODEC_AD1988, HDA_MATCH_ALL, HDAA_QUIRK_IVREF80, HDAA_QUIRK_IVREF50 | HDAA_QUIRK_IVREF100, 0 }, { HDA_MATCH_ALL, HDA_CODEC_AD1988B, HDA_MATCH_ALL, HDAA_QUIRK_IVREF80, HDAA_QUIRK_IVREF50 | HDAA_QUIRK_IVREF100, 0 }, { HDA_MATCH_ALL, HDA_CODEC_CX20549, HDA_MATCH_ALL, 0, HDAA_QUIRK_FORCESTEREO, 0 }, /* Mac Pro 1,1 requires ovref for proper volume level. */ { 0x00000000, HDA_CODEC_ALC885, 0x106b0c00, 0, HDAA_QUIRK_OVREF, 0 } }; static void hdac_pin_patch(struct hdaa_widget *w) { const char *patch = NULL; uint32_t config, orig, id, subid; nid_t nid = w->nid; config = orig = w->wclass.pin.config; id = hdaa_codec_id(w->devinfo); subid = hdaa_card_id(w->devinfo); /* XXX: Old patches require complete review. * Now they may create more problem then solve due to * incorrect associations. */ if (id == HDA_CODEC_ALC880 && subid == LG_LW20_SUBVENDOR) { switch (nid) { case 26: config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_IN; break; case 27: config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_HP_OUT; break; default: break; } } else if (id == HDA_CODEC_ALC880 && (subid == CLEVO_D900T_SUBVENDOR || subid == ASUS_M5200_SUBVENDOR)) { /* * Super broken BIOS */ switch (nid) { case 24: /* MIC1 */ config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN; break; case 25: /* XXX MIC2 */ config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN; break; case 26: /* LINE1 */ config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_IN; break; case 27: /* XXX LINE2 */ config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_IN; break; case 28: /* CD */ config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK; config |= HDA_CONFIG_DEFAULTCONF_DEVICE_CD; break; } } else if (id == HDA_CODEC_ALC883 && (subid == MSI_MS034A_SUBVENDOR || HDA_DEV_MATCH(ACER_ALL_SUBVENDOR, subid))) { switch (nid) { case 25: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED); break; case 28: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_CD | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED); break; } } else if (id == HDA_CODEC_CX20549 && subid == HP_V3000_SUBVENDOR) { switch (nid) { case 18: config &= ~HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK; config |= HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_NONE; break; case 20: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED); break; case 21: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_CD | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED); break; } } else if (id == HDA_CODEC_CX20551 && subid == HP_DV5000_SUBVENDOR) { switch (nid) { case 20: case 21: config &= ~HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK; config |= HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_NONE; break; } } else if (id == HDA_CODEC_ALC861 && subid == ASUS_W6F_SUBVENDOR) { switch (nid) { case 11: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_OUT | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED); break; case 12: case 14: case 16: case 31: case 32: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED); break; case 15: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_HP_OUT | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_JACK); break; } } else if (id == HDA_CODEC_ALC861 && subid == UNIWILL_9075_SUBVENDOR) { switch (nid) { case 15: config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK); config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_HP_OUT | HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_JACK); break; } } /* New patches */ if (id == HDA_CODEC_AD1984A && subid == LENOVO_X300_SUBVENDOR) { switch (nid) { case 17: /* Headphones with redirection */ patch = "as=1 seq=15"; break; case 20: /* Two mics together */ patch = "as=2 seq=15"; break; } } else if (id == HDA_CODEC_AD1986A && (subid == ASUS_M2NPVMX_SUBVENDOR || subid == ASUS_A8NVMCSM_SUBVENDOR || subid == ASUS_P5PL2_SUBVENDOR)) { switch (nid) { case 26: /* Headphones with redirection */ patch = "as=1 seq=15"; break; case 28: /* 5.1 out => 2.0 out + 1 input */ patch = "device=Line-in as=8 seq=1"; break; case 29: /* Can't use this as input, as the only available mic * preamplifier is busy by front panel mic (nid 31). * If you want to use this rear connector as mic input, * you have to disable the front panel one. */ patch = "as=0"; break; case 31: /* Lot of inputs configured with as=15 and unusable */ patch = "as=8 seq=3"; break; case 32: patch = "as=8 seq=4"; break; case 34: patch = "as=8 seq=5"; break; case 36: patch = "as=8 seq=6"; break; } } else if (id == HDA_CODEC_ALC260 && HDA_DEV_MATCH(SONY_S5_SUBVENDOR, subid)) { switch (nid) { case 16: patch = "seq=15 device=Headphones"; break; } } else if (id == HDA_CODEC_ALC268) { if (subid == ACER_T5320_SUBVENDOR) { switch (nid) { case 20: /* Headphones Jack */ patch = "as=1 seq=15"; break; } } } else if (id == HDA_CODEC_CX20561 && subid == LENOVO_B450_SUBVENDOR) { switch (nid) { case 22: patch = "as=1 seq=15"; break; } } else if (id == HDA_CODEC_CX20561 && subid == LENOVO_T400_SUBVENDOR) { switch (nid) { case 22: patch = "as=1 seq=15"; break; case 26: patch = "as=1 seq=0"; break; } } else if (id == HDA_CODEC_CX20590 && (subid == LENOVO_X1_SUBVENDOR || subid == LENOVO_X220_SUBVENDOR || subid == LENOVO_T420_SUBVENDOR || subid == LENOVO_T520_SUBVENDOR || subid == LENOVO_G580_SUBVENDOR)) { switch (nid) { case 25: patch = "as=1 seq=15"; break; /* * Group onboard mic and headphone mic * together. Fixes onboard mic. */ case 27: patch = "as=2 seq=15"; break; case 35: patch = "as=2"; break; } } else if (id == HDA_CODEC_ALC269 && (subid == LENOVO_X1CRBN_SUBVENDOR || subid == LENOVO_T430_SUBVENDOR || subid == LENOVO_T430S_SUBVENDOR || subid == LENOVO_T530_SUBVENDOR)) { switch (nid) { case 21: patch = "as=1 seq=15"; break; } } else if (id == HDA_CODEC_ALC269 && subid == ASUS_UX31A_SUBVENDOR) { switch (nid) { case 33: patch = "as=1 seq=15"; break; } } else if (id == HDA_CODEC_ALC892 && subid == INTEL_DH87RL_SUBVENDOR) { switch (nid) { case 27: patch = "as=1 seq=15"; break; } + } else if (id == HDA_CODEC_ALC292 && + subid == LENOVO_X120BS_SUBVENDOR) { + switch (nid) { + case 21: + patch = "as=1 seq=15"; + break; + } } if (patch != NULL) config = hdaa_widget_pin_patch(config, patch); HDA_BOOTVERBOSE( if (config != orig) device_printf(w->devinfo->dev, "Patching pin config nid=%u 0x%08x -> 0x%08x\n", nid, orig, config); ); w->wclass.pin.config = config; } static void hdaa_widget_patch(struct hdaa_widget *w) { struct hdaa_devinfo *devinfo = w->devinfo; uint32_t orig; nid_t beeper = -1; orig = w->param.widget_cap; /* On some codecs beeper is an input pin, but it is not recordable alone. Also most of BIOSes does not declare beeper pin. Change beeper pin node type to beeper to help parser. */ switch (hdaa_codec_id(devinfo)) { case HDA_CODEC_AD1882: case HDA_CODEC_AD1883: case HDA_CODEC_AD1984: case HDA_CODEC_AD1984A: case HDA_CODEC_AD1984B: case HDA_CODEC_AD1987: case HDA_CODEC_AD1988: case HDA_CODEC_AD1988B: case HDA_CODEC_AD1989B: beeper = 26; break; case HDA_CODEC_ALC260: beeper = 23; break; } if (hda_get_vendor_id(devinfo->dev) == REALTEK_VENDORID && hdaa_codec_id(devinfo) != HDA_CODEC_ALC260) beeper = 29; if (w->nid == beeper) { w->param.widget_cap &= ~HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_MASK; w->param.widget_cap |= HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_BEEP_WIDGET << HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_SHIFT; w->waspin = 1; } /* * Clear "digital" flag from digital mic input, as its signal then goes * to "analog" mixer and this separation just limits functionaity. */ if (hdaa_codec_id(devinfo) == HDA_CODEC_AD1984A && w->nid == 23) w->param.widget_cap &= ~HDA_PARAM_AUDIO_WIDGET_CAP_DIGITAL_MASK; HDA_BOOTVERBOSE( if (w->param.widget_cap != orig) { device_printf(w->devinfo->dev, "Patching widget caps nid=%u 0x%08x -> 0x%08x\n", w->nid, orig, w->param.widget_cap); } ); if (w->type == HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_PIN_COMPLEX) hdac_pin_patch(w); } void hdaa_patch(struct hdaa_devinfo *devinfo) { struct hdaa_widget *w; uint32_t id, subid, subsystemid; int i; id = hdaa_codec_id(devinfo); subid = hdaa_card_id(devinfo); subsystemid = hda_get_subsystem_id(devinfo->dev); /* * Quirks */ for (i = 0; i < nitems(hdac_quirks); i++) { if (!(HDA_DEV_MATCH(hdac_quirks[i].model, subid) && HDA_DEV_MATCH(hdac_quirks[i].id, id) && HDA_DEV_MATCH(hdac_quirks[i].subsystemid, subsystemid))) continue; devinfo->quirks |= hdac_quirks[i].set; devinfo->quirks &= ~(hdac_quirks[i].unset); devinfo->gpio = hdac_quirks[i].gpio; } /* Apply per-widget patch. */ for (i = devinfo->startnode; i < devinfo->endnode; i++) { w = hdaa_widget_get(devinfo, i); if (w == NULL) continue; hdaa_widget_patch(w); } switch (id) { case HDA_CODEC_AD1983: /* * This CODEC has several possible usages, but none * fit the parser best. Help parser to choose better. */ /* Disable direct unmixed playback to get pcm volume. */ w = hdaa_widget_get(devinfo, 5); if (w != NULL) w->connsenable[0] = 0; w = hdaa_widget_get(devinfo, 6); if (w != NULL) w->connsenable[0] = 0; w = hdaa_widget_get(devinfo, 11); if (w != NULL) w->connsenable[0] = 0; /* Disable mic and line selectors. */ w = hdaa_widget_get(devinfo, 12); if (w != NULL) w->connsenable[1] = 0; w = hdaa_widget_get(devinfo, 13); if (w != NULL) w->connsenable[1] = 0; /* Disable recording from mono playback mix. */ w = hdaa_widget_get(devinfo, 20); if (w != NULL) w->connsenable[3] = 0; break; case HDA_CODEC_AD1986A: /* * This CODEC has overcomplicated input mixing. * Make some cleaning there. */ /* Disable input mono mixer. Not needed and not supported. */ w = hdaa_widget_get(devinfo, 43); if (w != NULL) w->enable = 0; /* Disable any with any input mixing mesh. Use separately. */ w = hdaa_widget_get(devinfo, 39); if (w != NULL) w->enable = 0; w = hdaa_widget_get(devinfo, 40); if (w != NULL) w->enable = 0; w = hdaa_widget_get(devinfo, 41); if (w != NULL) w->enable = 0; w = hdaa_widget_get(devinfo, 42); if (w != NULL) w->enable = 0; /* Disable duplicate mixer node connector. */ w = hdaa_widget_get(devinfo, 15); if (w != NULL) w->connsenable[3] = 0; /* There is only one mic preamplifier, use it effectively. */ w = hdaa_widget_get(devinfo, 31); if (w != NULL) { if ((w->wclass.pin.config & HDA_CONFIG_DEFAULTCONF_DEVICE_MASK) == HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN) { w = hdaa_widget_get(devinfo, 16); if (w != NULL) w->connsenable[2] = 0; } else { w = hdaa_widget_get(devinfo, 15); if (w != NULL) w->connsenable[0] = 0; } } w = hdaa_widget_get(devinfo, 32); if (w != NULL) { if ((w->wclass.pin.config & HDA_CONFIG_DEFAULTCONF_DEVICE_MASK) == HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN) { w = hdaa_widget_get(devinfo, 16); if (w != NULL) w->connsenable[0] = 0; } else { w = hdaa_widget_get(devinfo, 15); if (w != NULL) w->connsenable[1] = 0; } } if (subid == ASUS_A8X_SUBVENDOR) { /* * This is just plain ridiculous.. There * are several A8 series that share the same * pci id but works differently (EAPD). */ w = hdaa_widget_get(devinfo, 26); if (w != NULL && w->type == HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_PIN_COMPLEX && (w->wclass.pin.config & HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK) != HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_NONE) devinfo->quirks &= ~HDAA_QUIRK_EAPDINV; } break; case HDA_CODEC_AD1981HD: /* * This CODEC has very unusual design with several * points inappropriate for the present parser. */ /* Disable recording from mono playback mix. */ w = hdaa_widget_get(devinfo, 21); if (w != NULL) w->connsenable[3] = 0; /* Disable rear to front mic mixer, use separately. */ w = hdaa_widget_get(devinfo, 31); if (w != NULL) w->enable = 0; /* Disable direct playback, use mixer. */ w = hdaa_widget_get(devinfo, 5); if (w != NULL) w->connsenable[0] = 0; w = hdaa_widget_get(devinfo, 6); if (w != NULL) w->connsenable[0] = 0; w = hdaa_widget_get(devinfo, 9); if (w != NULL) w->connsenable[0] = 0; w = hdaa_widget_get(devinfo, 24); if (w != NULL) w->connsenable[0] = 0; break; case HDA_CODEC_ALC269: /* * ASUS EeePC 1001px has strange variant of ALC269 CODEC, * that mutes speaker if unused mixer at NID 15 is muted. * Probably CODEC incorrectly reports internal connections. * Hide that muter from the driver. There are several CODECs * sharing this ID and I have not enough information about * them to implement more universal solution. */ if (subid == 0x84371043) { w = hdaa_widget_get(devinfo, 15); if (w != NULL) w->param.inamp_cap = 0; } break; case HDA_CODEC_CX20582: case HDA_CODEC_CX20583: case HDA_CODEC_CX20584: case HDA_CODEC_CX20585: case HDA_CODEC_CX20590: /* * These codecs have extra connectivity on record side * too reach for the present parser. */ w = hdaa_widget_get(devinfo, 20); if (w != NULL) w->connsenable[1] = 0; w = hdaa_widget_get(devinfo, 21); if (w != NULL) w->connsenable[1] = 0; w = hdaa_widget_get(devinfo, 22); if (w != NULL) w->connsenable[0] = 0; break; case HDA_CODEC_VT1708S_0: case HDA_CODEC_VT1708S_1: case HDA_CODEC_VT1708S_2: case HDA_CODEC_VT1708S_3: case HDA_CODEC_VT1708S_4: case HDA_CODEC_VT1708S_5: case HDA_CODEC_VT1708S_6: case HDA_CODEC_VT1708S_7: /* * These codecs have hidden mic boost controls. */ w = hdaa_widget_get(devinfo, 26); if (w != NULL) w->param.inamp_cap = (40 << HDA_PARAM_OUTPUT_AMP_CAP_STEPSIZE_SHIFT) | (3 << HDA_PARAM_OUTPUT_AMP_CAP_NUMSTEPS_SHIFT) | (0 << HDA_PARAM_OUTPUT_AMP_CAP_OFFSET_SHIFT); w = hdaa_widget_get(devinfo, 30); if (w != NULL) w->param.inamp_cap = (40 << HDA_PARAM_OUTPUT_AMP_CAP_STEPSIZE_SHIFT) | (3 << HDA_PARAM_OUTPUT_AMP_CAP_NUMSTEPS_SHIFT) | (0 << HDA_PARAM_OUTPUT_AMP_CAP_OFFSET_SHIFT); break; } } void hdaa_patch_direct(struct hdaa_devinfo *devinfo) { device_t dev = devinfo->dev; uint32_t id, subid, val; id = hdaa_codec_id(devinfo); subid = hdaa_card_id(devinfo); switch (id) { case HDA_CODEC_VT1708S_0: case HDA_CODEC_VT1708S_1: case HDA_CODEC_VT1708S_2: case HDA_CODEC_VT1708S_3: case HDA_CODEC_VT1708S_4: case HDA_CODEC_VT1708S_5: case HDA_CODEC_VT1708S_6: case HDA_CODEC_VT1708S_7: /* Enable Mic Boost Volume controls. */ hda_command(dev, HDA_CMD_12BIT(0, devinfo->nid, 0xf98, 0x01)); /* Fall though */ case HDA_CODEC_VT1818S: /* Don't bypass mixer. */ hda_command(dev, HDA_CMD_12BIT(0, devinfo->nid, 0xf88, 0xc0)); break; } if (subid == APPLE_INTEL_MAC) hda_command(dev, HDA_CMD_12BIT(0, devinfo->nid, 0x7e7, 0)); if (id == HDA_CODEC_ALC269) { if (subid == 0x16e31043 || subid == 0x831a1043 || subid == 0x834a1043 || subid == 0x83981043 || subid == 0x83ce1043) { /* * The ditital mics on some Asus laptops produce * differential signals instead of expected stereo. * That results in silence if downmix it to mono. * To workaround, make codec to handle signal as mono. */ hda_command(dev, HDA_CMD_SET_COEFF_INDEX(0, 0x20, 0x07)); val = hda_command(dev, HDA_CMD_GET_PROCESSING_COEFF(0, 0x20)); hda_command(dev, HDA_CMD_SET_COEFF_INDEX(0, 0x20, 0x07)); hda_command(dev, HDA_CMD_SET_PROCESSING_COEFF(0, 0x20, val|0x80)); } } } Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.c =================================================================== --- user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.c (revision 281584) +++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.c (revision 281585) @@ -1,2088 +1,2090 @@ /*- * Copyright (c) 2006 Stephane E. Potvin * Copyright (c) 2006 Ariff Abdullah * Copyright (c) 2008-2012 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * Intel High Definition Audio (Controller) driver for FreeBSD. */ #ifdef HAVE_KERNEL_OPTION_HEADERS #include "opt_snd.h" #endif #include #include #include #include #include #include #include #include #include #define HDA_DRV_TEST_REV "20120126_0002" SND_DECLARE_FILE("$FreeBSD$"); #define hdac_lock(sc) snd_mtxlock((sc)->lock) #define hdac_unlock(sc) snd_mtxunlock((sc)->lock) #define hdac_lockassert(sc) snd_mtxassert((sc)->lock) #define hdac_lockowned(sc) mtx_owned((sc)->lock) #define HDAC_QUIRK_64BIT (1 << 0) #define HDAC_QUIRK_DMAPOS (1 << 1) #define HDAC_QUIRK_MSI (1 << 2) static const struct { const char *key; uint32_t value; } hdac_quirks_tab[] = { { "64bit", HDAC_QUIRK_DMAPOS }, { "dmapos", HDAC_QUIRK_DMAPOS }, { "msi", HDAC_QUIRK_MSI }, }; MALLOC_DEFINE(M_HDAC, "hdac", "HDA Controller"); static const struct { uint32_t model; const char *desc; char quirks_on; char quirks_off; } hdac_devices[] = { { HDA_INTEL_OAK, "Intel Oaktrail", 0, 0 }, { HDA_INTEL_BAY, "Intel BayTrail", 0, 0 }, { HDA_INTEL_HSW1, "Intel Haswell", 0, 0 }, { HDA_INTEL_HSW2, "Intel Haswell", 0, 0 }, { HDA_INTEL_HSW3, "Intel Haswell", 0, 0 }, + { HDA_INTEL_BDW1, "Intel Broadwell", 0, 0 }, + { HDA_INTEL_BDW2, "Intel Broadwell", 0, 0 }, { HDA_INTEL_CPT, "Intel Cougar Point", 0, 0 }, { HDA_INTEL_PATSBURG,"Intel Patsburg", 0, 0 }, { HDA_INTEL_PPT1, "Intel Panther Point", 0, 0 }, { HDA_INTEL_LPT1, "Intel Lynx Point", 0, 0 }, { HDA_INTEL_LPT2, "Intel Lynx Point", 0, 0 }, { HDA_INTEL_WCPT, "Intel Wildcat Point", 0, 0 }, { HDA_INTEL_WELLS1, "Intel Wellsburg", 0, 0 }, { HDA_INTEL_WELLS2, "Intel Wellsburg", 0, 0 }, { HDA_INTEL_LPTLP1, "Intel Lynx Point-LP", 0, 0 }, { HDA_INTEL_LPTLP2, "Intel Lynx Point-LP", 0, 0 }, { HDA_INTEL_82801F, "Intel 82801F", 0, 0 }, { HDA_INTEL_63XXESB, "Intel 631x/632xESB", 0, 0 }, { HDA_INTEL_82801G, "Intel 82801G", 0, 0 }, { HDA_INTEL_82801H, "Intel 82801H", 0, 0 }, { HDA_INTEL_82801I, "Intel 82801I", 0, 0 }, { HDA_INTEL_82801JI, "Intel 82801JI", 0, 0 }, { HDA_INTEL_82801JD, "Intel 82801JD", 0, 0 }, { HDA_INTEL_PCH, "Intel 5 Series/3400 Series", 0, 0 }, { HDA_INTEL_PCH2, "Intel 5 Series/3400 Series", 0, 0 }, { HDA_INTEL_SCH, "Intel SCH", 0, 0 }, { HDA_NVIDIA_MCP51, "NVIDIA MCP51", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_MCP55, "NVIDIA MCP55", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_MCP61_1, "NVIDIA MCP61", 0, 0 }, { HDA_NVIDIA_MCP61_2, "NVIDIA MCP61", 0, 0 }, { HDA_NVIDIA_MCP65_1, "NVIDIA MCP65", 0, 0 }, { HDA_NVIDIA_MCP65_2, "NVIDIA MCP65", 0, 0 }, { HDA_NVIDIA_MCP67_1, "NVIDIA MCP67", 0, 0 }, { HDA_NVIDIA_MCP67_2, "NVIDIA MCP67", 0, 0 }, { HDA_NVIDIA_MCP73_1, "NVIDIA MCP73", 0, 0 }, { HDA_NVIDIA_MCP73_2, "NVIDIA MCP73", 0, 0 }, { HDA_NVIDIA_MCP78_1, "NVIDIA MCP78", 0, HDAC_QUIRK_64BIT }, { HDA_NVIDIA_MCP78_2, "NVIDIA MCP78", 0, HDAC_QUIRK_64BIT }, { HDA_NVIDIA_MCP78_3, "NVIDIA MCP78", 0, HDAC_QUIRK_64BIT }, { HDA_NVIDIA_MCP78_4, "NVIDIA MCP78", 0, HDAC_QUIRK_64BIT }, { HDA_NVIDIA_MCP79_1, "NVIDIA MCP79", 0, 0 }, { HDA_NVIDIA_MCP79_2, "NVIDIA MCP79", 0, 0 }, { HDA_NVIDIA_MCP79_3, "NVIDIA MCP79", 0, 0 }, { HDA_NVIDIA_MCP79_4, "NVIDIA MCP79", 0, 0 }, { HDA_NVIDIA_MCP89_1, "NVIDIA MCP89", 0, 0 }, { HDA_NVIDIA_MCP89_2, "NVIDIA MCP89", 0, 0 }, { HDA_NVIDIA_MCP89_3, "NVIDIA MCP89", 0, 0 }, { HDA_NVIDIA_MCP89_4, "NVIDIA MCP89", 0, 0 }, { HDA_NVIDIA_0BE2, "NVIDIA (0x0be2)", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_0BE3, "NVIDIA (0x0be3)", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_0BE4, "NVIDIA (0x0be4)", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GT100, "NVIDIA GT100", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GT104, "NVIDIA GT104", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GT106, "NVIDIA GT106", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GT108, "NVIDIA GT108", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GT116, "NVIDIA GT116", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GF119, "NVIDIA GF119", 0, 0 }, { HDA_NVIDIA_GF110_1, "NVIDIA GF110", 0, HDAC_QUIRK_MSI }, { HDA_NVIDIA_GF110_2, "NVIDIA GF110", 0, HDAC_QUIRK_MSI }, { HDA_ATI_SB450, "ATI SB450", 0, 0 }, { HDA_ATI_SB600, "ATI SB600", 0, 0 }, { HDA_ATI_RS600, "ATI RS600", 0, 0 }, { HDA_ATI_RS690, "ATI RS690", 0, 0 }, { HDA_ATI_RS780, "ATI RS780", 0, 0 }, { HDA_ATI_R600, "ATI R600", 0, 0 }, { HDA_ATI_RV610, "ATI RV610", 0, 0 }, { HDA_ATI_RV620, "ATI RV620", 0, 0 }, { HDA_ATI_RV630, "ATI RV630", 0, 0 }, { HDA_ATI_RV635, "ATI RV635", 0, 0 }, { HDA_ATI_RV710, "ATI RV710", 0, 0 }, { HDA_ATI_RV730, "ATI RV730", 0, 0 }, { HDA_ATI_RV740, "ATI RV740", 0, 0 }, { HDA_ATI_RV770, "ATI RV770", 0, 0 }, { HDA_ATI_RV810, "ATI RV810", 0, 0 }, { HDA_ATI_RV830, "ATI RV830", 0, 0 }, { HDA_ATI_RV840, "ATI RV840", 0, 0 }, { HDA_ATI_RV870, "ATI RV870", 0, 0 }, { HDA_ATI_RV910, "ATI RV910", 0, 0 }, { HDA_ATI_RV930, "ATI RV930", 0, 0 }, { HDA_ATI_RV940, "ATI RV940", 0, 0 }, { HDA_ATI_RV970, "ATI RV970", 0, 0 }, { HDA_ATI_R1000, "ATI R1000", 0, 0 }, { HDA_RDC_M3010, "RDC M3010", 0, 0 }, { HDA_VIA_VT82XX, "VIA VT8251/8237A",0, 0 }, { HDA_SIS_966, "SiS 966", 0, 0 }, { HDA_ULI_M5461, "ULI M5461", 0, 0 }, /* Unknown */ { HDA_INTEL_ALL, "Intel", 0, 0 }, { HDA_NVIDIA_ALL, "NVIDIA", 0, 0 }, { HDA_ATI_ALL, "ATI", 0, 0 }, { HDA_VIA_ALL, "VIA", 0, 0 }, { HDA_SIS_ALL, "SiS", 0, 0 }, { HDA_ULI_ALL, "ULI", 0, 0 }, }; static const struct { uint16_t vendor; uint8_t reg; uint8_t mask; uint8_t enable; } hdac_pcie_snoop[] = { { INTEL_VENDORID, 0x00, 0x00, 0x00 }, { ATI_VENDORID, 0x42, 0xf8, 0x02 }, { NVIDIA_VENDORID, 0x4e, 0xf0, 0x0f }, }; /**************************************************************************** * Function prototypes ****************************************************************************/ static void hdac_intr_handler(void *); static int hdac_reset(struct hdac_softc *, int); static int hdac_get_capabilities(struct hdac_softc *); static void hdac_dma_cb(void *, bus_dma_segment_t *, int, int); static int hdac_dma_alloc(struct hdac_softc *, struct hdac_dma *, bus_size_t); static void hdac_dma_free(struct hdac_softc *, struct hdac_dma *); static int hdac_mem_alloc(struct hdac_softc *); static void hdac_mem_free(struct hdac_softc *); static int hdac_irq_alloc(struct hdac_softc *); static void hdac_irq_free(struct hdac_softc *); static void hdac_corb_init(struct hdac_softc *); static void hdac_rirb_init(struct hdac_softc *); static void hdac_corb_start(struct hdac_softc *); static void hdac_rirb_start(struct hdac_softc *); static void hdac_attach2(void *); static uint32_t hdac_send_command(struct hdac_softc *, nid_t, uint32_t); static int hdac_probe(device_t); static int hdac_attach(device_t); static int hdac_detach(device_t); static int hdac_suspend(device_t); static int hdac_resume(device_t); static int hdac_rirb_flush(struct hdac_softc *sc); static int hdac_unsolq_flush(struct hdac_softc *sc); #define hdac_command(a1, a2, a3) \ hdac_send_command(a1, a3, a2) /* This function surely going to make its way into upper level someday. */ static void hdac_config_fetch(struct hdac_softc *sc, uint32_t *on, uint32_t *off) { const char *res = NULL; int i = 0, j, k, len, inv; if (resource_string_value(device_get_name(sc->dev), device_get_unit(sc->dev), "config", &res) != 0) return; if (!(res != NULL && strlen(res) > 0)) return; HDA_BOOTVERBOSE( device_printf(sc->dev, "Config options:"); ); for (;;) { while (res[i] != '\0' && (res[i] == ',' || isspace(res[i]) != 0)) i++; if (res[i] == '\0') { HDA_BOOTVERBOSE( printf("\n"); ); return; } j = i; while (res[j] != '\0' && !(res[j] == ',' || isspace(res[j]) != 0)) j++; len = j - i; if (len > 2 && strncmp(res + i, "no", 2) == 0) inv = 2; else inv = 0; for (k = 0; len > inv && k < nitems(hdac_quirks_tab); k++) { if (strncmp(res + i + inv, hdac_quirks_tab[k].key, len - inv) != 0) continue; if (len - inv != strlen(hdac_quirks_tab[k].key)) continue; HDA_BOOTVERBOSE( printf(" %s%s", (inv != 0) ? "no" : "", hdac_quirks_tab[k].key); ); if (inv == 0) { *on |= hdac_quirks_tab[k].value; *on &= ~hdac_quirks_tab[k].value; } else if (inv != 0) { *off |= hdac_quirks_tab[k].value; *off &= ~hdac_quirks_tab[k].value; } break; } i = j; } } /**************************************************************************** * void hdac_intr_handler(void *) * * Interrupt handler. Processes interrupts received from the hdac. ****************************************************************************/ static void hdac_intr_handler(void *context) { struct hdac_softc *sc; device_t dev; uint32_t intsts; uint8_t rirbsts; int i; sc = (struct hdac_softc *)context; hdac_lock(sc); /* Do we have anything to do? */ intsts = HDAC_READ_4(&sc->mem, HDAC_INTSTS); if ((intsts & HDAC_INTSTS_GIS) == 0) { hdac_unlock(sc); return; } /* Was this a controller interrupt? */ if (intsts & HDAC_INTSTS_CIS) { rirbsts = HDAC_READ_1(&sc->mem, HDAC_RIRBSTS); /* Get as many responses that we can */ while (rirbsts & HDAC_RIRBSTS_RINTFL) { HDAC_WRITE_1(&sc->mem, HDAC_RIRBSTS, HDAC_RIRBSTS_RINTFL); hdac_rirb_flush(sc); rirbsts = HDAC_READ_1(&sc->mem, HDAC_RIRBSTS); } if (sc->unsolq_rp != sc->unsolq_wp) taskqueue_enqueue(taskqueue_thread, &sc->unsolq_task); } if (intsts & HDAC_INTSTS_SIS_MASK) { for (i = 0; i < sc->num_ss; i++) { if ((intsts & (1 << i)) == 0) continue; HDAC_WRITE_1(&sc->mem, (i << 5) + HDAC_SDSTS, HDAC_SDSTS_DESE | HDAC_SDSTS_FIFOE | HDAC_SDSTS_BCIS ); if ((dev = sc->streams[i].dev) != NULL) { HDAC_STREAM_INTR(dev, sc->streams[i].dir, sc->streams[i].stream); } } } HDAC_WRITE_4(&sc->mem, HDAC_INTSTS, intsts); hdac_unlock(sc); } static void hdac_poll_callback(void *arg) { struct hdac_softc *sc = arg; if (sc == NULL) return; hdac_lock(sc); if (sc->polling == 0) { hdac_unlock(sc); return; } callout_reset(&sc->poll_callout, sc->poll_ival, hdac_poll_callback, sc); hdac_unlock(sc); hdac_intr_handler(sc); } /**************************************************************************** * int hdac_reset(hdac_softc *, int) * * Reset the hdac to a quiescent and known state. ****************************************************************************/ static int hdac_reset(struct hdac_softc *sc, int wakeup) { uint32_t gctl; int count, i; /* * Stop all Streams DMA engine */ for (i = 0; i < sc->num_iss; i++) HDAC_WRITE_4(&sc->mem, HDAC_ISDCTL(sc, i), 0x0); for (i = 0; i < sc->num_oss; i++) HDAC_WRITE_4(&sc->mem, HDAC_OSDCTL(sc, i), 0x0); for (i = 0; i < sc->num_bss; i++) HDAC_WRITE_4(&sc->mem, HDAC_BSDCTL(sc, i), 0x0); /* * Stop Control DMA engines. */ HDAC_WRITE_1(&sc->mem, HDAC_CORBCTL, 0x0); HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, 0x0); /* * Reset DMA position buffer. */ HDAC_WRITE_4(&sc->mem, HDAC_DPIBLBASE, 0x0); HDAC_WRITE_4(&sc->mem, HDAC_DPIBUBASE, 0x0); /* * Reset the controller. The reset must remain asserted for * a minimum of 100us. */ gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL); HDAC_WRITE_4(&sc->mem, HDAC_GCTL, gctl & ~HDAC_GCTL_CRST); count = 10000; do { gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL); if (!(gctl & HDAC_GCTL_CRST)) break; DELAY(10); } while (--count); if (gctl & HDAC_GCTL_CRST) { device_printf(sc->dev, "Unable to put hdac in reset\n"); return (ENXIO); } /* If wakeup is not requested - leave the controller in reset state. */ if (!wakeup) return (0); DELAY(100); gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL); HDAC_WRITE_4(&sc->mem, HDAC_GCTL, gctl | HDAC_GCTL_CRST); count = 10000; do { gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL); if (gctl & HDAC_GCTL_CRST) break; DELAY(10); } while (--count); if (!(gctl & HDAC_GCTL_CRST)) { device_printf(sc->dev, "Device stuck in reset\n"); return (ENXIO); } /* * Wait for codecs to finish their own reset sequence. The delay here * should be of 250us but for some reasons, on it's not enough on my * computer. Let's use twice as much as necessary to make sure that * it's reset properly. */ DELAY(1000); return (0); } /**************************************************************************** * int hdac_get_capabilities(struct hdac_softc *); * * Retreive the general capabilities of the hdac; * Number of Input Streams * Number of Output Streams * Number of bidirectional Streams * 64bit ready * CORB and RIRB sizes ****************************************************************************/ static int hdac_get_capabilities(struct hdac_softc *sc) { uint16_t gcap; uint8_t corbsize, rirbsize; gcap = HDAC_READ_2(&sc->mem, HDAC_GCAP); sc->num_iss = HDAC_GCAP_ISS(gcap); sc->num_oss = HDAC_GCAP_OSS(gcap); sc->num_bss = HDAC_GCAP_BSS(gcap); sc->num_ss = sc->num_iss + sc->num_oss + sc->num_bss; sc->num_sdo = HDAC_GCAP_NSDO(gcap); sc->support_64bit = (gcap & HDAC_GCAP_64OK) != 0; if (sc->quirks_on & HDAC_QUIRK_64BIT) sc->support_64bit = 1; else if (sc->quirks_off & HDAC_QUIRK_64BIT) sc->support_64bit = 0; corbsize = HDAC_READ_1(&sc->mem, HDAC_CORBSIZE); if ((corbsize & HDAC_CORBSIZE_CORBSZCAP_256) == HDAC_CORBSIZE_CORBSZCAP_256) sc->corb_size = 256; else if ((corbsize & HDAC_CORBSIZE_CORBSZCAP_16) == HDAC_CORBSIZE_CORBSZCAP_16) sc->corb_size = 16; else if ((corbsize & HDAC_CORBSIZE_CORBSZCAP_2) == HDAC_CORBSIZE_CORBSZCAP_2) sc->corb_size = 2; else { device_printf(sc->dev, "%s: Invalid corb size (%x)\n", __func__, corbsize); return (ENXIO); } rirbsize = HDAC_READ_1(&sc->mem, HDAC_RIRBSIZE); if ((rirbsize & HDAC_RIRBSIZE_RIRBSZCAP_256) == HDAC_RIRBSIZE_RIRBSZCAP_256) sc->rirb_size = 256; else if ((rirbsize & HDAC_RIRBSIZE_RIRBSZCAP_16) == HDAC_RIRBSIZE_RIRBSZCAP_16) sc->rirb_size = 16; else if ((rirbsize & HDAC_RIRBSIZE_RIRBSZCAP_2) == HDAC_RIRBSIZE_RIRBSZCAP_2) sc->rirb_size = 2; else { device_printf(sc->dev, "%s: Invalid rirb size (%x)\n", __func__, rirbsize); return (ENXIO); } HDA_BOOTVERBOSE( device_printf(sc->dev, "Caps: OSS %d, ISS %d, BSS %d, " "NSDO %d%s, CORB %d, RIRB %d\n", sc->num_oss, sc->num_iss, sc->num_bss, 1 << sc->num_sdo, sc->support_64bit ? ", 64bit" : "", sc->corb_size, sc->rirb_size); ); return (0); } /**************************************************************************** * void hdac_dma_cb * * This function is called by bus_dmamap_load when the mapping has been * established. We just record the physical address of the mapping into * the struct hdac_dma passed in. ****************************************************************************/ static void hdac_dma_cb(void *callback_arg, bus_dma_segment_t *segs, int nseg, int error) { struct hdac_dma *dma; if (error == 0) { dma = (struct hdac_dma *)callback_arg; dma->dma_paddr = segs[0].ds_addr; } } /**************************************************************************** * int hdac_dma_alloc * * This function allocate and setup a dma region (struct hdac_dma). * It must be freed by a corresponding hdac_dma_free. ****************************************************************************/ static int hdac_dma_alloc(struct hdac_softc *sc, struct hdac_dma *dma, bus_size_t size) { bus_size_t roundsz; int result; roundsz = roundup2(size, HDA_DMA_ALIGNMENT); bzero(dma, sizeof(*dma)); /* * Create a DMA tag */ result = bus_dma_tag_create( bus_get_dma_tag(sc->dev), /* parent */ HDA_DMA_ALIGNMENT, /* alignment */ 0, /* boundary */ (sc->support_64bit) ? BUS_SPACE_MAXADDR : BUS_SPACE_MAXADDR_32BIT, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, /* filtfunc */ NULL, /* fistfuncarg */ roundsz, /* maxsize */ 1, /* nsegments */ roundsz, /* maxsegsz */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &dma->dma_tag); /* dmat */ if (result != 0) { device_printf(sc->dev, "%s: bus_dma_tag_create failed (%x)\n", __func__, result); goto hdac_dma_alloc_fail; } /* * Allocate DMA memory */ result = bus_dmamem_alloc(dma->dma_tag, (void **)&dma->dma_vaddr, BUS_DMA_NOWAIT | BUS_DMA_ZERO | ((sc->flags & HDAC_F_DMA_NOCACHE) ? BUS_DMA_NOCACHE : 0), &dma->dma_map); if (result != 0) { device_printf(sc->dev, "%s: bus_dmamem_alloc failed (%x)\n", __func__, result); goto hdac_dma_alloc_fail; } dma->dma_size = roundsz; /* * Map the memory */ result = bus_dmamap_load(dma->dma_tag, dma->dma_map, (void *)dma->dma_vaddr, roundsz, hdac_dma_cb, (void *)dma, 0); if (result != 0 || dma->dma_paddr == 0) { if (result == 0) result = ENOMEM; device_printf(sc->dev, "%s: bus_dmamem_load failed (%x)\n", __func__, result); goto hdac_dma_alloc_fail; } HDA_BOOTHVERBOSE( device_printf(sc->dev, "%s: size=%ju -> roundsz=%ju\n", __func__, (uintmax_t)size, (uintmax_t)roundsz); ); return (0); hdac_dma_alloc_fail: hdac_dma_free(sc, dma); return (result); } /**************************************************************************** * void hdac_dma_free(struct hdac_softc *, struct hdac_dma *) * * Free a struct dhac_dma that has been previously allocated via the * hdac_dma_alloc function. ****************************************************************************/ static void hdac_dma_free(struct hdac_softc *sc, struct hdac_dma *dma) { if (dma->dma_paddr != 0) { #if 0 /* Flush caches */ bus_dmamap_sync(dma->dma_tag, dma->dma_map, BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); #endif bus_dmamap_unload(dma->dma_tag, dma->dma_map); dma->dma_paddr = 0; } if (dma->dma_vaddr != NULL) { bus_dmamem_free(dma->dma_tag, dma->dma_vaddr, dma->dma_map); dma->dma_vaddr = NULL; } if (dma->dma_tag != NULL) { bus_dma_tag_destroy(dma->dma_tag); dma->dma_tag = NULL; } dma->dma_size = 0; } /**************************************************************************** * int hdac_mem_alloc(struct hdac_softc *) * * Allocate all the bus resources necessary to speak with the physical * controller. ****************************************************************************/ static int hdac_mem_alloc(struct hdac_softc *sc) { struct hdac_mem *mem; mem = &sc->mem; mem->mem_rid = PCIR_BAR(0); mem->mem_res = bus_alloc_resource_any(sc->dev, SYS_RES_MEMORY, &mem->mem_rid, RF_ACTIVE); if (mem->mem_res == NULL) { device_printf(sc->dev, "%s: Unable to allocate memory resource\n", __func__); return (ENOMEM); } mem->mem_tag = rman_get_bustag(mem->mem_res); mem->mem_handle = rman_get_bushandle(mem->mem_res); return (0); } /**************************************************************************** * void hdac_mem_free(struct hdac_softc *) * * Free up resources previously allocated by hdac_mem_alloc. ****************************************************************************/ static void hdac_mem_free(struct hdac_softc *sc) { struct hdac_mem *mem; mem = &sc->mem; if (mem->mem_res != NULL) bus_release_resource(sc->dev, SYS_RES_MEMORY, mem->mem_rid, mem->mem_res); mem->mem_res = NULL; } /**************************************************************************** * int hdac_irq_alloc(struct hdac_softc *) * * Allocate and setup the resources necessary for interrupt handling. ****************************************************************************/ static int hdac_irq_alloc(struct hdac_softc *sc) { struct hdac_irq *irq; int result; irq = &sc->irq; irq->irq_rid = 0x0; if ((sc->quirks_off & HDAC_QUIRK_MSI) == 0 && (result = pci_msi_count(sc->dev)) == 1 && pci_alloc_msi(sc->dev, &result) == 0) irq->irq_rid = 0x1; irq->irq_res = bus_alloc_resource_any(sc->dev, SYS_RES_IRQ, &irq->irq_rid, RF_SHAREABLE | RF_ACTIVE); if (irq->irq_res == NULL) { device_printf(sc->dev, "%s: Unable to allocate irq\n", __func__); goto hdac_irq_alloc_fail; } result = bus_setup_intr(sc->dev, irq->irq_res, INTR_MPSAFE | INTR_TYPE_AV, NULL, hdac_intr_handler, sc, &irq->irq_handle); if (result != 0) { device_printf(sc->dev, "%s: Unable to setup interrupt handler (%x)\n", __func__, result); goto hdac_irq_alloc_fail; } return (0); hdac_irq_alloc_fail: hdac_irq_free(sc); return (ENXIO); } /**************************************************************************** * void hdac_irq_free(struct hdac_softc *) * * Free up resources previously allocated by hdac_irq_alloc. ****************************************************************************/ static void hdac_irq_free(struct hdac_softc *sc) { struct hdac_irq *irq; irq = &sc->irq; if (irq->irq_res != NULL && irq->irq_handle != NULL) bus_teardown_intr(sc->dev, irq->irq_res, irq->irq_handle); if (irq->irq_res != NULL) bus_release_resource(sc->dev, SYS_RES_IRQ, irq->irq_rid, irq->irq_res); if (irq->irq_rid == 0x1) pci_release_msi(sc->dev); irq->irq_handle = NULL; irq->irq_res = NULL; irq->irq_rid = 0x0; } /**************************************************************************** * void hdac_corb_init(struct hdac_softc *) * * Initialize the corb registers for operations but do not start it up yet. * The CORB engine must not be running when this function is called. ****************************************************************************/ static void hdac_corb_init(struct hdac_softc *sc) { uint8_t corbsize; uint64_t corbpaddr; /* Setup the CORB size. */ switch (sc->corb_size) { case 256: corbsize = HDAC_CORBSIZE_CORBSIZE(HDAC_CORBSIZE_CORBSIZE_256); break; case 16: corbsize = HDAC_CORBSIZE_CORBSIZE(HDAC_CORBSIZE_CORBSIZE_16); break; case 2: corbsize = HDAC_CORBSIZE_CORBSIZE(HDAC_CORBSIZE_CORBSIZE_2); break; default: panic("%s: Invalid CORB size (%x)\n", __func__, sc->corb_size); } HDAC_WRITE_1(&sc->mem, HDAC_CORBSIZE, corbsize); /* Setup the CORB Address in the hdac */ corbpaddr = (uint64_t)sc->corb_dma.dma_paddr; HDAC_WRITE_4(&sc->mem, HDAC_CORBLBASE, (uint32_t)corbpaddr); HDAC_WRITE_4(&sc->mem, HDAC_CORBUBASE, (uint32_t)(corbpaddr >> 32)); /* Set the WP and RP */ sc->corb_wp = 0; HDAC_WRITE_2(&sc->mem, HDAC_CORBWP, sc->corb_wp); HDAC_WRITE_2(&sc->mem, HDAC_CORBRP, HDAC_CORBRP_CORBRPRST); /* * The HDA specification indicates that the CORBRPRST bit will always * read as zero. Unfortunately, it seems that at least the 82801G * doesn't reset the bit to zero, which stalls the corb engine. * manually reset the bit to zero before continuing. */ HDAC_WRITE_2(&sc->mem, HDAC_CORBRP, 0x0); /* Enable CORB error reporting */ #if 0 HDAC_WRITE_1(&sc->mem, HDAC_CORBCTL, HDAC_CORBCTL_CMEIE); #endif } /**************************************************************************** * void hdac_rirb_init(struct hdac_softc *) * * Initialize the rirb registers for operations but do not start it up yet. * The RIRB engine must not be running when this function is called. ****************************************************************************/ static void hdac_rirb_init(struct hdac_softc *sc) { uint8_t rirbsize; uint64_t rirbpaddr; /* Setup the RIRB size. */ switch (sc->rirb_size) { case 256: rirbsize = HDAC_RIRBSIZE_RIRBSIZE(HDAC_RIRBSIZE_RIRBSIZE_256); break; case 16: rirbsize = HDAC_RIRBSIZE_RIRBSIZE(HDAC_RIRBSIZE_RIRBSIZE_16); break; case 2: rirbsize = HDAC_RIRBSIZE_RIRBSIZE(HDAC_RIRBSIZE_RIRBSIZE_2); break; default: panic("%s: Invalid RIRB size (%x)\n", __func__, sc->rirb_size); } HDAC_WRITE_1(&sc->mem, HDAC_RIRBSIZE, rirbsize); /* Setup the RIRB Address in the hdac */ rirbpaddr = (uint64_t)sc->rirb_dma.dma_paddr; HDAC_WRITE_4(&sc->mem, HDAC_RIRBLBASE, (uint32_t)rirbpaddr); HDAC_WRITE_4(&sc->mem, HDAC_RIRBUBASE, (uint32_t)(rirbpaddr >> 32)); /* Setup the WP and RP */ sc->rirb_rp = 0; HDAC_WRITE_2(&sc->mem, HDAC_RIRBWP, HDAC_RIRBWP_RIRBWPRST); /* Setup the interrupt threshold */ HDAC_WRITE_2(&sc->mem, HDAC_RINTCNT, sc->rirb_size / 2); /* Enable Overrun and response received reporting */ #if 0 HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, HDAC_RIRBCTL_RIRBOIC | HDAC_RIRBCTL_RINTCTL); #else HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, HDAC_RIRBCTL_RINTCTL); #endif #if 0 /* * Make sure that the Host CPU cache doesn't contain any dirty * cache lines that falls in the rirb. If I understood correctly, it * should be sufficient to do this only once as the rirb is purely * read-only from now on. */ bus_dmamap_sync(sc->rirb_dma.dma_tag, sc->rirb_dma.dma_map, BUS_DMASYNC_PREREAD); #endif } /**************************************************************************** * void hdac_corb_start(hdac_softc *) * * Startup the corb DMA engine ****************************************************************************/ static void hdac_corb_start(struct hdac_softc *sc) { uint32_t corbctl; corbctl = HDAC_READ_1(&sc->mem, HDAC_CORBCTL); corbctl |= HDAC_CORBCTL_CORBRUN; HDAC_WRITE_1(&sc->mem, HDAC_CORBCTL, corbctl); } /**************************************************************************** * void hdac_rirb_start(hdac_softc *) * * Startup the rirb DMA engine ****************************************************************************/ static void hdac_rirb_start(struct hdac_softc *sc) { uint32_t rirbctl; rirbctl = HDAC_READ_1(&sc->mem, HDAC_RIRBCTL); rirbctl |= HDAC_RIRBCTL_RIRBDMAEN; HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, rirbctl); } static int hdac_rirb_flush(struct hdac_softc *sc) { struct hdac_rirb *rirb_base, *rirb; nid_t cad; uint32_t resp; uint8_t rirbwp; int ret; rirb_base = (struct hdac_rirb *)sc->rirb_dma.dma_vaddr; rirbwp = HDAC_READ_1(&sc->mem, HDAC_RIRBWP); #if 0 bus_dmamap_sync(sc->rirb_dma.dma_tag, sc->rirb_dma.dma_map, BUS_DMASYNC_POSTREAD); #endif ret = 0; while (sc->rirb_rp != rirbwp) { sc->rirb_rp++; sc->rirb_rp %= sc->rirb_size; rirb = &rirb_base[sc->rirb_rp]; cad = HDAC_RIRB_RESPONSE_EX_SDATA_IN(rirb->response_ex); resp = rirb->response; if (rirb->response_ex & HDAC_RIRB_RESPONSE_EX_UNSOLICITED) { sc->unsolq[sc->unsolq_wp++] = resp; sc->unsolq_wp %= HDAC_UNSOLQ_MAX; sc->unsolq[sc->unsolq_wp++] = cad; sc->unsolq_wp %= HDAC_UNSOLQ_MAX; } else if (sc->codecs[cad].pending <= 0) { device_printf(sc->dev, "Unexpected unsolicited " "response from address %d: %08x\n", cad, resp); } else { sc->codecs[cad].response = resp; sc->codecs[cad].pending--; } ret++; } return (ret); } static int hdac_unsolq_flush(struct hdac_softc *sc) { device_t child; nid_t cad; uint32_t resp; int ret = 0; if (sc->unsolq_st == HDAC_UNSOLQ_READY) { sc->unsolq_st = HDAC_UNSOLQ_BUSY; while (sc->unsolq_rp != sc->unsolq_wp) { resp = sc->unsolq[sc->unsolq_rp++]; sc->unsolq_rp %= HDAC_UNSOLQ_MAX; cad = sc->unsolq[sc->unsolq_rp++]; sc->unsolq_rp %= HDAC_UNSOLQ_MAX; if ((child = sc->codecs[cad].dev) != NULL) HDAC_UNSOL_INTR(child, resp); ret++; } sc->unsolq_st = HDAC_UNSOLQ_READY; } return (ret); } /**************************************************************************** * uint32_t hdac_command_sendone_internal * * Wrapper function that sends only one command to a given codec ****************************************************************************/ static uint32_t hdac_send_command(struct hdac_softc *sc, nid_t cad, uint32_t verb) { int timeout; uint32_t *corb; if (!hdac_lockowned(sc)) device_printf(sc->dev, "WARNING!!!! mtx not owned!!!!\n"); verb &= ~HDA_CMD_CAD_MASK; verb |= ((uint32_t)cad) << HDA_CMD_CAD_SHIFT; sc->codecs[cad].response = HDA_INVALID; sc->codecs[cad].pending++; sc->corb_wp++; sc->corb_wp %= sc->corb_size; corb = (uint32_t *)sc->corb_dma.dma_vaddr; #if 0 bus_dmamap_sync(sc->corb_dma.dma_tag, sc->corb_dma.dma_map, BUS_DMASYNC_PREWRITE); #endif corb[sc->corb_wp] = verb; #if 0 bus_dmamap_sync(sc->corb_dma.dma_tag, sc->corb_dma.dma_map, BUS_DMASYNC_POSTWRITE); #endif HDAC_WRITE_2(&sc->mem, HDAC_CORBWP, sc->corb_wp); timeout = 10000; do { if (hdac_rirb_flush(sc) == 0) DELAY(10); } while (sc->codecs[cad].pending != 0 && --timeout); if (sc->codecs[cad].pending != 0) { device_printf(sc->dev, "Command timeout on address %d\n", cad); sc->codecs[cad].pending = 0; } if (sc->unsolq_rp != sc->unsolq_wp) taskqueue_enqueue(taskqueue_thread, &sc->unsolq_task); return (sc->codecs[cad].response); } /**************************************************************************** * Device Methods ****************************************************************************/ /**************************************************************************** * int hdac_probe(device_t) * * Probe for the presence of an hdac. If none is found, check for a generic * match using the subclass of the device. ****************************************************************************/ static int hdac_probe(device_t dev) { int i, result; uint32_t model; uint16_t class, subclass; char desc[64]; model = (uint32_t)pci_get_device(dev) << 16; model |= (uint32_t)pci_get_vendor(dev) & 0x0000ffff; class = pci_get_class(dev); subclass = pci_get_subclass(dev); bzero(desc, sizeof(desc)); result = ENXIO; for (i = 0; i < nitems(hdac_devices); i++) { if (hdac_devices[i].model == model) { strlcpy(desc, hdac_devices[i].desc, sizeof(desc)); result = BUS_PROBE_DEFAULT; break; } if (HDA_DEV_MATCH(hdac_devices[i].model, model) && class == PCIC_MULTIMEDIA && subclass == PCIS_MULTIMEDIA_HDA) { snprintf(desc, sizeof(desc), "%s (0x%04x)", hdac_devices[i].desc, pci_get_device(dev)); result = BUS_PROBE_GENERIC; break; } } if (result == ENXIO && class == PCIC_MULTIMEDIA && subclass == PCIS_MULTIMEDIA_HDA) { snprintf(desc, sizeof(desc), "Generic (0x%08x)", model); result = BUS_PROBE_GENERIC; } if (result != ENXIO) { strlcat(desc, " HDA Controller", sizeof(desc)); device_set_desc_copy(dev, desc); } return (result); } static void hdac_unsolq_task(void *context, int pending) { struct hdac_softc *sc; sc = (struct hdac_softc *)context; hdac_lock(sc); hdac_unsolq_flush(sc); hdac_unlock(sc); } /**************************************************************************** * int hdac_attach(device_t) * * Attach the device into the kernel. Interrupts usually won't be enabled * when this function is called. Setup everything that doesn't require * interrupts and defer probing of codecs until interrupts are enabled. ****************************************************************************/ static int hdac_attach(device_t dev) { struct hdac_softc *sc; int result; int i, devid = -1; uint32_t model; uint16_t class, subclass; uint16_t vendor; uint8_t v; sc = device_get_softc(dev); HDA_BOOTVERBOSE( device_printf(dev, "PCI card vendor: 0x%04x, device: 0x%04x\n", pci_get_subvendor(dev), pci_get_subdevice(dev)); device_printf(dev, "HDA Driver Revision: %s\n", HDA_DRV_TEST_REV); ); model = (uint32_t)pci_get_device(dev) << 16; model |= (uint32_t)pci_get_vendor(dev) & 0x0000ffff; class = pci_get_class(dev); subclass = pci_get_subclass(dev); for (i = 0; i < nitems(hdac_devices); i++) { if (hdac_devices[i].model == model) { devid = i; break; } if (HDA_DEV_MATCH(hdac_devices[i].model, model) && class == PCIC_MULTIMEDIA && subclass == PCIS_MULTIMEDIA_HDA) { devid = i; break; } } sc->lock = snd_mtxcreate(device_get_nameunit(dev), "HDA driver mutex"); sc->dev = dev; TASK_INIT(&sc->unsolq_task, 0, hdac_unsolq_task, sc); callout_init(&sc->poll_callout, CALLOUT_MPSAFE); for (i = 0; i < HDAC_CODEC_MAX; i++) sc->codecs[i].dev = NULL; if (devid >= 0) { sc->quirks_on = hdac_devices[devid].quirks_on; sc->quirks_off = hdac_devices[devid].quirks_off; } else { sc->quirks_on = 0; sc->quirks_off = 0; } if (resource_int_value(device_get_name(dev), device_get_unit(dev), "msi", &i) == 0) { if (i == 0) sc->quirks_off |= HDAC_QUIRK_MSI; else { sc->quirks_on |= HDAC_QUIRK_MSI; sc->quirks_off |= ~HDAC_QUIRK_MSI; } } hdac_config_fetch(sc, &sc->quirks_on, &sc->quirks_off); HDA_BOOTVERBOSE( device_printf(sc->dev, "Config options: on=0x%08x off=0x%08x\n", sc->quirks_on, sc->quirks_off); ); sc->poll_ival = hz; if (resource_int_value(device_get_name(dev), device_get_unit(dev), "polling", &i) == 0 && i != 0) sc->polling = 1; else sc->polling = 0; pci_enable_busmaster(dev); vendor = pci_get_vendor(dev); if (vendor == INTEL_VENDORID) { /* TCSEL -> TC0 */ v = pci_read_config(dev, 0x44, 1); pci_write_config(dev, 0x44, v & 0xf8, 1); HDA_BOOTHVERBOSE( device_printf(dev, "TCSEL: 0x%02d -> 0x%02d\n", v, pci_read_config(dev, 0x44, 1)); ); } #if defined(__i386__) || defined(__amd64__) sc->flags |= HDAC_F_DMA_NOCACHE; if (resource_int_value(device_get_name(dev), device_get_unit(dev), "snoop", &i) == 0 && i != 0) { #else sc->flags &= ~HDAC_F_DMA_NOCACHE; #endif /* * Try to enable PCIe snoop to avoid messing around with * uncacheable DMA attribute. Since PCIe snoop register * config is pretty much vendor specific, there are no * general solutions on how to enable it, forcing us (even * Microsoft) to enable uncacheable or write combined DMA * by default. * * http://msdn2.microsoft.com/en-us/library/ms790324.aspx */ for (i = 0; i < nitems(hdac_pcie_snoop); i++) { if (hdac_pcie_snoop[i].vendor != vendor) continue; sc->flags &= ~HDAC_F_DMA_NOCACHE; if (hdac_pcie_snoop[i].reg == 0x00) break; v = pci_read_config(dev, hdac_pcie_snoop[i].reg, 1); if ((v & hdac_pcie_snoop[i].enable) == hdac_pcie_snoop[i].enable) break; v &= hdac_pcie_snoop[i].mask; v |= hdac_pcie_snoop[i].enable; pci_write_config(dev, hdac_pcie_snoop[i].reg, v, 1); v = pci_read_config(dev, hdac_pcie_snoop[i].reg, 1); if ((v & hdac_pcie_snoop[i].enable) != hdac_pcie_snoop[i].enable) { HDA_BOOTVERBOSE( device_printf(dev, "WARNING: Failed to enable PCIe " "snoop!\n"); ); #if defined(__i386__) || defined(__amd64__) sc->flags |= HDAC_F_DMA_NOCACHE; #endif } break; } #if defined(__i386__) || defined(__amd64__) } #endif HDA_BOOTHVERBOSE( device_printf(dev, "DMA Coherency: %s / vendor=0x%04x\n", (sc->flags & HDAC_F_DMA_NOCACHE) ? "Uncacheable" : "PCIe snoop", vendor); ); /* Allocate resources */ result = hdac_mem_alloc(sc); if (result != 0) goto hdac_attach_fail; result = hdac_irq_alloc(sc); if (result != 0) goto hdac_attach_fail; /* Get Capabilities */ result = hdac_get_capabilities(sc); if (result != 0) goto hdac_attach_fail; /* Allocate CORB, RIRB, POS and BDLs dma memory */ result = hdac_dma_alloc(sc, &sc->corb_dma, sc->corb_size * sizeof(uint32_t)); if (result != 0) goto hdac_attach_fail; result = hdac_dma_alloc(sc, &sc->rirb_dma, sc->rirb_size * sizeof(struct hdac_rirb)); if (result != 0) goto hdac_attach_fail; sc->streams = malloc(sizeof(struct hdac_stream) * sc->num_ss, M_HDAC, M_ZERO | M_WAITOK); for (i = 0; i < sc->num_ss; i++) { result = hdac_dma_alloc(sc, &sc->streams[i].bdl, sizeof(struct hdac_bdle) * HDA_BDL_MAX); if (result != 0) goto hdac_attach_fail; } if (sc->quirks_on & HDAC_QUIRK_DMAPOS) { if (hdac_dma_alloc(sc, &sc->pos_dma, (sc->num_ss) * 8) != 0) { HDA_BOOTVERBOSE( device_printf(dev, "Failed to " "allocate DMA pos buffer " "(non-fatal)\n"); ); } else { uint64_t addr = sc->pos_dma.dma_paddr; HDAC_WRITE_4(&sc->mem, HDAC_DPIBUBASE, addr >> 32); HDAC_WRITE_4(&sc->mem, HDAC_DPIBLBASE, (addr & HDAC_DPLBASE_DPLBASE_MASK) | HDAC_DPLBASE_DPLBASE_DMAPBE); } } result = bus_dma_tag_create( bus_get_dma_tag(sc->dev), /* parent */ HDA_DMA_ALIGNMENT, /* alignment */ 0, /* boundary */ (sc->support_64bit) ? BUS_SPACE_MAXADDR : BUS_SPACE_MAXADDR_32BIT, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, /* filtfunc */ NULL, /* fistfuncarg */ HDA_BUFSZ_MAX, /* maxsize */ 1, /* nsegments */ HDA_BUFSZ_MAX, /* maxsegsz */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &sc->chan_dmat); /* dmat */ if (result != 0) { device_printf(dev, "%s: bus_dma_tag_create failed (%x)\n", __func__, result); goto hdac_attach_fail; } /* Quiesce everything */ HDA_BOOTHVERBOSE( device_printf(dev, "Reset controller...\n"); ); hdac_reset(sc, 1); /* Initialize the CORB and RIRB */ hdac_corb_init(sc); hdac_rirb_init(sc); /* Defer remaining of initialization until interrupts are enabled */ sc->intrhook.ich_func = hdac_attach2; sc->intrhook.ich_arg = (void *)sc; if (cold == 0 || config_intrhook_establish(&sc->intrhook) != 0) { sc->intrhook.ich_func = NULL; hdac_attach2((void *)sc); } return (0); hdac_attach_fail: hdac_irq_free(sc); for (i = 0; i < sc->num_ss; i++) hdac_dma_free(sc, &sc->streams[i].bdl); free(sc->streams, M_HDAC); hdac_dma_free(sc, &sc->rirb_dma); hdac_dma_free(sc, &sc->corb_dma); hdac_mem_free(sc); snd_mtxfree(sc->lock); return (ENXIO); } static int sysctl_hdac_pindump(SYSCTL_HANDLER_ARGS) { struct hdac_softc *sc; device_t *devlist; device_t dev; int devcount, i, err, val; dev = oidp->oid_arg1; sc = device_get_softc(dev); if (sc == NULL) return (EINVAL); val = 0; err = sysctl_handle_int(oidp, &val, 0, req); if (err != 0 || req->newptr == NULL || val == 0) return (err); /* XXX: Temporary. For debugging. */ if (val == 100) { hdac_suspend(dev); return (0); } else if (val == 101) { hdac_resume(dev); return (0); } if ((err = device_get_children(dev, &devlist, &devcount)) != 0) return (err); hdac_lock(sc); for (i = 0; i < devcount; i++) HDAC_PINDUMP(devlist[i]); hdac_unlock(sc); free(devlist, M_TEMP); return (0); } static int hdac_mdata_rate(uint16_t fmt) { static const int mbits[8] = { 8, 16, 32, 32, 32, 32, 32, 32 }; int rate, bits; if (fmt & (1 << 14)) rate = 44100; else rate = 48000; rate *= ((fmt >> 11) & 0x07) + 1; rate /= ((fmt >> 8) & 0x07) + 1; bits = mbits[(fmt >> 4) & 0x03]; bits *= (fmt & 0x0f) + 1; return (rate * bits); } static int hdac_bdata_rate(uint16_t fmt, int output) { static const int bbits[8] = { 8, 16, 20, 24, 32, 32, 32, 32 }; int rate, bits; rate = 48000; rate *= ((fmt >> 11) & 0x07) + 1; bits = bbits[(fmt >> 4) & 0x03]; bits *= (fmt & 0x0f) + 1; if (!output) bits = ((bits + 7) & ~0x07) + 10; return (rate * bits); } static void hdac_poll_reinit(struct hdac_softc *sc) { int i, pollticks, min = 1000000; struct hdac_stream *s; if (sc->polling == 0) return; if (sc->unsol_registered > 0) min = hz / 2; for (i = 0; i < sc->num_ss; i++) { s = &sc->streams[i]; if (s->running == 0) continue; pollticks = ((uint64_t)hz * s->blksz) / (hdac_mdata_rate(s->format) / 8); pollticks >>= 1; if (pollticks > hz) pollticks = hz; if (pollticks < 1) { HDA_BOOTVERBOSE( device_printf(sc->dev, "poll interval < 1 tick !\n"); ); pollticks = 1; } if (min > pollticks) min = pollticks; } HDA_BOOTVERBOSE( device_printf(sc->dev, "poll interval %d -> %d ticks\n", sc->poll_ival, min); ); sc->poll_ival = min; if (min == 1000000) callout_stop(&sc->poll_callout); else callout_reset(&sc->poll_callout, 1, hdac_poll_callback, sc); } static int sysctl_hdac_polling(SYSCTL_HANDLER_ARGS) { struct hdac_softc *sc; device_t dev; uint32_t ctl; int err, val; dev = oidp->oid_arg1; sc = device_get_softc(dev); if (sc == NULL) return (EINVAL); hdac_lock(sc); val = sc->polling; hdac_unlock(sc); err = sysctl_handle_int(oidp, &val, 0, req); if (err != 0 || req->newptr == NULL) return (err); if (val < 0 || val > 1) return (EINVAL); hdac_lock(sc); if (val != sc->polling) { if (val == 0) { callout_stop(&sc->poll_callout); hdac_unlock(sc); callout_drain(&sc->poll_callout); hdac_lock(sc); sc->polling = 0; ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL); ctl |= HDAC_INTCTL_GIE; HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl); } else { ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL); ctl &= ~HDAC_INTCTL_GIE; HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl); sc->polling = 1; hdac_poll_reinit(sc); } } hdac_unlock(sc); return (err); } static void hdac_attach2(void *arg) { struct hdac_softc *sc; device_t child; uint32_t vendorid, revisionid; int i; uint16_t statests; sc = (struct hdac_softc *)arg; hdac_lock(sc); /* Remove ourselves from the config hooks */ if (sc->intrhook.ich_func != NULL) { config_intrhook_disestablish(&sc->intrhook); sc->intrhook.ich_func = NULL; } HDA_BOOTHVERBOSE( device_printf(sc->dev, "Starting CORB Engine...\n"); ); hdac_corb_start(sc); HDA_BOOTHVERBOSE( device_printf(sc->dev, "Starting RIRB Engine...\n"); ); hdac_rirb_start(sc); HDA_BOOTHVERBOSE( device_printf(sc->dev, "Enabling controller interrupt...\n"); ); HDAC_WRITE_4(&sc->mem, HDAC_GCTL, HDAC_READ_4(&sc->mem, HDAC_GCTL) | HDAC_GCTL_UNSOL); if (sc->polling == 0) { HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, HDAC_INTCTL_CIE | HDAC_INTCTL_GIE); } DELAY(1000); HDA_BOOTHVERBOSE( device_printf(sc->dev, "Scanning HDA codecs ...\n"); ); statests = HDAC_READ_2(&sc->mem, HDAC_STATESTS); hdac_unlock(sc); for (i = 0; i < HDAC_CODEC_MAX; i++) { if (HDAC_STATESTS_SDIWAKE(statests, i)) { HDA_BOOTHVERBOSE( device_printf(sc->dev, "Found CODEC at address %d\n", i); ); hdac_lock(sc); vendorid = hdac_send_command(sc, i, HDA_CMD_GET_PARAMETER(0, 0x0, HDA_PARAM_VENDOR_ID)); revisionid = hdac_send_command(sc, i, HDA_CMD_GET_PARAMETER(0, 0x0, HDA_PARAM_REVISION_ID)); hdac_unlock(sc); if (vendorid == HDA_INVALID && revisionid == HDA_INVALID) { device_printf(sc->dev, "CODEC is not responding!\n"); continue; } sc->codecs[i].vendor_id = HDA_PARAM_VENDOR_ID_VENDOR_ID(vendorid); sc->codecs[i].device_id = HDA_PARAM_VENDOR_ID_DEVICE_ID(vendorid); sc->codecs[i].revision_id = HDA_PARAM_REVISION_ID_REVISION_ID(revisionid); sc->codecs[i].stepping_id = HDA_PARAM_REVISION_ID_STEPPING_ID(revisionid); child = device_add_child(sc->dev, "hdacc", -1); if (child == NULL) { device_printf(sc->dev, "Failed to add CODEC device\n"); continue; } device_set_ivars(child, (void *)(intptr_t)i); sc->codecs[i].dev = child; } } bus_generic_attach(sc->dev); SYSCTL_ADD_PROC(device_get_sysctl_ctx(sc->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(sc->dev)), OID_AUTO, "pindump", CTLTYPE_INT | CTLFLAG_RW, sc->dev, sizeof(sc->dev), sysctl_hdac_pindump, "I", "Dump pin states/data"); SYSCTL_ADD_PROC(device_get_sysctl_ctx(sc->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(sc->dev)), OID_AUTO, "polling", CTLTYPE_INT | CTLFLAG_RW, sc->dev, sizeof(sc->dev), sysctl_hdac_polling, "I", "Enable polling mode"); } /**************************************************************************** * int hdac_suspend(device_t) * * Suspend and power down HDA bus and codecs. ****************************************************************************/ static int hdac_suspend(device_t dev) { struct hdac_softc *sc = device_get_softc(dev); HDA_BOOTHVERBOSE( device_printf(dev, "Suspend...\n"); ); bus_generic_suspend(dev); hdac_lock(sc); HDA_BOOTHVERBOSE( device_printf(dev, "Reset controller...\n"); ); callout_stop(&sc->poll_callout); hdac_reset(sc, 0); hdac_unlock(sc); callout_drain(&sc->poll_callout); taskqueue_drain(taskqueue_thread, &sc->unsolq_task); HDA_BOOTHVERBOSE( device_printf(dev, "Suspend done\n"); ); return (0); } /**************************************************************************** * int hdac_resume(device_t) * * Powerup and restore HDA bus and codecs state. ****************************************************************************/ static int hdac_resume(device_t dev) { struct hdac_softc *sc = device_get_softc(dev); int error; HDA_BOOTHVERBOSE( device_printf(dev, "Resume...\n"); ); hdac_lock(sc); /* Quiesce everything */ HDA_BOOTHVERBOSE( device_printf(dev, "Reset controller...\n"); ); hdac_reset(sc, 1); /* Initialize the CORB and RIRB */ hdac_corb_init(sc); hdac_rirb_init(sc); HDA_BOOTHVERBOSE( device_printf(dev, "Starting CORB Engine...\n"); ); hdac_corb_start(sc); HDA_BOOTHVERBOSE( device_printf(dev, "Starting RIRB Engine...\n"); ); hdac_rirb_start(sc); HDA_BOOTHVERBOSE( device_printf(dev, "Enabling controller interrupt...\n"); ); HDAC_WRITE_4(&sc->mem, HDAC_GCTL, HDAC_READ_4(&sc->mem, HDAC_GCTL) | HDAC_GCTL_UNSOL); HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, HDAC_INTCTL_CIE | HDAC_INTCTL_GIE); DELAY(1000); hdac_poll_reinit(sc); hdac_unlock(sc); error = bus_generic_resume(dev); HDA_BOOTHVERBOSE( device_printf(dev, "Resume done\n"); ); return (error); } /**************************************************************************** * int hdac_detach(device_t) * * Detach and free up resources utilized by the hdac device. ****************************************************************************/ static int hdac_detach(device_t dev) { struct hdac_softc *sc = device_get_softc(dev); device_t *devlist; int cad, i, devcount, error; if ((error = device_get_children(dev, &devlist, &devcount)) != 0) return (error); for (i = 0; i < devcount; i++) { cad = (intptr_t)device_get_ivars(devlist[i]); if ((error = device_delete_child(dev, devlist[i])) != 0) { free(devlist, M_TEMP); return (error); } sc->codecs[cad].dev = NULL; } free(devlist, M_TEMP); hdac_lock(sc); hdac_reset(sc, 0); hdac_unlock(sc); taskqueue_drain(taskqueue_thread, &sc->unsolq_task); hdac_irq_free(sc); for (i = 0; i < sc->num_ss; i++) hdac_dma_free(sc, &sc->streams[i].bdl); free(sc->streams, M_HDAC); hdac_dma_free(sc, &sc->pos_dma); hdac_dma_free(sc, &sc->rirb_dma); hdac_dma_free(sc, &sc->corb_dma); if (sc->chan_dmat != NULL) { bus_dma_tag_destroy(sc->chan_dmat); sc->chan_dmat = NULL; } hdac_mem_free(sc); snd_mtxfree(sc->lock); return (0); } static bus_dma_tag_t hdac_get_dma_tag(device_t dev, device_t child) { struct hdac_softc *sc = device_get_softc(dev); return (sc->chan_dmat); } static int hdac_print_child(device_t dev, device_t child) { int retval; retval = bus_print_child_header(dev, child); retval += printf(" at cad %d", (int)(intptr_t)device_get_ivars(child)); retval += bus_print_child_footer(dev, child); return (retval); } static int hdac_child_location_str(device_t dev, device_t child, char *buf, size_t buflen) { snprintf(buf, buflen, "cad=%d", (int)(intptr_t)device_get_ivars(child)); return (0); } static int hdac_child_pnpinfo_str_method(device_t dev, device_t child, char *buf, size_t buflen) { struct hdac_softc *sc = device_get_softc(dev); nid_t cad = (uintptr_t)device_get_ivars(child); snprintf(buf, buflen, "vendor=0x%04x device=0x%04x revision=0x%02x " "stepping=0x%02x", sc->codecs[cad].vendor_id, sc->codecs[cad].device_id, sc->codecs[cad].revision_id, sc->codecs[cad].stepping_id); return (0); } static int hdac_read_ivar(device_t dev, device_t child, int which, uintptr_t *result) { struct hdac_softc *sc = device_get_softc(dev); nid_t cad = (uintptr_t)device_get_ivars(child); switch (which) { case HDA_IVAR_CODEC_ID: *result = cad; break; case HDA_IVAR_VENDOR_ID: *result = sc->codecs[cad].vendor_id; break; case HDA_IVAR_DEVICE_ID: *result = sc->codecs[cad].device_id; break; case HDA_IVAR_REVISION_ID: *result = sc->codecs[cad].revision_id; break; case HDA_IVAR_STEPPING_ID: *result = sc->codecs[cad].stepping_id; break; case HDA_IVAR_SUBVENDOR_ID: *result = pci_get_subvendor(dev); break; case HDA_IVAR_SUBDEVICE_ID: *result = pci_get_subdevice(dev); break; case HDA_IVAR_DMA_NOCACHE: *result = (sc->flags & HDAC_F_DMA_NOCACHE) != 0; break; default: return (ENOENT); } return (0); } static struct mtx * hdac_get_mtx(device_t dev, device_t child) { struct hdac_softc *sc = device_get_softc(dev); return (sc->lock); } static uint32_t hdac_codec_command(device_t dev, device_t child, uint32_t verb) { return (hdac_send_command(device_get_softc(dev), (intptr_t)device_get_ivars(child), verb)); } static int hdac_find_stream(struct hdac_softc *sc, int dir, int stream) { int i, ss; ss = -1; /* Allocate ISS/BSS first. */ if (dir == 0) { for (i = 0; i < sc->num_iss; i++) { if (sc->streams[i].stream == stream) { ss = i; break; } } } else { for (i = 0; i < sc->num_oss; i++) { if (sc->streams[i + sc->num_iss].stream == stream) { ss = i + sc->num_iss; break; } } } /* Fallback to BSS. */ if (ss == -1) { for (i = 0; i < sc->num_bss; i++) { if (sc->streams[i + sc->num_iss + sc->num_oss].stream == stream) { ss = i + sc->num_iss + sc->num_oss; break; } } } return (ss); } static int hdac_stream_alloc(device_t dev, device_t child, int dir, int format, int stripe, uint32_t **dmapos) { struct hdac_softc *sc = device_get_softc(dev); nid_t cad = (uintptr_t)device_get_ivars(child); int stream, ss, bw, maxbw, prevbw; /* Look for empty stream. */ ss = hdac_find_stream(sc, dir, 0); /* Return if found nothing. */ if (ss < 0) return (0); /* Check bus bandwidth. */ bw = hdac_bdata_rate(format, dir); if (dir == 1) { bw *= 1 << (sc->num_sdo - stripe); prevbw = sc->sdo_bw_used; maxbw = 48000 * 960 * (1 << sc->num_sdo); } else { prevbw = sc->codecs[cad].sdi_bw_used; maxbw = 48000 * 464; } HDA_BOOTHVERBOSE( device_printf(dev, "%dKbps of %dKbps bandwidth used%s\n", (bw + prevbw) / 1000, maxbw / 1000, bw + prevbw > maxbw ? " -- OVERFLOW!" : ""); ); if (bw + prevbw > maxbw) return (0); if (dir == 1) sc->sdo_bw_used += bw; else sc->codecs[cad].sdi_bw_used += bw; /* Allocate stream number */ if (ss >= sc->num_iss + sc->num_oss) stream = 15 - (ss - sc->num_iss + sc->num_oss); else if (ss >= sc->num_iss) stream = ss - sc->num_iss + 1; else stream = ss + 1; sc->streams[ss].dev = child; sc->streams[ss].dir = dir; sc->streams[ss].stream = stream; sc->streams[ss].bw = bw; sc->streams[ss].format = format; sc->streams[ss].stripe = stripe; if (dmapos != NULL) { if (sc->pos_dma.dma_vaddr != NULL) *dmapos = (uint32_t *)(sc->pos_dma.dma_vaddr + ss * 8); else *dmapos = NULL; } return (stream); } static void hdac_stream_free(device_t dev, device_t child, int dir, int stream) { struct hdac_softc *sc = device_get_softc(dev); nid_t cad = (uintptr_t)device_get_ivars(child); int ss; ss = hdac_find_stream(sc, dir, stream); KASSERT(ss >= 0, ("Free for not allocated stream (%d/%d)\n", dir, stream)); if (dir == 1) sc->sdo_bw_used -= sc->streams[ss].bw; else sc->codecs[cad].sdi_bw_used -= sc->streams[ss].bw; sc->streams[ss].stream = 0; sc->streams[ss].dev = NULL; } static int hdac_stream_start(device_t dev, device_t child, int dir, int stream, bus_addr_t buf, int blksz, int blkcnt) { struct hdac_softc *sc = device_get_softc(dev); struct hdac_bdle *bdle; uint64_t addr; int i, ss, off; uint32_t ctl; ss = hdac_find_stream(sc, dir, stream); KASSERT(ss >= 0, ("Start for not allocated stream (%d/%d)\n", dir, stream)); addr = (uint64_t)buf; bdle = (struct hdac_bdle *)sc->streams[ss].bdl.dma_vaddr; for (i = 0; i < blkcnt; i++, bdle++) { bdle->addrl = (uint32_t)addr; bdle->addrh = (uint32_t)(addr >> 32); bdle->len = blksz; bdle->ioc = 1; addr += blksz; } off = ss << 5; HDAC_WRITE_4(&sc->mem, off + HDAC_SDCBL, blksz * blkcnt); HDAC_WRITE_2(&sc->mem, off + HDAC_SDLVI, blkcnt - 1); addr = sc->streams[ss].bdl.dma_paddr; HDAC_WRITE_4(&sc->mem, off + HDAC_SDBDPL, (uint32_t)addr); HDAC_WRITE_4(&sc->mem, off + HDAC_SDBDPU, (uint32_t)(addr >> 32)); ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL2); if (dir) ctl |= HDAC_SDCTL2_DIR; else ctl &= ~HDAC_SDCTL2_DIR; ctl &= ~HDAC_SDCTL2_STRM_MASK; ctl |= stream << HDAC_SDCTL2_STRM_SHIFT; ctl &= ~HDAC_SDCTL2_STRIPE_MASK; ctl |= sc->streams[ss].stripe << HDAC_SDCTL2_STRIPE_SHIFT; HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL2, ctl); HDAC_WRITE_2(&sc->mem, off + HDAC_SDFMT, sc->streams[ss].format); ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL); ctl |= 1 << ss; HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl); HDAC_WRITE_1(&sc->mem, off + HDAC_SDSTS, HDAC_SDSTS_DESE | HDAC_SDSTS_FIFOE | HDAC_SDSTS_BCIS); ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0); ctl |= HDAC_SDCTL_IOCE | HDAC_SDCTL_FEIE | HDAC_SDCTL_DEIE | HDAC_SDCTL_RUN; HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl); sc->streams[ss].blksz = blksz; sc->streams[ss].running = 1; hdac_poll_reinit(sc); return (0); } static void hdac_stream_stop(device_t dev, device_t child, int dir, int stream) { struct hdac_softc *sc = device_get_softc(dev); int ss, off; uint32_t ctl; ss = hdac_find_stream(sc, dir, stream); KASSERT(ss >= 0, ("Stop for not allocated stream (%d/%d)\n", dir, stream)); off = ss << 5; ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0); ctl &= ~(HDAC_SDCTL_IOCE | HDAC_SDCTL_FEIE | HDAC_SDCTL_DEIE | HDAC_SDCTL_RUN); HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl); ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL); ctl &= ~(1 << ss); HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl); sc->streams[ss].running = 0; hdac_poll_reinit(sc); } static void hdac_stream_reset(device_t dev, device_t child, int dir, int stream) { struct hdac_softc *sc = device_get_softc(dev); int timeout = 1000; int to = timeout; int ss, off; uint32_t ctl; ss = hdac_find_stream(sc, dir, stream); KASSERT(ss >= 0, ("Reset for not allocated stream (%d/%d)\n", dir, stream)); off = ss << 5; ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0); ctl |= HDAC_SDCTL_SRST; HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl); do { ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0); if (ctl & HDAC_SDCTL_SRST) break; DELAY(10); } while (--to); if (!(ctl & HDAC_SDCTL_SRST)) device_printf(dev, "Reset setting timeout\n"); ctl &= ~HDAC_SDCTL_SRST; HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl); to = timeout; do { ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0); if (!(ctl & HDAC_SDCTL_SRST)) break; DELAY(10); } while (--to); if (ctl & HDAC_SDCTL_SRST) device_printf(dev, "Reset timeout!\n"); } static uint32_t hdac_stream_getptr(device_t dev, device_t child, int dir, int stream) { struct hdac_softc *sc = device_get_softc(dev); int ss, off; ss = hdac_find_stream(sc, dir, stream); KASSERT(ss >= 0, ("Reset for not allocated stream (%d/%d)\n", dir, stream)); off = ss << 5; return (HDAC_READ_4(&sc->mem, off + HDAC_SDLPIB)); } static int hdac_unsol_alloc(device_t dev, device_t child, int tag) { struct hdac_softc *sc = device_get_softc(dev); sc->unsol_registered++; hdac_poll_reinit(sc); return (tag); } static void hdac_unsol_free(device_t dev, device_t child, int tag) { struct hdac_softc *sc = device_get_softc(dev); sc->unsol_registered--; hdac_poll_reinit(sc); } static device_method_t hdac_methods[] = { /* device interface */ DEVMETHOD(device_probe, hdac_probe), DEVMETHOD(device_attach, hdac_attach), DEVMETHOD(device_detach, hdac_detach), DEVMETHOD(device_suspend, hdac_suspend), DEVMETHOD(device_resume, hdac_resume), /* Bus interface */ DEVMETHOD(bus_get_dma_tag, hdac_get_dma_tag), DEVMETHOD(bus_print_child, hdac_print_child), DEVMETHOD(bus_child_location_str, hdac_child_location_str), DEVMETHOD(bus_child_pnpinfo_str, hdac_child_pnpinfo_str_method), DEVMETHOD(bus_read_ivar, hdac_read_ivar), DEVMETHOD(hdac_get_mtx, hdac_get_mtx), DEVMETHOD(hdac_codec_command, hdac_codec_command), DEVMETHOD(hdac_stream_alloc, hdac_stream_alloc), DEVMETHOD(hdac_stream_free, hdac_stream_free), DEVMETHOD(hdac_stream_start, hdac_stream_start), DEVMETHOD(hdac_stream_stop, hdac_stream_stop), DEVMETHOD(hdac_stream_reset, hdac_stream_reset), DEVMETHOD(hdac_stream_getptr, hdac_stream_getptr), DEVMETHOD(hdac_unsol_alloc, hdac_unsol_alloc), DEVMETHOD(hdac_unsol_free, hdac_unsol_free), DEVMETHOD_END }; static driver_t hdac_driver = { "hdac", hdac_methods, sizeof(struct hdac_softc), }; static devclass_t hdac_devclass; DRIVER_MODULE(snd_hda, pci, hdac_driver, hdac_devclass, NULL, NULL); Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.h =================================================================== --- user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.h (revision 281584) +++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.h (revision 281585) @@ -1,708 +1,713 @@ /*- * Copyright (c) 2006 Stephane E. Potvin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _HDAC_H_ #define _HDAC_H_ #include "hdac_if.h" /**************************************************************************** * Miscellanious defines ****************************************************************************/ /* Controller models */ #define HDA_MODEL_CONSTRUCT(vendor, model) \ (((uint32_t)(model) << 16) | ((vendor##_VENDORID) & 0xffff)) /* Intel */ #define INTEL_VENDORID 0x8086 #define HDA_INTEL_OAK HDA_MODEL_CONSTRUCT(INTEL, 0x080a) #define HDA_INTEL_BAY HDA_MODEL_CONSTRUCT(INTEL, 0x0f04) #define HDA_INTEL_HSW1 HDA_MODEL_CONSTRUCT(INTEL, 0x0a0c) #define HDA_INTEL_HSW2 HDA_MODEL_CONSTRUCT(INTEL, 0x0c0c) #define HDA_INTEL_HSW3 HDA_MODEL_CONSTRUCT(INTEL, 0x0d0c) +#define HDA_INTEL_BDW1 HDA_MODEL_CONSTRUCT(INTEL, 0x160c) #define HDA_INTEL_CPT HDA_MODEL_CONSTRUCT(INTEL, 0x1c20) #define HDA_INTEL_PATSBURG HDA_MODEL_CONSTRUCT(INTEL, 0x1d20) #define HDA_INTEL_PPT1 HDA_MODEL_CONSTRUCT(INTEL, 0x1e20) #define HDA_INTEL_82801F HDA_MODEL_CONSTRUCT(INTEL, 0x2668) #define HDA_INTEL_63XXESB HDA_MODEL_CONSTRUCT(INTEL, 0x269a) #define HDA_INTEL_82801G HDA_MODEL_CONSTRUCT(INTEL, 0x27d8) #define HDA_INTEL_82801H HDA_MODEL_CONSTRUCT(INTEL, 0x284b) #define HDA_INTEL_82801I HDA_MODEL_CONSTRUCT(INTEL, 0x293e) #define HDA_INTEL_82801JI HDA_MODEL_CONSTRUCT(INTEL, 0x3a3e) #define HDA_INTEL_82801JD HDA_MODEL_CONSTRUCT(INTEL, 0x3a6e) #define HDA_INTEL_PCH HDA_MODEL_CONSTRUCT(INTEL, 0x3b56) #define HDA_INTEL_PCH2 HDA_MODEL_CONSTRUCT(INTEL, 0x3b57) #define HDA_INTEL_MACBOOKPRO92 HDA_MODEL_CONSTRUCT(INTEL, 0x7270) #define HDA_INTEL_SCH HDA_MODEL_CONSTRUCT(INTEL, 0x811b) #define HDA_INTEL_LPT1 HDA_MODEL_CONSTRUCT(INTEL, 0x8c20) #define HDA_INTEL_LPT2 HDA_MODEL_CONSTRUCT(INTEL, 0x8c21) #define HDA_INTEL_WCPT HDA_MODEL_CONSTRUCT(INTEL, 0x8ca0) #define HDA_INTEL_WELLS1 HDA_MODEL_CONSTRUCT(INTEL, 0x8d20) #define HDA_INTEL_WELLS2 HDA_MODEL_CONSTRUCT(INTEL, 0x8d21) #define HDA_INTEL_LPTLP1 HDA_MODEL_CONSTRUCT(INTEL, 0x9c20) #define HDA_INTEL_LPTLP2 HDA_MODEL_CONSTRUCT(INTEL, 0x9c21) +#define HDA_INTEL_BDW2 HDA_MODEL_CONSTRUCT(INTEL, 0x9ca0) #define HDA_INTEL_ALL HDA_MODEL_CONSTRUCT(INTEL, 0xffff) /* Nvidia */ #define NVIDIA_VENDORID 0x10de #define HDA_NVIDIA_MCP51 HDA_MODEL_CONSTRUCT(NVIDIA, 0x026c) #define HDA_NVIDIA_MCP55 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0371) #define HDA_NVIDIA_MCP61_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x03e4) #define HDA_NVIDIA_MCP61_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x03f0) #define HDA_NVIDIA_MCP65_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x044a) #define HDA_NVIDIA_MCP65_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x044b) #define HDA_NVIDIA_MCP67_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x055c) #define HDA_NVIDIA_MCP67_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x055d) #define HDA_NVIDIA_MCP78_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0774) #define HDA_NVIDIA_MCP78_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0775) #define HDA_NVIDIA_MCP78_3 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0776) #define HDA_NVIDIA_MCP78_4 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0777) #define HDA_NVIDIA_MCP73_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x07fc) #define HDA_NVIDIA_MCP73_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x07fd) #define HDA_NVIDIA_MCP79_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac0) #define HDA_NVIDIA_MCP79_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac1) #define HDA_NVIDIA_MCP79_3 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac2) #define HDA_NVIDIA_MCP79_4 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac3) #define HDA_NVIDIA_0BE2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be2) #define HDA_NVIDIA_0BE3 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be3) #define HDA_NVIDIA_0BE4 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be4) #define HDA_NVIDIA_GT100 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be5) #define HDA_NVIDIA_GT106 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be9) #define HDA_NVIDIA_GT108 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0bea) #define HDA_NVIDIA_GT104 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0beb) #define HDA_NVIDIA_GT116 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0bee) #define HDA_NVIDIA_MCP89_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d94) #define HDA_NVIDIA_MCP89_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d95) #define HDA_NVIDIA_MCP89_3 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d96) #define HDA_NVIDIA_MCP89_4 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d97) #define HDA_NVIDIA_GF119 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0e08) #define HDA_NVIDIA_GF110_1 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0e09) #define HDA_NVIDIA_GF110_2 HDA_MODEL_CONSTRUCT(NVIDIA, 0x0e0c) #define HDA_NVIDIA_ALL HDA_MODEL_CONSTRUCT(NVIDIA, 0xffff) /* ATI */ #define ATI_VENDORID 0x1002 #define HDA_ATI_SB450 HDA_MODEL_CONSTRUCT(ATI, 0x437b) #define HDA_ATI_SB600 HDA_MODEL_CONSTRUCT(ATI, 0x4383) #define HDA_ATI_RS600 HDA_MODEL_CONSTRUCT(ATI, 0x793b) #define HDA_ATI_RS690 HDA_MODEL_CONSTRUCT(ATI, 0x7919) #define HDA_ATI_RS780 HDA_MODEL_CONSTRUCT(ATI, 0x960f) #define HDA_ATI_R600 HDA_MODEL_CONSTRUCT(ATI, 0xaa00) #define HDA_ATI_RV630 HDA_MODEL_CONSTRUCT(ATI, 0xaa08) #define HDA_ATI_RV610 HDA_MODEL_CONSTRUCT(ATI, 0xaa10) #define HDA_ATI_RV670 HDA_MODEL_CONSTRUCT(ATI, 0xaa18) #define HDA_ATI_RV635 HDA_MODEL_CONSTRUCT(ATI, 0xaa20) #define HDA_ATI_RV620 HDA_MODEL_CONSTRUCT(ATI, 0xaa28) #define HDA_ATI_RV770 HDA_MODEL_CONSTRUCT(ATI, 0xaa30) #define HDA_ATI_RV730 HDA_MODEL_CONSTRUCT(ATI, 0xaa38) #define HDA_ATI_RV710 HDA_MODEL_CONSTRUCT(ATI, 0xaa40) #define HDA_ATI_RV740 HDA_MODEL_CONSTRUCT(ATI, 0xaa48) #define HDA_ATI_RV870 HDA_MODEL_CONSTRUCT(ATI, 0xaa50) #define HDA_ATI_RV840 HDA_MODEL_CONSTRUCT(ATI, 0xaa58) #define HDA_ATI_RV830 HDA_MODEL_CONSTRUCT(ATI, 0xaa60) #define HDA_ATI_RV810 HDA_MODEL_CONSTRUCT(ATI, 0xaa68) #define HDA_ATI_RV970 HDA_MODEL_CONSTRUCT(ATI, 0xaa80) #define HDA_ATI_RV940 HDA_MODEL_CONSTRUCT(ATI, 0xaa88) #define HDA_ATI_RV930 HDA_MODEL_CONSTRUCT(ATI, 0xaa90) #define HDA_ATI_RV910 HDA_MODEL_CONSTRUCT(ATI, 0xaa98) #define HDA_ATI_R1000 HDA_MODEL_CONSTRUCT(ATI, 0xaaa0) #define HDA_ATI_ALL HDA_MODEL_CONSTRUCT(ATI, 0xffff) /* RDC */ #define RDC_VENDORID 0x17f3 #define HDA_RDC_M3010 HDA_MODEL_CONSTRUCT(RDC, 0x3010) /* VIA */ #define VIA_VENDORID 0x1106 #define HDA_VIA_VT82XX HDA_MODEL_CONSTRUCT(VIA, 0x3288) #define HDA_VIA_ALL HDA_MODEL_CONSTRUCT(VIA, 0xffff) /* SiS */ #define SIS_VENDORID 0x1039 #define HDA_SIS_966 HDA_MODEL_CONSTRUCT(SIS, 0x7502) #define HDA_SIS_ALL HDA_MODEL_CONSTRUCT(SIS, 0xffff) /* ULI */ #define ULI_VENDORID 0x10b9 #define HDA_ULI_M5461 HDA_MODEL_CONSTRUCT(ULI, 0x5461) #define HDA_ULI_ALL HDA_MODEL_CONSTRUCT(ULI, 0xffff) /* OEM/subvendors */ /* Intel */ #define INTEL_DH87RL_SUBVENDOR HDA_MODEL_CONSTRUCT(INTEL, 0x204a) #define INTEL_D101GGC_SUBVENDOR HDA_MODEL_CONSTRUCT(INTEL, 0xd600) /* HP/Compaq */ #define HP_VENDORID 0x103c #define HP_V3000_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x30b5) #define HP_NX7400_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x30a2) #define HP_NX6310_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x30aa) #define HP_NX6325_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x30b0) #define HP_XW4300_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x3013) #define HP_3010_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x3010) #define HP_DV5000_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x30a5) #define HP_DC7700S_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x2801) #define HP_DC7700_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0x2802) #define HP_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(HP, 0xffff) /* What is wrong with XN 2563 anyway? (Got the picture ?) */ #define HP_NX6325_SUBVENDORX 0x103c30b0 /* Dell */ #define DELL_VENDORID 0x1028 #define DELL_D630_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x01f9) #define DELL_D820_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x01cc) #define DELL_V1400_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x0227) #define DELL_V1500_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x0228) #define DELL_I1300_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x01c9) #define DELL_XPSM1210_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x01d7) #define DELL_OPLX745_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0x01da) #define DELL_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(DELL, 0xffff) /* Clevo */ #define CLEVO_VENDORID 0x1558 #define CLEVO_D900T_SUBVENDOR HDA_MODEL_CONSTRUCT(CLEVO, 0x0900) #define CLEVO_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(CLEVO, 0xffff) /* Acer */ #define ACER_VENDORID 0x1025 #define ACER_A5050_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x010f) #define ACER_A4520_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x0127) #define ACER_A4710_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x012f) #define ACER_A4715_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x0133) #define ACER_3681WXM_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x0110) #define ACER_T6292_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x011b) #define ACER_T5320_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0x011f) #define ACER_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(ACER, 0xffff) /* Asus */ #define ASUS_VENDORID 0x1043 #define ASUS_A8X_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1153) #define ASUS_U5F_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1263) #define ASUS_W6F_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1263) #define ASUS_A7M_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1323) #define ASUS_F3JC_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1338) #define ASUS_G2K_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1339) #define ASUS_A7T_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x13c2) #define ASUS_UX31A_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1517) #define ASUS_W2J_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1971) #define ASUS_M5200_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x1993) #define ASUS_P5PL2_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x817f) #define ASUS_P1AH2_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x81cb) #define ASUS_M2NPVMX_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x81cb) #define ASUS_M2V_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x81e7) #define ASUS_P5BWD_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x81ec) #define ASUS_M2N_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0x8234) #define ASUS_A8NVMCSM_SUBVENDOR HDA_MODEL_CONSTRUCT(NVIDIA, 0xcb84) #define ASUS_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(ASUS, 0xffff) /* IBM / Lenovo */ #define IBM_VENDORID 0x1014 #define IBM_M52_SUBVENDOR HDA_MODEL_CONSTRUCT(IBM, 0x02f6) #define IBM_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(IBM, 0xffff) /* Lenovo */ #define LENOVO_VENDORID 0x17aa #define LENOVO_3KN100_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x2066) #define LENOVO_3KN200_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x384e) #define LENOVO_B450_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x3a0d) #define LENOVO_TCA55_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x1015) #define LENOVO_X1_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21e8) #define LENOVO_X1CRBN_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21f9) +#define LENOVO_X120BS_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x2227) #define LENOVO_X220_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21da) #define LENOVO_X300_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x20ac) #define LENOVO_T400_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x20f2) #define LENOVO_T420_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21ce) #define LENOVO_T430_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21f3) #define LENOVO_T430S_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21fb) #define LENOVO_T520_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21cf) #define LENOVO_T530_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x21f6) #define LENOVO_G580_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x3977) #define LENOVO_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0xffff) /* Samsung */ #define SAMSUNG_VENDORID 0x144d #define SAMSUNG_Q1_SUBVENDOR HDA_MODEL_CONSTRUCT(SAMSUNG, 0xc027) #define SAMSUNG_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(SAMSUNG, 0xffff) /* Medion ? */ #define MEDION_VENDORID 0x161f #define MEDION_MD95257_SUBVENDOR HDA_MODEL_CONSTRUCT(MEDION, 0x203d) #define MEDION_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(MEDION, 0xffff) /* Apple Computer Inc. */ #define APPLE_VENDORID 0x106b #define APPLE_MB3_SUBVENDOR HDA_MODEL_CONSTRUCT(APPLE, 0x00a1) /* Sony */ #define SONY_VENDORID 0x104d #define SONY_S5_SUBVENDOR HDA_MODEL_CONSTRUCT(SONY, 0x81cc) #define SONY_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(SONY, 0xffff) /* * Apple Intel MacXXXX seems using Sigmatel codec/vendor id * instead of their own, which is beyond my comprehension * (see HDA_CODEC_STAC9221 below). */ #define APPLE_INTEL_MAC 0x76808384 #define APPLE_MACBOOKAIR31 0x0d9410de #define APPLE_MACBOOKPRO55 0xcb7910de #define APPLE_MACBOOKPRO71 0xcb8910de /* LG Electronics */ #define LG_VENDORID 0x1854 #define LG_LW20_SUBVENDOR HDA_MODEL_CONSTRUCT(LG, 0x0018) #define LG_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(LG, 0xffff) /* Fujitsu Siemens */ #define FS_VENDORID 0x1734 #define FS_PA1510_SUBVENDOR HDA_MODEL_CONSTRUCT(FS, 0x10b8) #define FS_SI1848_SUBVENDOR HDA_MODEL_CONSTRUCT(FS, 0x10cd) #define FS_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(FS, 0xffff) /* Fujitsu Limited */ #define FL_VENDORID 0x10cf #define FL_S7020D_SUBVENDOR HDA_MODEL_CONSTRUCT(FL, 0x1326) #define FL_U1010_SUBVENDOR HDA_MODEL_CONSTRUCT(FL, 0x142d) #define FL_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(FL, 0xffff) /* Toshiba */ #define TOSHIBA_VENDORID 0x1179 #define TOSHIBA_U200_SUBVENDOR HDA_MODEL_CONSTRUCT(TOSHIBA, 0x0001) #define TOSHIBA_A135_SUBVENDOR HDA_MODEL_CONSTRUCT(TOSHIBA, 0xff01) #define TOSHIBA_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(TOSHIBA, 0xffff) /* Micro-Star International (MSI) */ #define MSI_VENDORID 0x1462 #define MSI_MS1034_SUBVENDOR HDA_MODEL_CONSTRUCT(MSI, 0x0349) #define MSI_MS034A_SUBVENDOR HDA_MODEL_CONSTRUCT(MSI, 0x034a) #define MSI_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(MSI, 0xffff) /* Giga-Byte Technology */ #define GB_VENDORID 0x1458 #define GB_G33S2H_SUBVENDOR HDA_MODEL_CONSTRUCT(GB, 0xa022) #define GP_ALL_SUBVENDOR HDA_MODEL_CONSTRUCT(GB, 0xffff) /* Uniwill ? */ #define UNIWILL_VENDORID 0x1584 #define UNIWILL_9075_SUBVENDOR HDA_MODEL_CONSTRUCT(UNIWILL, 0x9075) #define UNIWILL_9080_SUBVENDOR HDA_MODEL_CONSTRUCT(UNIWILL, 0x9080) /* All codecs you can eat... */ #define HDA_CODEC_CONSTRUCT(vendor, id) \ (((uint32_t)(vendor##_VENDORID) << 16) | ((id) & 0xffff)) /* Cirrus Logic */ #define CIRRUSLOGIC_VENDORID 0x1013 #define HDA_CODEC_CS4206 HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0x4206) #define HDA_CODEC_CS4207 HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0x4207) #define HDA_CODEC_CS4210 HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0x4210) #define HDA_CODEC_CSXXXX HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0xffff) /* Realtek */ #define REALTEK_VENDORID 0x10ec #define HDA_CODEC_ALC221 HDA_CODEC_CONSTRUCT(REALTEK, 0x0221) #define HDA_CODEC_ALC260 HDA_CODEC_CONSTRUCT(REALTEK, 0x0260) #define HDA_CODEC_ALC262 HDA_CODEC_CONSTRUCT(REALTEK, 0x0262) #define HDA_CODEC_ALC267 HDA_CODEC_CONSTRUCT(REALTEK, 0x0267) #define HDA_CODEC_ALC268 HDA_CODEC_CONSTRUCT(REALTEK, 0x0268) #define HDA_CODEC_ALC269 HDA_CODEC_CONSTRUCT(REALTEK, 0x0269) #define HDA_CODEC_ALC270 HDA_CODEC_CONSTRUCT(REALTEK, 0x0270) #define HDA_CODEC_ALC272 HDA_CODEC_CONSTRUCT(REALTEK, 0x0272) #define HDA_CODEC_ALC273 HDA_CODEC_CONSTRUCT(REALTEK, 0x0273) #define HDA_CODEC_ALC275 HDA_CODEC_CONSTRUCT(REALTEK, 0x0275) #define HDA_CODEC_ALC276 HDA_CODEC_CONSTRUCT(REALTEK, 0x0276) +#define HDA_CODEC_ALC292 HDA_CODEC_CONSTRUCT(REALTEK, 0x0292) #define HDA_CODEC_ALC660 HDA_CODEC_CONSTRUCT(REALTEK, 0x0660) #define HDA_CODEC_ALC662 HDA_CODEC_CONSTRUCT(REALTEK, 0x0662) #define HDA_CODEC_ALC663 HDA_CODEC_CONSTRUCT(REALTEK, 0x0663) #define HDA_CODEC_ALC665 HDA_CODEC_CONSTRUCT(REALTEK, 0x0665) #define HDA_CODEC_ALC670 HDA_CODEC_CONSTRUCT(REALTEK, 0x0670) #define HDA_CODEC_ALC680 HDA_CODEC_CONSTRUCT(REALTEK, 0x0680) #define HDA_CODEC_ALC861 HDA_CODEC_CONSTRUCT(REALTEK, 0x0861) #define HDA_CODEC_ALC861VD HDA_CODEC_CONSTRUCT(REALTEK, 0x0862) #define HDA_CODEC_ALC880 HDA_CODEC_CONSTRUCT(REALTEK, 0x0880) #define HDA_CODEC_ALC882 HDA_CODEC_CONSTRUCT(REALTEK, 0x0882) #define HDA_CODEC_ALC883 HDA_CODEC_CONSTRUCT(REALTEK, 0x0883) #define HDA_CODEC_ALC885 HDA_CODEC_CONSTRUCT(REALTEK, 0x0885) #define HDA_CODEC_ALC887 HDA_CODEC_CONSTRUCT(REALTEK, 0x0887) #define HDA_CODEC_ALC888 HDA_CODEC_CONSTRUCT(REALTEK, 0x0888) #define HDA_CODEC_ALC889 HDA_CODEC_CONSTRUCT(REALTEK, 0x0889) #define HDA_CODEC_ALC892 HDA_CODEC_CONSTRUCT(REALTEK, 0x0892) #define HDA_CODEC_ALC899 HDA_CODEC_CONSTRUCT(REALTEK, 0x0899) #define HDA_CODEC_ALCXXXX HDA_CODEC_CONSTRUCT(REALTEK, 0xffff) /* Motorola */ #define MOTO_VENDORID 0x1057 #define HDA_CODEC_MOTOXXXX HDA_CODEC_CONSTRUCT(MOTO, 0xffff) /* Creative */ #define CREATIVE_VENDORID 0x1102 #define HDA_CODEC_CA0110 HDA_CODEC_CONSTRUCT(CREATIVE, 0x000a) #define HDA_CODEC_CA0110_2 HDA_CODEC_CONSTRUCT(CREATIVE, 0x000b) #define HDA_CODEC_SB0880 HDA_CODEC_CONSTRUCT(CREATIVE, 0x000d) #define HDA_CODEC_CA0132 HDA_CODEC_CONSTRUCT(CREATIVE, 0x0011) #define HDA_CODEC_CAXXXX HDA_CODEC_CONSTRUCT(CREATIVE, 0xffff) /* Analog Devices */ #define ANALOGDEVICES_VENDORID 0x11d4 #define HDA_CODEC_AD1884A HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x184a) #define HDA_CODEC_AD1882 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1882) #define HDA_CODEC_AD1883 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1883) #define HDA_CODEC_AD1884 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1884) #define HDA_CODEC_AD1984A HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x194a) #define HDA_CODEC_AD1984B HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x194b) #define HDA_CODEC_AD1981HD HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1981) #define HDA_CODEC_AD1983 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1983) #define HDA_CODEC_AD1984 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1984) #define HDA_CODEC_AD1986A HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1986) #define HDA_CODEC_AD1987 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1987) #define HDA_CODEC_AD1988 HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1988) #define HDA_CODEC_AD1988B HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x198b) #define HDA_CODEC_AD1882A HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x882a) #define HDA_CODEC_AD1989A HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x989a) #define HDA_CODEC_AD1989B HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x989b) #define HDA_CODEC_ADXXXX HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0xffff) /* CMedia */ #define CMEDIA_VENDORID 0x13f6 #define HDA_CODEC_CMI9880 HDA_CODEC_CONSTRUCT(CMEDIA, 0x9880) #define HDA_CODEC_CMIXXXX HDA_CODEC_CONSTRUCT(CMEDIA, 0xffff) #define CMEDIA2_VENDORID 0x434d #define HDA_CODEC_CMI98802 HDA_CODEC_CONSTRUCT(CMEDIA2, 0x4980) #define HDA_CODEC_CMIXXXX2 HDA_CODEC_CONSTRUCT(CMEDIA2, 0xffff) /* Sigmatel */ #define SIGMATEL_VENDORID 0x8384 #define HDA_CODEC_STAC9230X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7612) #define HDA_CODEC_STAC9230D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7613) #define HDA_CODEC_STAC9229X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7614) #define HDA_CODEC_STAC9229D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7615) #define HDA_CODEC_STAC9228X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7616) #define HDA_CODEC_STAC9228D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7617) #define HDA_CODEC_STAC9227X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7618) #define HDA_CODEC_STAC9227D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7619) #define HDA_CODEC_STAC9274 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7620) #define HDA_CODEC_STAC9274D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7621) #define HDA_CODEC_STAC9273X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7622) #define HDA_CODEC_STAC9273D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7623) #define HDA_CODEC_STAC9272X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7624) #define HDA_CODEC_STAC9272D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7625) #define HDA_CODEC_STAC9271X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7626) #define HDA_CODEC_STAC9271D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7627) #define HDA_CODEC_STAC9274X5NH HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7628) #define HDA_CODEC_STAC9274D5NH HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7629) #define HDA_CODEC_STAC9250 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7634) #define HDA_CODEC_STAC9251 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7636) #define HDA_CODEC_IDT92HD700X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7638) #define HDA_CODEC_IDT92HD700D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7639) #define HDA_CODEC_IDT92HD206X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7645) #define HDA_CODEC_IDT92HD206D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7646) #define HDA_CODEC_CXD9872RDK HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7661) #define HDA_CODEC_STAC9872AK HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7662) #define HDA_CODEC_CXD9872AKD HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7664) #define HDA_CODEC_STAC9221 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7680) #define HDA_CODEC_STAC922XD HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7681) #define HDA_CODEC_STAC9221_A2 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7682) #define HDA_CODEC_STAC9221D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7683) #define HDA_CODEC_STAC9220 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7690) #define HDA_CODEC_STAC9200D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7691) #define HDA_CODEC_IDT92HD005 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7698) #define HDA_CODEC_IDT92HD005D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7699) #define HDA_CODEC_STAC9205X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a0) #define HDA_CODEC_STAC9205D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a1) #define HDA_CODEC_STAC9204X HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a2) #define HDA_CODEC_STAC9204D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a3) #define HDA_CODEC_STAC9255 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a4) #define HDA_CODEC_STAC9255D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a5) #define HDA_CODEC_STAC9254 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a6) #define HDA_CODEC_STAC9254D HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a7) #define HDA_CODEC_STAC9220_A2 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7880) #define HDA_CODEC_STAC9220_A1 HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7882) #define HDA_CODEC_STACXXXX HDA_CODEC_CONSTRUCT(SIGMATEL, 0xffff) /* IDT */ #define IDT_VENDORID 0x111d #define HDA_CODEC_IDT92HD75BX HDA_CODEC_CONSTRUCT(IDT, 0x7603) #define HDA_CODEC_IDT92HD83C1X HDA_CODEC_CONSTRUCT(IDT, 0x7604) #define HDA_CODEC_IDT92HD81B1X HDA_CODEC_CONSTRUCT(IDT, 0x7605) #define HDA_CODEC_IDT92HD75B3 HDA_CODEC_CONSTRUCT(IDT, 0x7608) #define HDA_CODEC_IDT92HD73D1 HDA_CODEC_CONSTRUCT(IDT, 0x7674) #define HDA_CODEC_IDT92HD73C1 HDA_CODEC_CONSTRUCT(IDT, 0x7675) #define HDA_CODEC_IDT92HD73E1 HDA_CODEC_CONSTRUCT(IDT, 0x7676) #define HDA_CODEC_IDT92HD71B8 HDA_CODEC_CONSTRUCT(IDT, 0x76b0) #define HDA_CODEC_IDT92HD71B8_2 HDA_CODEC_CONSTRUCT(IDT, 0x76b1) #define HDA_CODEC_IDT92HD71B7 HDA_CODEC_CONSTRUCT(IDT, 0x76b2) #define HDA_CODEC_IDT92HD71B7_2 HDA_CODEC_CONSTRUCT(IDT, 0x76b3) #define HDA_CODEC_IDT92HD71B6 HDA_CODEC_CONSTRUCT(IDT, 0x76b4) #define HDA_CODEC_IDT92HD71B6_2 HDA_CODEC_CONSTRUCT(IDT, 0x76b5) #define HDA_CODEC_IDT92HD71B5 HDA_CODEC_CONSTRUCT(IDT, 0x76b6) #define HDA_CODEC_IDT92HD71B5_2 HDA_CODEC_CONSTRUCT(IDT, 0x76b7) #define HDA_CODEC_IDT92HD89C3 HDA_CODEC_CONSTRUCT(IDT, 0x76c0) #define HDA_CODEC_IDT92HD89C2 HDA_CODEC_CONSTRUCT(IDT, 0x76c1) #define HDA_CODEC_IDT92HD89C1 HDA_CODEC_CONSTRUCT(IDT, 0x76c2) #define HDA_CODEC_IDT92HD89B3 HDA_CODEC_CONSTRUCT(IDT, 0x76c3) #define HDA_CODEC_IDT92HD89B2 HDA_CODEC_CONSTRUCT(IDT, 0x76c4) #define HDA_CODEC_IDT92HD89B1 HDA_CODEC_CONSTRUCT(IDT, 0x76c5) #define HDA_CODEC_IDT92HD89E3 HDA_CODEC_CONSTRUCT(IDT, 0x76c6) #define HDA_CODEC_IDT92HD89E2 HDA_CODEC_CONSTRUCT(IDT, 0x76c7) #define HDA_CODEC_IDT92HD89E1 HDA_CODEC_CONSTRUCT(IDT, 0x76c8) #define HDA_CODEC_IDT92HD89D3 HDA_CODEC_CONSTRUCT(IDT, 0x76c9) #define HDA_CODEC_IDT92HD89D2 HDA_CODEC_CONSTRUCT(IDT, 0x76ca) #define HDA_CODEC_IDT92HD89D1 HDA_CODEC_CONSTRUCT(IDT, 0x76cb) #define HDA_CODEC_IDT92HD89F3 HDA_CODEC_CONSTRUCT(IDT, 0x76cc) #define HDA_CODEC_IDT92HD89F2 HDA_CODEC_CONSTRUCT(IDT, 0x76cd) #define HDA_CODEC_IDT92HD89F1 HDA_CODEC_CONSTRUCT(IDT, 0x76ce) #define HDA_CODEC_IDT92HD87B1_3 HDA_CODEC_CONSTRUCT(IDT, 0x76d1) #define HDA_CODEC_IDT92HD83C1C HDA_CODEC_CONSTRUCT(IDT, 0x76d4) #define HDA_CODEC_IDT92HD81B1C HDA_CODEC_CONSTRUCT(IDT, 0x76d5) #define HDA_CODEC_IDT92HD87B2_4 HDA_CODEC_CONSTRUCT(IDT, 0x76d9) #define HDA_CODEC_IDT92HD93BXX HDA_CODEC_CONSTRUCT(IDT, 0x76df) #define HDA_CODEC_IDT92HD91BXX HDA_CODEC_CONSTRUCT(IDT, 0x76e0) #define HDA_CODEC_IDT92HD98BXX HDA_CODEC_CONSTRUCT(IDT, 0x76e3) #define HDA_CODEC_IDT92HD99BXX HDA_CODEC_CONSTRUCT(IDT, 0x76e5) #define HDA_CODEC_IDT92HD90BXX HDA_CODEC_CONSTRUCT(IDT, 0x76e7) #define HDA_CODEC_IDT92HD66B1X5 HDA_CODEC_CONSTRUCT(IDT, 0x76e8) #define HDA_CODEC_IDT92HD66B2X5 HDA_CODEC_CONSTRUCT(IDT, 0x76e9) #define HDA_CODEC_IDT92HD66B3X5 HDA_CODEC_CONSTRUCT(IDT, 0x76ea) #define HDA_CODEC_IDT92HD66C1X5 HDA_CODEC_CONSTRUCT(IDT, 0x76eb) #define HDA_CODEC_IDT92HD66C2X5 HDA_CODEC_CONSTRUCT(IDT, 0x76ec) #define HDA_CODEC_IDT92HD66C3X5 HDA_CODEC_CONSTRUCT(IDT, 0x76ed) #define HDA_CODEC_IDT92HD66B1X3 HDA_CODEC_CONSTRUCT(IDT, 0x76ee) #define HDA_CODEC_IDT92HD66B2X3 HDA_CODEC_CONSTRUCT(IDT, 0x76ef) #define HDA_CODEC_IDT92HD66B3X3 HDA_CODEC_CONSTRUCT(IDT, 0x76f0) #define HDA_CODEC_IDT92HD66C1X3 HDA_CODEC_CONSTRUCT(IDT, 0x76f1) #define HDA_CODEC_IDT92HD66C2X3 HDA_CODEC_CONSTRUCT(IDT, 0x76f2) #define HDA_CODEC_IDT92HD66C3_65 HDA_CODEC_CONSTRUCT(IDT, 0x76f3) #define HDA_CODEC_IDTXXXX HDA_CODEC_CONSTRUCT(IDT, 0xffff) /* Silicon Image */ #define SII_VENDORID 0x1095 #define HDA_CODEC_SII1390 HDA_CODEC_CONSTRUCT(SII, 0x1390) #define HDA_CODEC_SII1392 HDA_CODEC_CONSTRUCT(SII, 0x1392) #define HDA_CODEC_SIIXXXX HDA_CODEC_CONSTRUCT(SII, 0xffff) /* Lucent/Agere */ #define AGERE_VENDORID 0x11c1 #define HDA_CODEC_AGEREXXXX HDA_CODEC_CONSTRUCT(AGERE, 0xffff) /* Conexant */ #define CONEXANT_VENDORID 0x14f1 #define HDA_CODEC_CX20549 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5045) #define HDA_CODEC_CX20551 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5047) #define HDA_CODEC_CX20561 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5051) #define HDA_CODEC_CX20582 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5066) #define HDA_CODEC_CX20583 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5067) #define HDA_CODEC_CX20584 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5068) #define HDA_CODEC_CX20585 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5069) #define HDA_CODEC_CX20588 HDA_CODEC_CONSTRUCT(CONEXANT, 0x506c) #define HDA_CODEC_CX20590 HDA_CODEC_CONSTRUCT(CONEXANT, 0x506e) #define HDA_CODEC_CX20631 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5097) #define HDA_CODEC_CX20632 HDA_CODEC_CONSTRUCT(CONEXANT, 0x5098) #define HDA_CODEC_CX20641 HDA_CODEC_CONSTRUCT(CONEXANT, 0x50a1) #define HDA_CODEC_CX20642 HDA_CODEC_CONSTRUCT(CONEXANT, 0x50a2) #define HDA_CODEC_CX20651 HDA_CODEC_CONSTRUCT(CONEXANT, 0x50ab) #define HDA_CODEC_CX20652 HDA_CODEC_CONSTRUCT(CONEXANT, 0x50ac) #define HDA_CODEC_CX20664 HDA_CODEC_CONSTRUCT(CONEXANT, 0x50b8) #define HDA_CODEC_CX20665 HDA_CODEC_CONSTRUCT(CONEXANT, 0x50b9) #define HDA_CODEC_CXXXXX HDA_CODEC_CONSTRUCT(CONEXANT, 0xffff) /* VIA */ #define HDA_CODEC_VT1708_8 HDA_CODEC_CONSTRUCT(VIA, 0x1708) #define HDA_CODEC_VT1708_9 HDA_CODEC_CONSTRUCT(VIA, 0x1709) #define HDA_CODEC_VT1708_A HDA_CODEC_CONSTRUCT(VIA, 0x170a) #define HDA_CODEC_VT1708_B HDA_CODEC_CONSTRUCT(VIA, 0x170b) #define HDA_CODEC_VT1709_0 HDA_CODEC_CONSTRUCT(VIA, 0xe710) #define HDA_CODEC_VT1709_1 HDA_CODEC_CONSTRUCT(VIA, 0xe711) #define HDA_CODEC_VT1709_2 HDA_CODEC_CONSTRUCT(VIA, 0xe712) #define HDA_CODEC_VT1709_3 HDA_CODEC_CONSTRUCT(VIA, 0xe713) #define HDA_CODEC_VT1709_4 HDA_CODEC_CONSTRUCT(VIA, 0xe714) #define HDA_CODEC_VT1709_5 HDA_CODEC_CONSTRUCT(VIA, 0xe715) #define HDA_CODEC_VT1709_6 HDA_CODEC_CONSTRUCT(VIA, 0xe716) #define HDA_CODEC_VT1709_7 HDA_CODEC_CONSTRUCT(VIA, 0xe717) #define HDA_CODEC_VT1708B_0 HDA_CODEC_CONSTRUCT(VIA, 0xe720) #define HDA_CODEC_VT1708B_1 HDA_CODEC_CONSTRUCT(VIA, 0xe721) #define HDA_CODEC_VT1708B_2 HDA_CODEC_CONSTRUCT(VIA, 0xe722) #define HDA_CODEC_VT1708B_3 HDA_CODEC_CONSTRUCT(VIA, 0xe723) #define HDA_CODEC_VT1708B_4 HDA_CODEC_CONSTRUCT(VIA, 0xe724) #define HDA_CODEC_VT1708B_5 HDA_CODEC_CONSTRUCT(VIA, 0xe725) #define HDA_CODEC_VT1708B_6 HDA_CODEC_CONSTRUCT(VIA, 0xe726) #define HDA_CODEC_VT1708B_7 HDA_CODEC_CONSTRUCT(VIA, 0xe727) #define HDA_CODEC_VT1708S_0 HDA_CODEC_CONSTRUCT(VIA, 0x0397) #define HDA_CODEC_VT1708S_1 HDA_CODEC_CONSTRUCT(VIA, 0x1397) #define HDA_CODEC_VT1708S_2 HDA_CODEC_CONSTRUCT(VIA, 0x2397) #define HDA_CODEC_VT1708S_3 HDA_CODEC_CONSTRUCT(VIA, 0x3397) #define HDA_CODEC_VT1708S_4 HDA_CODEC_CONSTRUCT(VIA, 0x4397) #define HDA_CODEC_VT1708S_5 HDA_CODEC_CONSTRUCT(VIA, 0x5397) #define HDA_CODEC_VT1708S_6 HDA_CODEC_CONSTRUCT(VIA, 0x6397) #define HDA_CODEC_VT1708S_7 HDA_CODEC_CONSTRUCT(VIA, 0x7397) #define HDA_CODEC_VT1702_0 HDA_CODEC_CONSTRUCT(VIA, 0x0398) #define HDA_CODEC_VT1702_1 HDA_CODEC_CONSTRUCT(VIA, 0x1398) #define HDA_CODEC_VT1702_2 HDA_CODEC_CONSTRUCT(VIA, 0x2398) #define HDA_CODEC_VT1702_3 HDA_CODEC_CONSTRUCT(VIA, 0x3398) #define HDA_CODEC_VT1702_4 HDA_CODEC_CONSTRUCT(VIA, 0x4398) #define HDA_CODEC_VT1702_5 HDA_CODEC_CONSTRUCT(VIA, 0x5398) #define HDA_CODEC_VT1702_6 HDA_CODEC_CONSTRUCT(VIA, 0x6398) #define HDA_CODEC_VT1702_7 HDA_CODEC_CONSTRUCT(VIA, 0x7398) #define HDA_CODEC_VT1716S_0 HDA_CODEC_CONSTRUCT(VIA, 0x0433) #define HDA_CODEC_VT1716S_1 HDA_CODEC_CONSTRUCT(VIA, 0xa721) #define HDA_CODEC_VT1718S_0 HDA_CODEC_CONSTRUCT(VIA, 0x0428) #define HDA_CODEC_VT1718S_1 HDA_CODEC_CONSTRUCT(VIA, 0x4428) #define HDA_CODEC_VT1802_0 HDA_CODEC_CONSTRUCT(VIA, 0x0446) #define HDA_CODEC_VT1802_1 HDA_CODEC_CONSTRUCT(VIA, 0x8446) #define HDA_CODEC_VT1812 HDA_CODEC_CONSTRUCT(VIA, 0x0448) #define HDA_CODEC_VT1818S HDA_CODEC_CONSTRUCT(VIA, 0x0440) #define HDA_CODEC_VT1828S HDA_CODEC_CONSTRUCT(VIA, 0x4441) #define HDA_CODEC_VT2002P_0 HDA_CODEC_CONSTRUCT(VIA, 0x0438) #define HDA_CODEC_VT2002P_1 HDA_CODEC_CONSTRUCT(VIA, 0x4438) #define HDA_CODEC_VT2020 HDA_CODEC_CONSTRUCT(VIA, 0x0441) #define HDA_CODEC_VTXXXX HDA_CODEC_CONSTRUCT(VIA, 0xffff) /* ATI */ #define HDA_CODEC_ATIRS600_1 HDA_CODEC_CONSTRUCT(ATI, 0x793c) #define HDA_CODEC_ATIRS600_2 HDA_CODEC_CONSTRUCT(ATI, 0x7919) #define HDA_CODEC_ATIRS690 HDA_CODEC_CONSTRUCT(ATI, 0x791a) #define HDA_CODEC_ATIR6XX HDA_CODEC_CONSTRUCT(ATI, 0xaa01) #define HDA_CODEC_ATIXXXX HDA_CODEC_CONSTRUCT(ATI, 0xffff) /* NVIDIA */ #define HDA_CODEC_NVIDIAMCP78 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0002) #define HDA_CODEC_NVIDIAMCP78_2 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0003) #define HDA_CODEC_NVIDIAMCP78_3 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0005) #define HDA_CODEC_NVIDIAMCP78_4 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0006) #define HDA_CODEC_NVIDIAMCP7A HDA_CODEC_CONSTRUCT(NVIDIA, 0x0007) #define HDA_CODEC_NVIDIAGT220 HDA_CODEC_CONSTRUCT(NVIDIA, 0x000a) #define HDA_CODEC_NVIDIAGT21X HDA_CODEC_CONSTRUCT(NVIDIA, 0x000b) #define HDA_CODEC_NVIDIAMCP89 HDA_CODEC_CONSTRUCT(NVIDIA, 0x000c) #define HDA_CODEC_NVIDIAGT240 HDA_CODEC_CONSTRUCT(NVIDIA, 0x000d) #define HDA_CODEC_NVIDIAGTS450 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0011) #define HDA_CODEC_NVIDIAGT440 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0014) #define HDA_CODEC_NVIDIAGTX550 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0015) #define HDA_CODEC_NVIDIAGTX570 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0018) #define HDA_CODEC_NVIDIAMCP67 HDA_CODEC_CONSTRUCT(NVIDIA, 0x0067) #define HDA_CODEC_NVIDIAMCP73 HDA_CODEC_CONSTRUCT(NVIDIA, 0x8001) #define HDA_CODEC_NVIDIAXXXX HDA_CODEC_CONSTRUCT(NVIDIA, 0xffff) /* Chrontel */ #define CHRONTEL_VENDORID 0x17e8 #define HDA_CODEC_CHXXXX HDA_CODEC_CONSTRUCT(CHRONTEL, 0xffff) /* INTEL */ #define HDA_CODEC_INTELIP HDA_CODEC_CONSTRUCT(INTEL, 0x0054) #define HDA_CODEC_INTELBL HDA_CODEC_CONSTRUCT(INTEL, 0x2801) #define HDA_CODEC_INTELCA HDA_CODEC_CONSTRUCT(INTEL, 0x2802) #define HDA_CODEC_INTELEL HDA_CODEC_CONSTRUCT(INTEL, 0x2803) #define HDA_CODEC_INTELIP2 HDA_CODEC_CONSTRUCT(INTEL, 0x2804) #define HDA_CODEC_INTELCPT HDA_CODEC_CONSTRUCT(INTEL, 0x2805) #define HDA_CODEC_INTELPPT HDA_CODEC_CONSTRUCT(INTEL, 0x2806) #define HDA_CODEC_INTELHSW HDA_CODEC_CONSTRUCT(INTEL, 0x2807) +#define HDA_CODEC_INTELBDW HDA_CODEC_CONSTRUCT(INTEL, 0x2808) #define HDA_CODEC_INTELCL HDA_CODEC_CONSTRUCT(INTEL, 0x29fb) #define HDA_CODEC_INTELXXXX HDA_CODEC_CONSTRUCT(INTEL, 0xffff) /**************************************************************************** * Helper Macros ****************************************************************************/ #define HDA_DMA_ALIGNMENT 128 #define HDA_BDL_MIN 2 #define HDA_BDL_MAX 256 #define HDA_BDL_DEFAULT HDA_BDL_MIN #define HDA_BLK_MIN HDA_DMA_ALIGNMENT #define HDA_BLK_ALIGN (~(HDA_BLK_MIN - 1)) #define HDA_BUFSZ_MIN (HDA_BDL_MIN * HDA_BLK_MIN) #define HDA_BUFSZ_MAX 262144 #define HDA_BUFSZ_DEFAULT 65536 #define HDA_GPIO_MAX 8 #define HDA_DEV_MATCH(fl, v) ((fl) == (v) || \ (fl) == 0xffffffff || \ (((fl) & 0xffff0000) == 0xffff0000 && \ ((fl) & 0x0000ffff) == ((v) & 0x0000ffff)) || \ (((fl) & 0x0000ffff) == 0x0000ffff && \ ((fl) & 0xffff0000) == ((v) & 0xffff0000))) #define HDA_MATCH_ALL 0xffffffff #define HDA_INVALID 0xffffffff #define HDA_BOOTVERBOSE(stmt) do { \ if (bootverbose != 0 || snd_verbose > 3) { \ stmt \ } \ } while (0) #define HDA_BOOTHVERBOSE(stmt) do { \ if (snd_verbose > 3) { \ stmt \ } \ } while (0) #define hda_command(dev, verb) \ HDAC_CODEC_COMMAND(device_get_parent(dev), (dev), (verb)) typedef int nid_t; /**************************************************************************** * Simplified Accessors for HDA devices ****************************************************************************/ enum hdac_device_ivars { HDA_IVAR_CODEC_ID, HDA_IVAR_NODE_ID, HDA_IVAR_VENDOR_ID, HDA_IVAR_DEVICE_ID, HDA_IVAR_REVISION_ID, HDA_IVAR_STEPPING_ID, HDA_IVAR_SUBVENDOR_ID, HDA_IVAR_SUBDEVICE_ID, HDA_IVAR_SUBSYSTEM_ID, HDA_IVAR_NODE_TYPE, HDA_IVAR_DMA_NOCACHE, }; #define HDA_ACCESSOR(var, ivar, type) \ __BUS_ACCESSOR(hda, var, HDA, ivar, type) HDA_ACCESSOR(codec_id, CODEC_ID, uint8_t); HDA_ACCESSOR(node_id, NODE_ID, uint8_t); HDA_ACCESSOR(vendor_id, VENDOR_ID, uint16_t); HDA_ACCESSOR(device_id, DEVICE_ID, uint16_t); HDA_ACCESSOR(revision_id, REVISION_ID, uint8_t); HDA_ACCESSOR(stepping_id, STEPPING_ID, uint8_t); HDA_ACCESSOR(subvendor_id, SUBVENDOR_ID, uint16_t); HDA_ACCESSOR(subdevice_id, SUBDEVICE_ID, uint16_t); HDA_ACCESSOR(subsystem_id, SUBSYSTEM_ID, uint32_t); HDA_ACCESSOR(node_type, NODE_TYPE, uint8_t); HDA_ACCESSOR(dma_nocache, DMA_NOCACHE, uint8_t); #define PCIS_MULTIMEDIA_HDA 0x03 #endif Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdacc.c =================================================================== --- user/ngie/more-tests/sys/dev/sound/pci/hda/hdacc.c (revision 281584) +++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdacc.c (revision 281585) @@ -1,726 +1,728 @@ /*- * Copyright (c) 2006 Stephane E. Potvin * Copyright (c) 2006 Ariff Abdullah * Copyright (c) 2008-2012 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * Intel High Definition Audio (CODEC) driver for FreeBSD. */ #ifdef HAVE_KERNEL_OPTION_HEADERS #include "opt_snd.h" #endif #include #include #include #include SND_DECLARE_FILE("$FreeBSD$"); struct hdacc_fg { device_t dev; nid_t nid; uint8_t type; uint32_t subsystem_id; }; struct hdacc_softc { device_t dev; struct mtx *lock; nid_t cad; device_t streams[2][16]; device_t tags[64]; int fgcnt; struct hdacc_fg *fgs; }; #define hdacc_lock(codec) snd_mtxlock((codec)->lock) #define hdacc_unlock(codec) snd_mtxunlock((codec)->lock) #define hdacc_lockassert(codec) snd_mtxassert((codec)->lock) #define hdacc_lockowned(codec) mtx_owned((codec)->lock) MALLOC_DEFINE(M_HDACC, "hdacc", "HDA CODEC"); /* CODECs */ static const struct { uint32_t id; uint16_t revid; const char *name; } hdacc_codecs[] = { { HDA_CODEC_CS4206, 0, "Cirrus Logic CS4206" }, { HDA_CODEC_CS4207, 0, "Cirrus Logic CS4207" }, { HDA_CODEC_CS4210, 0, "Cirrus Logic CS4210" }, { HDA_CODEC_ALC221, 0, "Realtek ALC221" }, { HDA_CODEC_ALC260, 0, "Realtek ALC260" }, { HDA_CODEC_ALC262, 0, "Realtek ALC262" }, { HDA_CODEC_ALC267, 0, "Realtek ALC267" }, { HDA_CODEC_ALC268, 0, "Realtek ALC268" }, { HDA_CODEC_ALC269, 0, "Realtek ALC269" }, { HDA_CODEC_ALC270, 0, "Realtek ALC270" }, { HDA_CODEC_ALC272, 0, "Realtek ALC272" }, { HDA_CODEC_ALC273, 0, "Realtek ALC273" }, { HDA_CODEC_ALC275, 0, "Realtek ALC275" }, { HDA_CODEC_ALC276, 0, "Realtek ALC276" }, + { HDA_CODEC_ALC292, 0, "Realtek ALC292" }, { HDA_CODEC_ALC660, 0, "Realtek ALC660-VD" }, { HDA_CODEC_ALC662, 0x0002, "Realtek ALC662 rev2" }, { HDA_CODEC_ALC662, 0, "Realtek ALC662" }, { HDA_CODEC_ALC663, 0, "Realtek ALC663" }, { HDA_CODEC_ALC665, 0, "Realtek ALC665" }, { HDA_CODEC_ALC670, 0, "Realtek ALC670" }, { HDA_CODEC_ALC680, 0, "Realtek ALC680" }, { HDA_CODEC_ALC861, 0x0340, "Realtek ALC660" }, { HDA_CODEC_ALC861, 0, "Realtek ALC861" }, { HDA_CODEC_ALC861VD, 0, "Realtek ALC861-VD" }, { HDA_CODEC_ALC880, 0, "Realtek ALC880" }, { HDA_CODEC_ALC882, 0, "Realtek ALC882" }, { HDA_CODEC_ALC883, 0, "Realtek ALC883" }, { HDA_CODEC_ALC885, 0x0101, "Realtek ALC889A" }, { HDA_CODEC_ALC885, 0x0103, "Realtek ALC889A" }, { HDA_CODEC_ALC885, 0, "Realtek ALC885" }, { HDA_CODEC_ALC887, 0, "Realtek ALC887" }, { HDA_CODEC_ALC888, 0x0101, "Realtek ALC1200" }, { HDA_CODEC_ALC888, 0, "Realtek ALC888" }, { HDA_CODEC_ALC889, 0, "Realtek ALC889" }, { HDA_CODEC_ALC892, 0, "Realtek ALC892" }, { HDA_CODEC_ALC899, 0, "Realtek ALC899" }, { HDA_CODEC_AD1882, 0, "Analog Devices AD1882" }, { HDA_CODEC_AD1882A, 0, "Analog Devices AD1882A" }, { HDA_CODEC_AD1883, 0, "Analog Devices AD1883" }, { HDA_CODEC_AD1884, 0, "Analog Devices AD1884" }, { HDA_CODEC_AD1884A, 0, "Analog Devices AD1884A" }, { HDA_CODEC_AD1981HD, 0, "Analog Devices AD1981HD" }, { HDA_CODEC_AD1983, 0, "Analog Devices AD1983" }, { HDA_CODEC_AD1984, 0, "Analog Devices AD1984" }, { HDA_CODEC_AD1984A, 0, "Analog Devices AD1984A" }, { HDA_CODEC_AD1984B, 0, "Analog Devices AD1984B" }, { HDA_CODEC_AD1986A, 0, "Analog Devices AD1986A" }, { HDA_CODEC_AD1987, 0, "Analog Devices AD1987" }, { HDA_CODEC_AD1988, 0, "Analog Devices AD1988A" }, { HDA_CODEC_AD1988B, 0, "Analog Devices AD1988B" }, { HDA_CODEC_AD1989A, 0, "Analog Devices AD1989A" }, { HDA_CODEC_AD1989B, 0, "Analog Devices AD1989B" }, { HDA_CODEC_CA0110, 0, "Creative CA0110-IBG" }, { HDA_CODEC_CA0110_2, 0, "Creative CA0110-IBG" }, { HDA_CODEC_CA0132, 0, "Creative CA0132" }, { HDA_CODEC_SB0880, 0, "Creative SB0880 X-Fi" }, { HDA_CODEC_CMI9880, 0, "CMedia CMI9880" }, { HDA_CODEC_CMI98802, 0, "CMedia CMI9880" }, { HDA_CODEC_CXD9872RDK, 0, "Sigmatel CXD9872RD/K" }, { HDA_CODEC_CXD9872AKD, 0, "Sigmatel CXD9872AKD" }, { HDA_CODEC_STAC9200D, 0, "Sigmatel STAC9200D" }, { HDA_CODEC_STAC9204X, 0, "Sigmatel STAC9204X" }, { HDA_CODEC_STAC9204D, 0, "Sigmatel STAC9204D" }, { HDA_CODEC_STAC9205X, 0, "Sigmatel STAC9205X" }, { HDA_CODEC_STAC9205D, 0, "Sigmatel STAC9205D" }, { HDA_CODEC_STAC9220, 0, "Sigmatel STAC9220" }, { HDA_CODEC_STAC9220_A1, 0, "Sigmatel STAC9220_A1" }, { HDA_CODEC_STAC9220_A2, 0, "Sigmatel STAC9220_A2" }, { HDA_CODEC_STAC9221, 0, "Sigmatel STAC9221" }, { HDA_CODEC_STAC9221_A2, 0, "Sigmatel STAC9221_A2" }, { HDA_CODEC_STAC9221D, 0, "Sigmatel STAC9221D" }, { HDA_CODEC_STAC922XD, 0, "Sigmatel STAC9220D/9223D" }, { HDA_CODEC_STAC9227X, 0, "Sigmatel STAC9227X" }, { HDA_CODEC_STAC9227D, 0, "Sigmatel STAC9227D" }, { HDA_CODEC_STAC9228X, 0, "Sigmatel STAC9228X" }, { HDA_CODEC_STAC9228D, 0, "Sigmatel STAC9228D" }, { HDA_CODEC_STAC9229X, 0, "Sigmatel STAC9229X" }, { HDA_CODEC_STAC9229D, 0, "Sigmatel STAC9229D" }, { HDA_CODEC_STAC9230X, 0, "Sigmatel STAC9230X" }, { HDA_CODEC_STAC9230D, 0, "Sigmatel STAC9230D" }, { HDA_CODEC_STAC9250, 0, "Sigmatel STAC9250" }, { HDA_CODEC_STAC9251, 0, "Sigmatel STAC9251" }, { HDA_CODEC_STAC9255, 0, "Sigmatel STAC9255" }, { HDA_CODEC_STAC9255D, 0, "Sigmatel STAC9255D" }, { HDA_CODEC_STAC9254, 0, "Sigmatel STAC9254" }, { HDA_CODEC_STAC9254D, 0, "Sigmatel STAC9254D" }, { HDA_CODEC_STAC9271X, 0, "Sigmatel STAC9271X" }, { HDA_CODEC_STAC9271D, 0, "Sigmatel STAC9271D" }, { HDA_CODEC_STAC9272X, 0, "Sigmatel STAC9272X" }, { HDA_CODEC_STAC9272D, 0, "Sigmatel STAC9272D" }, { HDA_CODEC_STAC9273X, 0, "Sigmatel STAC9273X" }, { HDA_CODEC_STAC9273D, 0, "Sigmatel STAC9273D" }, { HDA_CODEC_STAC9274, 0, "Sigmatel STAC9274" }, { HDA_CODEC_STAC9274D, 0, "Sigmatel STAC9274D" }, { HDA_CODEC_STAC9274X5NH, 0, "Sigmatel STAC9274X5NH" }, { HDA_CODEC_STAC9274D5NH, 0, "Sigmatel STAC9274D5NH" }, { HDA_CODEC_STAC9872AK, 0, "Sigmatel STAC9872AK" }, { HDA_CODEC_IDT92HD005, 0, "IDT 92HD005" }, { HDA_CODEC_IDT92HD005D, 0, "IDT 92HD005D" }, { HDA_CODEC_IDT92HD206X, 0, "IDT 92HD206X" }, { HDA_CODEC_IDT92HD206D, 0, "IDT 92HD206D" }, { HDA_CODEC_IDT92HD66B1X5, 0, "IDT 92HD66B1X5" }, { HDA_CODEC_IDT92HD66B2X5, 0, "IDT 92HD66B2X5" }, { HDA_CODEC_IDT92HD66B3X5, 0, "IDT 92HD66B3X5" }, { HDA_CODEC_IDT92HD66C1X5, 0, "IDT 92HD66C1X5" }, { HDA_CODEC_IDT92HD66C2X5, 0, "IDT 92HD66C2X5" }, { HDA_CODEC_IDT92HD66C3X5, 0, "IDT 92HD66C3X5" }, { HDA_CODEC_IDT92HD66B1X3, 0, "IDT 92HD66B1X3" }, { HDA_CODEC_IDT92HD66B2X3, 0, "IDT 92HD66B2X3" }, { HDA_CODEC_IDT92HD66B3X3, 0, "IDT 92HD66B3X3" }, { HDA_CODEC_IDT92HD66C1X3, 0, "IDT 92HD66C1X3" }, { HDA_CODEC_IDT92HD66C2X3, 0, "IDT 92HD66C2X3" }, { HDA_CODEC_IDT92HD66C3_65, 0, "IDT 92HD66C3_65" }, { HDA_CODEC_IDT92HD700X, 0, "IDT 92HD700X" }, { HDA_CODEC_IDT92HD700D, 0, "IDT 92HD700D" }, { HDA_CODEC_IDT92HD71B5, 0, "IDT 92HD71B5" }, { HDA_CODEC_IDT92HD71B5_2, 0, "IDT 92HD71B5" }, { HDA_CODEC_IDT92HD71B6, 0, "IDT 92HD71B6" }, { HDA_CODEC_IDT92HD71B6_2, 0, "IDT 92HD71B6" }, { HDA_CODEC_IDT92HD71B7, 0, "IDT 92HD71B7" }, { HDA_CODEC_IDT92HD71B7_2, 0, "IDT 92HD71B7" }, { HDA_CODEC_IDT92HD71B8, 0, "IDT 92HD71B8" }, { HDA_CODEC_IDT92HD71B8_2, 0, "IDT 92HD71B8" }, { HDA_CODEC_IDT92HD73C1, 0, "IDT 92HD73C1" }, { HDA_CODEC_IDT92HD73D1, 0, "IDT 92HD73D1" }, { HDA_CODEC_IDT92HD73E1, 0, "IDT 92HD73E1" }, { HDA_CODEC_IDT92HD75B3, 0, "IDT 92HD75B3" }, { HDA_CODEC_IDT92HD75BX, 0, "IDT 92HD75BX" }, { HDA_CODEC_IDT92HD81B1C, 0, "IDT 92HD81B1C" }, { HDA_CODEC_IDT92HD81B1X, 0, "IDT 92HD81B1X" }, { HDA_CODEC_IDT92HD83C1C, 0, "IDT 92HD83C1C" }, { HDA_CODEC_IDT92HD83C1X, 0, "IDT 92HD83C1X" }, { HDA_CODEC_IDT92HD87B1_3, 0, "IDT 92HD87B1/3" }, { HDA_CODEC_IDT92HD87B2_4, 0, "IDT 92HD87B2/4" }, { HDA_CODEC_IDT92HD89C3, 0, "IDT 92HD89C3" }, { HDA_CODEC_IDT92HD89C2, 0, "IDT 92HD89C2" }, { HDA_CODEC_IDT92HD89C1, 0, "IDT 92HD89C1" }, { HDA_CODEC_IDT92HD89B3, 0, "IDT 92HD89B3" }, { HDA_CODEC_IDT92HD89B2, 0, "IDT 92HD89B2" }, { HDA_CODEC_IDT92HD89B1, 0, "IDT 92HD89B1" }, { HDA_CODEC_IDT92HD89E3, 0, "IDT 92HD89E3" }, { HDA_CODEC_IDT92HD89E2, 0, "IDT 92HD89E2" }, { HDA_CODEC_IDT92HD89E1, 0, "IDT 92HD89E1" }, { HDA_CODEC_IDT92HD89D3, 0, "IDT 92HD89D3" }, { HDA_CODEC_IDT92HD89D2, 0, "IDT 92HD89D2" }, { HDA_CODEC_IDT92HD89D1, 0, "IDT 92HD89D1" }, { HDA_CODEC_IDT92HD89F3, 0, "IDT 92HD89F3" }, { HDA_CODEC_IDT92HD89F2, 0, "IDT 92HD89F2" }, { HDA_CODEC_IDT92HD89F1, 0, "IDT 92HD89F1" }, { HDA_CODEC_IDT92HD90BXX, 0, "IDT 92HD90BXX" }, { HDA_CODEC_IDT92HD91BXX, 0, "IDT 92HD91BXX" }, { HDA_CODEC_IDT92HD93BXX, 0, "IDT 92HD93BXX" }, { HDA_CODEC_IDT92HD98BXX, 0, "IDT 92HD98BXX" }, { HDA_CODEC_IDT92HD99BXX, 0, "IDT 92HD99BXX" }, { HDA_CODEC_CX20549, 0, "Conexant CX20549 (Venice)" }, { HDA_CODEC_CX20551, 0, "Conexant CX20551 (Waikiki)" }, { HDA_CODEC_CX20561, 0, "Conexant CX20561 (Hermosa)" }, { HDA_CODEC_CX20582, 0, "Conexant CX20582 (Pebble)" }, { HDA_CODEC_CX20583, 0, "Conexant CX20583 (Pebble HSF)" }, { HDA_CODEC_CX20584, 0, "Conexant CX20584" }, { HDA_CODEC_CX20585, 0, "Conexant CX20585" }, { HDA_CODEC_CX20588, 0, "Conexant CX20588" }, { HDA_CODEC_CX20590, 0, "Conexant CX20590" }, { HDA_CODEC_CX20631, 0, "Conexant CX20631" }, { HDA_CODEC_CX20632, 0, "Conexant CX20632" }, { HDA_CODEC_CX20641, 0, "Conexant CX20641" }, { HDA_CODEC_CX20642, 0, "Conexant CX20642" }, { HDA_CODEC_CX20651, 0, "Conexant CX20651" }, { HDA_CODEC_CX20652, 0, "Conexant CX20652" }, { HDA_CODEC_CX20664, 0, "Conexant CX20664" }, { HDA_CODEC_CX20665, 0, "Conexant CX20665" }, { HDA_CODEC_VT1708_8, 0, "VIA VT1708_8" }, { HDA_CODEC_VT1708_9, 0, "VIA VT1708_9" }, { HDA_CODEC_VT1708_A, 0, "VIA VT1708_A" }, { HDA_CODEC_VT1708_B, 0, "VIA VT1708_B" }, { HDA_CODEC_VT1709_0, 0, "VIA VT1709_0" }, { HDA_CODEC_VT1709_1, 0, "VIA VT1709_1" }, { HDA_CODEC_VT1709_2, 0, "VIA VT1709_2" }, { HDA_CODEC_VT1709_3, 0, "VIA VT1709_3" }, { HDA_CODEC_VT1709_4, 0, "VIA VT1709_4" }, { HDA_CODEC_VT1709_5, 0, "VIA VT1709_5" }, { HDA_CODEC_VT1709_6, 0, "VIA VT1709_6" }, { HDA_CODEC_VT1709_7, 0, "VIA VT1709_7" }, { HDA_CODEC_VT1708B_0, 0, "VIA VT1708B_0" }, { HDA_CODEC_VT1708B_1, 0, "VIA VT1708B_1" }, { HDA_CODEC_VT1708B_2, 0, "VIA VT1708B_2" }, { HDA_CODEC_VT1708B_3, 0, "VIA VT1708B_3" }, { HDA_CODEC_VT1708B_4, 0, "VIA VT1708B_4" }, { HDA_CODEC_VT1708B_5, 0, "VIA VT1708B_5" }, { HDA_CODEC_VT1708B_6, 0, "VIA VT1708B_6" }, { HDA_CODEC_VT1708B_7, 0, "VIA VT1708B_7" }, { HDA_CODEC_VT1708S_0, 0, "VIA VT1708S_0" }, { HDA_CODEC_VT1708S_1, 0, "VIA VT1708S_1" }, { HDA_CODEC_VT1708S_2, 0, "VIA VT1708S_2" }, { HDA_CODEC_VT1708S_3, 0, "VIA VT1708S_3" }, { HDA_CODEC_VT1708S_4, 0, "VIA VT1708S_4" }, { HDA_CODEC_VT1708S_5, 0, "VIA VT1708S_5" }, { HDA_CODEC_VT1708S_6, 0, "VIA VT1708S_6" }, { HDA_CODEC_VT1708S_7, 0, "VIA VT1708S_7" }, { HDA_CODEC_VT1702_0, 0, "VIA VT1702_0" }, { HDA_CODEC_VT1702_1, 0, "VIA VT1702_1" }, { HDA_CODEC_VT1702_2, 0, "VIA VT1702_2" }, { HDA_CODEC_VT1702_3, 0, "VIA VT1702_3" }, { HDA_CODEC_VT1702_4, 0, "VIA VT1702_4" }, { HDA_CODEC_VT1702_5, 0, "VIA VT1702_5" }, { HDA_CODEC_VT1702_6, 0, "VIA VT1702_6" }, { HDA_CODEC_VT1702_7, 0, "VIA VT1702_7" }, { HDA_CODEC_VT1716S_0, 0, "VIA VT1716S_0" }, { HDA_CODEC_VT1716S_1, 0, "VIA VT1716S_1" }, { HDA_CODEC_VT1718S_0, 0, "VIA VT1718S_0" }, { HDA_CODEC_VT1718S_1, 0, "VIA VT1718S_1" }, { HDA_CODEC_VT1802_0, 0, "VIA VT1802_0" }, { HDA_CODEC_VT1802_1, 0, "VIA VT1802_1" }, { HDA_CODEC_VT1812, 0, "VIA VT1812" }, { HDA_CODEC_VT1818S, 0, "VIA VT1818S" }, { HDA_CODEC_VT1828S, 0, "VIA VT1828S" }, { HDA_CODEC_VT2002P_0, 0, "VIA VT2002P_0" }, { HDA_CODEC_VT2002P_1, 0, "VIA VT2002P_1" }, { HDA_CODEC_VT2020, 0, "VIA VT2020" }, { HDA_CODEC_ATIRS600_1, 0, "ATI RS600" }, { HDA_CODEC_ATIRS600_2, 0, "ATI RS600" }, { HDA_CODEC_ATIRS690, 0, "ATI RS690/780" }, { HDA_CODEC_ATIR6XX, 0, "ATI R6xx" }, { HDA_CODEC_NVIDIAMCP67, 0, "NVIDIA MCP67" }, { HDA_CODEC_NVIDIAMCP73, 0, "NVIDIA MCP73" }, { HDA_CODEC_NVIDIAMCP78, 0, "NVIDIA MCP78" }, { HDA_CODEC_NVIDIAMCP78_2, 0, "NVIDIA MCP78" }, { HDA_CODEC_NVIDIAMCP78_3, 0, "NVIDIA MCP78" }, { HDA_CODEC_NVIDIAMCP78_4, 0, "NVIDIA MCP78" }, { HDA_CODEC_NVIDIAMCP7A, 0, "NVIDIA MCP7A" }, { HDA_CODEC_NVIDIAGT220, 0, "NVIDIA GT220" }, { HDA_CODEC_NVIDIAGT21X, 0, "NVIDIA GT21x" }, { HDA_CODEC_NVIDIAMCP89, 0, "NVIDIA MCP89" }, { HDA_CODEC_NVIDIAGT240, 0, "NVIDIA GT240" }, { HDA_CODEC_NVIDIAGTS450, 0, "NVIDIA GTS450" }, { HDA_CODEC_NVIDIAGT440, 0, "NVIDIA GT440" }, { HDA_CODEC_NVIDIAGTX550, 0, "NVIDIA GTX550" }, { HDA_CODEC_NVIDIAGTX570, 0, "NVIDIA GTX570" }, { HDA_CODEC_INTELIP, 0, "Intel Ibex Peak" }, { HDA_CODEC_INTELBL, 0, "Intel Bearlake" }, { HDA_CODEC_INTELCA, 0, "Intel Cantiga" }, { HDA_CODEC_INTELEL, 0, "Intel Eaglelake" }, { HDA_CODEC_INTELIP2, 0, "Intel Ibex Peak" }, { HDA_CODEC_INTELCPT, 0, "Intel Cougar Point" }, { HDA_CODEC_INTELPPT, 0, "Intel Panther Point" }, { HDA_CODEC_INTELHSW, 0, "Intel Haswell" }, + { HDA_CODEC_INTELBDW, 0, "Intel Broadwell" }, { HDA_CODEC_INTELCL, 0, "Intel Crestline" }, { HDA_CODEC_SII1390, 0, "Silicon Image SiI1390" }, { HDA_CODEC_SII1392, 0, "Silicon Image SiI1392" }, /* Unknown CODECs */ { HDA_CODEC_ADXXXX, 0, "Analog Devices" }, { HDA_CODEC_AGEREXXXX, 0, "Lucent/Agere Systems" }, { HDA_CODEC_ALCXXXX, 0, "Realtek" }, { HDA_CODEC_ATIXXXX, 0, "ATI" }, { HDA_CODEC_CAXXXX, 0, "Creative" }, { HDA_CODEC_CMIXXXX, 0, "CMedia" }, { HDA_CODEC_CMIXXXX2, 0, "CMedia" }, { HDA_CODEC_CSXXXX, 0, "Cirrus Logic" }, { HDA_CODEC_CXXXXX, 0, "Conexant" }, { HDA_CODEC_CHXXXX, 0, "Chrontel" }, { HDA_CODEC_IDTXXXX, 0, "IDT" }, { HDA_CODEC_INTELXXXX, 0, "Intel" }, { HDA_CODEC_MOTOXXXX, 0, "Motorola" }, { HDA_CODEC_NVIDIAXXXX, 0, "NVIDIA" }, { HDA_CODEC_SIIXXXX, 0, "Silicon Image" }, { HDA_CODEC_STACXXXX, 0, "Sigmatel" }, { HDA_CODEC_VTXXXX, 0, "VIA" }, }; static int hdacc_suspend(device_t dev) { HDA_BOOTHVERBOSE( device_printf(dev, "Suspend...\n"); ); bus_generic_suspend(dev); HDA_BOOTHVERBOSE( device_printf(dev, "Suspend done\n"); ); return (0); } static int hdacc_resume(device_t dev) { HDA_BOOTHVERBOSE( device_printf(dev, "Resume...\n"); ); bus_generic_resume(dev); HDA_BOOTHVERBOSE( device_printf(dev, "Resume done\n"); ); return (0); } static int hdacc_probe(device_t dev) { uint32_t id, revid; char buf[128]; int i; id = ((uint32_t)hda_get_vendor_id(dev) << 16) + hda_get_device_id(dev); revid = ((uint32_t)hda_get_revision_id(dev) << 8) + hda_get_stepping_id(dev); for (i = 0; i < nitems(hdacc_codecs); i++) { if (!HDA_DEV_MATCH(hdacc_codecs[i].id, id)) continue; if (hdacc_codecs[i].revid != 0 && hdacc_codecs[i].revid != revid) continue; break; } if (i < nitems(hdacc_codecs)) { if ((hdacc_codecs[i].id & 0xffff) != 0xffff) strlcpy(buf, hdacc_codecs[i].name, sizeof(buf)); else snprintf(buf, sizeof(buf), "%s (0x%04x)", hdacc_codecs[i].name, hda_get_device_id(dev)); } else snprintf(buf, sizeof(buf), "Generic (0x%04x)", id); strlcat(buf, " HDA CODEC", sizeof(buf)); device_set_desc_copy(dev, buf); return (BUS_PROBE_DEFAULT); } static int hdacc_attach(device_t dev) { struct hdacc_softc *codec = device_get_softc(dev); device_t child; int cad = (intptr_t)device_get_ivars(dev); uint32_t subnode; int startnode; int endnode; int i, n; codec->lock = HDAC_GET_MTX(device_get_parent(dev), dev); codec->dev = dev; codec->cad = cad; hdacc_lock(codec); subnode = hda_command(dev, HDA_CMD_GET_PARAMETER(0, 0x0, HDA_PARAM_SUB_NODE_COUNT)); hdacc_unlock(codec); if (subnode == HDA_INVALID) return (EIO); codec->fgcnt = HDA_PARAM_SUB_NODE_COUNT_TOTAL(subnode); startnode = HDA_PARAM_SUB_NODE_COUNT_START(subnode); endnode = startnode + codec->fgcnt; HDA_BOOTHVERBOSE( device_printf(dev, "Root Node at nid=0: %d subnodes %d-%d\n", HDA_PARAM_SUB_NODE_COUNT_TOTAL(subnode), startnode, endnode - 1); ); codec->fgs = malloc(sizeof(struct hdacc_fg) * codec->fgcnt, M_HDACC, M_ZERO | M_WAITOK); for (i = startnode, n = 0; i < endnode; i++, n++) { codec->fgs[n].nid = i; hdacc_lock(codec); codec->fgs[n].type = HDA_PARAM_FCT_GRP_TYPE_NODE_TYPE(hda_command(dev, HDA_CMD_GET_PARAMETER(0, i, HDA_PARAM_FCT_GRP_TYPE))); codec->fgs[n].subsystem_id = hda_command(dev, HDA_CMD_GET_SUBSYSTEM_ID(0, i)); hdacc_unlock(codec); codec->fgs[n].dev = child = device_add_child(dev, NULL, -1); if (child == NULL) { device_printf(dev, "Failed to add function device\n"); continue; } device_set_ivars(child, &codec->fgs[n]); } bus_generic_attach(dev); return (0); } static int hdacc_detach(device_t dev) { struct hdacc_softc *codec = device_get_softc(dev); int error; error = device_delete_children(dev); free(codec->fgs, M_HDACC); return (error); } static int hdacc_child_location_str(device_t dev, device_t child, char *buf, size_t buflen) { struct hdacc_fg *fg = device_get_ivars(child); snprintf(buf, buflen, "nid=%d", fg->nid); return (0); } static int hdacc_child_pnpinfo_str_method(device_t dev, device_t child, char *buf, size_t buflen) { struct hdacc_fg *fg = device_get_ivars(child); snprintf(buf, buflen, "type=0x%02x subsystem=0x%08x", fg->type, fg->subsystem_id); return (0); } static int hdacc_print_child(device_t dev, device_t child) { struct hdacc_fg *fg = device_get_ivars(child); int retval; retval = bus_print_child_header(dev, child); retval += printf(" at nid %d", fg->nid); retval += bus_print_child_footer(dev, child); return (retval); } static void hdacc_probe_nomatch(device_t dev, device_t child) { struct hdacc_softc *codec = device_get_softc(dev); struct hdacc_fg *fg = device_get_ivars(child); device_printf(child, "<%s %s Function Group> at nid %d on %s " "(no driver attached)\n", device_get_desc(dev), fg->type == HDA_PARAM_FCT_GRP_TYPE_NODE_TYPE_AUDIO ? "Audio" : (fg->type == HDA_PARAM_FCT_GRP_TYPE_NODE_TYPE_MODEM ? "Modem" : "Unknown"), fg->nid, device_get_nameunit(dev)); HDA_BOOTVERBOSE( device_printf(dev, "Subsystem ID: 0x%08x\n", hda_get_subsystem_id(dev)); ); HDA_BOOTHVERBOSE( device_printf(dev, "Power down FG nid=%d to the D3 state...\n", fg->nid); ); hdacc_lock(codec); hda_command(dev, HDA_CMD_SET_POWER_STATE(0, fg->nid, HDA_CMD_POWER_STATE_D3)); hdacc_unlock(codec); } static int hdacc_read_ivar(device_t dev, device_t child, int which, uintptr_t *result) { struct hdacc_fg *fg = device_get_ivars(child); switch (which) { case HDA_IVAR_NODE_ID: *result = fg->nid; break; case HDA_IVAR_NODE_TYPE: *result = fg->type; break; case HDA_IVAR_SUBSYSTEM_ID: *result = fg->subsystem_id; break; default: return(BUS_READ_IVAR(device_get_parent(dev), dev, which, result)); } return (0); } static struct mtx * hdacc_get_mtx(device_t dev, device_t child) { struct hdacc_softc *codec = device_get_softc(dev); return (codec->lock); } static uint32_t hdacc_codec_command(device_t dev, device_t child, uint32_t verb) { return (HDAC_CODEC_COMMAND(device_get_parent(dev), dev, verb)); } static int hdacc_stream_alloc(device_t dev, device_t child, int dir, int format, int stripe, uint32_t **dmapos) { struct hdacc_softc *codec = device_get_softc(dev); int stream; stream = HDAC_STREAM_ALLOC(device_get_parent(dev), dev, dir, format, stripe, dmapos); if (stream > 0) codec->streams[dir][stream] = child; return (stream); } static void hdacc_stream_free(device_t dev, device_t child, int dir, int stream) { struct hdacc_softc *codec = device_get_softc(dev); codec->streams[dir][stream] = NULL; HDAC_STREAM_FREE(device_get_parent(dev), dev, dir, stream); } static int hdacc_stream_start(device_t dev, device_t child, int dir, int stream, bus_addr_t buf, int blksz, int blkcnt) { return (HDAC_STREAM_START(device_get_parent(dev), dev, dir, stream, buf, blksz, blkcnt)); } static void hdacc_stream_stop(device_t dev, device_t child, int dir, int stream) { HDAC_STREAM_STOP(device_get_parent(dev), dev, dir, stream); } static void hdacc_stream_reset(device_t dev, device_t child, int dir, int stream) { HDAC_STREAM_RESET(device_get_parent(dev), dev, dir, stream); } static uint32_t hdacc_stream_getptr(device_t dev, device_t child, int dir, int stream) { return (HDAC_STREAM_GETPTR(device_get_parent(dev), dev, dir, stream)); } static void hdacc_stream_intr(device_t dev, int dir, int stream) { struct hdacc_softc *codec = device_get_softc(dev); device_t child; if ((child = codec->streams[dir][stream]) != NULL) HDAC_STREAM_INTR(child, dir, stream); } static int hdacc_unsol_alloc(device_t dev, device_t child, int wanted) { struct hdacc_softc *codec = device_get_softc(dev); int tag; wanted &= 0x3f; tag = wanted; do { if (codec->tags[tag] == NULL) { codec->tags[tag] = child; HDAC_UNSOL_ALLOC(device_get_parent(dev), dev, tag); return (tag); } tag++; tag &= 0x3f; } while (tag != wanted); return (-1); } static void hdacc_unsol_free(device_t dev, device_t child, int tag) { struct hdacc_softc *codec = device_get_softc(dev); KASSERT(tag >= 0 && tag <= 0x3f, ("Wrong tag value %d\n", tag)); codec->tags[tag] = NULL; HDAC_UNSOL_FREE(device_get_parent(dev), dev, tag); } static void hdacc_unsol_intr(device_t dev, uint32_t resp) { struct hdacc_softc *codec = device_get_softc(dev); device_t child; int tag; tag = resp >> 26; if ((child = codec->tags[tag]) != NULL) HDAC_UNSOL_INTR(child, resp); else device_printf(codec->dev, "Unexpected unsolicited " "response with tag %d: %08x\n", tag, resp); } static void hdacc_pindump(device_t dev) { device_t *devlist; int devcount, i; if (device_get_children(dev, &devlist, &devcount) != 0) return; for (i = 0; i < devcount; i++) HDAC_PINDUMP(devlist[i]); free(devlist, M_TEMP); } static device_method_t hdacc_methods[] = { /* device interface */ DEVMETHOD(device_probe, hdacc_probe), DEVMETHOD(device_attach, hdacc_attach), DEVMETHOD(device_detach, hdacc_detach), DEVMETHOD(device_suspend, hdacc_suspend), DEVMETHOD(device_resume, hdacc_resume), /* Bus interface */ DEVMETHOD(bus_child_location_str, hdacc_child_location_str), DEVMETHOD(bus_child_pnpinfo_str, hdacc_child_pnpinfo_str_method), DEVMETHOD(bus_print_child, hdacc_print_child), DEVMETHOD(bus_probe_nomatch, hdacc_probe_nomatch), DEVMETHOD(bus_read_ivar, hdacc_read_ivar), DEVMETHOD(hdac_get_mtx, hdacc_get_mtx), DEVMETHOD(hdac_codec_command, hdacc_codec_command), DEVMETHOD(hdac_stream_alloc, hdacc_stream_alloc), DEVMETHOD(hdac_stream_free, hdacc_stream_free), DEVMETHOD(hdac_stream_start, hdacc_stream_start), DEVMETHOD(hdac_stream_stop, hdacc_stream_stop), DEVMETHOD(hdac_stream_reset, hdacc_stream_reset), DEVMETHOD(hdac_stream_getptr, hdacc_stream_getptr), DEVMETHOD(hdac_stream_intr, hdacc_stream_intr), DEVMETHOD(hdac_unsol_alloc, hdacc_unsol_alloc), DEVMETHOD(hdac_unsol_free, hdacc_unsol_free), DEVMETHOD(hdac_unsol_intr, hdacc_unsol_intr), DEVMETHOD(hdac_pindump, hdacc_pindump), DEVMETHOD_END }; static driver_t hdacc_driver = { "hdacc", hdacc_methods, sizeof(struct hdacc_softc), }; static devclass_t hdacc_devclass; DRIVER_MODULE(snd_hda, hdac, hdacc_driver, hdacc_devclass, NULL, NULL); Index: user/ngie/more-tests/sys/dev/vt/vt_font.c =================================================================== --- user/ngie/more-tests/sys/dev/vt/vt_font.c (revision 281584) +++ user/ngie/more-tests/sys/dev/vt/vt_font.c (revision 281585) @@ -1,224 +1,224 @@ /*- * Copyright (c) 2009 The FreeBSD Foundation * All rights reserved. * * This software was developed by Ed Schouten under sponsorship from the * FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include static MALLOC_DEFINE(M_VTFONT, "vtfont", "vt font"); /* Some limits to prevent abnormal fonts from being loaded. */ -#define VTFONT_MAXMAPPINGS 8192 -#define VTFONT_MAXGLYPHSIZE 1048576 +#define VTFONT_MAXMAPPINGS 65536 +#define VTFONT_MAXGLYPHSIZE 2097152 #define VTFONT_MAXDIMENSION 128 static uint16_t vtfont_bisearch(const struct vt_font_map *map, unsigned int len, uint32_t src) { int min, mid, max; min = 0; max = len - 1; /* Empty font map. */ if (len == 0) return (0); /* Character below minimal entry. */ if (src < map[0].vfm_src) return (0); /* Optimization: ASCII characters occur very often. */ if (src <= map[0].vfm_src + map[0].vfm_len) return (src - map[0].vfm_src + map[0].vfm_dst); /* Character above maximum entry. */ if (src > map[max].vfm_src + map[max].vfm_len) return (0); /* Binary search. */ while (max >= min) { mid = (min + max) / 2; if (src < map[mid].vfm_src) max = mid - 1; else if (src > map[mid].vfm_src + map[mid].vfm_len) min = mid + 1; else return (src - map[mid].vfm_src + map[mid].vfm_dst); } return (0); } const uint8_t * vtfont_lookup(const struct vt_font *vf, term_char_t c) { uint32_t src; uint16_t dst; size_t stride; unsigned int normal_map; unsigned int bold_map; /* * No support for printing right hand sides for CJK fullwidth * characters. Simply print a space and assume that the left * hand side describes the entire character. */ src = TCHAR_CHARACTER(c); if (TCHAR_FORMAT(c) & TF_CJK_RIGHT) { normal_map = VFNT_MAP_NORMAL_RIGHT; bold_map = VFNT_MAP_BOLD_RIGHT; } else { normal_map = VFNT_MAP_NORMAL; bold_map = VFNT_MAP_BOLD; } if (TCHAR_FORMAT(c) & TF_BOLD) { dst = vtfont_bisearch(vf->vf_map[bold_map], vf->vf_map_count[bold_map], src); if (dst != 0) goto found; } dst = vtfont_bisearch(vf->vf_map[normal_map], vf->vf_map_count[normal_map], src); found: stride = howmany(vf->vf_width, 8) * vf->vf_height; return (&vf->vf_bytes[dst * stride]); } struct vt_font * vtfont_ref(struct vt_font *vf) { refcount_acquire(&vf->vf_refcount); return (vf); } void vtfont_unref(struct vt_font *vf) { unsigned int i; if (refcount_release(&vf->vf_refcount)) { for (i = 0; i < VFNT_MAPS; i++) free(vf->vf_map[i], M_VTFONT); free(vf->vf_bytes, M_VTFONT); free(vf, M_VTFONT); } } static int vtfont_validate_map(struct vt_font_map *vfm, unsigned int length, unsigned int glyph_count) { unsigned int i, last = 0; for (i = 0; i < length; i++) { /* Not ordered. */ if (i > 0 && vfm[i].vfm_src <= last) return (EINVAL); /* * Destination extends amount of glyphs. */ if (vfm[i].vfm_dst >= glyph_count || vfm[i].vfm_dst + vfm[i].vfm_len >= glyph_count) return (EINVAL); last = vfm[i].vfm_src + vfm[i].vfm_len; } return (0); } int vtfont_load(vfnt_t *f, struct vt_font **ret) { size_t glyphsize, mapsize; struct vt_font *vf; int error; unsigned int i; /* Make sure the dimensions are valid. */ if (f->width < 1 || f->height < 1) return (EINVAL); if (f->width > VTFONT_MAXDIMENSION || f->height > VTFONT_MAXDIMENSION) return (E2BIG); /* Not too many mappings. */ for (i = 0; i < VFNT_MAPS; i++) if (f->map_count[i] > VTFONT_MAXMAPPINGS) return (E2BIG); /* Character 0 must always be present. */ if (f->glyph_count < 1) return (EINVAL); glyphsize = howmany(f->width, 8) * f->height * f->glyph_count; if (glyphsize > VTFONT_MAXGLYPHSIZE) return (E2BIG); /* Allocate new font structure. */ vf = malloc(sizeof *vf, M_VTFONT, M_WAITOK | M_ZERO); vf->vf_bytes = malloc(glyphsize, M_VTFONT, M_WAITOK); vf->vf_height = f->height; vf->vf_width = f->width; vf->vf_refcount = 1; /* Allocate, copy in, and validate mappings. */ for (i = 0; i < VFNT_MAPS; i++) { vf->vf_map_count[i] = f->map_count[i]; if (f->map_count[i] == 0) continue; mapsize = f->map_count[i] * sizeof(struct vt_font_map); vf->vf_map[i] = malloc(mapsize, M_VTFONT, M_WAITOK); error = copyin(f->map[i], vf->vf_map[i], mapsize); if (error) goto bad; error = vtfont_validate_map(vf->vf_map[i], vf->vf_map_count[i], f->glyph_count); if (error) goto bad; } /* Copy in glyph data. */ error = copyin(f->glyphs, vf->vf_bytes, glyphsize); if (error) goto bad; /* Success. */ *ret = vf; return (0); bad: vtfont_unref(vf); return (error); } Index: user/ngie/more-tests/sys/fs/ext2fs/ext2_vfsops.c =================================================================== --- user/ngie/more-tests/sys/fs/ext2fs/ext2_vfsops.c (revision 281584) +++ user/ngie/more-tests/sys/fs/ext2fs/ext2_vfsops.c (revision 281585) @@ -1,1114 +1,1115 @@ /*- * modified for EXT2FS support in Lites 1.1 * * Aug 1995, Godmar Back (gback@cs.utah.edu) * University of Utah, Department of Computer Science */ /*- * Copyright (c) 1989, 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)ffs_vfsops.c 8.8 (Berkeley) 4/18/94 * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static int ext2_flushfiles(struct mount *mp, int flags, struct thread *td); static int ext2_mountfs(struct vnode *, struct mount *); static int ext2_reload(struct mount *mp, struct thread *td); static int ext2_sbupdate(struct ext2mount *, int); static int ext2_cgupdate(struct ext2mount *, int); static vfs_unmount_t ext2_unmount; static vfs_root_t ext2_root; static vfs_statfs_t ext2_statfs; static vfs_sync_t ext2_sync; static vfs_vget_t ext2_vget; static vfs_fhtovp_t ext2_fhtovp; static vfs_mount_t ext2_mount; MALLOC_DEFINE(M_EXT2NODE, "ext2_node", "EXT2 vnode private part"); static MALLOC_DEFINE(M_EXT2MNT, "ext2_mount", "EXT2 mount structure"); static struct vfsops ext2fs_vfsops = { .vfs_fhtovp = ext2_fhtovp, .vfs_mount = ext2_mount, .vfs_root = ext2_root, /* root inode via vget */ .vfs_statfs = ext2_statfs, .vfs_sync = ext2_sync, .vfs_unmount = ext2_unmount, .vfs_vget = ext2_vget, }; VFS_SET(ext2fs_vfsops, ext2fs, 0); static int ext2_check_sb_compat(struct ext2fs *es, struct cdev *dev, int ronly); static int compute_sb_data(struct vnode * devvp, struct ext2fs * es, struct m_ext2fs * fs); static const char *ext2_opts[] = { "acls", "async", "noatime", "noclusterr", "noclusterw", "noexec", "export", "force", "from", "multilabel", "suiddir", "nosymfollow", "sync", "union", NULL }; /* * VFS Operations. * * mount system call */ static int ext2_mount(struct mount *mp) { struct vfsoptlist *opts; struct vnode *devvp; struct thread *td; struct ext2mount *ump = NULL; struct m_ext2fs *fs; struct nameidata nd, *ndp = &nd; accmode_t accmode; char *path, *fspec; int error, flags, len; td = curthread; opts = mp->mnt_optnew; if (vfs_filteropt(opts, ext2_opts)) return (EINVAL); vfs_getopt(opts, "fspath", (void **)&path, NULL); /* Double-check the length of path.. */ if (strlen(path) >= MAXMNTLEN) return (ENAMETOOLONG); fspec = NULL; error = vfs_getopt(opts, "from", (void **)&fspec, &len); if (!error && fspec[len - 1] != '\0') return (EINVAL); /* * If updating, check whether changing from read-only to * read/write; if there is no device name, that's all we do. */ if (mp->mnt_flag & MNT_UPDATE) { ump = VFSTOEXT2(mp); fs = ump->um_e2fs; error = 0; if (fs->e2fs_ronly == 0 && vfs_flagopt(opts, "ro", NULL, 0)) { error = VFS_SYNC(mp, MNT_WAIT); if (error) return (error); flags = WRITECLOSE; if (mp->mnt_flag & MNT_FORCE) flags |= FORCECLOSE; error = ext2_flushfiles(mp, flags, td); if ( error == 0 && fs->e2fs_wasvalid && ext2_cgupdate(ump, MNT_WAIT) == 0) { fs->e2fs->e2fs_state |= E2FS_ISCLEAN; ext2_sbupdate(ump, MNT_WAIT); } fs->e2fs_ronly = 1; vfs_flagopt(opts, "ro", &mp->mnt_flag, MNT_RDONLY); DROP_GIANT(); g_topology_lock(); g_access(ump->um_cp, 0, -1, 0); g_topology_unlock(); PICKUP_GIANT(); } if (!error && (mp->mnt_flag & MNT_RELOAD)) error = ext2_reload(mp, td); if (error) return (error); devvp = ump->um_devvp; if (fs->e2fs_ronly && !vfs_flagopt(opts, "ro", NULL, 0)) { if (ext2_check_sb_compat(fs->e2fs, devvp->v_rdev, 0)) return (EPERM); /* * If upgrade to read-write by non-root, then verify * that user has necessary permissions on the device. */ vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (error) error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { VOP_UNLOCK(devvp, 0); return (error); } VOP_UNLOCK(devvp, 0); DROP_GIANT(); g_topology_lock(); error = g_access(ump->um_cp, 0, 1, 0); g_topology_unlock(); PICKUP_GIANT(); if (error) return (error); if ((fs->e2fs->e2fs_state & E2FS_ISCLEAN) == 0 || (fs->e2fs->e2fs_state & E2FS_ERRORS)) { if (mp->mnt_flag & MNT_FORCE) { printf( "WARNING: %s was not properly dismounted\n", fs->e2fs_fsmnt); } else { printf( "WARNING: R/W mount of %s denied. Filesystem is not clean - run fsck\n", fs->e2fs_fsmnt); return (EPERM); } } fs->e2fs->e2fs_state &= ~E2FS_ISCLEAN; (void)ext2_cgupdate(ump, MNT_WAIT); fs->e2fs_ronly = 0; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_RDONLY; MNT_IUNLOCK(mp); } if (vfs_flagopt(opts, "export", NULL, 0)) { /* Process export requests in vfs_mount.c. */ return (error); } } /* * Not an update, or updating the name: look up the name * and verify that it refers to a sensible disk device. */ if (fspec == NULL) return (EINVAL); NDINIT(ndp, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspec, td); if ((error = namei(ndp)) != 0) return (error); NDFREE(ndp, NDF_ONLY_PNBUF); devvp = ndp->ni_vp; if (!vn_isdisk(devvp, &error)) { vput(devvp); return (error); } /* * If mount by non-root, then verify that user has necessary * permissions on the device. * * XXXRW: VOP_ACCESS() enough? */ accmode = VREAD; if ((mp->mnt_flag & MNT_RDONLY) == 0) accmode |= VWRITE; error = VOP_ACCESS(devvp, accmode, td->td_ucred, td); if (error) error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { vput(devvp); return (error); } if ((mp->mnt_flag & MNT_UPDATE) == 0) { error = ext2_mountfs(devvp, mp); } else { if (devvp != ump->um_devvp) { vput(devvp); return (EINVAL); /* needs translation */ } else vput(devvp); } if (error) { vrele(devvp); return (error); } ump = VFSTOEXT2(mp); fs = ump->um_e2fs; /* * Note that this strncpy() is ok because of a check at the start * of ext2_mount(). */ strncpy(fs->e2fs_fsmnt, path, MAXMNTLEN); fs->e2fs_fsmnt[MAXMNTLEN - 1] = '\0'; vfs_mountedfrom(mp, fspec); return (0); } static int ext2_check_sb_compat(struct ext2fs *es, struct cdev *dev, int ronly) { if (es->e2fs_magic != E2FS_MAGIC) { printf("ext2fs: %s: wrong magic number %#x (expected %#x)\n", devtoname(dev), es->e2fs_magic, E2FS_MAGIC); return (1); } if (es->e2fs_rev > E2FS_REV0) { if (es->e2fs_features_incompat & ~(EXT2F_INCOMPAT_SUPP | EXT4F_RO_INCOMPAT_SUPP)) { printf( "WARNING: mount of %s denied due to unsupported optional features\n", devtoname(dev)); return (1); } if (!ronly && (es->e2fs_features_rocompat & ~EXT2F_ROCOMPAT_SUPP)) { printf("WARNING: R/W mount of %s denied due to " "unsupported optional features\n", devtoname(dev)); return (1); } } return (0); } /* * This computes the fields of the ext2_sb_info structure from the * data in the ext2_super_block structure read in. */ static int compute_sb_data(struct vnode *devvp, struct ext2fs *es, struct m_ext2fs *fs) { int db_count, error; int i; int logic_sb_block = 1; /* XXX for now */ struct buf *bp; uint32_t e2fs_descpb; fs->e2fs_bshift = EXT2_MIN_BLOCK_LOG_SIZE + es->e2fs_log_bsize; fs->e2fs_bsize = 1U << fs->e2fs_bshift; fs->e2fs_fsbtodb = es->e2fs_log_bsize + 1; fs->e2fs_qbmask = fs->e2fs_bsize - 1; fs->e2fs_fsize = EXT2_MIN_FRAG_SIZE << es->e2fs_log_fsize; if (fs->e2fs_fsize) fs->e2fs_fpb = fs->e2fs_bsize / fs->e2fs_fsize; fs->e2fs_bpg = es->e2fs_bpg; fs->e2fs_fpg = es->e2fs_fpg; fs->e2fs_ipg = es->e2fs_ipg; if (es->e2fs_rev == E2FS_REV0) { fs->e2fs_isize = E2FS_REV0_INODE_SIZE ; } else { fs->e2fs_isize = es->e2fs_inode_size; /* * Simple sanity check for superblock inode size value. */ if (EXT2_INODE_SIZE(fs) < E2FS_REV0_INODE_SIZE || EXT2_INODE_SIZE(fs) > fs->e2fs_bsize || (fs->e2fs_isize & (fs->e2fs_isize - 1)) != 0) { printf("ext2fs: invalid inode size %d\n", fs->e2fs_isize); return (EIO); } } /* Check for extra isize in big inodes. */ if (EXT2_HAS_RO_COMPAT_FEATURE(fs, EXT2F_ROCOMPAT_EXTRA_ISIZE) && EXT2_INODE_SIZE(fs) < sizeof(struct ext2fs_dinode)) { printf("ext2fs: no space for extra inode timestamps\n"); return (EINVAL); } fs->e2fs_ipb = fs->e2fs_bsize / EXT2_INODE_SIZE(fs); fs->e2fs_itpg = fs->e2fs_ipg / fs->e2fs_ipb; /* s_resuid / s_resgid ? */ fs->e2fs_gcount = (es->e2fs_bcount - es->e2fs_first_dblock + EXT2_BLOCKS_PER_GROUP(fs) - 1) / EXT2_BLOCKS_PER_GROUP(fs); e2fs_descpb = fs->e2fs_bsize / sizeof(struct ext2_gd); db_count = (fs->e2fs_gcount + e2fs_descpb - 1) / e2fs_descpb; fs->e2fs_gdbcount = db_count; fs->e2fs_gd = malloc(db_count * fs->e2fs_bsize, M_EXT2MNT, M_WAITOK); fs->e2fs_contigdirs = malloc(fs->e2fs_gcount * sizeof(*fs->e2fs_contigdirs), M_EXT2MNT, M_WAITOK | M_ZERO); /* * Adjust logic_sb_block. * Godmar thinks: if the blocksize is greater than 1024, then * the superblock is logically part of block zero. */ if(fs->e2fs_bsize > SBSIZE) logic_sb_block = 0; for (i = 0; i < db_count; i++) { error = bread(devvp , fsbtodb(fs, logic_sb_block + i + 1 ), fs->e2fs_bsize, NOCRED, &bp); if (error) { free(fs->e2fs_contigdirs, M_EXT2MNT); free(fs->e2fs_gd, M_EXT2MNT); brelse(bp); return (error); } e2fs_cgload((struct ext2_gd *)bp->b_data, &fs->e2fs_gd[ i * fs->e2fs_bsize / sizeof(struct ext2_gd)], fs->e2fs_bsize); brelse(bp); bp = NULL; } /* Initialization for the ext2 Orlov allocator variant. */ fs->e2fs_total_dir = 0; for (i = 0; i < fs->e2fs_gcount; i++) fs->e2fs_total_dir += fs->e2fs_gd[i].ext2bgd_ndirs; if (es->e2fs_rev == E2FS_REV0 || !EXT2_HAS_RO_COMPAT_FEATURE(fs, EXT2F_ROCOMPAT_LARGEFILE)) fs->e2fs_maxfilesize = 0x7fffffff; else { fs->e2fs_maxfilesize = 0xffffffffffff; if (EXT2_HAS_RO_COMPAT_FEATURE(fs, EXT2F_ROCOMPAT_HUGE_FILE)) fs->e2fs_maxfilesize = 0x7fffffffffffffff; } if (es->e4fs_flags & E2FS_UNSIGNED_HASH) { fs->e2fs_uhash = 3; } else if ((es->e4fs_flags & E2FS_SIGNED_HASH) == 0) { #ifdef __CHAR_UNSIGNED__ es->e4fs_flags |= E2FS_UNSIGNED_HASH; fs->e2fs_uhash = 3; #else es->e4fs_flags |= E2FS_SIGNED_HASH; #endif } return (0); } /* * Reload all incore data for a filesystem (used after running fsck on * the root filesystem and finding things to fix). The filesystem must * be mounted read-only. * * Things to do to update the mount: * 1) invalidate all cached meta-data. * 2) re-read superblock from disk. * 3) invalidate all cluster summary information. * 4) invalidate all inactive vnodes. * 5) invalidate all cached file data. * 6) re-read inode data for all active vnodes. * XXX we are missing some steps, in particular # 3, this has to be reviewed. */ static int ext2_reload(struct mount *mp, struct thread *td) { struct vnode *vp, *mvp, *devvp; struct inode *ip; struct buf *bp; struct ext2fs *es; struct m_ext2fs *fs; struct csum *sump; int error, i; int32_t *lp; if ((mp->mnt_flag & MNT_RDONLY) == 0) return (EINVAL); /* * Step 1: invalidate all cached meta-data. */ devvp = VFSTOEXT2(mp)->um_devvp; vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); if (vinvalbuf(devvp, 0, 0, 0) != 0) panic("ext2_reload: dirty1"); VOP_UNLOCK(devvp, 0); /* * Step 2: re-read superblock from disk. * constants have been adjusted for ext2 */ if ((error = bread(devvp, SBLOCK, SBSIZE, NOCRED, &bp)) != 0) return (error); es = (struct ext2fs *)bp->b_data; if (ext2_check_sb_compat(es, devvp->v_rdev, 0) != 0) { brelse(bp); return (EIO); /* XXX needs translation */ } fs = VFSTOEXT2(mp)->um_e2fs; bcopy(bp->b_data, fs->e2fs, sizeof(struct ext2fs)); if((error = compute_sb_data(devvp, es, fs)) != 0) { brelse(bp); return (error); } #ifdef UNKLAR if (fs->fs_sbsize < SBSIZE) bp->b_flags |= B_INVAL; #endif brelse(bp); /* * Step 3: invalidate all cluster summary information. */ if (fs->e2fs_contigsumsize > 0) { lp = fs->e2fs_maxcluster; sump = fs->e2fs_clustersum; for (i = 0; i < fs->e2fs_gcount; i++, sump++) { *lp++ = fs->e2fs_contigsumsize; sump->cs_init = 0; bzero(sump->cs_sum, fs->e2fs_contigsumsize + 1); } } loop: MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { /* * Step 4: invalidate all cached file data. */ if (vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, td)) { MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); goto loop; } if (vinvalbuf(vp, 0, 0, 0)) panic("ext2_reload: dirty2"); /* * Step 5: re-read inode data for all active vnodes. */ ip = VTOI(vp); error = bread(devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)), (int)fs->e2fs_bsize, NOCRED, &bp); if (error) { VOP_UNLOCK(vp, 0); vrele(vp); MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); return (error); } ext2_ei2i((struct ext2fs_dinode *) ((char *)bp->b_data + EXT2_INODE_SIZE(fs) * ino_to_fsbo(fs, ip->i_number)), ip); brelse(bp); VOP_UNLOCK(vp, 0); vrele(vp); } return (0); } /* * Common code for mount and mountroot. */ static int ext2_mountfs(struct vnode *devvp, struct mount *mp) { struct ext2mount *ump; struct buf *bp; struct m_ext2fs *fs; struct ext2fs *es; struct cdev *dev = devvp->v_rdev; struct g_consumer *cp; struct bufobj *bo; struct csum *sump; int error; int ronly; int i, size; int32_t *lp; int32_t e2fs_maxcontig; ronly = vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0); /* XXX: use VOP_ACESS to check FS perms */ DROP_GIANT(); g_topology_lock(); error = g_vfs_open(devvp, &cp, "ext2fs", ronly ? 0 : 1); g_topology_unlock(); PICKUP_GIANT(); VOP_UNLOCK(devvp, 0); if (error) return (error); /* XXX: should we check for some sectorsize or 512 instead? */ if (((SBSIZE % cp->provider->sectorsize) != 0) || (SBSIZE < cp->provider->sectorsize)) { DROP_GIANT(); g_topology_lock(); g_vfs_close(cp); g_topology_unlock(); PICKUP_GIANT(); return (EINVAL); } bo = &devvp->v_bufobj; bo->bo_private = cp; bo->bo_ops = g_vfs_bufops; if (devvp->v_rdev->si_iosize_max != 0) mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max; if (mp->mnt_iosize_max > MAXPHYS) mp->mnt_iosize_max = MAXPHYS; bp = NULL; ump = NULL; if ((error = bread(devvp, SBLOCK, SBSIZE, NOCRED, &bp)) != 0) goto out; es = (struct ext2fs *)bp->b_data; if (ext2_check_sb_compat(es, dev, ronly) != 0) { error = EINVAL; /* XXX needs translation */ goto out; } if ((es->e2fs_state & E2FS_ISCLEAN) == 0 || (es->e2fs_state & E2FS_ERRORS)) { if (ronly || (mp->mnt_flag & MNT_FORCE)) { printf( "WARNING: Filesystem was not properly dismounted\n"); } else { printf( "WARNING: R/W mount denied. Filesystem is not clean - run fsck\n"); error = EPERM; goto out; } } ump = malloc(sizeof(*ump), M_EXT2MNT, M_WAITOK | M_ZERO); /* * I don't know whether this is the right strategy. Note that * we dynamically allocate both an ext2_sb_info and an ext2_super_block * while Linux keeps the super block in a locked buffer. */ ump->um_e2fs = malloc(sizeof(struct m_ext2fs), M_EXT2MNT, M_WAITOK); ump->um_e2fs->e2fs = malloc(sizeof(struct ext2fs), M_EXT2MNT, M_WAITOK); mtx_init(EXT2_MTX(ump), "EXT2FS", "EXT2FS Lock", MTX_DEF); bcopy(es, ump->um_e2fs->e2fs, (u_int)sizeof(struct ext2fs)); if ((error = compute_sb_data(devvp, ump->um_e2fs->e2fs, ump->um_e2fs))) goto out; /* * Calculate the maximum contiguous blocks and size of cluster summary * array. In FFS this is done by newfs; however, the superblock * in ext2fs doesn't have these variables, so we can calculate * them here. */ e2fs_maxcontig = MAX(1, MAXPHYS / ump->um_e2fs->e2fs_bsize); ump->um_e2fs->e2fs_contigsumsize = MIN(e2fs_maxcontig, EXT2_MAXCONTIG); if (ump->um_e2fs->e2fs_contigsumsize > 0) { size = ump->um_e2fs->e2fs_gcount * sizeof(int32_t); ump->um_e2fs->e2fs_maxcluster = malloc(size, M_EXT2MNT, M_WAITOK); size = ump->um_e2fs->e2fs_gcount * sizeof(struct csum); ump->um_e2fs->e2fs_clustersum = malloc(size, M_EXT2MNT, M_WAITOK); lp = ump->um_e2fs->e2fs_maxcluster; sump = ump->um_e2fs->e2fs_clustersum; for (i = 0; i < ump->um_e2fs->e2fs_gcount; i++, sump++) { *lp++ = ump->um_e2fs->e2fs_contigsumsize; sump->cs_init = 0; sump->cs_sum = malloc((ump->um_e2fs->e2fs_contigsumsize + 1) * sizeof(int32_t), M_EXT2MNT, M_WAITOK | M_ZERO); } } brelse(bp); bp = NULL; fs = ump->um_e2fs; fs->e2fs_ronly = ronly; /* ronly is set according to mnt_flags */ /* * If the fs is not mounted read-only, make sure the super block is * always written back on a sync(). */ fs->e2fs_wasvalid = fs->e2fs->e2fs_state & E2FS_ISCLEAN ? 1 : 0; if (ronly == 0) { fs->e2fs_fmod = 1; /* mark it modified */ fs->e2fs->e2fs_state &= ~E2FS_ISCLEAN; /* set fs invalid */ } mp->mnt_data = ump; mp->mnt_stat.f_fsid.val[0] = dev2udev(dev); mp->mnt_stat.f_fsid.val[1] = mp->mnt_vfc->vfc_typenum; mp->mnt_maxsymlinklen = EXT2_MAXSYMLINKLEN; MNT_ILOCK(mp); mp->mnt_flag |= MNT_LOCAL; MNT_IUNLOCK(mp); ump->um_mountp = mp; ump->um_dev = dev; ump->um_devvp = devvp; ump->um_bo = &devvp->v_bufobj; ump->um_cp = cp; /* * Setting those two parameters allowed us to use * ufs_bmap w/o changse! */ ump->um_nindir = EXT2_ADDR_PER_BLOCK(fs); ump->um_bptrtodb = fs->e2fs->e2fs_log_bsize + 1; ump->um_seqinc = EXT2_FRAGS_PER_BLOCK(fs); if (ronly == 0) ext2_sbupdate(ump, MNT_WAIT); /* * Initialize filesystem stat information in mount struct. */ MNT_ILOCK(mp); - mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED; + mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED | + MNTK_USES_BCACHE; MNT_IUNLOCK(mp); return (0); out: if (bp) brelse(bp); if (cp != NULL) { DROP_GIANT(); g_topology_lock(); g_vfs_close(cp); g_topology_unlock(); PICKUP_GIANT(); } if (ump) { mtx_destroy(EXT2_MTX(ump)); free(ump->um_e2fs->e2fs_gd, M_EXT2MNT); free(ump->um_e2fs->e2fs_contigdirs, M_EXT2MNT); free(ump->um_e2fs->e2fs, M_EXT2MNT); free(ump->um_e2fs, M_EXT2MNT); free(ump, M_EXT2MNT); mp->mnt_data = NULL; } return (error); } /* * Unmount system call. */ static int ext2_unmount(struct mount *mp, int mntflags) { struct ext2mount *ump; struct m_ext2fs *fs; struct csum *sump; int error, flags, i, ronly; flags = 0; if (mntflags & MNT_FORCE) { if (mp->mnt_flag & MNT_ROOTFS) return (EINVAL); flags |= FORCECLOSE; } if ((error = ext2_flushfiles(mp, flags, curthread)) != 0) return (error); ump = VFSTOEXT2(mp); fs = ump->um_e2fs; ronly = fs->e2fs_ronly; if (ronly == 0 && ext2_cgupdate(ump, MNT_WAIT) == 0) { if (fs->e2fs_wasvalid) fs->e2fs->e2fs_state |= E2FS_ISCLEAN; ext2_sbupdate(ump, MNT_WAIT); } DROP_GIANT(); g_topology_lock(); g_vfs_close(ump->um_cp); g_topology_unlock(); PICKUP_GIANT(); vrele(ump->um_devvp); sump = fs->e2fs_clustersum; for (i = 0; i < fs->e2fs_gcount; i++, sump++) free(sump->cs_sum, M_EXT2MNT); free(fs->e2fs_clustersum, M_EXT2MNT); free(fs->e2fs_maxcluster, M_EXT2MNT); free(fs->e2fs_gd, M_EXT2MNT); free(fs->e2fs_contigdirs, M_EXT2MNT); free(fs->e2fs, M_EXT2MNT); free(fs, M_EXT2MNT); free(ump, M_EXT2MNT); mp->mnt_data = NULL; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_LOCAL; MNT_IUNLOCK(mp); return (error); } /* * Flush out all the files in a filesystem. */ static int ext2_flushfiles(struct mount *mp, int flags, struct thread *td) { int error; error = vflush(mp, 0, flags, td); return (error); } /* * Get filesystem statistics. */ int ext2_statfs(struct mount *mp, struct statfs *sbp) { struct ext2mount *ump; struct m_ext2fs *fs; uint32_t overhead, overhead_per_group, ngdb; int i, ngroups; ump = VFSTOEXT2(mp); fs = ump->um_e2fs; if (fs->e2fs->e2fs_magic != E2FS_MAGIC) panic("ext2_statfs"); /* * Compute the overhead (FS structures) */ overhead_per_group = 1 /* block bitmap */ + 1 /* inode bitmap */ + fs->e2fs_itpg; overhead = fs->e2fs->e2fs_first_dblock + fs->e2fs_gcount * overhead_per_group; if (fs->e2fs->e2fs_rev > E2FS_REV0 && fs->e2fs->e2fs_features_rocompat & EXT2F_ROCOMPAT_SPARSESUPER) { for (i = 0, ngroups = 0; i < fs->e2fs_gcount; i++) { if (cg_has_sb(i)) ngroups++; } } else { ngroups = fs->e2fs_gcount; } ngdb = fs->e2fs_gdbcount; if (fs->e2fs->e2fs_rev > E2FS_REV0 && fs->e2fs->e2fs_features_compat & EXT2F_COMPAT_RESIZE) ngdb += fs->e2fs->e2fs_reserved_ngdb; overhead += ngroups * (1 /* superblock */ + ngdb); sbp->f_bsize = EXT2_FRAG_SIZE(fs); sbp->f_iosize = EXT2_BLOCK_SIZE(fs); sbp->f_blocks = fs->e2fs->e2fs_bcount - overhead; sbp->f_bfree = fs->e2fs->e2fs_fbcount; sbp->f_bavail = sbp->f_bfree - fs->e2fs->e2fs_rbcount; sbp->f_files = fs->e2fs->e2fs_icount; sbp->f_ffree = fs->e2fs->e2fs_ficount; return (0); } /* * Go through the disk queues to initiate sandbagged IO; * go through the inodes to write those that have been modified; * initiate the writing of the super block if it has been modified. * * Note: we are always called with the filesystem marked `MPBUSY'. */ static int ext2_sync(struct mount *mp, int waitfor) { struct vnode *mvp, *vp; struct thread *td; struct inode *ip; struct ext2mount *ump = VFSTOEXT2(mp); struct m_ext2fs *fs; int error, allerror = 0; td = curthread; fs = ump->um_e2fs; if (fs->e2fs_fmod != 0 && fs->e2fs_ronly != 0) { /* XXX */ printf("fs = %s\n", fs->e2fs_fsmnt); panic("ext2_sync: rofs mod"); } /* * Write back each (modified) inode. */ loop: MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { if (vp->v_type == VNON) { VI_UNLOCK(vp); continue; } ip = VTOI(vp); if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 && (vp->v_bufobj.bo_dirty.bv_cnt == 0 || waitfor == MNT_LAZY)) { VI_UNLOCK(vp); continue; } error = vget(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, td); if (error) { if (error == ENOENT) { MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); goto loop; } continue; } if ((error = VOP_FSYNC(vp, waitfor, td)) != 0) allerror = error; VOP_UNLOCK(vp, 0); vrele(vp); } /* * Force stale filesystem control information to be flushed. */ if (waitfor != MNT_LAZY) { vn_lock(ump->um_devvp, LK_EXCLUSIVE | LK_RETRY); if ((error = VOP_FSYNC(ump->um_devvp, waitfor, td)) != 0) allerror = error; VOP_UNLOCK(ump->um_devvp, 0); } /* * Write back modified superblock. */ if (fs->e2fs_fmod != 0) { fs->e2fs_fmod = 0; fs->e2fs->e2fs_wtime = time_second; if ((error = ext2_cgupdate(ump, waitfor)) != 0) allerror = error; } return (allerror); } /* * Look up an EXT2FS dinode number to find its incore vnode, otherwise read it * in from disk. If it is in core, wait for the lock bit to clear, then * return the inode locked. Detection and handling of mount points must be * done by the calling routine. */ static int ext2_vget(struct mount *mp, ino_t ino, int flags, struct vnode **vpp) { struct m_ext2fs *fs; struct inode *ip; struct ext2mount *ump; struct buf *bp; struct vnode *vp; struct thread *td; int i, error; int used_blocks; td = curthread; error = vfs_hash_get(mp, ino, flags, td, vpp, NULL, NULL); if (error || *vpp != NULL) return (error); ump = VFSTOEXT2(mp); ip = malloc(sizeof(struct inode), M_EXT2NODE, M_WAITOK | M_ZERO); /* Allocate a new vnode/inode. */ if ((error = getnewvnode("ext2fs", mp, &ext2_vnodeops, &vp)) != 0) { *vpp = NULL; free(ip, M_EXT2NODE); return (error); } vp->v_data = ip; ip->i_vnode = vp; ip->i_e2fs = fs = ump->um_e2fs; ip->i_ump = ump; ip->i_number = ino; lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL); error = insmntque(vp, mp); if (error != 0) { free(ip, M_EXT2NODE); *vpp = NULL; return (error); } error = vfs_hash_insert(vp, ino, flags, td, vpp, NULL, NULL); if (error || *vpp != NULL) return (error); /* Read in the disk contents for the inode, copy into the inode. */ if ((error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ino)), (int)fs->e2fs_bsize, NOCRED, &bp)) != 0) { /* * The inode does not contain anything useful, so it would * be misleading to leave it on its hash chain. With mode * still zero, it will be unlinked and returned to the free * list by vput(). */ brelse(bp); vput(vp); *vpp = NULL; return (error); } /* convert ext2 inode to dinode */ ext2_ei2i((struct ext2fs_dinode *) ((char *)bp->b_data + EXT2_INODE_SIZE(fs) * ino_to_fsbo(fs, ino)), ip); ip->i_block_group = ino_to_cg(fs, ino); ip->i_next_alloc_block = 0; ip->i_next_alloc_goal = 0; /* * Now we want to make sure that block pointers for unused * blocks are zeroed out - ext2_balloc depends on this * although for regular files and directories only * * If IN_E4EXTENTS is enabled, unused blocks are not zeroed * out because we could corrupt the extent tree. */ if (!(ip->i_flag & IN_E4EXTENTS) && (S_ISDIR(ip->i_mode) || S_ISREG(ip->i_mode))) { used_blocks = (ip->i_size+fs->e2fs_bsize-1) / fs->e2fs_bsize; for (i = used_blocks; i < EXT2_NDIR_BLOCKS; i++) ip->i_db[i] = 0; } #ifdef EXT2FS_DEBUG ext2_print_inode(ip); #endif bqrelse(bp); /* * Initialize the vnode from the inode, check for aliases. * Note that the underlying vnode may have changed. */ if ((error = ext2_vinit(mp, &ext2_fifoops, &vp)) != 0) { vput(vp); *vpp = NULL; return (error); } /* * Finish inode initialization. */ /* * Set up a generation number for this inode if it does not * already have one. This should only happen on old filesystems. */ if (ip->i_gen == 0) { ip->i_gen = random() + 1; if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) ip->i_flag |= IN_MODIFIED; } *vpp = vp; return (0); } /* * File handle to vnode * * Have to be really careful about stale file handles: * - check that the inode number is valid * - call ext2_vget() to get the locked inode * - check for an unallocated inode (i_mode == 0) * - check that the given client host has export rights and return * those rights via. exflagsp and credanonp */ static int ext2_fhtovp(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp) { struct inode *ip; struct ufid *ufhp; struct vnode *nvp; struct m_ext2fs *fs; int error; ufhp = (struct ufid *)fhp; fs = VFSTOEXT2(mp)->um_e2fs; if (ufhp->ufid_ino < EXT2_ROOTINO || ufhp->ufid_ino > fs->e2fs_gcount * fs->e2fs->e2fs_ipg) return (ESTALE); error = VFS_VGET(mp, ufhp->ufid_ino, LK_EXCLUSIVE, &nvp); if (error) { *vpp = NULLVP; return (error); } ip = VTOI(nvp); if (ip->i_mode == 0 || ip->i_gen != ufhp->ufid_gen || ip->i_nlink <= 0) { vput(nvp); *vpp = NULLVP; return (ESTALE); } *vpp = nvp; vnode_create_vobject(*vpp, 0, curthread); return (0); } /* * Write a superblock and associated information back to disk. */ static int ext2_sbupdate(struct ext2mount *mp, int waitfor) { struct m_ext2fs *fs = mp->um_e2fs; struct ext2fs *es = fs->e2fs; struct buf *bp; int error = 0; bp = getblk(mp->um_devvp, SBLOCK, SBSIZE, 0, 0, 0); bcopy((caddr_t)es, bp->b_data, (u_int)sizeof(struct ext2fs)); if (waitfor == MNT_WAIT) error = bwrite(bp); else bawrite(bp); /* * The buffers for group descriptors, inode bitmaps and block bitmaps * are not busy at this point and are (hopefully) written by the * usual sync mechanism. No need to write them here. */ return (error); } int ext2_cgupdate(struct ext2mount *mp, int waitfor) { struct m_ext2fs *fs = mp->um_e2fs; struct buf *bp; int i, error = 0, allerror = 0; allerror = ext2_sbupdate(mp, waitfor); for (i = 0; i < fs->e2fs_gdbcount; i++) { bp = getblk(mp->um_devvp, fsbtodb(fs, fs->e2fs->e2fs_first_dblock + 1 /* superblock */ + i), fs->e2fs_bsize, 0, 0, 0); e2fs_cgsave(&fs->e2fs_gd[ i * fs->e2fs_bsize / sizeof(struct ext2_gd)], (struct ext2_gd *)bp->b_data, fs->e2fs_bsize); if (waitfor == MNT_WAIT) error = bwrite(bp); else bawrite(bp); } if (!allerror && error) allerror = error; return (allerror); } /* * Return the root of a filesystem. */ static int ext2_root(struct mount *mp, int flags, struct vnode **vpp) { struct vnode *nvp; int error; error = VFS_VGET(mp, EXT2_ROOTINO, LK_EXCLUSIVE, &nvp); if (error) return (error); *vpp = nvp; return (0); } Index: user/ngie/more-tests/sys/fs/fuse/fuse_vfsops.c =================================================================== --- user/ngie/more-tests/sys/fs/fuse/fuse_vfsops.c (revision 281584) +++ user/ngie/more-tests/sys/fs/fuse/fuse_vfsops.c (revision 281585) @@ -1,533 +1,534 @@ /* * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_param.h" #include "fuse_node.h" #include "fuse_ipc.h" #include "fuse_internal.h" #include #include #define FUSE_DEBUG_MODULE VFSOPS #include "fuse_debug.h" /* This will do for privilege types for now */ #ifndef PRIV_VFS_FUSE_ALLOWOTHER #define PRIV_VFS_FUSE_ALLOWOTHER PRIV_VFS_MOUNT_NONUSER #endif #ifndef PRIV_VFS_FUSE_MOUNT_NONUSER #define PRIV_VFS_FUSE_MOUNT_NONUSER PRIV_VFS_MOUNT_NONUSER #endif #ifndef PRIV_VFS_FUSE_SYNC_UNMOUNT #define PRIV_VFS_FUSE_SYNC_UNMOUNT PRIV_VFS_MOUNT_NONUSER #endif static vfs_mount_t fuse_vfsop_mount; static vfs_unmount_t fuse_vfsop_unmount; static vfs_root_t fuse_vfsop_root; static vfs_statfs_t fuse_vfsop_statfs; struct vfsops fuse_vfsops = { .vfs_mount = fuse_vfsop_mount, .vfs_unmount = fuse_vfsop_unmount, .vfs_root = fuse_vfsop_root, .vfs_statfs = fuse_vfsop_statfs, }; SYSCTL_INT(_vfs_fuse, OID_AUTO, init_backgrounded, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, 1, "indicate async handshake"); static int fuse_enforce_dev_perms = 0; SYSCTL_INT(_vfs_fuse, OID_AUTO, enforce_dev_perms, CTLFLAG_RW, &fuse_enforce_dev_perms, 0, "enforce fuse device permissions for secondary mounts"); static unsigned sync_unmount = 1; SYSCTL_UINT(_vfs_fuse, OID_AUTO, sync_unmount, CTLFLAG_RW, &sync_unmount, 0, "specify when to use synchronous unmount"); MALLOC_DEFINE(M_FUSEVFS, "fuse_filesystem", "buffer for fuse vfs layer"); static int fuse_getdevice(const char *fspec, struct thread *td, struct cdev **fdevp) { struct nameidata nd, *ndp = &nd; struct vnode *devvp; struct cdev *fdev; int err; /* * Not an update, or updating the name: look up the name * and verify that it refers to a sensible disk device. */ NDINIT(ndp, LOOKUP, FOLLOW, UIO_SYSSPACE, fspec, td); if ((err = namei(ndp)) != 0) return err; NDFREE(ndp, NDF_ONLY_PNBUF); devvp = ndp->ni_vp; if (devvp->v_type != VCHR) { vrele(devvp); return ENXIO; } fdev = devvp->v_rdev; dev_ref(fdev); if (fuse_enforce_dev_perms) { /* * Check if mounter can open the fuse device. * * This has significance only if we are doing a secondary mount * which doesn't involve actually opening fuse devices, but we * still want to enforce the permissions of the device (in * order to keep control over the circle of fuse users). * * (In case of primary mounts, we are either the superuser so * we can do anything anyway, or we can mount only if the * device is already opened by us, ie. we are permitted to open * the device.) */ #if 0 #ifdef MAC err = mac_check_vnode_open(td->td_ucred, devvp, VREAD | VWRITE); if (!err) #endif #endif /* 0 */ err = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (err) { vrele(devvp); dev_rel(fdev); return err; } } /* * according to coda code, no extra lock is needed -- * although in sys/vnode.h this field is marked "v" */ vrele(devvp); if (!fdev->si_devsw || strcmp("fuse", fdev->si_devsw->d_name)) { dev_rel(fdev); return ENXIO; } *fdevp = fdev; return 0; } #define FUSE_FLAGOPT(fnam, fval) do { \ vfs_flagopt(opts, #fnam, &mntopts, fval); \ vfs_flagopt(opts, "__" #fnam, &__mntopts, fval); \ } while (0) static int fuse_vfsop_mount(struct mount *mp) { int err; uint64_t mntopts, __mntopts; int max_read_set; uint32_t max_read; int daemon_timeout; int fd; size_t len; struct cdev *fdev; struct fuse_data *data; struct thread *td; struct file *fp, *fptmp; char *fspec, *subtype; struct vfsoptlist *opts; cap_rights_t rights; subtype = NULL; max_read_set = 0; max_read = ~0; err = 0; mntopts = 0; __mntopts = 0; td = curthread; fuse_trace_printf_vfsop(); if (mp->mnt_flag & MNT_UPDATE) return EOPNOTSUPP; MNT_ILOCK(mp); mp->mnt_flag |= MNT_SYNCHRONOUS; mp->mnt_data = NULL; MNT_IUNLOCK(mp); /* Get the new options passed to mount */ opts = mp->mnt_optnew; if (!opts) return EINVAL; /* `fspath' contains the mount point (eg. /mnt/fuse/sshfs); REQUIRED */ if (!vfs_getopts(opts, "fspath", &err)) return err; /* `from' contains the device name (eg. /dev/fuse0); REQUIRED */ fspec = vfs_getopts(opts, "from", &err); if (!fspec) return err; /* `fd' contains the filedescriptor for this session; REQUIRED */ if (vfs_scanopt(opts, "fd", "%d", &fd) != 1) return EINVAL; err = fuse_getdevice(fspec, td, &fdev); if (err != 0) return err; /* * With the help of underscored options the mount program * can inform us from the flags it sets by default */ FUSE_FLAGOPT(allow_other, FSESS_DAEMON_CAN_SPY); FUSE_FLAGOPT(push_symlinks_in, FSESS_PUSH_SYMLINKS_IN); FUSE_FLAGOPT(default_permissions, FSESS_DEFAULT_PERMISSIONS); FUSE_FLAGOPT(no_attrcache, FSESS_NO_ATTRCACHE); FUSE_FLAGOPT(no_readahed, FSESS_NO_READAHEAD); FUSE_FLAGOPT(no_datacache, FSESS_NO_DATACACHE); FUSE_FLAGOPT(no_namecache, FSESS_NO_NAMECACHE); FUSE_FLAGOPT(no_mmap, FSESS_NO_MMAP); FUSE_FLAGOPT(brokenio, FSESS_BROKENIO); if (vfs_scanopt(opts, "max_read=", "%u", &max_read) == 1) max_read_set = 1; if (vfs_scanopt(opts, "timeout=", "%u", &daemon_timeout) == 1) { if (daemon_timeout < FUSE_MIN_DAEMON_TIMEOUT) daemon_timeout = FUSE_MIN_DAEMON_TIMEOUT; else if (daemon_timeout > FUSE_MAX_DAEMON_TIMEOUT) daemon_timeout = FUSE_MAX_DAEMON_TIMEOUT; } else { daemon_timeout = FUSE_DEFAULT_DAEMON_TIMEOUT; } subtype = vfs_getopts(opts, "subtype=", &err); FS_DEBUG2G("mntopts 0x%jx\n", (uintmax_t)mntopts); err = fget(td, fd, cap_rights_init(&rights, CAP_READ), &fp); if (err != 0) { FS_DEBUG("invalid or not opened device: data=%p\n", data); goto out; } fptmp = td->td_fpop; td->td_fpop = fp; err = devfs_get_cdevpriv((void **)&data); td->td_fpop = fptmp; fdrop(fp, td); FUSE_LOCK(); if (err != 0 || data == NULL || data->mp != NULL) { FS_DEBUG("invalid or not opened device: data=%p data.mp=%p\n", data, data != NULL ? data->mp : NULL); err = ENXIO; FUSE_UNLOCK(); goto out; } if (fdata_get_dead(data)) { FS_DEBUG("device is dead during mount: data=%p\n", data); err = ENOTCONN; FUSE_UNLOCK(); goto out; } /* Sanity + permission checks */ if (!data->daemoncred) panic("fuse daemon found, but identity unknown"); if (mntopts & FSESS_DAEMON_CAN_SPY) err = priv_check(td, PRIV_VFS_FUSE_ALLOWOTHER); if (err == 0 && td->td_ucred->cr_uid != data->daemoncred->cr_uid) /* are we allowed to do the first mount? */ err = priv_check(td, PRIV_VFS_FUSE_MOUNT_NONUSER); if (err) { FUSE_UNLOCK(); goto out; } data->ref++; data->mp = mp; data->dataflags |= mntopts; data->max_read = max_read; data->daemon_timeout = daemon_timeout; FUSE_UNLOCK(); vfs_getnewfsid(mp); MNT_ILOCK(mp); mp->mnt_data = data; mp->mnt_flag |= MNT_LOCAL; + mp->mnt_kern_flag |= MNTK_USES_BCACHE; MNT_IUNLOCK(mp); /* We need this here as this slot is used by getnewvnode() */ mp->mnt_stat.f_iosize = PAGE_SIZE; if (subtype) { strlcat(mp->mnt_stat.f_fstypename, ".", MFSNAMELEN); strlcat(mp->mnt_stat.f_fstypename, subtype, MFSNAMELEN); } copystr(fspec, mp->mnt_stat.f_mntfromname, MNAMELEN - 1, &len); bzero(mp->mnt_stat.f_mntfromname + len, MNAMELEN - len); FS_DEBUG2G("mp %p: %s\n", mp, mp->mnt_stat.f_mntfromname); /* Now handshaking with daemon */ fuse_internal_send_init(data, td); out: if (err) { FUSE_LOCK(); if (data->mp == mp) { /* * Destroy device only if we acquired reference to * it */ FS_DEBUG("mount failed, destroy device: data=%p mp=%p" " err=%d\n", data, mp, err); data->mp = NULL; fdata_trydestroy(data); } FUSE_UNLOCK(); dev_rel(fdev); } return err; } static int fuse_vfsop_unmount(struct mount *mp, int mntflags) { int err = 0; int flags = 0; struct cdev *fdev; struct fuse_data *data; struct fuse_dispatcher fdi; struct thread *td = curthread; fuse_trace_printf_vfsop(); if (mntflags & MNT_FORCE) { flags |= FORCECLOSE; } data = fuse_get_mpdata(mp); if (!data) { panic("no private data for mount point?"); } /* There is 1 extra root vnode reference (mp->mnt_data). */ FUSE_LOCK(); if (data->vroot != NULL) { struct vnode *vroot = data->vroot; data->vroot = NULL; FUSE_UNLOCK(); vrele(vroot); } else FUSE_UNLOCK(); err = vflush(mp, 0, flags, td); if (err) { debug_printf("vflush failed"); return err; } if (fdata_get_dead(data)) { goto alreadydead; } fdisp_init(&fdi, 0); fdisp_make(&fdi, FUSE_DESTROY, mp, 0, td, NULL); err = fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); fdata_set_dead(data); alreadydead: FUSE_LOCK(); data->mp = NULL; fdev = data->fdev; fdata_trydestroy(data); FUSE_UNLOCK(); MNT_ILOCK(mp); mp->mnt_data = NULL; mp->mnt_flag &= ~MNT_LOCAL; MNT_IUNLOCK(mp); dev_rel(fdev); return 0; } static int fuse_vfsop_root(struct mount *mp, int lkflags, struct vnode **vpp) { struct fuse_data *data = fuse_get_mpdata(mp); int err = 0; if (data->vroot != NULL) { err = vget(data->vroot, lkflags, curthread); if (err == 0) *vpp = data->vroot; } else { err = fuse_vnode_get(mp, FUSE_ROOT_ID, NULL, vpp, NULL, VDIR); if (err == 0) { FUSE_LOCK(); MPASS(data->vroot == NULL || data->vroot == *vpp); if (data->vroot == NULL) { FS_DEBUG("new root vnode\n"); data->vroot = *vpp; FUSE_UNLOCK(); vref(*vpp); } else if (data->vroot != *vpp) { FS_DEBUG("root vnode race\n"); FUSE_UNLOCK(); VOP_UNLOCK(*vpp, 0); vrele(*vpp); vrecycle(*vpp); *vpp = data->vroot; } else FUSE_UNLOCK(); } } return err; } static int fuse_vfsop_statfs(struct mount *mp, struct statfs *sbp) { struct fuse_dispatcher fdi; int err = 0; struct fuse_statfs_out *fsfo; struct fuse_data *data; FS_DEBUG2G("mp %p: %s\n", mp, mp->mnt_stat.f_mntfromname); data = fuse_get_mpdata(mp); if (!(data->dataflags & FSESS_INITED)) goto fake; fdisp_init(&fdi, 0); fdisp_make(&fdi, FUSE_STATFS, mp, FUSE_ROOT_ID, NULL, NULL); err = fdisp_wait_answ(&fdi); if (err) { fdisp_destroy(&fdi); if (err == ENOTCONN) { /* * We want to seem a legitimate fs even if the daemon * is stiff dead... (so that, eg., we can still do path * based unmounting after the daemon dies). */ goto fake; } return err; } fsfo = fdi.answ; sbp->f_blocks = fsfo->st.blocks; sbp->f_bfree = fsfo->st.bfree; sbp->f_bavail = fsfo->st.bavail; sbp->f_files = fsfo->st.files; sbp->f_ffree = fsfo->st.ffree; /* cast from uint64_t to int64_t */ sbp->f_namemax = fsfo->st.namelen; sbp->f_bsize = fsfo->st.frsize; /* cast from uint32_t to uint64_t */ FS_DEBUG("fuse_statfs_out -- blocks: %llu, bfree: %llu, bavail: %llu, " "fil es: %llu, ffree: %llu, bsize: %i, namelen: %i\n", (unsigned long long)fsfo->st.blocks, (unsigned long long)fsfo->st.bfree, (unsigned long long)fsfo->st.bavail, (unsigned long long)fsfo->st.files, (unsigned long long)fsfo->st.ffree, fsfo->st.bsize, fsfo->st.namelen); fdisp_destroy(&fdi); return 0; fake: sbp->f_blocks = 0; sbp->f_bfree = 0; sbp->f_bavail = 0; sbp->f_files = 0; sbp->f_ffree = 0; sbp->f_namemax = 0; sbp->f_bsize = FUSE_DEFAULT_BLOCKSIZE; return 0; } Index: user/ngie/more-tests/sys/fs/msdosfs/msdosfs_vfsops.c =================================================================== --- user/ngie/more-tests/sys/fs/msdosfs/msdosfs_vfsops.c (revision 281584) +++ user/ngie/more-tests/sys/fs/msdosfs/msdosfs_vfsops.c (revision 281585) @@ -1,1035 +1,1036 @@ /* $FreeBSD$ */ /* $NetBSD: msdosfs_vfsops.c,v 1.51 1997/11/17 15:36:58 ws Exp $ */ /*- * Copyright (C) 1994, 1995, 1997 Wolfgang Solfrank. * Copyright (C) 1994, 1995, 1997 TooLs GmbH. * All rights reserved. * Original code by Paul Popelka (paulp@uts.amdahl.com) (see below). * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /*- * Written by Paul Popelka (paulp@uts.amdahl.com) * * You can do anything you want with this software, just don't say you wrote * it, and don't remove this notice. * * This software is provided "as is". * * The author supplies this software to be publicly redistributed on the * understanding that the author is not responsible for the correct * functioning of this software in any circumstances and is not liable for * any damages caused by this software. * * October 1992 */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static const char msdosfs_lock_msg[] = "fatlk"; /* Mount options that we support. */ static const char *msdosfs_opts[] = { "async", "noatime", "noclusterr", "noclusterw", "export", "force", "from", "sync", "cs_dos", "cs_local", "cs_win", "dirmask", "gid", "kiconv", "large", "longname", "longnames", "mask", "shortname", "shortnames", "uid", "win95", "nowin95", NULL }; #if 1 /*def PC98*/ /* * XXX - The boot signature formatted by NEC PC-98 DOS looks like a * garbage or a random value :-{ * If you want to use that broken-signatured media, define the * following symbol even though PC/AT. * (ex. mount PC-98 DOS formatted FD on PC/AT) */ #define MSDOSFS_NOCHECKSIG #endif MALLOC_DEFINE(M_MSDOSFSMNT, "msdosfs_mount", "MSDOSFS mount structure"); static MALLOC_DEFINE(M_MSDOSFSFAT, "msdosfs_fat", "MSDOSFS file allocation table"); struct iconv_functions *msdosfs_iconv; static int update_mp(struct mount *mp, struct thread *td); static int mountmsdosfs(struct vnode *devvp, struct mount *mp); static vfs_fhtovp_t msdosfs_fhtovp; static vfs_mount_t msdosfs_mount; static vfs_root_t msdosfs_root; static vfs_statfs_t msdosfs_statfs; static vfs_sync_t msdosfs_sync; static vfs_unmount_t msdosfs_unmount; /* Maximum length of a character set name (arbitrary). */ #define MAXCSLEN 64 static int update_mp(struct mount *mp, struct thread *td) { struct msdosfsmount *pmp = VFSTOMSDOSFS(mp); void *dos, *win, *local; int error, v; if (!vfs_getopt(mp->mnt_optnew, "kiconv", NULL, NULL)) { if (msdosfs_iconv != NULL) { error = vfs_getopt(mp->mnt_optnew, "cs_win", &win, NULL); if (!error) error = vfs_getopt(mp->mnt_optnew, "cs_local", &local, NULL); if (!error) error = vfs_getopt(mp->mnt_optnew, "cs_dos", &dos, NULL); if (!error) { msdosfs_iconv->open(win, local, &pmp->pm_u2w); msdosfs_iconv->open(local, win, &pmp->pm_w2u); msdosfs_iconv->open(dos, local, &pmp->pm_u2d); msdosfs_iconv->open(local, dos, &pmp->pm_d2u); } if (error != 0) return (error); } else { pmp->pm_w2u = NULL; pmp->pm_u2w = NULL; pmp->pm_d2u = NULL; pmp->pm_u2d = NULL; } } if (vfs_scanopt(mp->mnt_optnew, "gid", "%d", &v) == 1) pmp->pm_gid = v; if (vfs_scanopt(mp->mnt_optnew, "uid", "%d", &v) == 1) pmp->pm_uid = v; if (vfs_scanopt(mp->mnt_optnew, "mask", "%d", &v) == 1) pmp->pm_mask = v & ALLPERMS; if (vfs_scanopt(mp->mnt_optnew, "dirmask", "%d", &v) == 1) pmp->pm_dirmask = v & ALLPERMS; vfs_flagopt(mp->mnt_optnew, "shortname", &pmp->pm_flags, MSDOSFSMNT_SHORTNAME); vfs_flagopt(mp->mnt_optnew, "shortnames", &pmp->pm_flags, MSDOSFSMNT_SHORTNAME); vfs_flagopt(mp->mnt_optnew, "longname", &pmp->pm_flags, MSDOSFSMNT_LONGNAME); vfs_flagopt(mp->mnt_optnew, "longnames", &pmp->pm_flags, MSDOSFSMNT_LONGNAME); vfs_flagopt(mp->mnt_optnew, "kiconv", &pmp->pm_flags, MSDOSFSMNT_KICONV); if (vfs_getopt(mp->mnt_optnew, "nowin95", NULL, NULL) == 0) pmp->pm_flags |= MSDOSFSMNT_NOWIN95; else pmp->pm_flags &= ~MSDOSFSMNT_NOWIN95; if (pmp->pm_flags & MSDOSFSMNT_NOWIN95) pmp->pm_flags |= MSDOSFSMNT_SHORTNAME; else if (!(pmp->pm_flags & (MSDOSFSMNT_SHORTNAME | MSDOSFSMNT_LONGNAME))) { struct vnode *rootvp; /* * Try to divine whether to support Win'95 long filenames */ if (FAT32(pmp)) pmp->pm_flags |= MSDOSFSMNT_LONGNAME; else { if ((error = msdosfs_root(mp, LK_EXCLUSIVE, &rootvp)) != 0) return error; pmp->pm_flags |= findwin95(VTODE(rootvp)) ? MSDOSFSMNT_LONGNAME : MSDOSFSMNT_SHORTNAME; vput(rootvp); } } return 0; } static int msdosfs_cmount(struct mntarg *ma, void *data, uint64_t flags) { struct msdosfs_args args; struct export_args exp; int error; if (data == NULL) return (EINVAL); error = copyin(data, &args, sizeof args); if (error) return (error); vfs_oexport_conv(&args.export, &exp); ma = mount_argsu(ma, "from", args.fspec, MAXPATHLEN); ma = mount_arg(ma, "export", &exp, sizeof(exp)); ma = mount_argf(ma, "uid", "%d", args.uid); ma = mount_argf(ma, "gid", "%d", args.gid); ma = mount_argf(ma, "mask", "%d", args.mask); ma = mount_argf(ma, "dirmask", "%d", args.dirmask); ma = mount_argb(ma, args.flags & MSDOSFSMNT_SHORTNAME, "noshortname"); ma = mount_argb(ma, args.flags & MSDOSFSMNT_LONGNAME, "nolongname"); ma = mount_argb(ma, !(args.flags & MSDOSFSMNT_NOWIN95), "nowin95"); ma = mount_argb(ma, args.flags & MSDOSFSMNT_KICONV, "nokiconv"); ma = mount_argsu(ma, "cs_win", args.cs_win, MAXCSLEN); ma = mount_argsu(ma, "cs_dos", args.cs_dos, MAXCSLEN); ma = mount_argsu(ma, "cs_local", args.cs_local, MAXCSLEN); error = kernel_mount(ma, flags); return (error); } /* * mp - path - addr in user space of mount point (ie /usr or whatever) * data - addr in user space of mount params including the name of the block * special file to treat as a filesystem. */ static int msdosfs_mount(struct mount *mp) { struct vnode *devvp; /* vnode for blk device to mount */ struct thread *td; /* msdosfs specific mount control block */ struct msdosfsmount *pmp = NULL; struct nameidata ndp; int error, flags; accmode_t accmode; char *from; td = curthread; if (vfs_filteropt(mp->mnt_optnew, msdosfs_opts)) return (EINVAL); /* * If updating, check whether changing from read-only to * read/write; if there is no device name, that's all we do. */ if (mp->mnt_flag & MNT_UPDATE) { pmp = VFSTOMSDOSFS(mp); if (vfs_flagopt(mp->mnt_optnew, "export", NULL, 0)) { /* * Forbid export requests if filesystem has * MSDOSFS_LARGEFS flag set. */ if ((pmp->pm_flags & MSDOSFS_LARGEFS) != 0) { vfs_mount_error(mp, "MSDOSFS_LARGEFS flag set, cannot export"); return (EOPNOTSUPP); } } if (!(pmp->pm_flags & MSDOSFSMNT_RONLY) && vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) { error = VFS_SYNC(mp, MNT_WAIT); if (error) return (error); flags = WRITECLOSE; if (mp->mnt_flag & MNT_FORCE) flags |= FORCECLOSE; error = vflush(mp, 0, flags, td); if (error) return (error); /* * Now the volume is clean. Mark it so while the * device is still rw. */ error = markvoldirty(pmp, 0); if (error) { (void)markvoldirty(pmp, 1); return (error); } /* Downgrade the device from rw to ro. */ DROP_GIANT(); g_topology_lock(); error = g_access(pmp->pm_cp, 0, -1, 0); g_topology_unlock(); PICKUP_GIANT(); if (error) { (void)markvoldirty(pmp, 1); return (error); } /* * Backing out after an error was painful in the * above. Now we are committed to succeeding. */ pmp->pm_fmod = 0; pmp->pm_flags |= MSDOSFSMNT_RONLY; MNT_ILOCK(mp); mp->mnt_flag |= MNT_RDONLY; MNT_IUNLOCK(mp); } else if ((pmp->pm_flags & MSDOSFSMNT_RONLY) && !vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) { /* * If upgrade to read-write by non-root, then verify * that user has necessary permissions on the device. */ devvp = pmp->pm_devvp; vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (error) error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { VOP_UNLOCK(devvp, 0); return (error); } VOP_UNLOCK(devvp, 0); DROP_GIANT(); g_topology_lock(); error = g_access(pmp->pm_cp, 0, 1, 0); g_topology_unlock(); PICKUP_GIANT(); if (error) return (error); pmp->pm_fmod = 1; pmp->pm_flags &= ~MSDOSFSMNT_RONLY; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_RDONLY; MNT_IUNLOCK(mp); /* Now that the volume is modifiable, mark it dirty. */ error = markvoldirty(pmp, 1); if (error) return (error); } } /* * Not an update, or updating the name: look up the name * and verify that it refers to a sensible disk device. */ if (vfs_getopt(mp->mnt_optnew, "from", (void **)&from, NULL)) return (EINVAL); NDINIT(&ndp, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, from, td); error = namei(&ndp); if (error) return (error); devvp = ndp.ni_vp; NDFREE(&ndp, NDF_ONLY_PNBUF); if (!vn_isdisk(devvp, &error)) { vput(devvp); return (error); } /* * If mount by non-root, then verify that user has necessary * permissions on the device. */ accmode = VREAD; if ((mp->mnt_flag & MNT_RDONLY) == 0) accmode |= VWRITE; error = VOP_ACCESS(devvp, accmode, td->td_ucred, td); if (error) error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { vput(devvp); return (error); } if ((mp->mnt_flag & MNT_UPDATE) == 0) { error = mountmsdosfs(devvp, mp); #ifdef MSDOSFS_DEBUG /* only needed for the printf below */ pmp = VFSTOMSDOSFS(mp); #endif } else { vput(devvp); if (devvp != pmp->pm_devvp) return (EINVAL); /* XXX needs translation */ } if (error) { vrele(devvp); return (error); } error = update_mp(mp, td); if (error) { if ((mp->mnt_flag & MNT_UPDATE) == 0) msdosfs_unmount(mp, MNT_FORCE); return error; } if (devvp->v_type == VCHR && devvp->v_rdev != NULL) devvp->v_rdev->si_mountpt = mp; vfs_mountedfrom(mp, from); #ifdef MSDOSFS_DEBUG printf("msdosfs_mount(): mp %p, pmp %p, inusemap %p\n", mp, pmp, pmp->pm_inusemap); #endif return (0); } static int mountmsdosfs(struct vnode *devvp, struct mount *mp) { struct msdosfsmount *pmp; struct buf *bp; struct cdev *dev; union bootsector *bsp; struct byte_bpb33 *b33; struct byte_bpb50 *b50; struct byte_bpb710 *b710; u_int8_t SecPerClust; u_long clusters; int ronly, error; struct g_consumer *cp; struct bufobj *bo; bp = NULL; /* This and pmp both used in error_exit. */ pmp = NULL; ronly = (mp->mnt_flag & MNT_RDONLY) != 0; dev = devvp->v_rdev; dev_ref(dev); DROP_GIANT(); g_topology_lock(); error = g_vfs_open(devvp, &cp, "msdosfs", ronly ? 0 : 1); g_topology_unlock(); PICKUP_GIANT(); VOP_UNLOCK(devvp, 0); if (error) goto error_exit; bo = &devvp->v_bufobj; /* * Read the boot sector of the filesystem, and then check the * boot signature. If not a dos boot sector then error out. * * NOTE: 8192 is a magic size that works for ffs. */ error = bread(devvp, 0, 8192, NOCRED, &bp); if (error) goto error_exit; bp->b_flags |= B_AGE; bsp = (union bootsector *)bp->b_data; b33 = (struct byte_bpb33 *)bsp->bs33.bsBPB; b50 = (struct byte_bpb50 *)bsp->bs50.bsBPB; b710 = (struct byte_bpb710 *)bsp->bs710.bsBPB; #ifndef MSDOSFS_NOCHECKSIG if (bsp->bs50.bsBootSectSig0 != BOOTSIG0 || bsp->bs50.bsBootSectSig1 != BOOTSIG1) { error = EINVAL; goto error_exit; } #endif pmp = malloc(sizeof *pmp, M_MSDOSFSMNT, M_WAITOK | M_ZERO); pmp->pm_mountp = mp; pmp->pm_cp = cp; pmp->pm_bo = bo; lockinit(&pmp->pm_fatlock, 0, msdosfs_lock_msg, 0, 0); /* * Initialize ownerships and permissions, since nothing else will * initialize them iff we are mounting root. */ pmp->pm_uid = UID_ROOT; pmp->pm_gid = GID_WHEEL; pmp->pm_mask = pmp->pm_dirmask = S_IXUSR | S_IXGRP | S_IXOTH | S_IRUSR | S_IRGRP | S_IROTH | S_IWUSR; /* * Experimental support for large MS-DOS filesystems. * WARNING: This uses at least 32 bytes of kernel memory (which is not * reclaimed until the FS is unmounted) for each file on disk to map * between the 32-bit inode numbers used by VFS and the 64-bit * pseudo-inode numbers used internally by msdosfs. This is only * safe to use in certain controlled situations (e.g. read-only FS * with less than 1 million files). * Since the mappings do not persist across unmounts (or reboots), these * filesystems are not suitable for exporting through NFS, or any other * application that requires fixed inode numbers. */ vfs_flagopt(mp->mnt_optnew, "large", &pmp->pm_flags, MSDOSFS_LARGEFS); /* * Compute several useful quantities from the bpb in the * bootsector. Copy in the dos 5 variant of the bpb then fix up * the fields that are different between dos 5 and dos 3.3. */ SecPerClust = b50->bpbSecPerClust; pmp->pm_BytesPerSec = getushort(b50->bpbBytesPerSec); if (pmp->pm_BytesPerSec < DEV_BSIZE) { error = EINVAL; goto error_exit; } pmp->pm_ResSectors = getushort(b50->bpbResSectors); pmp->pm_FATs = b50->bpbFATs; pmp->pm_RootDirEnts = getushort(b50->bpbRootDirEnts); pmp->pm_Sectors = getushort(b50->bpbSectors); pmp->pm_FATsecs = getushort(b50->bpbFATsecs); pmp->pm_SecPerTrack = getushort(b50->bpbSecPerTrack); pmp->pm_Heads = getushort(b50->bpbHeads); pmp->pm_Media = b50->bpbMedia; /* calculate the ratio of sector size to DEV_BSIZE */ pmp->pm_BlkPerSec = pmp->pm_BytesPerSec / DEV_BSIZE; /* * We don't check pm_Heads nor pm_SecPerTrack, because * these may not be set for EFI file systems. We don't * use these anyway, so we're unaffected if they are * invalid. */ if (!pmp->pm_BytesPerSec || !SecPerClust) { error = EINVAL; goto error_exit; } if (pmp->pm_Sectors == 0) { pmp->pm_HiddenSects = getulong(b50->bpbHiddenSecs); pmp->pm_HugeSectors = getulong(b50->bpbHugeSectors); } else { pmp->pm_HiddenSects = getushort(b33->bpbHiddenSecs); pmp->pm_HugeSectors = pmp->pm_Sectors; } if (!(pmp->pm_flags & MSDOSFS_LARGEFS)) { if (pmp->pm_HugeSectors > 0xffffffff / (pmp->pm_BytesPerSec / sizeof(struct direntry)) + 1) { /* * We cannot deal currently with this size of disk * due to fileid limitations (see msdosfs_getattr and * msdosfs_readdir) */ error = EINVAL; vfs_mount_error(mp, "Disk too big, try '-o large' mount option"); goto error_exit; } } if (pmp->pm_RootDirEnts == 0) { if (pmp->pm_FATsecs || getushort(b710->bpbFSVers)) { error = EINVAL; #ifdef MSDOSFS_DEBUG printf("mountmsdosfs(): bad FAT32 filesystem\n"); #endif goto error_exit; } pmp->pm_fatmask = FAT32_MASK; pmp->pm_fatmult = 4; pmp->pm_fatdiv = 1; pmp->pm_FATsecs = getulong(b710->bpbBigFATsecs); if (getushort(b710->bpbExtFlags) & FATMIRROR) pmp->pm_curfat = getushort(b710->bpbExtFlags) & FATNUM; else pmp->pm_flags |= MSDOSFS_FATMIRROR; } else pmp->pm_flags |= MSDOSFS_FATMIRROR; /* * Check a few values (could do some more): * - logical sector size: power of 2, >= block size * - sectors per cluster: power of 2, >= 1 * - number of sectors: >= 1, <= size of partition * - number of FAT sectors: >= 1 */ if ( (SecPerClust == 0) || (SecPerClust & (SecPerClust - 1)) || (pmp->pm_BytesPerSec < DEV_BSIZE) || (pmp->pm_BytesPerSec & (pmp->pm_BytesPerSec - 1)) || (pmp->pm_HugeSectors == 0) || (pmp->pm_FATsecs == 0) || (SecPerClust * pmp->pm_BlkPerSec > MAXBSIZE / DEV_BSIZE) ) { error = EINVAL; goto error_exit; } pmp->pm_HugeSectors *= pmp->pm_BlkPerSec; pmp->pm_HiddenSects *= pmp->pm_BlkPerSec; /* XXX not used? */ pmp->pm_FATsecs *= pmp->pm_BlkPerSec; SecPerClust *= pmp->pm_BlkPerSec; pmp->pm_fatblk = pmp->pm_ResSectors * pmp->pm_BlkPerSec; if (FAT32(pmp)) { pmp->pm_rootdirblk = getulong(b710->bpbRootClust); pmp->pm_firstcluster = pmp->pm_fatblk + (pmp->pm_FATs * pmp->pm_FATsecs); pmp->pm_fsinfo = getushort(b710->bpbFSInfo) * pmp->pm_BlkPerSec; } else { pmp->pm_rootdirblk = pmp->pm_fatblk + (pmp->pm_FATs * pmp->pm_FATsecs); pmp->pm_rootdirsize = (pmp->pm_RootDirEnts * sizeof(struct direntry) + DEV_BSIZE - 1) / DEV_BSIZE; /* in blocks */ pmp->pm_firstcluster = pmp->pm_rootdirblk + pmp->pm_rootdirsize; } pmp->pm_maxcluster = (pmp->pm_HugeSectors - pmp->pm_firstcluster) / SecPerClust + 1; pmp->pm_fatsize = pmp->pm_FATsecs * DEV_BSIZE; /* XXX not used? */ if (pmp->pm_fatmask == 0) { if (pmp->pm_maxcluster <= ((CLUST_RSRVD - CLUST_FIRST) & FAT12_MASK)) { /* * This will usually be a floppy disk. This size makes * sure that one fat entry will not be split across * multiple blocks. */ pmp->pm_fatmask = FAT12_MASK; pmp->pm_fatmult = 3; pmp->pm_fatdiv = 2; } else { pmp->pm_fatmask = FAT16_MASK; pmp->pm_fatmult = 2; pmp->pm_fatdiv = 1; } } clusters = (pmp->pm_fatsize / pmp->pm_fatmult) * pmp->pm_fatdiv; if (pmp->pm_maxcluster >= clusters) { #ifdef MSDOSFS_DEBUG printf("Warning: number of clusters (%ld) exceeds FAT " "capacity (%ld)\n", pmp->pm_maxcluster + 1, clusters); #endif pmp->pm_maxcluster = clusters - 1; } if (FAT12(pmp)) pmp->pm_fatblocksize = 3 * 512; else pmp->pm_fatblocksize = PAGE_SIZE; pmp->pm_fatblocksize = roundup(pmp->pm_fatblocksize, pmp->pm_BytesPerSec); pmp->pm_fatblocksec = pmp->pm_fatblocksize / DEV_BSIZE; pmp->pm_bnshift = ffs(DEV_BSIZE) - 1; /* * Compute mask and shift value for isolating cluster relative byte * offsets and cluster numbers from a file offset. */ pmp->pm_bpcluster = SecPerClust * DEV_BSIZE; pmp->pm_crbomask = pmp->pm_bpcluster - 1; pmp->pm_cnshift = ffs(pmp->pm_bpcluster) - 1; /* * Check for valid cluster size * must be a power of 2 */ if (pmp->pm_bpcluster ^ (1 << pmp->pm_cnshift)) { error = EINVAL; goto error_exit; } /* * Release the bootsector buffer. */ brelse(bp); bp = NULL; /* * Check the fsinfo sector if we have one. Silently fix up our * in-core copy of fp->fsinxtfree if it is unknown (0xffffffff) * or too large. Ignore fp->fsinfree for now, since we need to * read the entire FAT anyway to fill the inuse map. */ if (pmp->pm_fsinfo) { struct fsinfo *fp; if ((error = bread(devvp, pmp->pm_fsinfo, pmp->pm_BytesPerSec, NOCRED, &bp)) != 0) goto error_exit; fp = (struct fsinfo *)bp->b_data; if (!bcmp(fp->fsisig1, "RRaA", 4) && !bcmp(fp->fsisig2, "rrAa", 4) && !bcmp(fp->fsisig3, "\0\0\125\252", 4)) { pmp->pm_nxtfree = getulong(fp->fsinxtfree); if (pmp->pm_nxtfree > pmp->pm_maxcluster) pmp->pm_nxtfree = CLUST_FIRST; } else pmp->pm_fsinfo = 0; brelse(bp); bp = NULL; } /* * Finish initializing pmp->pm_nxtfree (just in case the first few * sectors aren't properly reserved in the FAT). This completes * the fixup for fp->fsinxtfree, and fixes up the zero-initialized * value if there is no fsinfo. We will use pmp->pm_nxtfree * internally even if there is no fsinfo. */ if (pmp->pm_nxtfree < CLUST_FIRST) pmp->pm_nxtfree = CLUST_FIRST; /* * Allocate memory for the bitmap of allocated clusters, and then * fill it in. */ pmp->pm_inusemap = malloc(howmany(pmp->pm_maxcluster + 1, N_INUSEBITS) * sizeof(*pmp->pm_inusemap), M_MSDOSFSFAT, M_WAITOK); /* * fillinusemap() needs pm_devvp. */ pmp->pm_devvp = devvp; pmp->pm_dev = dev; /* * Have the inuse map filled in. */ MSDOSFS_LOCK_MP(pmp); error = fillinusemap(pmp); MSDOSFS_UNLOCK_MP(pmp); if (error != 0) goto error_exit; /* * If they want fat updates to be synchronous then let them suffer * the performance degradation in exchange for the on disk copy of * the fat being correct just about all the time. I suppose this * would be a good thing to turn on if the kernel is still flakey. */ if (mp->mnt_flag & MNT_SYNCHRONOUS) pmp->pm_flags |= MSDOSFSMNT_WAITONFAT; /* * Finish up. */ if (ronly) pmp->pm_flags |= MSDOSFSMNT_RONLY; else { if ((error = markvoldirty(pmp, 1)) != 0) { (void)markvoldirty(pmp, 0); goto error_exit; } pmp->pm_fmod = 1; } mp->mnt_data = pmp; mp->mnt_stat.f_fsid.val[0] = dev2udev(dev); mp->mnt_stat.f_fsid.val[1] = mp->mnt_vfc->vfc_typenum; MNT_ILOCK(mp); mp->mnt_flag |= MNT_LOCAL; + mp->mnt_kern_flag |= MNTK_USES_BCACHE; MNT_IUNLOCK(mp); if (pmp->pm_flags & MSDOSFS_LARGEFS) msdosfs_fileno_init(mp); return 0; error_exit: if (bp) brelse(bp); if (cp != NULL) { DROP_GIANT(); g_topology_lock(); g_vfs_close(cp); g_topology_unlock(); PICKUP_GIANT(); } if (pmp) { lockdestroy(&pmp->pm_fatlock); if (pmp->pm_inusemap) free(pmp->pm_inusemap, M_MSDOSFSFAT); free(pmp, M_MSDOSFSMNT); mp->mnt_data = NULL; } dev_rel(dev); return (error); } /* * Unmount the filesystem described by mp. */ static int msdosfs_unmount(struct mount *mp, int mntflags) { struct msdosfsmount *pmp; int error, flags; error = flags = 0; pmp = VFSTOMSDOSFS(mp); if ((pmp->pm_flags & MSDOSFSMNT_RONLY) == 0) error = msdosfs_sync(mp, MNT_WAIT); if ((mntflags & MNT_FORCE) != 0) flags |= FORCECLOSE; else if (error != 0) return (error); error = vflush(mp, 0, flags, curthread); if (error != 0 && error != ENXIO) return (error); if ((pmp->pm_flags & MSDOSFSMNT_RONLY) == 0) { error = markvoldirty(pmp, 0); if (error && error != ENXIO) { (void)markvoldirty(pmp, 1); return (error); } } if (pmp->pm_flags & MSDOSFSMNT_KICONV && msdosfs_iconv) { if (pmp->pm_w2u) msdosfs_iconv->close(pmp->pm_w2u); if (pmp->pm_u2w) msdosfs_iconv->close(pmp->pm_u2w); if (pmp->pm_d2u) msdosfs_iconv->close(pmp->pm_d2u); if (pmp->pm_u2d) msdosfs_iconv->close(pmp->pm_u2d); } #ifdef MSDOSFS_DEBUG { struct vnode *vp = pmp->pm_devvp; struct bufobj *bo; bo = &vp->v_bufobj; BO_LOCK(bo); VI_LOCK(vp); vn_printf(vp, "msdosfs_umount(): just before calling VOP_CLOSE()\n"); printf("freef %p, freeb %p, mount %p\n", TAILQ_NEXT(vp, v_actfreelist), vp->v_actfreelist.tqe_prev, vp->v_mount); printf("cleanblkhd %p, dirtyblkhd %p, numoutput %ld, type %d\n", TAILQ_FIRST(&vp->v_bufobj.bo_clean.bv_hd), TAILQ_FIRST(&vp->v_bufobj.bo_dirty.bv_hd), vp->v_bufobj.bo_numoutput, vp->v_type); VI_UNLOCK(vp); BO_UNLOCK(bo); } #endif DROP_GIANT(); if (pmp->pm_devvp->v_type == VCHR && pmp->pm_devvp->v_rdev != NULL) pmp->pm_devvp->v_rdev->si_mountpt = NULL; g_topology_lock(); g_vfs_close(pmp->pm_cp); g_topology_unlock(); PICKUP_GIANT(); vrele(pmp->pm_devvp); dev_rel(pmp->pm_dev); free(pmp->pm_inusemap, M_MSDOSFSFAT); if (pmp->pm_flags & MSDOSFS_LARGEFS) msdosfs_fileno_free(mp); lockdestroy(&pmp->pm_fatlock); free(pmp, M_MSDOSFSMNT); mp->mnt_data = NULL; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_LOCAL; MNT_IUNLOCK(mp); return (error); } static int msdosfs_root(struct mount *mp, int flags, struct vnode **vpp) { struct msdosfsmount *pmp = VFSTOMSDOSFS(mp); struct denode *ndep; int error; #ifdef MSDOSFS_DEBUG printf("msdosfs_root(); mp %p, pmp %p\n", mp, pmp); #endif error = deget(pmp, MSDOSFSROOT, MSDOSFSROOT_OFS, &ndep); if (error) return (error); *vpp = DETOV(ndep); return (0); } static int msdosfs_statfs(struct mount *mp, struct statfs *sbp) { struct msdosfsmount *pmp; pmp = VFSTOMSDOSFS(mp); sbp->f_bsize = pmp->pm_bpcluster; sbp->f_iosize = pmp->pm_bpcluster; sbp->f_blocks = pmp->pm_maxcluster + 1; sbp->f_bfree = pmp->pm_freeclustercount; sbp->f_bavail = pmp->pm_freeclustercount; sbp->f_files = pmp->pm_RootDirEnts; /* XXX */ sbp->f_ffree = 0; /* what to put in here? */ return (0); } /* * If we have an FSInfo block, update it. */ static int msdosfs_fsiflush(struct msdosfsmount *pmp, int waitfor) { struct fsinfo *fp; struct buf *bp; int error; MSDOSFS_LOCK_MP(pmp); if (pmp->pm_fsinfo == 0 || (pmp->pm_flags & MSDOSFS_FSIMOD) == 0) { error = 0; goto unlock; } error = bread(pmp->pm_devvp, pmp->pm_fsinfo, pmp->pm_BytesPerSec, NOCRED, &bp); if (error != 0) { brelse(bp); goto unlock; } fp = (struct fsinfo *)bp->b_data; putulong(fp->fsinfree, pmp->pm_freeclustercount); putulong(fp->fsinxtfree, pmp->pm_nxtfree); pmp->pm_flags &= ~MSDOSFS_FSIMOD; if (waitfor == MNT_WAIT) error = bwrite(bp); else bawrite(bp); unlock: MSDOSFS_UNLOCK_MP(pmp); return (error); } static int msdosfs_sync(struct mount *mp, int waitfor) { struct vnode *vp, *nvp; struct thread *td; struct denode *dep; struct msdosfsmount *pmp = VFSTOMSDOSFS(mp); int error, allerror = 0; td = curthread; /* * If we ever switch to not updating all of the fats all the time, * this would be the place to update them from the first one. */ if (pmp->pm_fmod != 0) { if (pmp->pm_flags & MSDOSFSMNT_RONLY) panic("msdosfs_sync: rofs mod"); else { /* update fats here */ } } /* * Write back each (modified) denode. */ loop: MNT_VNODE_FOREACH_ALL(vp, mp, nvp) { if (vp->v_type == VNON) { VI_UNLOCK(vp); continue; } dep = VTODE(vp); if ((dep->de_flag & (DE_ACCESS | DE_CREATE | DE_UPDATE | DE_MODIFIED)) == 0 && (vp->v_bufobj.bo_dirty.bv_cnt == 0 || waitfor == MNT_LAZY)) { VI_UNLOCK(vp); continue; } error = vget(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, td); if (error) { if (error == ENOENT) goto loop; continue; } error = VOP_FSYNC(vp, waitfor, td); if (error) allerror = error; VOP_UNLOCK(vp, 0); vrele(vp); } /* * Flush filesystem control info. */ if (waitfor != MNT_LAZY) { vn_lock(pmp->pm_devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_FSYNC(pmp->pm_devvp, waitfor, td); if (error) allerror = error; VOP_UNLOCK(pmp->pm_devvp, 0); } error = msdosfs_fsiflush(pmp, waitfor); if (error != 0) allerror = error; return (allerror); } static int msdosfs_fhtovp(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp) { struct msdosfsmount *pmp = VFSTOMSDOSFS(mp); struct defid *defhp = (struct defid *) fhp; struct denode *dep; int error; error = deget(pmp, defhp->defid_dirclust, defhp->defid_dirofs, &dep); if (error) { *vpp = NULLVP; return (error); } *vpp = DETOV(dep); vnode_create_vobject(*vpp, dep->de_FileSize, curthread); return (0); } static struct vfsops msdosfs_vfsops = { .vfs_fhtovp = msdosfs_fhtovp, .vfs_mount = msdosfs_mount, .vfs_cmount = msdosfs_cmount, .vfs_root = msdosfs_root, .vfs_statfs = msdosfs_statfs, .vfs_sync = msdosfs_sync, .vfs_unmount = msdosfs_unmount, }; VFS_SET(msdosfs_vfsops, msdosfs, 0); MODULE_VERSION(msdosfs, 1); Index: user/ngie/more-tests/sys/fs/nandfs/nandfs_vfsops.c =================================================================== --- user/ngie/more-tests/sys/fs/nandfs/nandfs_vfsops.c (revision 281584) +++ user/ngie/more-tests/sys/fs/nandfs/nandfs_vfsops.c (revision 281585) @@ -1,1597 +1,1598 @@ /*- * Copyright (c) 2010-2012 Semihalf * Copyright (c) 2008, 2009 Reinoud Zandijk * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * From: NetBSD: nilfs_vfsops.c,v 1.1 2009/07/18 16:31:42 reinoud Exp */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static MALLOC_DEFINE(M_NANDFSMNT, "nandfs_mount", "NANDFS mount structure"); #define NANDFS_SET_SYSTEMFILE(vp) { \ (vp)->v_vflag |= VV_SYSTEM; \ vref(vp); \ vput(vp); } #define NANDFS_UNSET_SYSTEMFILE(vp) { \ VOP_LOCK(vp, LK_EXCLUSIVE); \ MPASS(vp->v_bufobj.bo_dirty.bv_cnt == 0); \ (vp)->v_vflag &= ~VV_SYSTEM; \ vgone(vp); \ vput(vp); } /* Globals */ struct _nandfs_devices nandfs_devices; /* Parameters */ int nandfs_verbose = 0; static void nandfs_tunable_init(void *arg) { TUNABLE_INT_FETCH("vfs.nandfs.verbose", &nandfs_verbose); } SYSINIT(nandfs_tunables, SI_SUB_VFS, SI_ORDER_ANY, nandfs_tunable_init, NULL); static SYSCTL_NODE(_vfs, OID_AUTO, nandfs, CTLFLAG_RD, 0, "NAND filesystem"); static SYSCTL_NODE(_vfs_nandfs, OID_AUTO, mount, CTLFLAG_RD, 0, "NANDFS mountpoints"); SYSCTL_INT(_vfs_nandfs, OID_AUTO, verbose, CTLFLAG_RW, &nandfs_verbose, 0, ""); #define NANDFS_CONSTR_INTERVAL 5 int nandfs_sync_interval = NANDFS_CONSTR_INTERVAL; /* sync every 5 seconds */ SYSCTL_UINT(_vfs_nandfs, OID_AUTO, sync_interval, CTLFLAG_RW, &nandfs_sync_interval, 0, ""); #define NANDFS_MAX_DIRTY_SEGS 5 int nandfs_max_dirty_segs = NANDFS_MAX_DIRTY_SEGS; /* sync when 5 dirty seg */ SYSCTL_UINT(_vfs_nandfs, OID_AUTO, max_dirty_segs, CTLFLAG_RW, &nandfs_max_dirty_segs, 0, ""); #define NANDFS_CPS_BETWEEN_SBLOCKS 5 int nandfs_cps_between_sblocks = NANDFS_CPS_BETWEEN_SBLOCKS; /* write superblock every 5 checkpoints */ SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cps_between_sblocks, CTLFLAG_RW, &nandfs_cps_between_sblocks, 0, ""); #define NANDFS_CLEANER_ENABLE 1 int nandfs_cleaner_enable = NANDFS_CLEANER_ENABLE; SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cleaner_enable, CTLFLAG_RW, &nandfs_cleaner_enable, 0, ""); #define NANDFS_CLEANER_INTERVAL 5 int nandfs_cleaner_interval = NANDFS_CLEANER_INTERVAL; SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cleaner_interval, CTLFLAG_RW, &nandfs_cleaner_interval, 0, ""); #define NANDFS_CLEANER_SEGMENTS 5 int nandfs_cleaner_segments = NANDFS_CLEANER_SEGMENTS; SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cleaner_segments, CTLFLAG_RW, &nandfs_cleaner_segments, 0, ""); static int nandfs_mountfs(struct vnode *devvp, struct mount *mp); static vfs_mount_t nandfs_mount; static vfs_root_t nandfs_root; static vfs_statfs_t nandfs_statfs; static vfs_unmount_t nandfs_unmount; static vfs_vget_t nandfs_vget; static vfs_sync_t nandfs_sync; static const char *nandfs_opts[] = { "snap", "from", "noatime", NULL }; /* System nodes */ static int nandfs_create_system_nodes(struct nandfs_device *nandfsdev) { int error; error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_DAT_INO, &nandfsdev->nd_super_root.sr_dat, &nandfsdev->nd_dat_node); if (error) goto errorout; error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_CPFILE_INO, &nandfsdev->nd_super_root.sr_cpfile, &nandfsdev->nd_cp_node); if (error) goto errorout; error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_SUFILE_INO, &nandfsdev->nd_super_root.sr_sufile, &nandfsdev->nd_su_node); if (error) goto errorout; error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_GC_INO, NULL, &nandfsdev->nd_gc_node); if (error) goto errorout; NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_dat_node)); NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_cp_node)); NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_su_node)); NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_gc_node)); DPRINTF(VOLUMES, ("System vnodes: dat: %p cp: %p su: %p\n", NTOV(nandfsdev->nd_dat_node), NTOV(nandfsdev->nd_cp_node), NTOV(nandfsdev->nd_su_node))); return (0); errorout: nandfs_dispose_node(&nandfsdev->nd_gc_node); nandfs_dispose_node(&nandfsdev->nd_dat_node); nandfs_dispose_node(&nandfsdev->nd_cp_node); nandfs_dispose_node(&nandfsdev->nd_su_node); return (error); } static void nandfs_release_system_nodes(struct nandfs_device *nandfsdev) { if (!nandfsdev) return; if (nandfsdev->nd_refcnt > 0) return; if (nandfsdev->nd_gc_node) NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_gc_node)); if (nandfsdev->nd_dat_node) NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_dat_node)); if (nandfsdev->nd_cp_node) NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_cp_node)); if (nandfsdev->nd_su_node) NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_su_node)); } static int nandfs_check_fsdata_crc(struct nandfs_fsdata *fsdata) { uint32_t fsdata_crc, comp_crc; if (fsdata->f_magic != NANDFS_FSDATA_MAGIC) return (0); /* Preserve CRC */ fsdata_crc = fsdata->f_sum; /* Calculate */ fsdata->f_sum = (0); comp_crc = crc32((uint8_t *)fsdata, fsdata->f_bytes); /* Restore */ fsdata->f_sum = fsdata_crc; /* Check CRC */ return (fsdata_crc == comp_crc); } static int nandfs_check_superblock_crc(struct nandfs_fsdata *fsdata, struct nandfs_super_block *super) { uint32_t super_crc, comp_crc; /* Check super block magic */ if (super->s_magic != NANDFS_SUPER_MAGIC) return (0); /* Preserve CRC */ super_crc = super->s_sum; /* Calculate */ super->s_sum = (0); comp_crc = crc32((uint8_t *)super, fsdata->f_sbbytes); /* Restore */ super->s_sum = super_crc; /* Check CRC */ return (super_crc == comp_crc); } static void nandfs_calc_superblock_crc(struct nandfs_fsdata *fsdata, struct nandfs_super_block *super) { uint32_t comp_crc; /* Calculate */ super->s_sum = 0; comp_crc = crc32((uint8_t *)super, fsdata->f_sbbytes); /* Restore */ super->s_sum = comp_crc; } static int nandfs_is_empty(u_char *area, int size) { int i; for (i = 0; i < size; i++) if (area[i] != 0xff) return (0); return (1); } static __inline int nandfs_sblocks_in_esize(struct nandfs_device *fsdev) { return ((fsdev->nd_erasesize - NANDFS_SBLOCK_OFFSET_BYTES) / sizeof(struct nandfs_super_block)); } static __inline int nandfs_max_sblocks(struct nandfs_device *fsdev) { return (NANDFS_NFSAREAS * nandfs_sblocks_in_esize(fsdev)); } static __inline int nandfs_sblocks_in_block(struct nandfs_device *fsdev) { return (fsdev->nd_devblocksize / sizeof(struct nandfs_super_block)); } #if 0 static __inline int nandfs_sblocks_in_first_block(struct nandfs_device *fsdev) { int n; n = nandfs_sblocks_in_block(fsdev) - NANDFS_SBLOCK_OFFSET_BYTES / sizeof(struct nandfs_super_block); if (n < 0) n = 0; return (n); } #endif static int nandfs_write_superblock_at(struct nandfs_device *fsdev, struct nandfs_fsarea *fstp) { struct nandfs_super_block *super, *supert; struct buf *bp; int sb_per_sector, sbs_in_fsd, read_block; int index, pos, error; off_t offset; DPRINTF(SYNC, ("%s: last_used %d nandfs_sblocks_in_esize %d\n", __func__, fstp->last_used, nandfs_sblocks_in_esize(fsdev))); if (fstp->last_used == nandfs_sblocks_in_esize(fsdev) - 1) index = 0; else index = fstp->last_used + 1; super = &fsdev->nd_super; supert = NULL; sb_per_sector = nandfs_sblocks_in_block(fsdev); sbs_in_fsd = sizeof(struct nandfs_fsdata) / sizeof(struct nandfs_super_block); index += sbs_in_fsd; offset = fstp->offset; DPRINTF(SYNC, ("%s: offset %#jx s_last_pseg %#jx s_last_cno %#jx " "s_last_seq %#jx wtime %jd index %d\n", __func__, offset, super->s_last_pseg, super->s_last_cno, super->s_last_seq, super->s_wtime, index)); read_block = btodb(offset + ((index / sb_per_sector) * sb_per_sector) * sizeof(struct nandfs_super_block)); DPRINTF(SYNC, ("%s: read_block %#x\n", __func__, read_block)); if (index == sbs_in_fsd) { error = nandfs_erase(fsdev, offset, fsdev->nd_erasesize); if (error) return (error); error = bread(fsdev->nd_devvp, btodb(offset), fsdev->nd_devblocksize, NOCRED, &bp); if (error) { printf("NANDFS: couldn't read initial data: %d\n", error); brelse(bp); return (error); } memcpy(bp->b_data, &fsdev->nd_fsdata, sizeof(fsdev->nd_fsdata)); /* * 0xff-out the rest. This bp could be cached, so potentially * b_data contains stale super blocks. * * We don't mind cached bp since most of the time we just add * super blocks to already 0xff-out b_data and don't need to * perform actual read. */ if (fsdev->nd_devblocksize > sizeof(fsdev->nd_fsdata)) memset(bp->b_data + sizeof(fsdev->nd_fsdata), 0xff, fsdev->nd_devblocksize - sizeof(fsdev->nd_fsdata)); error = bwrite(bp); if (error) { printf("NANDFS: cannot rewrite initial data at %jx\n", offset); return (error); } } error = bread(fsdev->nd_devvp, read_block, fsdev->nd_devblocksize, NOCRED, &bp); if (error) { brelse(bp); return (error); } supert = (struct nandfs_super_block *)(bp->b_data); pos = index % sb_per_sector; DPRINTF(SYNC, ("%s: storing at %d\n", __func__, pos)); memcpy(&supert[pos], super, sizeof(struct nandfs_super_block)); /* * See comment above in code that performs erase. */ if (pos == 0) memset(&supert[1], 0xff, (sb_per_sector - 1) * sizeof(struct nandfs_super_block)); error = bwrite(bp); if (error) { printf("NANDFS: cannot update superblock at %jx\n", offset); return (error); } DPRINTF(SYNC, ("%s: fstp->last_used %d -> %d\n", __func__, fstp->last_used, index - sbs_in_fsd)); fstp->last_used = index - sbs_in_fsd; return (0); } int nandfs_write_superblock(struct nandfs_device *fsdev) { struct nandfs_super_block *super; struct timespec ts; int error; int i, j; vfs_timestamp(&ts); super = &fsdev->nd_super; super->s_last_pseg = fsdev->nd_last_pseg; super->s_last_cno = fsdev->nd_last_cno; super->s_last_seq = fsdev->nd_seg_sequence; super->s_wtime = ts.tv_sec; nandfs_calc_superblock_crc(&fsdev->nd_fsdata, super); error = 0; for (i = 0, j = fsdev->nd_last_fsarea; i < NANDFS_NFSAREAS; i++, j = (j + 1 % NANDFS_NFSAREAS)) { if (fsdev->nd_fsarea[j].flags & NANDFS_FSSTOR_FAILED) { DPRINTF(SYNC, ("%s: skipping %d\n", __func__, j)); continue; } error = nandfs_write_superblock_at(fsdev, &fsdev->nd_fsarea[j]); if (error) { printf("NANDFS: writing superblock at offset %d failed:" "%d\n", j * fsdev->nd_erasesize, error); fsdev->nd_fsarea[j].flags |= NANDFS_FSSTOR_FAILED; } else break; } if (i == NANDFS_NFSAREAS) { printf("NANDFS: superblock was not written\n"); /* * TODO: switch to read-only? */ return (error); } else fsdev->nd_last_fsarea = (j + 1) % NANDFS_NFSAREAS; return (0); } static int nandfs_select_fsdata(struct nandfs_device *fsdev, struct nandfs_fsdata *fsdatat, struct nandfs_fsdata **fsdata, int nfsds) { int i; *fsdata = NULL; for (i = 0; i < nfsds; i++) { DPRINTF(VOLUMES, ("%s: i %d f_magic %x f_crc %x\n", __func__, i, fsdatat[i].f_magic, fsdatat[i].f_sum)); if (!nandfs_check_fsdata_crc(&fsdatat[i])) continue; *fsdata = &fsdatat[i]; break; } return (*fsdata != NULL ? 0 : EINVAL); } static int nandfs_select_sb(struct nandfs_device *fsdev, struct nandfs_super_block *supert, struct nandfs_super_block **super, int nsbs) { int i; *super = NULL; for (i = 0; i < nsbs; i++) { if (!nandfs_check_superblock_crc(&fsdev->nd_fsdata, &supert[i])) continue; DPRINTF(SYNC, ("%s: i %d s_last_cno %jx s_magic %x " "s_wtime %jd\n", __func__, i, supert[i].s_last_cno, supert[i].s_magic, supert[i].s_wtime)); if (*super == NULL || supert[i].s_last_cno > (*super)->s_last_cno) *super = &supert[i]; } return (*super != NULL ? 0 : EINVAL); } static int nandfs_read_structures_at(struct nandfs_device *fsdev, struct nandfs_fsarea *fstp, struct nandfs_fsdata *fsdata, struct nandfs_super_block *super) { struct nandfs_super_block *tsuper, *tsuperd; struct buf *bp; int error, read_size; int i; int offset; offset = fstp->offset; if (fsdev->nd_erasesize > MAXBSIZE) read_size = MAXBSIZE; else read_size = fsdev->nd_erasesize; error = bread(fsdev->nd_devvp, btodb(offset), read_size, NOCRED, &bp); if (error) { printf("couldn't read: %d\n", error); brelse(bp); fstp->flags |= NANDFS_FSSTOR_FAILED; return (error); } tsuper = super; memcpy(fsdata, bp->b_data, sizeof(struct nandfs_fsdata)); memcpy(tsuper, (bp->b_data + sizeof(struct nandfs_fsdata)), read_size - sizeof(struct nandfs_fsdata)); brelse(bp); tsuper += (read_size - sizeof(struct nandfs_fsdata)) / sizeof(struct nandfs_super_block); for (i = 1; i < fsdev->nd_erasesize / read_size; i++) { error = bread(fsdev->nd_devvp, btodb(offset + i * read_size), read_size, NOCRED, &bp); if (error) { printf("couldn't read: %d\n", error); brelse(bp); fstp->flags |= NANDFS_FSSTOR_FAILED; return (error); } memcpy(tsuper, bp->b_data, read_size); tsuper += read_size / sizeof(struct nandfs_super_block); brelse(bp); } tsuper -= 1; fstp->last_used = nandfs_sblocks_in_esize(fsdev) - 1; for (tsuperd = super - 1; (tsuper != tsuperd); tsuper -= 1) { if (nandfs_is_empty((u_char *)tsuper, sizeof(*tsuper))) fstp->last_used--; else break; } DPRINTF(VOLUMES, ("%s: last_used %d\n", __func__, fstp->last_used)); return (0); } static int nandfs_read_structures(struct nandfs_device *fsdev) { struct nandfs_fsdata *fsdata, *fsdatat; struct nandfs_super_block *sblocks, *ssblock; int nsbs, nfsds, i; int error = 0; int nrsbs; nfsds = NANDFS_NFSAREAS; nsbs = nandfs_max_sblocks(fsdev); fsdatat = malloc(sizeof(struct nandfs_fsdata) * nfsds, M_NANDFSTEMP, M_WAITOK | M_ZERO); sblocks = malloc(sizeof(struct nandfs_super_block) * nsbs, M_NANDFSTEMP, M_WAITOK | M_ZERO); nrsbs = 0; for (i = 0; i < NANDFS_NFSAREAS; i++) { fsdev->nd_fsarea[i].offset = i * fsdev->nd_erasesize; error = nandfs_read_structures_at(fsdev, &fsdev->nd_fsarea[i], &fsdatat[i], sblocks + nrsbs); if (error) continue; nrsbs += (fsdev->nd_fsarea[i].last_used + 1); if (fsdev->nd_fsarea[fsdev->nd_last_fsarea].last_used > fsdev->nd_fsarea[i].last_used) fsdev->nd_last_fsarea = i; } if (nrsbs == 0) { printf("nandfs: no valid superblocks found\n"); error = EINVAL; goto out; } error = nandfs_select_fsdata(fsdev, fsdatat, &fsdata, nfsds); if (error) goto out; memcpy(&fsdev->nd_fsdata, fsdata, sizeof(struct nandfs_fsdata)); error = nandfs_select_sb(fsdev, sblocks, &ssblock, nsbs); if (error) goto out; memcpy(&fsdev->nd_super, ssblock, sizeof(struct nandfs_super_block)); out: free(fsdatat, M_NANDFSTEMP); free(sblocks, M_NANDFSTEMP); if (error == 0) DPRINTF(VOLUMES, ("%s: selected sb with w_time %jd " "last_pseg %#jx\n", __func__, fsdev->nd_super.s_wtime, fsdev->nd_super.s_last_pseg)); return (error); } static void nandfs_unmount_base(struct nandfs_device *nandfsdev) { int error; if (!nandfsdev) return; /* Remove all our information */ error = vinvalbuf(nandfsdev->nd_devvp, V_SAVE, 0, 0); if (error) { /* * Flushing buffers failed when fs was umounting, can't do * much now, just printf error and continue with umount. */ nandfs_error("%s(): error:%d when umounting FS\n", __func__, error); } /* Release the device's system nodes */ nandfs_release_system_nodes(nandfsdev); } static void nandfs_get_ncleanseg(struct nandfs_device *nandfsdev) { struct nandfs_seg_stat nss; nandfs_get_seg_stat(nandfsdev, &nss); nandfsdev->nd_clean_segs = nss.nss_ncleansegs; DPRINTF(VOLUMES, ("nandfs_mount: clean segs: %jx\n", (uintmax_t)nandfsdev->nd_clean_segs)); } static int nandfs_mount_base(struct nandfs_device *nandfsdev, struct mount *mp, struct nandfs_args *args) { uint32_t log_blocksize; int error; /* Flush out any old buffers remaining from a previous use. */ if ((error = vinvalbuf(nandfsdev->nd_devvp, V_SAVE, 0, 0))) return (error); error = nandfs_read_structures(nandfsdev); if (error) { printf("nandfs: could not get valid filesystem structures\n"); return (error); } if (nandfsdev->nd_fsdata.f_rev_level != NANDFS_CURRENT_REV) { printf("nandfs: unsupported file system revision: %d " "(supported is %d).\n", nandfsdev->nd_fsdata.f_rev_level, NANDFS_CURRENT_REV); return (EINVAL); } if (nandfsdev->nd_fsdata.f_erasesize != nandfsdev->nd_erasesize) { printf("nandfs: erasesize mismatch (device %#x, fs %#x)\n", nandfsdev->nd_erasesize, nandfsdev->nd_fsdata.f_erasesize); return (EINVAL); } /* Get our blocksize */ log_blocksize = nandfsdev->nd_fsdata.f_log_block_size; nandfsdev->nd_blocksize = (uint64_t) 1 << (log_blocksize + 10); DPRINTF(VOLUMES, ("%s: blocksize:%x\n", __func__, nandfsdev->nd_blocksize)); DPRINTF(VOLUMES, ("%s: accepted super block with cp %#jx\n", __func__, (uintmax_t)nandfsdev->nd_super.s_last_cno)); /* Calculate dat structure parameters */ nandfs_calc_mdt_consts(nandfsdev, &nandfsdev->nd_dat_mdt, nandfsdev->nd_fsdata.f_dat_entry_size); nandfs_calc_mdt_consts(nandfsdev, &nandfsdev->nd_ifile_mdt, nandfsdev->nd_fsdata.f_inode_size); /* Search for the super root and roll forward when needed */ if (nandfs_search_super_root(nandfsdev)) { printf("Cannot find valid SuperRoot\n"); return (EINVAL); } nandfsdev->nd_mount_state = nandfsdev->nd_super.s_state; if (nandfsdev->nd_mount_state != NANDFS_VALID_FS) { printf("FS is seriously damaged, needs repairing\n"); printf("aborting mount\n"); return (EINVAL); } /* * FS should be ok now. The superblock and the last segsum could be * updated from the repair so extract running values again. */ nandfsdev->nd_last_pseg = nandfsdev->nd_super.s_last_pseg; nandfsdev->nd_seg_sequence = nandfsdev->nd_super.s_last_seq; nandfsdev->nd_seg_num = nandfs_get_segnum_of_block(nandfsdev, nandfsdev->nd_last_pseg); nandfsdev->nd_next_seg_num = nandfs_get_segnum_of_block(nandfsdev, nandfsdev->nd_last_segsum.ss_next); nandfsdev->nd_ts.tv_sec = nandfsdev->nd_last_segsum.ss_create; nandfsdev->nd_last_cno = nandfsdev->nd_super.s_last_cno; nandfsdev->nd_fakevblk = 1; /* * FIXME: bogus calculation. Should use actual number of usable segments * instead of total amount. */ nandfsdev->nd_segs_reserved = nandfsdev->nd_fsdata.f_nsegments * nandfsdev->nd_fsdata.f_r_segments_percentage / 100; nandfsdev->nd_last_ino = NANDFS_USER_INO; DPRINTF(VOLUMES, ("%s: last_pseg %#jx last_cno %#jx last_seq %#jx\n" "fsdev: last_seg: seq %#jx num %#jx, next_seg_num %#jx " "segs_reserved %#jx\n", __func__, (uintmax_t)nandfsdev->nd_last_pseg, (uintmax_t)nandfsdev->nd_last_cno, (uintmax_t)nandfsdev->nd_seg_sequence, (uintmax_t)nandfsdev->nd_seg_sequence, (uintmax_t)nandfsdev->nd_seg_num, (uintmax_t)nandfsdev->nd_next_seg_num, (uintmax_t)nandfsdev->nd_segs_reserved)); DPRINTF(VOLUMES, ("nandfs_mount: accepted super root\n")); /* Create system vnodes for DAT, CP and SEGSUM */ error = nandfs_create_system_nodes(nandfsdev); if (error) nandfs_unmount_base(nandfsdev); nandfs_get_ncleanseg(nandfsdev); return (error); } static void nandfs_unmount_device(struct nandfs_device *nandfsdev) { /* Is there anything? */ if (nandfsdev == NULL) return; /* Remove the device only if we're the last reference */ nandfsdev->nd_refcnt--; if (nandfsdev->nd_refcnt >= 1) return; MPASS(nandfsdev->nd_syncer == NULL); MPASS(nandfsdev->nd_cleaner == NULL); MPASS(nandfsdev->nd_free_base == NULL); /* Unmount our base */ nandfs_unmount_base(nandfsdev); /* Remove from our device list */ SLIST_REMOVE(&nandfs_devices, nandfsdev, nandfs_device, nd_next_device); DROP_GIANT(); g_topology_lock(); g_vfs_close(nandfsdev->nd_gconsumer); g_topology_unlock(); PICKUP_GIANT(); DPRINTF(VOLUMES, ("closing device\n")); /* Clear our mount reference and release device node */ vrele(nandfsdev->nd_devvp); dev_rel(nandfsdev->nd_devvp->v_rdev); /* Free our device info */ cv_destroy(&nandfsdev->nd_sync_cv); mtx_destroy(&nandfsdev->nd_sync_mtx); cv_destroy(&nandfsdev->nd_clean_cv); mtx_destroy(&nandfsdev->nd_clean_mtx); mtx_destroy(&nandfsdev->nd_mutex); lockdestroy(&nandfsdev->nd_seg_const); free(nandfsdev, M_NANDFSMNT); } static int nandfs_check_mounts(struct nandfs_device *nandfsdev, struct mount *mp, struct nandfs_args *args) { struct nandfsmount *nmp; uint64_t last_cno; /* no double-mounting of the same checkpoint */ STAILQ_FOREACH(nmp, &nandfsdev->nd_mounts, nm_next_mount) { if (nmp->nm_mount_args.cpno == args->cpno) return (EBUSY); } /* Allow readonly mounts without questioning here */ if (mp->mnt_flag & MNT_RDONLY) return (0); /* Read/write mount */ STAILQ_FOREACH(nmp, &nandfsdev->nd_mounts, nm_next_mount) { /* Only one RW mount on this device! */ if ((nmp->nm_vfs_mountp->mnt_flag & MNT_RDONLY)==0) return (EROFS); /* RDONLY on last mountpoint is device busy */ last_cno = nmp->nm_nandfsdev->nd_super.s_last_cno; if (nmp->nm_mount_args.cpno == last_cno) return (EBUSY); } /* OK for now */ return (0); } static int nandfs_mount_device(struct vnode *devvp, struct mount *mp, struct nandfs_args *args, struct nandfs_device **nandfsdev_p) { struct nandfs_device *nandfsdev; struct g_provider *pp; struct g_consumer *cp; struct cdev *dev; uint32_t erasesize; int error, size; int ronly; DPRINTF(VOLUMES, ("Mounting NANDFS device\n")); ronly = (mp->mnt_flag & MNT_RDONLY) != 0; /* Look up device in our nandfs_mountpoints */ *nandfsdev_p = NULL; SLIST_FOREACH(nandfsdev, &nandfs_devices, nd_next_device) if (nandfsdev->nd_devvp == devvp) break; if (nandfsdev) { DPRINTF(VOLUMES, ("device already mounted\n")); error = nandfs_check_mounts(nandfsdev, mp, args); if (error) return error; nandfsdev->nd_refcnt++; *nandfsdev_p = nandfsdev; if (!ronly) { DROP_GIANT(); g_topology_lock(); error = g_access(nandfsdev->nd_gconsumer, 0, 1, 0); g_topology_unlock(); PICKUP_GIANT(); } return (error); } vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); dev = devvp->v_rdev; dev_ref(dev); DROP_GIANT(); g_topology_lock(); error = g_vfs_open(devvp, &cp, "nandfs", ronly ? 0 : 1); pp = g_dev_getprovider(dev); g_topology_unlock(); PICKUP_GIANT(); VOP_UNLOCK(devvp, 0); if (error) { dev_rel(dev); return (error); } nandfsdev = malloc(sizeof(struct nandfs_device), M_NANDFSMNT, M_WAITOK | M_ZERO); /* Initialise */ nandfsdev->nd_refcnt = 1; nandfsdev->nd_devvp = devvp; nandfsdev->nd_syncing = 0; nandfsdev->nd_cleaning = 0; nandfsdev->nd_gconsumer = cp; cv_init(&nandfsdev->nd_sync_cv, "nandfssync"); mtx_init(&nandfsdev->nd_sync_mtx, "nffssyncmtx", NULL, MTX_DEF); cv_init(&nandfsdev->nd_clean_cv, "nandfsclean"); mtx_init(&nandfsdev->nd_clean_mtx, "nffscleanmtx", NULL, MTX_DEF); mtx_init(&nandfsdev->nd_mutex, "nandfsdev lock", NULL, MTX_DEF); lockinit(&nandfsdev->nd_seg_const, PVFS, "nffssegcon", VLKTIMEOUT, LK_CANRECURSE); STAILQ_INIT(&nandfsdev->nd_mounts); nandfsdev->nd_devsize = pp->mediasize; nandfsdev->nd_devblocksize = pp->sectorsize; size = sizeof(erasesize); error = g_io_getattr("NAND::blocksize", nandfsdev->nd_gconsumer, &size, &erasesize); if (error) { DPRINTF(VOLUMES, ("couldn't get erasesize: %d\n", error)); if (error == ENOIOCTL || error == EOPNOTSUPP) { /* * We conclude that this is not NAND storage */ erasesize = NANDFS_DEF_ERASESIZE; } else { DROP_GIANT(); g_topology_lock(); g_vfs_close(nandfsdev->nd_gconsumer); g_topology_unlock(); PICKUP_GIANT(); dev_rel(dev); free(nandfsdev, M_NANDFSMNT); return (error); } } nandfsdev->nd_erasesize = erasesize; DPRINTF(VOLUMES, ("%s: erasesize %x\n", __func__, nandfsdev->nd_erasesize)); /* Register nandfs_device in list */ SLIST_INSERT_HEAD(&nandfs_devices, nandfsdev, nd_next_device); error = nandfs_mount_base(nandfsdev, mp, args); if (error) { /* Remove all our information */ nandfs_unmount_device(nandfsdev); return (EINVAL); } nandfsdev->nd_maxfilesize = nandfs_get_maxfilesize(nandfsdev); *nandfsdev_p = nandfsdev; DPRINTF(VOLUMES, ("NANDFS device mounted ok\n")); return (0); } static int nandfs_mount_checkpoint(struct nandfsmount *nmp) { struct nandfs_cpfile_header *cphdr; struct nandfs_checkpoint *cp; struct nandfs_inode ifile_inode; struct nandfs_node *cp_node; struct buf *bp; uint64_t ncp, nsn, cpno, fcpno, blocknr, last_cno; uint32_t off, dlen; int cp_per_block, error; cpno = nmp->nm_mount_args.cpno; if (cpno == 0) cpno = nmp->nm_nandfsdev->nd_super.s_last_cno; DPRINTF(VOLUMES, ("%s: trying to mount checkpoint number %"PRIu64"\n", __func__, cpno)); cp_node = nmp->nm_nandfsdev->nd_cp_node; VOP_LOCK(NTOV(cp_node), LK_SHARED); /* Get cpfile header from 1st block of cp file */ error = nandfs_bread(cp_node, 0, NOCRED, 0, &bp); if (error) { brelse(bp); VOP_UNLOCK(NTOV(cp_node), 0); return (error); } cphdr = (struct nandfs_cpfile_header *) bp->b_data; ncp = cphdr->ch_ncheckpoints; nsn = cphdr->ch_nsnapshots; brelse(bp); DPRINTF(VOLUMES, ("mount_nandfs: checkpoint header read in\n")); DPRINTF(VOLUMES, ("\tNumber of checkpoints %"PRIu64"\n", ncp)); DPRINTF(VOLUMES, ("\tNumber of snapshots %"PRIu64"\n", nsn)); /* Read in our specified checkpoint */ dlen = nmp->nm_nandfsdev->nd_fsdata.f_checkpoint_size; cp_per_block = nmp->nm_nandfsdev->nd_blocksize / dlen; fcpno = cpno + NANDFS_CPFILE_FIRST_CHECKPOINT_OFFSET - 1; blocknr = fcpno / cp_per_block; off = (fcpno % cp_per_block) * dlen; error = nandfs_bread(cp_node, blocknr, NOCRED, 0, &bp); if (error) { brelse(bp); VOP_UNLOCK(NTOV(cp_node), 0); printf("mount_nandfs: couldn't read cp block %"PRIu64"\n", fcpno); return (EINVAL); } /* Needs to be a valid checkpoint */ cp = (struct nandfs_checkpoint *) ((uint8_t *) bp->b_data + off); if (cp->cp_flags & NANDFS_CHECKPOINT_INVALID) { printf("mount_nandfs: checkpoint marked invalid\n"); brelse(bp); VOP_UNLOCK(NTOV(cp_node), 0); return (EINVAL); } /* Is this really the checkpoint we want? */ if (cp->cp_cno != cpno) { printf("mount_nandfs: checkpoint file corrupt? " "expected cpno %"PRIu64", found cpno %"PRIu64"\n", cpno, cp->cp_cno); brelse(bp); VOP_UNLOCK(NTOV(cp_node), 0); return (EINVAL); } /* Check if it's a snapshot ! */ last_cno = nmp->nm_nandfsdev->nd_super.s_last_cno; if (cpno != last_cno) { /* Only allow snapshots if not mounting on the last cp */ if ((cp->cp_flags & NANDFS_CHECKPOINT_SNAPSHOT) == 0) { printf( "mount_nandfs: checkpoint %"PRIu64" is not a " "snapshot\n", cpno); brelse(bp); VOP_UNLOCK(NTOV(cp_node), 0); return (EINVAL); } } ifile_inode = cp->cp_ifile_inode; brelse(bp); /* Get ifile inode */ error = nandfs_get_node_raw(nmp->nm_nandfsdev, NULL, NANDFS_IFILE_INO, &ifile_inode, &nmp->nm_ifile_node); if (error) { printf("mount_nandfs: can't read ifile node\n"); VOP_UNLOCK(NTOV(cp_node), 0); return (EINVAL); } NANDFS_SET_SYSTEMFILE(NTOV(nmp->nm_ifile_node)); VOP_UNLOCK(NTOV(cp_node), 0); /* Get root node? */ return (0); } static void free_nandfs_mountinfo(struct mount *mp) { struct nandfsmount *nmp = VFSTONANDFS(mp); if (nmp == NULL) return; free(nmp, M_NANDFSMNT); } void nandfs_wakeup_wait_sync(struct nandfs_device *nffsdev, int reason) { char *reasons[] = { "umount", "vfssync", "bdflush", "fforce", "fsync", "ro_upd" }; DPRINTF(SYNC, ("%s: %s\n", __func__, reasons[reason])); mtx_lock(&nffsdev->nd_sync_mtx); if (nffsdev->nd_syncing) cv_wait(&nffsdev->nd_sync_cv, &nffsdev->nd_sync_mtx); if (reason == SYNCER_UMOUNT) nffsdev->nd_syncer_exit = 1; nffsdev->nd_syncing = 1; wakeup(&nffsdev->nd_syncing); cv_wait(&nffsdev->nd_sync_cv, &nffsdev->nd_sync_mtx); mtx_unlock(&nffsdev->nd_sync_mtx); } static void nandfs_gc_finished(struct nandfs_device *nffsdev, int exit) { int error; mtx_lock(&nffsdev->nd_sync_mtx); nffsdev->nd_syncing = 0; DPRINTF(SYNC, ("%s: cleaner finish\n", __func__)); cv_broadcast(&nffsdev->nd_sync_cv); mtx_unlock(&nffsdev->nd_sync_mtx); if (!exit) { error = tsleep(&nffsdev->nd_syncing, PRIBIO, "-", hz * nandfs_sync_interval); DPRINTF(SYNC, ("%s: cleaner waked up: %d\n", __func__, error)); } } static void nandfs_syncer(struct nandfsmount *nmp) { struct nandfs_device *nffsdev; struct mount *mp; int flags, error; mp = nmp->nm_vfs_mountp; nffsdev = nmp->nm_nandfsdev; tsleep(&nffsdev->nd_syncing, PRIBIO, "-", hz * nandfs_sync_interval); while (!nffsdev->nd_syncer_exit) { DPRINTF(SYNC, ("%s: syncer run\n", __func__)); nffsdev->nd_syncing = 1; flags = (nmp->nm_flags & (NANDFS_FORCE_SYNCER | NANDFS_UMOUNT)); error = nandfs_segment_constructor(nmp, flags); if (error) nandfs_error("%s: error:%d when creating segments\n", __func__, error); nmp->nm_flags &= ~flags; nandfs_gc_finished(nffsdev, 0); } MPASS(nffsdev->nd_cleaner == NULL); error = nandfs_segment_constructor(nmp, NANDFS_FORCE_SYNCER | NANDFS_UMOUNT); if (error) nandfs_error("%s: error:%d when creating segments\n", __func__, error); nandfs_gc_finished(nffsdev, 1); nffsdev->nd_syncer = NULL; MPASS(nffsdev->nd_free_base == NULL); DPRINTF(SYNC, ("%s: exiting\n", __func__)); kthread_exit(); } static int start_syncer(struct nandfsmount *nmp) { int error; MPASS(nmp->nm_nandfsdev->nd_syncer == NULL); DPRINTF(SYNC, ("%s: start syncer\n", __func__)); nmp->nm_nandfsdev->nd_syncer_exit = 0; error = kthread_add((void(*)(void *))nandfs_syncer, nmp, NULL, &nmp->nm_nandfsdev->nd_syncer, 0, 0, "nandfs_syncer"); if (error) printf("nandfs: could not start syncer: %d\n", error); return (error); } static int stop_syncer(struct nandfsmount *nmp) { MPASS(nmp->nm_nandfsdev->nd_syncer != NULL); nandfs_wakeup_wait_sync(nmp->nm_nandfsdev, SYNCER_UMOUNT); DPRINTF(SYNC, ("%s: stop syncer\n", __func__)); return (0); } /* * Mount null layer */ static int nandfs_mount(struct mount *mp) { struct nandfsmount *nmp; struct vnode *devvp; struct nameidata nd; struct vfsoptlist *opts; struct thread *td; char *from; int error = 0, flags; DPRINTF(VOLUMES, ("%s: mp = %p\n", __func__, (void *)mp)); td = curthread; opts = mp->mnt_optnew; if (vfs_filteropt(opts, nandfs_opts)) return (EINVAL); /* * Update is a no-op */ if (mp->mnt_flag & MNT_UPDATE) { nmp = VFSTONANDFS(mp); if (vfs_flagopt(mp->mnt_optnew, "export", NULL, 0)) { return (error); } if (!(nmp->nm_ronly) && vfs_flagopt(opts, "ro", NULL, 0)) { vn_start_write(NULL, &mp, V_WAIT); error = VFS_SYNC(mp, MNT_WAIT); if (error) return (error); vn_finished_write(mp); flags = WRITECLOSE; if (mp->mnt_flag & MNT_FORCE) flags |= FORCECLOSE; nandfs_wakeup_wait_sync(nmp->nm_nandfsdev, SYNCER_ROUPD); error = vflush(mp, 0, flags, td); if (error) return (error); nandfs_stop_cleaner(nmp->nm_nandfsdev); stop_syncer(nmp); DROP_GIANT(); g_topology_lock(); g_access(nmp->nm_nandfsdev->nd_gconsumer, 0, -1, 0); g_topology_unlock(); PICKUP_GIANT(); MNT_ILOCK(mp); mp->mnt_flag |= MNT_RDONLY; MNT_IUNLOCK(mp); nmp->nm_ronly = 1; } else if ((nmp->nm_ronly) && !vfs_flagopt(opts, "ro", NULL, 0)) { /* * Don't allow read-write snapshots. */ if (nmp->nm_mount_args.cpno != 0) return (EROFS); /* * If upgrade to read-write by non-root, then verify * that user has necessary permissions on the device. */ devvp = nmp->nm_nandfsdev->nd_devvp; vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (error) { error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { VOP_UNLOCK(devvp, 0); return (error); } } VOP_UNLOCK(devvp, 0); DROP_GIANT(); g_topology_lock(); error = g_access(nmp->nm_nandfsdev->nd_gconsumer, 0, 1, 0); g_topology_unlock(); PICKUP_GIANT(); if (error) return (error); MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_RDONLY; MNT_IUNLOCK(mp); error = start_syncer(nmp); if (error == 0) error = nandfs_start_cleaner(nmp->nm_nandfsdev); if (error) { DROP_GIANT(); g_topology_lock(); g_access(nmp->nm_nandfsdev->nd_gconsumer, 0, -1, 0); g_topology_unlock(); PICKUP_GIANT(); return (error); } nmp->nm_ronly = 0; } return (0); } from = vfs_getopts(opts, "from", &error); if (error) return (error); /* * Find device node */ NDINIT(&nd, LOOKUP, FOLLOW|LOCKLEAF, UIO_SYSSPACE, from, curthread); error = namei(&nd); if (error) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); devvp = nd.ni_vp; if (!vn_isdisk(devvp, &error)) { vput(devvp); return (error); } /* Check the access rights on the mount device */ error = VOP_ACCESS(devvp, VREAD, curthread->td_ucred, curthread); if (error) error = priv_check(curthread, PRIV_VFS_MOUNT_PERM); if (error) { vput(devvp); return (error); } vfs_getnewfsid(mp); error = nandfs_mountfs(devvp, mp); if (error) return (error); vfs_mountedfrom(mp, from); return (0); } static int nandfs_mountfs(struct vnode *devvp, struct mount *mp) { struct nandfsmount *nmp = NULL; struct nandfs_args *args = NULL; struct nandfs_device *nandfsdev; char *from; int error, ronly; char *cpno; ronly = (mp->mnt_flag & MNT_RDONLY) != 0; if (devvp->v_rdev->si_iosize_max != 0) mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max; VOP_UNLOCK(devvp, 0); if (mp->mnt_iosize_max > MAXPHYS) mp->mnt_iosize_max = MAXPHYS; from = vfs_getopts(mp->mnt_optnew, "from", &error); if (error) goto error; error = vfs_getopt(mp->mnt_optnew, "snap", (void **)&cpno, NULL); if (error == ENOENT) cpno = NULL; else if (error) goto error; args = (struct nandfs_args *)malloc(sizeof(struct nandfs_args), M_NANDFSMNT, M_WAITOK | M_ZERO); if (cpno != NULL) args->cpno = strtoul(cpno, (char **)NULL, 10); else args->cpno = 0; args->fspec = from; if (args->cpno != 0 && !ronly) { error = EROFS; goto error; } printf("WARNING: NANDFS is considered to be a highly experimental " "feature in FreeBSD.\n"); error = nandfs_mount_device(devvp, mp, args, &nandfsdev); if (error) goto error; nmp = (struct nandfsmount *) malloc(sizeof(struct nandfsmount), M_NANDFSMNT, M_WAITOK | M_ZERO); mp->mnt_data = nmp; nmp->nm_vfs_mountp = mp; nmp->nm_ronly = ronly; MNT_ILOCK(mp); mp->mnt_flag |= MNT_LOCAL; + mp->mnt_kern_flag |= MNTK_USES_BCACHE; MNT_IUNLOCK(mp); nmp->nm_nandfsdev = nandfsdev; /* Add our mountpoint */ STAILQ_INSERT_TAIL(&nandfsdev->nd_mounts, nmp, nm_next_mount); if (args->cpno > nandfsdev->nd_last_cno) { printf("WARNING: supplied checkpoint number (%jd) is greater " "than last known checkpoint on filesystem (%jd). Mounting" " checkpoint %jd\n", (uintmax_t)args->cpno, (uintmax_t)nandfsdev->nd_last_cno, (uintmax_t)nandfsdev->nd_last_cno); args->cpno = nandfsdev->nd_last_cno; } /* Setting up other parameters */ nmp->nm_mount_args = *args; free(args, M_NANDFSMNT); error = nandfs_mount_checkpoint(nmp); if (error) { nandfs_unmount(mp, MNT_FORCE); goto unmounted; } if (!ronly) { error = start_syncer(nmp); if (error == 0) error = nandfs_start_cleaner(nmp->nm_nandfsdev); if (error) nandfs_unmount(mp, MNT_FORCE); } return (0); error: if (args != NULL) free(args, M_NANDFSMNT); if (nmp != NULL) { free(nmp, M_NANDFSMNT); mp->mnt_data = NULL; } unmounted: return (error); } static int nandfs_unmount(struct mount *mp, int mntflags) { struct nandfs_device *nandfsdev; struct nandfsmount *nmp; int error; int flags = 0; DPRINTF(VOLUMES, ("%s: mp = %p\n", __func__, (void *)mp)); if (mntflags & MNT_FORCE) flags |= FORCECLOSE; nmp = mp->mnt_data; nandfsdev = nmp->nm_nandfsdev; error = vflush(mp, 0, flags | SKIPSYSTEM, curthread); if (error) return (error); if (!(nmp->nm_ronly)) { nandfs_stop_cleaner(nandfsdev); stop_syncer(nmp); } if (nmp->nm_ifile_node) NANDFS_UNSET_SYSTEMFILE(NTOV(nmp->nm_ifile_node)); /* Remove our mount point */ STAILQ_REMOVE(&nandfsdev->nd_mounts, nmp, nandfsmount, nm_next_mount); /* Unmount the device itself when we're the last one */ nandfs_unmount_device(nandfsdev); free_nandfs_mountinfo(mp); /* * Finally, throw away the null_mount structure */ mp->mnt_data = 0; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_LOCAL; MNT_IUNLOCK(mp); return (0); } static int nandfs_statfs(struct mount *mp, struct statfs *sbp) { struct nandfsmount *nmp; struct nandfs_device *nandfsdev; struct nandfs_fsdata *fsdata; struct nandfs_super_block *sb; struct nandfs_block_group_desc *groups; struct nandfs_node *ifile; struct nandfs_mdt *mdt; struct buf *bp; int i, error; uint32_t entries_per_group; uint64_t files = 0; nmp = mp->mnt_data; nandfsdev = nmp->nm_nandfsdev; fsdata = &nandfsdev->nd_fsdata; sb = &nandfsdev->nd_super; ifile = nmp->nm_ifile_node; mdt = &nandfsdev->nd_ifile_mdt; entries_per_group = mdt->entries_per_group; VOP_LOCK(NTOV(ifile), LK_SHARED); error = nandfs_bread(ifile, 0, NOCRED, 0, &bp); if (error) { brelse(bp); VOP_UNLOCK(NTOV(ifile), 0); return (error); } groups = (struct nandfs_block_group_desc *)bp->b_data; for (i = 0; i < mdt->groups_per_desc_block; i++) files += (entries_per_group - groups[i].bg_nfrees); brelse(bp); VOP_UNLOCK(NTOV(ifile), 0); sbp->f_bsize = nandfsdev->nd_blocksize; sbp->f_iosize = sbp->f_bsize; sbp->f_blocks = fsdata->f_blocks_per_segment * fsdata->f_nsegments; sbp->f_bfree = sb->s_free_blocks_count; sbp->f_bavail = sbp->f_bfree; sbp->f_files = files; sbp->f_ffree = 0; return (0); } static int nandfs_root(struct mount *mp, int flags, struct vnode **vpp) { struct nandfsmount *nmp = VFSTONANDFS(mp); struct nandfs_node *node; int error; error = nandfs_get_node(nmp, NANDFS_ROOT_INO, &node); if (error) return (error); KASSERT(NTOV(node)->v_vflag & VV_ROOT, ("root_vp->v_vflag & VV_ROOT")); *vpp = NTOV(node); return (error); } static int nandfs_vget(struct mount *mp, ino_t ino, int flags, struct vnode **vpp) { struct nandfsmount *nmp = VFSTONANDFS(mp); struct nandfs_node *node; int error; error = nandfs_get_node(nmp, ino, &node); if (node) *vpp = NTOV(node); return (error); } static int nandfs_sync(struct mount *mp, int waitfor) { struct nandfsmount *nmp = VFSTONANDFS(mp); DPRINTF(SYNC, ("%s: mp %p waitfor %d\n", __func__, mp, waitfor)); /* * XXX: A hack to be removed soon */ if (waitfor == MNT_LAZY) return (0); if (waitfor == MNT_SUSPEND) return (0); nandfs_wakeup_wait_sync(nmp->nm_nandfsdev, SYNCER_VFS_SYNC); return (0); } static struct vfsops nandfs_vfsops = { .vfs_init = nandfs_init, .vfs_mount = nandfs_mount, .vfs_root = nandfs_root, .vfs_statfs = nandfs_statfs, .vfs_uninit = nandfs_uninit, .vfs_unmount = nandfs_unmount, .vfs_vget = nandfs_vget, .vfs_sync = nandfs_sync, }; VFS_SET(nandfs_vfsops, nandfs, VFCF_LOOPBACK); Index: user/ngie/more-tests/sys/fs/nfsclient/nfs_clvfsops.c =================================================================== --- user/ngie/more-tests/sys/fs/nfsclient/nfs_clvfsops.c (revision 281584) +++ user/ngie/more-tests/sys/fs/nfsclient/nfs_clvfsops.c (revision 281585) @@ -1,1876 +1,1877 @@ /*- * Copyright (c) 1989, 1993, 1995 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Rick Macklem at The University of Guelph. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from nfs_vfsops.c 8.12 (Berkeley) 5/20/95 */ #include __FBSDID("$FreeBSD$"); #include "opt_bootp.h" #include "opt_nfsroot.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include FEATURE(nfscl, "NFSv4 client"); extern int nfscl_ticks; extern struct timeval nfsboottime; extern struct nfsstats newnfsstats; extern int nfsrv_useacl; extern int nfscl_debuglevel; extern enum nfsiod_state ncl_iodwant[NFS_MAXASYNCDAEMON]; extern struct nfsmount *ncl_iodmount[NFS_MAXASYNCDAEMON]; extern struct mtx ncl_iod_mutex; NFSCLSTATEMUTEX; MALLOC_DEFINE(M_NEWNFSREQ, "newnfsclient_req", "New NFS request header"); MALLOC_DEFINE(M_NEWNFSMNT, "newnfsmnt", "New NFS mount struct"); SYSCTL_DECL(_vfs_nfs); static int nfs_ip_paranoia = 1; SYSCTL_INT(_vfs_nfs, OID_AUTO, nfs_ip_paranoia, CTLFLAG_RW, &nfs_ip_paranoia, 0, ""); static int nfs_tprintf_initial_delay = NFS_TPRINTF_INITIAL_DELAY; SYSCTL_INT(_vfs_nfs, NFS_TPRINTF_INITIAL_DELAY, downdelayinitial, CTLFLAG_RW, &nfs_tprintf_initial_delay, 0, ""); /* how long between console messages "nfs server foo not responding" */ static int nfs_tprintf_delay = NFS_TPRINTF_DELAY; SYSCTL_INT(_vfs_nfs, NFS_TPRINTF_DELAY, downdelayinterval, CTLFLAG_RW, &nfs_tprintf_delay, 0, ""); #ifdef NFS_DEBUG int nfs_debug; SYSCTL_INT(_vfs_nfs, OID_AUTO, debug, CTLFLAG_RW, &nfs_debug, 0, "Toggle debug flag"); #endif static int nfs_mountroot(struct mount *); static void nfs_sec_name(char *, int *); static void nfs_decode_args(struct mount *mp, struct nfsmount *nmp, struct nfs_args *argp, const char *, struct ucred *, struct thread *); static int mountnfs(struct nfs_args *, struct mount *, struct sockaddr *, char *, u_char *, int, u_char *, int, u_char *, int, struct vnode **, struct ucred *, struct thread *, int, int, int); static void nfs_getnlminfo(struct vnode *, uint8_t *, size_t *, struct sockaddr_storage *, int *, off_t *, struct timeval *); static vfs_mount_t nfs_mount; static vfs_cmount_t nfs_cmount; static vfs_unmount_t nfs_unmount; static vfs_root_t nfs_root; static vfs_statfs_t nfs_statfs; static vfs_sync_t nfs_sync; static vfs_sysctl_t nfs_sysctl; static vfs_purge_t nfs_purge; /* * nfs vfs operations. */ static struct vfsops nfs_vfsops = { .vfs_init = ncl_init, .vfs_mount = nfs_mount, .vfs_cmount = nfs_cmount, .vfs_root = nfs_root, .vfs_statfs = nfs_statfs, .vfs_sync = nfs_sync, .vfs_uninit = ncl_uninit, .vfs_unmount = nfs_unmount, .vfs_sysctl = nfs_sysctl, .vfs_purge = nfs_purge, }; VFS_SET(nfs_vfsops, nfs, VFCF_NETWORK | VFCF_SBDRY); /* So that loader and kldload(2) can find us, wherever we are.. */ MODULE_VERSION(nfs, 1); MODULE_DEPEND(nfs, nfscommon, 1, 1, 1); MODULE_DEPEND(nfs, krpc, 1, 1, 1); MODULE_DEPEND(nfs, nfssvc, 1, 1, 1); MODULE_DEPEND(nfs, nfslock, 1, 1, 1); /* * This structure is now defined in sys/nfs/nfs_diskless.c so that it * can be shared by both NFS clients. It is declared here so that it * will be defined for kernels built without NFS_ROOT, although it * isn't used in that case. */ #if !defined(NFS_ROOT) struct nfs_diskless nfs_diskless = { { { 0 } } }; struct nfsv3_diskless nfsv3_diskless = { { { 0 } } }; int nfs_diskless_valid = 0; #endif SYSCTL_INT(_vfs_nfs, OID_AUTO, diskless_valid, CTLFLAG_RD, &nfs_diskless_valid, 0, "Has the diskless struct been filled correctly"); SYSCTL_STRING(_vfs_nfs, OID_AUTO, diskless_rootpath, CTLFLAG_RD, nfsv3_diskless.root_hostnam, 0, "Path to nfs root"); SYSCTL_OPAQUE(_vfs_nfs, OID_AUTO, diskless_rootaddr, CTLFLAG_RD, &nfsv3_diskless.root_saddr, sizeof(nfsv3_diskless.root_saddr), "%Ssockaddr_in", "Diskless root nfs address"); void newnfsargs_ntoh(struct nfs_args *); static int nfs_mountdiskless(char *, struct sockaddr_in *, struct nfs_args *, struct thread *, struct vnode **, struct mount *); static void nfs_convert_diskless(void); static void nfs_convert_oargs(struct nfs_args *args, struct onfs_args *oargs); int newnfs_iosize(struct nfsmount *nmp) { int iosize, maxio; /* First, set the upper limit for iosize */ if (nmp->nm_flag & NFSMNT_NFSV4) { maxio = NFS_MAXBSIZE; } else if (nmp->nm_flag & NFSMNT_NFSV3) { if (nmp->nm_sotype == SOCK_DGRAM) maxio = NFS_MAXDGRAMDATA; else maxio = NFS_MAXBSIZE; } else { maxio = NFS_V2MAXDATA; } if (nmp->nm_rsize > maxio || nmp->nm_rsize == 0) nmp->nm_rsize = maxio; if (nmp->nm_rsize > MAXBSIZE) nmp->nm_rsize = MAXBSIZE; if (nmp->nm_readdirsize > maxio || nmp->nm_readdirsize == 0) nmp->nm_readdirsize = maxio; if (nmp->nm_readdirsize > nmp->nm_rsize) nmp->nm_readdirsize = nmp->nm_rsize; if (nmp->nm_wsize > maxio || nmp->nm_wsize == 0) nmp->nm_wsize = maxio; if (nmp->nm_wsize > MAXBSIZE) nmp->nm_wsize = MAXBSIZE; /* * Calculate the size used for io buffers. Use the larger * of the two sizes to minimise nfs requests but make sure * that it is at least one VM page to avoid wasting buffer * space. */ iosize = imax(nmp->nm_rsize, nmp->nm_wsize); iosize = imax(iosize, PAGE_SIZE); nmp->nm_mountp->mnt_stat.f_iosize = iosize; return (iosize); } static void nfs_convert_oargs(struct nfs_args *args, struct onfs_args *oargs) { args->version = NFS_ARGSVERSION; args->addr = oargs->addr; args->addrlen = oargs->addrlen; args->sotype = oargs->sotype; args->proto = oargs->proto; args->fh = oargs->fh; args->fhsize = oargs->fhsize; args->flags = oargs->flags; args->wsize = oargs->wsize; args->rsize = oargs->rsize; args->readdirsize = oargs->readdirsize; args->timeo = oargs->timeo; args->retrans = oargs->retrans; args->readahead = oargs->readahead; args->hostname = oargs->hostname; } static void nfs_convert_diskless(void) { bcopy(&nfs_diskless.myif, &nfsv3_diskless.myif, sizeof(struct ifaliasreq)); bcopy(&nfs_diskless.mygateway, &nfsv3_diskless.mygateway, sizeof(struct sockaddr_in)); nfs_convert_oargs(&nfsv3_diskless.root_args,&nfs_diskless.root_args); if (nfsv3_diskless.root_args.flags & NFSMNT_NFSV3) { nfsv3_diskless.root_fhsize = NFSX_MYFH; bcopy(nfs_diskless.root_fh, nfsv3_diskless.root_fh, NFSX_MYFH); } else { nfsv3_diskless.root_fhsize = NFSX_V2FH; bcopy(nfs_diskless.root_fh, nfsv3_diskless.root_fh, NFSX_V2FH); } bcopy(&nfs_diskless.root_saddr,&nfsv3_diskless.root_saddr, sizeof(struct sockaddr_in)); bcopy(nfs_diskless.root_hostnam, nfsv3_diskless.root_hostnam, MNAMELEN); nfsv3_diskless.root_time = nfs_diskless.root_time; bcopy(nfs_diskless.my_hostnam, nfsv3_diskless.my_hostnam, MAXHOSTNAMELEN); nfs_diskless_valid = 3; } /* * nfs statfs call */ static int nfs_statfs(struct mount *mp, struct statfs *sbp) { struct vnode *vp; struct thread *td; struct nfsmount *nmp = VFSTONFS(mp); struct nfsvattr nfsva; struct nfsfsinfo fs; struct nfsstatfs sb; int error = 0, attrflag, gotfsinfo = 0, ret; struct nfsnode *np; td = curthread; error = vfs_busy(mp, MBF_NOWAIT); if (error) return (error); error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np, LK_EXCLUSIVE); if (error) { vfs_unbusy(mp); return (error); } vp = NFSTOV(np); mtx_lock(&nmp->nm_mtx); if (NFSHASNFSV3(nmp) && !NFSHASGOTFSINFO(nmp)) { mtx_unlock(&nmp->nm_mtx); error = nfsrpc_fsinfo(vp, &fs, td->td_ucred, td, &nfsva, &attrflag, NULL); if (!error) gotfsinfo = 1; } else mtx_unlock(&nmp->nm_mtx); if (!error) error = nfsrpc_statfs(vp, &sb, &fs, td->td_ucred, td, &nfsva, &attrflag, NULL); if (error != 0) NFSCL_DEBUG(2, "statfs=%d\n", error); if (attrflag == 0) { ret = nfsrpc_getattrnovp(nmp, nmp->nm_fh, nmp->nm_fhsize, 1, td->td_ucred, td, &nfsva, NULL, NULL); if (ret) { /* * Just set default values to get things going. */ NFSBZERO((caddr_t)&nfsva, sizeof (struct nfsvattr)); nfsva.na_vattr.va_type = VDIR; nfsva.na_vattr.va_mode = 0777; nfsva.na_vattr.va_nlink = 100; nfsva.na_vattr.va_uid = (uid_t)0; nfsva.na_vattr.va_gid = (gid_t)0; nfsva.na_vattr.va_fileid = 2; nfsva.na_vattr.va_gen = 1; nfsva.na_vattr.va_blocksize = NFS_FABLKSIZE; nfsva.na_vattr.va_size = 512 * 1024; } } (void) nfscl_loadattrcache(&vp, &nfsva, NULL, NULL, 0, 1); if (!error) { mtx_lock(&nmp->nm_mtx); if (gotfsinfo || (nmp->nm_flag & NFSMNT_NFSV4)) nfscl_loadfsinfo(nmp, &fs); nfscl_loadsbinfo(nmp, &sb, sbp); sbp->f_iosize = newnfs_iosize(nmp); mtx_unlock(&nmp->nm_mtx); if (sbp != &mp->mnt_stat) { bcopy(mp->mnt_stat.f_mntonname, sbp->f_mntonname, MNAMELEN); bcopy(mp->mnt_stat.f_mntfromname, sbp->f_mntfromname, MNAMELEN); } strncpy(&sbp->f_fstypename[0], mp->mnt_vfc->vfc_name, MFSNAMELEN); } else if (NFS_ISV4(vp)) { error = nfscl_maperr(td, error, (uid_t)0, (gid_t)0); } vput(vp); vfs_unbusy(mp); return (error); } /* * nfs version 3 fsinfo rpc call */ int ncl_fsinfo(struct nfsmount *nmp, struct vnode *vp, struct ucred *cred, struct thread *td) { struct nfsfsinfo fs; struct nfsvattr nfsva; int error, attrflag; error = nfsrpc_fsinfo(vp, &fs, cred, td, &nfsva, &attrflag, NULL); if (!error) { if (attrflag) (void) nfscl_loadattrcache(&vp, &nfsva, NULL, NULL, 0, 1); mtx_lock(&nmp->nm_mtx); nfscl_loadfsinfo(nmp, &fs); mtx_unlock(&nmp->nm_mtx); } return (error); } /* * Mount a remote root fs via. nfs. This depends on the info in the * nfs_diskless structure that has been filled in properly by some primary * bootstrap. * It goes something like this: * - do enough of "ifconfig" by calling ifioctl() so that the system * can talk to the server * - If nfs_diskless.mygateway is filled in, use that address as * a default gateway. * - build the rootfs mount point and call mountnfs() to do the rest. * * It is assumed to be safe to read, modify, and write the nfsv3_diskless * structure, as well as other global NFS client variables here, as * nfs_mountroot() will be called once in the boot before any other NFS * client activity occurs. */ static int nfs_mountroot(struct mount *mp) { struct thread *td = curthread; struct nfsv3_diskless *nd = &nfsv3_diskless; struct socket *so; struct vnode *vp; struct ifreq ir; int error; u_long l; char buf[128]; char *cp; #if defined(BOOTP_NFSROOT) && defined(BOOTP) bootpc_init(); /* use bootp to get nfs_diskless filled in */ #elif defined(NFS_ROOT) nfs_setup_diskless(); #endif if (nfs_diskless_valid == 0) return (-1); if (nfs_diskless_valid == 1) nfs_convert_diskless(); /* * XXX splnet, so networks will receive... */ splnet(); /* * Do enough of ifconfig(8) so that the critical net interface can * talk to the server. */ error = socreate(nd->myif.ifra_addr.sa_family, &so, nd->root_args.sotype, 0, td->td_ucred, td); if (error) panic("nfs_mountroot: socreate(%04x): %d", nd->myif.ifra_addr.sa_family, error); #if 0 /* XXX Bad idea */ /* * We might not have been told the right interface, so we pass * over the first ten interfaces of the same kind, until we get * one of them configured. */ for (i = strlen(nd->myif.ifra_name) - 1; nd->myif.ifra_name[i] >= '0' && nd->myif.ifra_name[i] <= '9'; nd->myif.ifra_name[i] ++) { error = ifioctl(so, SIOCAIFADDR, (caddr_t)&nd->myif, td); if(!error) break; } #endif error = ifioctl(so, SIOCAIFADDR, (caddr_t)&nd->myif, td); if (error) panic("nfs_mountroot: SIOCAIFADDR: %d", error); if ((cp = kern_getenv("boot.netif.mtu")) != NULL) { ir.ifr_mtu = strtol(cp, NULL, 10); bcopy(nd->myif.ifra_name, ir.ifr_name, IFNAMSIZ); freeenv(cp); error = ifioctl(so, SIOCSIFMTU, (caddr_t)&ir, td); if (error) printf("nfs_mountroot: SIOCSIFMTU: %d", error); } soclose(so); /* * If the gateway field is filled in, set it as the default route. * Note that pxeboot will set a default route of 0 if the route * is not set by the DHCP server. Check also for a value of 0 * to avoid panicking inappropriately in that situation. */ if (nd->mygateway.sin_len != 0 && nd->mygateway.sin_addr.s_addr != 0) { struct sockaddr_in mask, sin; bzero((caddr_t)&mask, sizeof(mask)); sin = mask; sin.sin_family = AF_INET; sin.sin_len = sizeof(sin); /* XXX MRT use table 0 for this sort of thing */ CURVNET_SET(TD_TO_VNET(td)); error = rtrequest_fib(RTM_ADD, (struct sockaddr *)&sin, (struct sockaddr *)&nd->mygateway, (struct sockaddr *)&mask, RTF_UP | RTF_GATEWAY, NULL, RT_DEFAULT_FIB); CURVNET_RESTORE(); if (error) panic("nfs_mountroot: RTM_ADD: %d", error); } /* * Create the rootfs mount point. */ nd->root_args.fh = nd->root_fh; nd->root_args.fhsize = nd->root_fhsize; l = ntohl(nd->root_saddr.sin_addr.s_addr); snprintf(buf, sizeof(buf), "%ld.%ld.%ld.%ld:%s", (l >> 24) & 0xff, (l >> 16) & 0xff, (l >> 8) & 0xff, (l >> 0) & 0xff, nd->root_hostnam); printf("NFS ROOT: %s\n", buf); nd->root_args.hostname = buf; if ((error = nfs_mountdiskless(buf, &nd->root_saddr, &nd->root_args, td, &vp, mp)) != 0) { return (error); } /* * This is not really an nfs issue, but it is much easier to * set hostname here and then let the "/etc/rc.xxx" files * mount the right /var based upon its preset value. */ mtx_lock(&prison0.pr_mtx); strlcpy(prison0.pr_hostname, nd->my_hostnam, sizeof(prison0.pr_hostname)); mtx_unlock(&prison0.pr_mtx); inittodr(ntohl(nd->root_time)); return (0); } /* * Internal version of mount system call for diskless setup. */ static int nfs_mountdiskless(char *path, struct sockaddr_in *sin, struct nfs_args *args, struct thread *td, struct vnode **vpp, struct mount *mp) { struct sockaddr *nam; int dirlen, error; char *dirpath; /* * Find the directory path in "path", which also has the server's * name/ip address in it. */ dirpath = strchr(path, ':'); if (dirpath != NULL) dirlen = strlen(++dirpath); else dirlen = 0; nam = sodupsockaddr((struct sockaddr *)sin, M_WAITOK); if ((error = mountnfs(args, mp, nam, path, NULL, 0, dirpath, dirlen, NULL, 0, vpp, td->td_ucred, td, NFS_DEFAULT_NAMETIMEO, NFS_DEFAULT_NEGNAMETIMEO, 0)) != 0) { printf("nfs_mountroot: mount %s on /: %d\n", path, error); return (error); } return (0); } static void nfs_sec_name(char *sec, int *flagsp) { if (!strcmp(sec, "krb5")) *flagsp |= NFSMNT_KERB; else if (!strcmp(sec, "krb5i")) *flagsp |= (NFSMNT_KERB | NFSMNT_INTEGRITY); else if (!strcmp(sec, "krb5p")) *flagsp |= (NFSMNT_KERB | NFSMNT_PRIVACY); } static void nfs_decode_args(struct mount *mp, struct nfsmount *nmp, struct nfs_args *argp, const char *hostname, struct ucred *cred, struct thread *td) { int s; int adjsock; char *p; s = splnet(); /* * Set read-only flag if requested; otherwise, clear it if this is * an update. If this is not an update, then either the read-only * flag is already clear, or this is a root mount and it was set * intentionally at some previous point. */ if (vfs_getopt(mp->mnt_optnew, "ro", NULL, NULL) == 0) { MNT_ILOCK(mp); mp->mnt_flag |= MNT_RDONLY; MNT_IUNLOCK(mp); } else if (mp->mnt_flag & MNT_UPDATE) { MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_RDONLY; MNT_IUNLOCK(mp); } /* * Silently clear NFSMNT_NOCONN if it's a TCP mount, it makes * no sense in that context. Also, set up appropriate retransmit * and soft timeout behavior. */ if (argp->sotype == SOCK_STREAM) { nmp->nm_flag &= ~NFSMNT_NOCONN; nmp->nm_timeo = NFS_MAXTIMEO; if ((argp->flags & NFSMNT_NFSV4) != 0) nmp->nm_retry = INT_MAX; else nmp->nm_retry = NFS_RETRANS_TCP; } /* Also clear RDIRPLUS if NFSv2, it crashes some servers */ if ((argp->flags & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0) { argp->flags &= ~NFSMNT_RDIRPLUS; nmp->nm_flag &= ~NFSMNT_RDIRPLUS; } /* Re-bind if rsrvd port requested and wasn't on one */ adjsock = !(nmp->nm_flag & NFSMNT_RESVPORT) && (argp->flags & NFSMNT_RESVPORT); /* Also re-bind if we're switching to/from a connected UDP socket */ adjsock |= ((nmp->nm_flag & NFSMNT_NOCONN) != (argp->flags & NFSMNT_NOCONN)); /* Update flags atomically. Don't change the lock bits. */ nmp->nm_flag = argp->flags | nmp->nm_flag; splx(s); if ((argp->flags & NFSMNT_TIMEO) && argp->timeo > 0) { nmp->nm_timeo = (argp->timeo * NFS_HZ + 5) / 10; if (nmp->nm_timeo < NFS_MINTIMEO) nmp->nm_timeo = NFS_MINTIMEO; else if (nmp->nm_timeo > NFS_MAXTIMEO) nmp->nm_timeo = NFS_MAXTIMEO; } if ((argp->flags & NFSMNT_RETRANS) && argp->retrans > 1) { nmp->nm_retry = argp->retrans; if (nmp->nm_retry > NFS_MAXREXMIT) nmp->nm_retry = NFS_MAXREXMIT; } if ((argp->flags & NFSMNT_WSIZE) && argp->wsize > 0) { nmp->nm_wsize = argp->wsize; /* * Clip at the power of 2 below the size. There is an * issue (not isolated) that causes intermittent page * faults if this is not done. */ if (nmp->nm_wsize > NFS_FABLKSIZE) nmp->nm_wsize = 1 << (fls(nmp->nm_wsize) - 1); else nmp->nm_wsize = NFS_FABLKSIZE; } if ((argp->flags & NFSMNT_RSIZE) && argp->rsize > 0) { nmp->nm_rsize = argp->rsize; /* * Clip at the power of 2 below the size. There is an * issue (not isolated) that causes intermittent page * faults if this is not done. */ if (nmp->nm_rsize > NFS_FABLKSIZE) nmp->nm_rsize = 1 << (fls(nmp->nm_rsize) - 1); else nmp->nm_rsize = NFS_FABLKSIZE; } if ((argp->flags & NFSMNT_READDIRSIZE) && argp->readdirsize > 0) { nmp->nm_readdirsize = argp->readdirsize; } if ((argp->flags & NFSMNT_ACREGMIN) && argp->acregmin >= 0) nmp->nm_acregmin = argp->acregmin; else nmp->nm_acregmin = NFS_MINATTRTIMO; if ((argp->flags & NFSMNT_ACREGMAX) && argp->acregmax >= 0) nmp->nm_acregmax = argp->acregmax; else nmp->nm_acregmax = NFS_MAXATTRTIMO; if ((argp->flags & NFSMNT_ACDIRMIN) && argp->acdirmin >= 0) nmp->nm_acdirmin = argp->acdirmin; else nmp->nm_acdirmin = NFS_MINDIRATTRTIMO; if ((argp->flags & NFSMNT_ACDIRMAX) && argp->acdirmax >= 0) nmp->nm_acdirmax = argp->acdirmax; else nmp->nm_acdirmax = NFS_MAXDIRATTRTIMO; if (nmp->nm_acdirmin > nmp->nm_acdirmax) nmp->nm_acdirmin = nmp->nm_acdirmax; if (nmp->nm_acregmin > nmp->nm_acregmax) nmp->nm_acregmin = nmp->nm_acregmax; if ((argp->flags & NFSMNT_READAHEAD) && argp->readahead >= 0) { if (argp->readahead <= NFS_MAXRAHEAD) nmp->nm_readahead = argp->readahead; else nmp->nm_readahead = NFS_MAXRAHEAD; } if ((argp->flags & NFSMNT_WCOMMITSIZE) && argp->wcommitsize >= 0) { if (argp->wcommitsize < nmp->nm_wsize) nmp->nm_wcommitsize = nmp->nm_wsize; else nmp->nm_wcommitsize = argp->wcommitsize; } adjsock |= ((nmp->nm_sotype != argp->sotype) || (nmp->nm_soproto != argp->proto)); if (nmp->nm_client != NULL && adjsock) { int haslock = 0, error = 0; if (nmp->nm_sotype == SOCK_STREAM) { error = newnfs_sndlock(&nmp->nm_sockreq.nr_lock); if (!error) haslock = 1; } if (!error) { newnfs_disconnect(&nmp->nm_sockreq); if (haslock) newnfs_sndunlock(&nmp->nm_sockreq.nr_lock); nmp->nm_sotype = argp->sotype; nmp->nm_soproto = argp->proto; if (nmp->nm_sotype == SOCK_DGRAM) while (newnfs_connect(nmp, &nmp->nm_sockreq, cred, td, 0)) { printf("newnfs_args: retrying connect\n"); (void) nfs_catnap(PSOCK, 0, "nfscon"); } } } else { nmp->nm_sotype = argp->sotype; nmp->nm_soproto = argp->proto; } if (hostname != NULL) { strlcpy(nmp->nm_hostname, hostname, sizeof(nmp->nm_hostname)); p = strchr(nmp->nm_hostname, ':'); if (p != NULL) *p = '\0'; } } static const char *nfs_opts[] = { "from", "nfs_args", "noac", "noatime", "noexec", "suiddir", "nosuid", "nosymfollow", "union", "noclusterr", "noclusterw", "multilabel", "acls", "force", "update", "async", "noconn", "nolockd", "conn", "lockd", "intr", "rdirplus", "readdirsize", "soft", "hard", "mntudp", "tcp", "udp", "wsize", "rsize", "retrans", "actimeo", "acregmin", "acregmax", "acdirmin", "acdirmax", "resvport", "readahead", "hostname", "timeo", "timeout", "addr", "fh", "nfsv3", "sec", "principal", "nfsv4", "gssname", "allgssname", "dirpath", "minorversion", "nametimeo", "negnametimeo", "nocto", "noncontigwr", "pnfs", "wcommitsize", NULL }; /* * VFS Operations. * * mount system call * It seems a bit dumb to copyinstr() the host and path here and then * bcopy() them in mountnfs(), but I wanted to detect errors before * doing the sockargs() call because sockargs() allocates an mbuf and * an error after that means that I have to release the mbuf. */ /* ARGSUSED */ static int nfs_mount(struct mount *mp) { struct nfs_args args = { .version = NFS_ARGSVERSION, .addr = NULL, .addrlen = sizeof (struct sockaddr_in), .sotype = SOCK_STREAM, .proto = 0, .fh = NULL, .fhsize = 0, .flags = NFSMNT_RESVPORT, .wsize = NFS_WSIZE, .rsize = NFS_RSIZE, .readdirsize = NFS_READDIRSIZE, .timeo = 10, .retrans = NFS_RETRANS, .readahead = NFS_DEFRAHEAD, .wcommitsize = 0, /* was: NQ_DEFLEASE */ .hostname = NULL, .acregmin = NFS_MINATTRTIMO, .acregmax = NFS_MAXATTRTIMO, .acdirmin = NFS_MINDIRATTRTIMO, .acdirmax = NFS_MAXDIRATTRTIMO, }; int error = 0, ret, len; struct sockaddr *nam = NULL; struct vnode *vp; struct thread *td; char hst[MNAMELEN]; u_char nfh[NFSX_FHMAX], krbname[100], dirpath[100], srvkrbname[100]; char *opt, *name, *secname; int nametimeo = NFS_DEFAULT_NAMETIMEO; int negnametimeo = NFS_DEFAULT_NEGNAMETIMEO; int minvers = 0; int dirlen, has_nfs_args_opt, krbnamelen, srvkrbnamelen; size_t hstlen; has_nfs_args_opt = 0; if (vfs_filteropt(mp->mnt_optnew, nfs_opts)) { error = EINVAL; goto out; } td = curthread; if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) { error = nfs_mountroot(mp); goto out; } nfscl_init(); /* * The old mount_nfs program passed the struct nfs_args * from userspace to kernel. The new mount_nfs program * passes string options via nmount() from userspace to kernel * and we populate the struct nfs_args in the kernel. */ if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) { error = vfs_copyopt(mp->mnt_optnew, "nfs_args", &args, sizeof(args)); if (error != 0) goto out; if (args.version != NFS_ARGSVERSION) { error = EPROGMISMATCH; goto out; } has_nfs_args_opt = 1; } /* Handle the new style options. */ if (vfs_getopt(mp->mnt_optnew, "noac", NULL, NULL) == 0) { args.acdirmin = args.acdirmax = args.acregmin = args.acregmax = 0; args.flags |= NFSMNT_ACDIRMIN | NFSMNT_ACDIRMAX | NFSMNT_ACREGMIN | NFSMNT_ACREGMAX; } if (vfs_getopt(mp->mnt_optnew, "noconn", NULL, NULL) == 0) args.flags |= NFSMNT_NOCONN; if (vfs_getopt(mp->mnt_optnew, "conn", NULL, NULL) == 0) args.flags &= ~NFSMNT_NOCONN; if (vfs_getopt(mp->mnt_optnew, "nolockd", NULL, NULL) == 0) args.flags |= NFSMNT_NOLOCKD; if (vfs_getopt(mp->mnt_optnew, "lockd", NULL, NULL) == 0) args.flags &= ~NFSMNT_NOLOCKD; if (vfs_getopt(mp->mnt_optnew, "intr", NULL, NULL) == 0) args.flags |= NFSMNT_INT; if (vfs_getopt(mp->mnt_optnew, "rdirplus", NULL, NULL) == 0) args.flags |= NFSMNT_RDIRPLUS; if (vfs_getopt(mp->mnt_optnew, "resvport", NULL, NULL) == 0) args.flags |= NFSMNT_RESVPORT; if (vfs_getopt(mp->mnt_optnew, "noresvport", NULL, NULL) == 0) args.flags &= ~NFSMNT_RESVPORT; if (vfs_getopt(mp->mnt_optnew, "soft", NULL, NULL) == 0) args.flags |= NFSMNT_SOFT; if (vfs_getopt(mp->mnt_optnew, "hard", NULL, NULL) == 0) args.flags &= ~NFSMNT_SOFT; if (vfs_getopt(mp->mnt_optnew, "mntudp", NULL, NULL) == 0) args.sotype = SOCK_DGRAM; if (vfs_getopt(mp->mnt_optnew, "udp", NULL, NULL) == 0) args.sotype = SOCK_DGRAM; if (vfs_getopt(mp->mnt_optnew, "tcp", NULL, NULL) == 0) args.sotype = SOCK_STREAM; if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0) args.flags |= NFSMNT_NFSV3; if (vfs_getopt(mp->mnt_optnew, "nfsv4", NULL, NULL) == 0) { args.flags |= NFSMNT_NFSV4; args.sotype = SOCK_STREAM; } if (vfs_getopt(mp->mnt_optnew, "allgssname", NULL, NULL) == 0) args.flags |= NFSMNT_ALLGSSNAME; if (vfs_getopt(mp->mnt_optnew, "nocto", NULL, NULL) == 0) args.flags |= NFSMNT_NOCTO; if (vfs_getopt(mp->mnt_optnew, "noncontigwr", NULL, NULL) == 0) args.flags |= NFSMNT_NONCONTIGWR; if (vfs_getopt(mp->mnt_optnew, "pnfs", NULL, NULL) == 0) args.flags |= NFSMNT_PNFS; if (vfs_getopt(mp->mnt_optnew, "readdirsize", (void **)&opt, NULL) == 0) { if (opt == NULL) { vfs_mount_error(mp, "illegal readdirsize"); error = EINVAL; goto out; } ret = sscanf(opt, "%d", &args.readdirsize); if (ret != 1 || args.readdirsize <= 0) { vfs_mount_error(mp, "illegal readdirsize: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_READDIRSIZE; } if (vfs_getopt(mp->mnt_optnew, "readahead", (void **)&opt, NULL) == 0) { if (opt == NULL) { vfs_mount_error(mp, "illegal readahead"); error = EINVAL; goto out; } ret = sscanf(opt, "%d", &args.readahead); if (ret != 1 || args.readahead <= 0) { vfs_mount_error(mp, "illegal readahead: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_READAHEAD; } if (vfs_getopt(mp->mnt_optnew, "wsize", (void **)&opt, NULL) == 0) { if (opt == NULL) { vfs_mount_error(mp, "illegal wsize"); error = EINVAL; goto out; } ret = sscanf(opt, "%d", &args.wsize); if (ret != 1 || args.wsize <= 0) { vfs_mount_error(mp, "illegal wsize: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_WSIZE; } if (vfs_getopt(mp->mnt_optnew, "rsize", (void **)&opt, NULL) == 0) { if (opt == NULL) { vfs_mount_error(mp, "illegal rsize"); error = EINVAL; goto out; } ret = sscanf(opt, "%d", &args.rsize); if (ret != 1 || args.rsize <= 0) { vfs_mount_error(mp, "illegal wsize: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_RSIZE; } if (vfs_getopt(mp->mnt_optnew, "retrans", (void **)&opt, NULL) == 0) { if (opt == NULL) { vfs_mount_error(mp, "illegal retrans"); error = EINVAL; goto out; } ret = sscanf(opt, "%d", &args.retrans); if (ret != 1 || args.retrans <= 0) { vfs_mount_error(mp, "illegal retrans: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_RETRANS; } if (vfs_getopt(mp->mnt_optnew, "actimeo", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.acregmin); if (ret != 1 || args.acregmin < 0) { vfs_mount_error(mp, "illegal actimeo: %s", opt); error = EINVAL; goto out; } args.acdirmin = args.acdirmax = args.acregmax = args.acregmin; args.flags |= NFSMNT_ACDIRMIN | NFSMNT_ACDIRMAX | NFSMNT_ACREGMIN | NFSMNT_ACREGMAX; } if (vfs_getopt(mp->mnt_optnew, "acregmin", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.acregmin); if (ret != 1 || args.acregmin < 0) { vfs_mount_error(mp, "illegal acregmin: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_ACREGMIN; } if (vfs_getopt(mp->mnt_optnew, "acregmax", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.acregmax); if (ret != 1 || args.acregmax < 0) { vfs_mount_error(mp, "illegal acregmax: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_ACREGMAX; } if (vfs_getopt(mp->mnt_optnew, "acdirmin", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.acdirmin); if (ret != 1 || args.acdirmin < 0) { vfs_mount_error(mp, "illegal acdirmin: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_ACDIRMIN; } if (vfs_getopt(mp->mnt_optnew, "acdirmax", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.acdirmax); if (ret != 1 || args.acdirmax < 0) { vfs_mount_error(mp, "illegal acdirmax: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_ACDIRMAX; } if (vfs_getopt(mp->mnt_optnew, "wcommitsize", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.wcommitsize); if (ret != 1 || args.wcommitsize < 0) { vfs_mount_error(mp, "illegal wcommitsize: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_WCOMMITSIZE; } if (vfs_getopt(mp->mnt_optnew, "timeo", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.timeo); if (ret != 1 || args.timeo <= 0) { vfs_mount_error(mp, "illegal timeo: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_TIMEO; } if (vfs_getopt(mp->mnt_optnew, "timeout", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &args.timeo); if (ret != 1 || args.timeo <= 0) { vfs_mount_error(mp, "illegal timeout: %s", opt); error = EINVAL; goto out; } args.flags |= NFSMNT_TIMEO; } if (vfs_getopt(mp->mnt_optnew, "nametimeo", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &nametimeo); if (ret != 1 || nametimeo < 0) { vfs_mount_error(mp, "illegal nametimeo: %s", opt); error = EINVAL; goto out; } } if (vfs_getopt(mp->mnt_optnew, "negnametimeo", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &negnametimeo); if (ret != 1 || negnametimeo < 0) { vfs_mount_error(mp, "illegal negnametimeo: %s", opt); error = EINVAL; goto out; } } if (vfs_getopt(mp->mnt_optnew, "minorversion", (void **)&opt, NULL) == 0) { ret = sscanf(opt, "%d", &minvers); if (ret != 1 || minvers < 0 || minvers > 1 || (args.flags & NFSMNT_NFSV4) == 0) { vfs_mount_error(mp, "illegal minorversion: %s", opt); error = EINVAL; goto out; } } if (vfs_getopt(mp->mnt_optnew, "sec", (void **) &secname, NULL) == 0) nfs_sec_name(secname, &args.flags); if (mp->mnt_flag & MNT_UPDATE) { struct nfsmount *nmp = VFSTONFS(mp); if (nmp == NULL) { error = EIO; goto out; } /* * If a change from TCP->UDP is done and there are thread(s) * that have I/O RPC(s) in progress with a tranfer size * greater than NFS_MAXDGRAMDATA, those thread(s) will be * hung, retrying the RPC(s) forever. Usually these threads * will be seen doing an uninterruptible sleep on wait channel * "nfsreq". */ if (args.sotype == SOCK_DGRAM && nmp->nm_sotype == SOCK_STREAM) tprintf(td->td_proc, LOG_WARNING, "Warning: mount -u that changes TCP->UDP can result in hung threads\n"); /* * When doing an update, we can't change version, * security, switch lockd strategies or change cookie * translation */ args.flags = (args.flags & ~(NFSMNT_NFSV3 | NFSMNT_NFSV4 | NFSMNT_KERB | NFSMNT_INTEGRITY | NFSMNT_PRIVACY | NFSMNT_NOLOCKD /*|NFSMNT_XLATECOOKIE*/)) | (nmp->nm_flag & (NFSMNT_NFSV3 | NFSMNT_NFSV4 | NFSMNT_KERB | NFSMNT_INTEGRITY | NFSMNT_PRIVACY | NFSMNT_NOLOCKD /*|NFSMNT_XLATECOOKIE*/)); nfs_decode_args(mp, nmp, &args, NULL, td->td_ucred, td); goto out; } /* * Make the nfs_ip_paranoia sysctl serve as the default connection * or no-connection mode for those protocols that support * no-connection mode (the flag will be cleared later for protocols * that do not support no-connection mode). This will allow a client * to receive replies from a different IP then the request was * sent to. Note: default value for nfs_ip_paranoia is 1 (paranoid), * not 0. */ if (nfs_ip_paranoia == 0) args.flags |= NFSMNT_NOCONN; if (has_nfs_args_opt != 0) { /* * In the 'nfs_args' case, the pointers in the args * structure are in userland - we copy them in here. */ if (args.fhsize < 0 || args.fhsize > NFSX_V3FHMAX) { vfs_mount_error(mp, "Bad file handle"); error = EINVAL; goto out; } error = copyin((caddr_t)args.fh, (caddr_t)nfh, args.fhsize); if (error != 0) goto out; error = copyinstr(args.hostname, hst, MNAMELEN - 1, &hstlen); if (error != 0) goto out; bzero(&hst[hstlen], MNAMELEN - hstlen); args.hostname = hst; /* sockargs() call must be after above copyin() calls */ error = getsockaddr(&nam, (caddr_t)args.addr, args.addrlen); if (error != 0) goto out; } else { if (vfs_getopt(mp->mnt_optnew, "fh", (void **)&args.fh, &args.fhsize) == 0) { if (args.fhsize < 0 || args.fhsize > NFSX_FHMAX) { vfs_mount_error(mp, "Bad file handle"); error = EINVAL; goto out; } bcopy(args.fh, nfh, args.fhsize); } else { args.fhsize = 0; } (void) vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname, &len); if (args.hostname == NULL) { vfs_mount_error(mp, "Invalid hostname"); error = EINVAL; goto out; } bcopy(args.hostname, hst, MNAMELEN); hst[MNAMELEN - 1] = '\0'; } if (vfs_getopt(mp->mnt_optnew, "principal", (void **)&name, NULL) == 0) strlcpy(srvkrbname, name, sizeof (srvkrbname)); else snprintf(srvkrbname, sizeof (srvkrbname), "nfs@%s", hst); srvkrbnamelen = strlen(srvkrbname); if (vfs_getopt(mp->mnt_optnew, "gssname", (void **)&name, NULL) == 0) strlcpy(krbname, name, sizeof (krbname)); else krbname[0] = '\0'; krbnamelen = strlen(krbname); if (vfs_getopt(mp->mnt_optnew, "dirpath", (void **)&name, NULL) == 0) strlcpy(dirpath, name, sizeof (dirpath)); else dirpath[0] = '\0'; dirlen = strlen(dirpath); if (has_nfs_args_opt == 0) { if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr, &args.addrlen) == 0) { if (args.addrlen > SOCK_MAXADDRLEN) { error = ENAMETOOLONG; goto out; } nam = malloc(args.addrlen, M_SONAME, M_WAITOK); bcopy(args.addr, nam, args.addrlen); nam->sa_len = args.addrlen; } else { vfs_mount_error(mp, "No server address"); error = EINVAL; goto out; } } args.fh = nfh; error = mountnfs(&args, mp, nam, hst, krbname, krbnamelen, dirpath, dirlen, srvkrbname, srvkrbnamelen, &vp, td->td_ucred, td, nametimeo, negnametimeo, minvers); out: if (!error) { MNT_ILOCK(mp); - mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_NO_IOPF; + mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_NO_IOPF | + MNTK_USES_BCACHE; MNT_IUNLOCK(mp); } return (error); } /* * VFS Operations. * * mount system call * It seems a bit dumb to copyinstr() the host and path here and then * bcopy() them in mountnfs(), but I wanted to detect errors before * doing the sockargs() call because sockargs() allocates an mbuf and * an error after that means that I have to release the mbuf. */ /* ARGSUSED */ static int nfs_cmount(struct mntarg *ma, void *data, uint64_t flags) { int error; struct nfs_args args; error = copyin(data, &args, sizeof (struct nfs_args)); if (error) return error; ma = mount_arg(ma, "nfs_args", &args, sizeof args); error = kernel_mount(ma, flags); return (error); } /* * Common code for mount and mountroot */ static int mountnfs(struct nfs_args *argp, struct mount *mp, struct sockaddr *nam, char *hst, u_char *krbname, int krbnamelen, u_char *dirpath, int dirlen, u_char *srvkrbname, int srvkrbnamelen, struct vnode **vpp, struct ucred *cred, struct thread *td, int nametimeo, int negnametimeo, int minvers) { struct nfsmount *nmp; struct nfsnode *np; int error, trycnt, ret; struct nfsvattr nfsva; struct nfsclclient *clp; struct nfsclds *dsp, *tdsp; uint32_t lease; static u_int64_t clval = 0; NFSCL_DEBUG(3, "in mnt\n"); clp = NULL; if (mp->mnt_flag & MNT_UPDATE) { nmp = VFSTONFS(mp); printf("%s: MNT_UPDATE is no longer handled here\n", __func__); FREE(nam, M_SONAME); return (0); } else { MALLOC(nmp, struct nfsmount *, sizeof (struct nfsmount) + krbnamelen + dirlen + srvkrbnamelen + 2, M_NEWNFSMNT, M_WAITOK | M_ZERO); TAILQ_INIT(&nmp->nm_bufq); if (clval == 0) clval = (u_int64_t)nfsboottime.tv_sec; nmp->nm_clval = clval++; nmp->nm_krbnamelen = krbnamelen; nmp->nm_dirpathlen = dirlen; nmp->nm_srvkrbnamelen = srvkrbnamelen; if (td->td_ucred->cr_uid != (uid_t)0) { /* * nm_uid is used to get KerberosV credentials for * the nfsv4 state handling operations if there is * no host based principal set. Use the uid of * this user if not root, since they are doing the * mount. I don't think setting this for root will * work, since root normally does not have user * credentials in a credentials cache. */ nmp->nm_uid = td->td_ucred->cr_uid; } else { /* * Just set to -1, so it won't be used. */ nmp->nm_uid = (uid_t)-1; } /* Copy and null terminate all the names */ if (nmp->nm_krbnamelen > 0) { bcopy(krbname, nmp->nm_krbname, nmp->nm_krbnamelen); nmp->nm_name[nmp->nm_krbnamelen] = '\0'; } if (nmp->nm_dirpathlen > 0) { bcopy(dirpath, NFSMNT_DIRPATH(nmp), nmp->nm_dirpathlen); nmp->nm_name[nmp->nm_krbnamelen + nmp->nm_dirpathlen + 1] = '\0'; } if (nmp->nm_srvkrbnamelen > 0) { bcopy(srvkrbname, NFSMNT_SRVKRBNAME(nmp), nmp->nm_srvkrbnamelen); nmp->nm_name[nmp->nm_krbnamelen + nmp->nm_dirpathlen + nmp->nm_srvkrbnamelen + 2] = '\0'; } nmp->nm_sockreq.nr_cred = crhold(cred); mtx_init(&nmp->nm_sockreq.nr_mtx, "nfssock", NULL, MTX_DEF); mp->mnt_data = nmp; nmp->nm_getinfo = nfs_getnlminfo; nmp->nm_vinvalbuf = ncl_vinvalbuf; } vfs_getnewfsid(mp); nmp->nm_mountp = mp; mtx_init(&nmp->nm_mtx, "NFSmount lock", NULL, MTX_DEF | MTX_DUPOK); /* * Since nfs_decode_args() might optionally set them, these * need to be set to defaults before the call, so that the * optional settings aren't overwritten. */ nmp->nm_nametimeo = nametimeo; nmp->nm_negnametimeo = negnametimeo; nmp->nm_timeo = NFS_TIMEO; nmp->nm_retry = NFS_RETRANS; nmp->nm_readahead = NFS_DEFRAHEAD; if (desiredvnodes >= 11000) nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000); else nmp->nm_wcommitsize = hibufspace / 10; if ((argp->flags & NFSMNT_NFSV4) != 0) nmp->nm_minorvers = minvers; else nmp->nm_minorvers = 0; nfs_decode_args(mp, nmp, argp, hst, cred, td); /* * V2 can only handle 32 bit filesizes. A 4GB-1 limit may be too * high, depending on whether we end up with negative offsets in * the client or server somewhere. 2GB-1 may be safer. * * For V3, ncl_fsinfo will adjust this as necessary. Assume maximum * that we can handle until we find out otherwise. */ if ((argp->flags & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0) nmp->nm_maxfilesize = 0xffffffffLL; else nmp->nm_maxfilesize = OFF_MAX; if ((argp->flags & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0) { nmp->nm_wsize = NFS_WSIZE; nmp->nm_rsize = NFS_RSIZE; nmp->nm_readdirsize = NFS_READDIRSIZE; } nmp->nm_numgrps = NFS_MAXGRPS; nmp->nm_tprintf_delay = nfs_tprintf_delay; if (nmp->nm_tprintf_delay < 0) nmp->nm_tprintf_delay = 0; nmp->nm_tprintf_initial_delay = nfs_tprintf_initial_delay; if (nmp->nm_tprintf_initial_delay < 0) nmp->nm_tprintf_initial_delay = 0; nmp->nm_fhsize = argp->fhsize; if (nmp->nm_fhsize > 0) bcopy((caddr_t)argp->fh, (caddr_t)nmp->nm_fh, argp->fhsize); bcopy(hst, mp->mnt_stat.f_mntfromname, MNAMELEN); nmp->nm_nam = nam; /* Set up the sockets and per-host congestion */ nmp->nm_sotype = argp->sotype; nmp->nm_soproto = argp->proto; nmp->nm_sockreq.nr_prog = NFS_PROG; if ((argp->flags & NFSMNT_NFSV4)) nmp->nm_sockreq.nr_vers = NFS_VER4; else if ((argp->flags & NFSMNT_NFSV3)) nmp->nm_sockreq.nr_vers = NFS_VER3; else nmp->nm_sockreq.nr_vers = NFS_VER2; if ((error = newnfs_connect(nmp, &nmp->nm_sockreq, cred, td, 0))) goto bad; /* For NFSv4.1, get the clientid now. */ if (nmp->nm_minorvers > 0) { NFSCL_DEBUG(3, "at getcl\n"); error = nfscl_getcl(mp, cred, td, 0, &clp); NFSCL_DEBUG(3, "aft getcl=%d\n", error); if (error != 0) goto bad; } if (nmp->nm_fhsize == 0 && (nmp->nm_flag & NFSMNT_NFSV4) && nmp->nm_dirpathlen > 0) { NFSCL_DEBUG(3, "in dirp\n"); /* * If the fhsize on the mount point == 0 for V4, the mount * path needs to be looked up. */ trycnt = 3; do { error = nfsrpc_getdirpath(nmp, NFSMNT_DIRPATH(nmp), cred, td); NFSCL_DEBUG(3, "aft dirp=%d\n", error); if (error) (void) nfs_catnap(PZERO, error, "nfsgetdirp"); } while (error && --trycnt > 0); if (error) { error = nfscl_maperr(td, error, (uid_t)0, (gid_t)0); goto bad; } } /* * A reference count is needed on the nfsnode representing the * remote root. If this object is not persistent, then backward * traversals of the mount point (i.e. "..") will not work if * the nfsnode gets flushed out of the cache. Ufs does not have * this problem, because one can identify root inodes by their * number == ROOTINO (2). */ if (nmp->nm_fhsize > 0) { /* * Set f_iosize to NFS_DIRBLKSIZ so that bo_bsize gets set * non-zero for the root vnode. f_iosize will be set correctly * by nfs_statfs() before any I/O occurs. */ mp->mnt_stat.f_iosize = NFS_DIRBLKSIZ; error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np, LK_EXCLUSIVE); if (error) goto bad; *vpp = NFSTOV(np); /* * Get file attributes and transfer parameters for the * mountpoint. This has the side effect of filling in * (*vpp)->v_type with the correct value. */ ret = nfsrpc_getattrnovp(nmp, nmp->nm_fh, nmp->nm_fhsize, 1, cred, td, &nfsva, NULL, &lease); if (ret) { /* * Just set default values to get things going. */ NFSBZERO((caddr_t)&nfsva, sizeof (struct nfsvattr)); nfsva.na_vattr.va_type = VDIR; nfsva.na_vattr.va_mode = 0777; nfsva.na_vattr.va_nlink = 100; nfsva.na_vattr.va_uid = (uid_t)0; nfsva.na_vattr.va_gid = (gid_t)0; nfsva.na_vattr.va_fileid = 2; nfsva.na_vattr.va_gen = 1; nfsva.na_vattr.va_blocksize = NFS_FABLKSIZE; nfsva.na_vattr.va_size = 512 * 1024; lease = 60; } (void) nfscl_loadattrcache(vpp, &nfsva, NULL, NULL, 0, 1); if (nmp->nm_minorvers > 0) { NFSCL_DEBUG(3, "lease=%d\n", (int)lease); NFSLOCKCLSTATE(); clp->nfsc_renew = NFSCL_RENEW(lease); clp->nfsc_expire = NFSD_MONOSEC + clp->nfsc_renew; clp->nfsc_clientidrev++; if (clp->nfsc_clientidrev == 0) clp->nfsc_clientidrev++; NFSUNLOCKCLSTATE(); /* * Mount will succeed, so the renew thread can be * started now. */ nfscl_start_renewthread(clp); nfscl_clientrelease(clp); } if (argp->flags & NFSMNT_NFSV3) ncl_fsinfo(nmp, *vpp, cred, td); /* Mark if the mount point supports NFSv4 ACLs. */ if ((argp->flags & NFSMNT_NFSV4) != 0 && nfsrv_useacl != 0 && ret == 0 && NFSISSET_ATTRBIT(&nfsva.na_suppattr, NFSATTRBIT_ACL)) { MNT_ILOCK(mp); mp->mnt_flag |= MNT_NFS4ACLS; MNT_IUNLOCK(mp); } /* * Lose the lock but keep the ref. */ NFSVOPUNLOCK(*vpp, 0); return (0); } error = EIO; bad: if (clp != NULL) nfscl_clientrelease(clp); newnfs_disconnect(&nmp->nm_sockreq); crfree(nmp->nm_sockreq.nr_cred); if (nmp->nm_sockreq.nr_auth != NULL) AUTH_DESTROY(nmp->nm_sockreq.nr_auth); mtx_destroy(&nmp->nm_sockreq.nr_mtx); mtx_destroy(&nmp->nm_mtx); if (nmp->nm_clp != NULL) { NFSLOCKCLSTATE(); LIST_REMOVE(nmp->nm_clp, nfsc_list); NFSUNLOCKCLSTATE(); free(nmp->nm_clp, M_NFSCLCLIENT); } TAILQ_FOREACH_SAFE(dsp, &nmp->nm_sess, nfsclds_list, tdsp) nfscl_freenfsclds(dsp); FREE(nmp, M_NEWNFSMNT); FREE(nam, M_SONAME); return (error); } /* * unmount system call */ static int nfs_unmount(struct mount *mp, int mntflags) { struct thread *td; struct nfsmount *nmp; int error, flags = 0, i, trycnt = 0; struct nfsclds *dsp, *tdsp; td = curthread; if (mntflags & MNT_FORCE) flags |= FORCECLOSE; nmp = VFSTONFS(mp); /* * Goes something like this.. * - Call vflush() to clear out vnodes for this filesystem * - Close the socket * - Free up the data structures */ /* In the forced case, cancel any outstanding requests. */ if (mntflags & MNT_FORCE) { error = newnfs_nmcancelreqs(nmp); if (error) goto out; /* For a forced close, get rid of the renew thread now */ nfscl_umount(nmp, td); } /* We hold 1 extra ref on the root vnode; see comment in mountnfs(). */ do { error = vflush(mp, 1, flags, td); if ((mntflags & MNT_FORCE) && error != 0 && ++trycnt < 30) (void) nfs_catnap(PSOCK, error, "newndm"); } while ((mntflags & MNT_FORCE) && error != 0 && trycnt < 30); if (error) goto out; /* * We are now committed to the unmount. */ if ((mntflags & MNT_FORCE) == 0) nfscl_umount(nmp, td); /* Make sure no nfsiods are assigned to this mount. */ mtx_lock(&ncl_iod_mutex); for (i = 0; i < NFS_MAXASYNCDAEMON; i++) if (ncl_iodmount[i] == nmp) { ncl_iodwant[i] = NFSIOD_AVAILABLE; ncl_iodmount[i] = NULL; } mtx_unlock(&ncl_iod_mutex); newnfs_disconnect(&nmp->nm_sockreq); crfree(nmp->nm_sockreq.nr_cred); FREE(nmp->nm_nam, M_SONAME); if (nmp->nm_sockreq.nr_auth != NULL) AUTH_DESTROY(nmp->nm_sockreq.nr_auth); mtx_destroy(&nmp->nm_sockreq.nr_mtx); mtx_destroy(&nmp->nm_mtx); TAILQ_FOREACH_SAFE(dsp, &nmp->nm_sess, nfsclds_list, tdsp) nfscl_freenfsclds(dsp); FREE(nmp, M_NEWNFSMNT); out: return (error); } /* * Return root of a filesystem */ static int nfs_root(struct mount *mp, int flags, struct vnode **vpp) { struct vnode *vp; struct nfsmount *nmp; struct nfsnode *np; int error; nmp = VFSTONFS(mp); error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np, flags); if (error) return error; vp = NFSTOV(np); /* * Get transfer parameters and attributes for root vnode once. */ mtx_lock(&nmp->nm_mtx); if (NFSHASNFSV3(nmp) && !NFSHASGOTFSINFO(nmp)) { mtx_unlock(&nmp->nm_mtx); ncl_fsinfo(nmp, vp, curthread->td_ucred, curthread); } else mtx_unlock(&nmp->nm_mtx); if (vp->v_type == VNON) vp->v_type = VDIR; vp->v_vflag |= VV_ROOT; *vpp = vp; return (0); } /* * Flush out the buffer cache */ /* ARGSUSED */ static int nfs_sync(struct mount *mp, int waitfor) { struct vnode *vp, *mvp; struct thread *td; int error, allerror = 0; td = curthread; MNT_ILOCK(mp); /* * If a forced dismount is in progress, return from here so that * the umount(2) syscall doesn't get stuck in VFS_SYNC() before * calling VFS_UNMOUNT(). */ if ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0) { MNT_IUNLOCK(mp); return (EBADF); } MNT_IUNLOCK(mp); /* * Force stale buffer cache information to be flushed. */ loop: MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { /* XXX Racy bv_cnt check. */ if (NFSVOPISLOCKED(vp) || vp->v_bufobj.bo_dirty.bv_cnt == 0 || waitfor == MNT_LAZY) { VI_UNLOCK(vp); continue; } if (vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, td)) { MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); goto loop; } error = VOP_FSYNC(vp, waitfor, td); if (error) allerror = error; NFSVOPUNLOCK(vp, 0); vrele(vp); } return (allerror); } static int nfs_sysctl(struct mount *mp, fsctlop_t op, struct sysctl_req *req) { struct nfsmount *nmp = VFSTONFS(mp); struct vfsquery vq; int error; bzero(&vq, sizeof(vq)); switch (op) { #if 0 case VFS_CTL_NOLOCKS: val = (nmp->nm_flag & NFSMNT_NOLOCKS) ? 1 : 0; if (req->oldptr != NULL) { error = SYSCTL_OUT(req, &val, sizeof(val)); if (error) return (error); } if (req->newptr != NULL) { error = SYSCTL_IN(req, &val, sizeof(val)); if (error) return (error); if (val) nmp->nm_flag |= NFSMNT_NOLOCKS; else nmp->nm_flag &= ~NFSMNT_NOLOCKS; } break; #endif case VFS_CTL_QUERY: mtx_lock(&nmp->nm_mtx); if (nmp->nm_state & NFSSTA_TIMEO) vq.vq_flags |= VQ_NOTRESP; mtx_unlock(&nmp->nm_mtx); #if 0 if (!(nmp->nm_flag & NFSMNT_NOLOCKS) && (nmp->nm_state & NFSSTA_LOCKTIMEO)) vq.vq_flags |= VQ_NOTRESPLOCK; #endif error = SYSCTL_OUT(req, &vq, sizeof(vq)); break; case VFS_CTL_TIMEO: if (req->oldptr != NULL) { error = SYSCTL_OUT(req, &nmp->nm_tprintf_initial_delay, sizeof(nmp->nm_tprintf_initial_delay)); if (error) return (error); } if (req->newptr != NULL) { error = vfs_suser(mp, req->td); if (error) return (error); error = SYSCTL_IN(req, &nmp->nm_tprintf_initial_delay, sizeof(nmp->nm_tprintf_initial_delay)); if (error) return (error); if (nmp->nm_tprintf_initial_delay < 0) nmp->nm_tprintf_initial_delay = 0; } break; default: return (ENOTSUP); } return (0); } /* * Purge any RPCs in progress, so that they will all return errors. * This allows dounmount() to continue as far as VFS_UNMOUNT() for a * forced dismount. */ static void nfs_purge(struct mount *mp) { struct nfsmount *nmp = VFSTONFS(mp); newnfs_nmcancelreqs(nmp); } /* * Extract the information needed by the nlm from the nfs vnode. */ static void nfs_getnlminfo(struct vnode *vp, uint8_t *fhp, size_t *fhlenp, struct sockaddr_storage *sp, int *is_v3p, off_t *sizep, struct timeval *timeop) { struct nfsmount *nmp; struct nfsnode *np = VTONFS(vp); nmp = VFSTONFS(vp->v_mount); if (fhlenp != NULL) *fhlenp = (size_t)np->n_fhp->nfh_len; if (fhp != NULL) bcopy(np->n_fhp->nfh_fh, fhp, np->n_fhp->nfh_len); if (sp != NULL) bcopy(nmp->nm_nam, sp, min(nmp->nm_nam->sa_len, sizeof(*sp))); if (is_v3p != NULL) *is_v3p = NFS_ISV3(vp); if (sizep != NULL) *sizep = np->n_size; if (timeop != NULL) { timeop->tv_sec = nmp->nm_timeo / NFS_HZ; timeop->tv_usec = (nmp->nm_timeo % NFS_HZ) * (1000000 / NFS_HZ); } } /* * This function prints out an option name, based on the conditional * argument. */ static __inline void nfscl_printopt(struct nfsmount *nmp, int testval, char *opt, char **buf, size_t *blen) { int len; if (testval != 0 && *blen > strlen(opt)) { len = snprintf(*buf, *blen, "%s", opt); if (len != strlen(opt)) printf("EEK!!\n"); *buf += len; *blen -= len; } } /* * This function printf out an options integer value. */ static __inline void nfscl_printoptval(struct nfsmount *nmp, int optval, char *opt, char **buf, size_t *blen) { int len; if (*blen > strlen(opt) + 1) { /* Could result in truncated output string. */ len = snprintf(*buf, *blen, "%s=%d", opt, optval); if (len < *blen) { *buf += len; *blen -= len; } } } /* * Load the option flags and values into the buffer. */ void nfscl_retopts(struct nfsmount *nmp, char *buffer, size_t buflen) { char *buf; size_t blen; buf = buffer; blen = buflen; nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NFSV4) != 0, "nfsv4", &buf, &blen); if ((nmp->nm_flag & NFSMNT_NFSV4) != 0) { nfscl_printoptval(nmp, nmp->nm_minorvers, ",minorversion", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_PNFS) != 0, ",pnfs", &buf, &blen); } nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NFSV3) != 0, "nfsv3", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0, "nfsv2", &buf, &blen); nfscl_printopt(nmp, nmp->nm_sotype == SOCK_STREAM, ",tcp", &buf, &blen); nfscl_printopt(nmp, nmp->nm_sotype != SOCK_STREAM, ",udp", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_RESVPORT) != 0, ",resvport", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NOCONN) != 0, ",noconn", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_SOFT) == 0, ",hard", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_SOFT) != 0, ",soft", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_INT) != 0, ",intr", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NOCTO) == 0, ",cto", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NOCTO) != 0, ",nocto", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NONCONTIGWR) != 0, ",noncontigwr", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_NOLOCKD | NFSMNT_NFSV4)) == 0, ",lockd", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_NOLOCKD | NFSMNT_NFSV4)) == NFSMNT_NOLOCKD, ",nolockd", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_RDIRPLUS) != 0, ",rdirplus", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_KERB) == 0, ",sec=sys", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_KERB | NFSMNT_INTEGRITY | NFSMNT_PRIVACY)) == NFSMNT_KERB, ",sec=krb5", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_KERB | NFSMNT_INTEGRITY | NFSMNT_PRIVACY)) == (NFSMNT_KERB | NFSMNT_INTEGRITY), ",sec=krb5i", &buf, &blen); nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_KERB | NFSMNT_INTEGRITY | NFSMNT_PRIVACY)) == (NFSMNT_KERB | NFSMNT_PRIVACY), ",sec=krb5p", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_acdirmin, ",acdirmin", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_acdirmax, ",acdirmax", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_acregmin, ",acregmin", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_acregmax, ",acregmax", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_nametimeo, ",nametimeo", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_negnametimeo, ",negnametimeo", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_rsize, ",rsize", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_wsize, ",wsize", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_readdirsize, ",readdirsize", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_readahead, ",readahead", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_wcommitsize, ",wcommitsize", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_timeo, ",timeout", &buf, &blen); nfscl_printoptval(nmp, nmp->nm_retry, ",retrans", &buf, &blen); } Index: user/ngie/more-tests/sys/fs/nfsserver/nfs_nfsdport.c =================================================================== --- user/ngie/more-tests/sys/fs/nfsserver/nfs_nfsdport.c (revision 281584) +++ user/ngie/more-tests/sys/fs/nfsserver/nfs_nfsdport.c (revision 281585) @@ -1,3408 +1,3411 @@ /*- * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Rick Macklem at The University of Guelph. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include /* * Functions that perform the vfs operations required by the routines in * nfsd_serv.c. It is hoped that this change will make the server more * portable. */ #include #include #include #include #include FEATURE(nfsd, "NFSv4 server"); extern u_int32_t newnfs_true, newnfs_false, newnfs_xdrneg1; extern int nfsrv_useacl; extern int newnfs_numnfsd; extern struct mount nfsv4root_mnt; extern struct nfsrv_stablefirst nfsrv_stablefirst; extern void (*nfsd_call_servertimer)(void); extern SVCPOOL *nfsrvd_pool; extern struct nfsv4lock nfsd_suspend_lock; extern struct nfssessionhash nfssessionhash[NFSSESSIONHASHSIZE]; struct vfsoptlist nfsv4root_opt, nfsv4root_newopt; NFSDLOCKMUTEX; struct nfsrchash_bucket nfsrchash_table[NFSRVCACHE_HASHSIZE]; struct nfsrchash_bucket nfsrcahash_table[NFSRVCACHE_HASHSIZE]; struct mtx nfsrc_udpmtx; struct mtx nfs_v4root_mutex; struct nfsrvfh nfs_rootfh, nfs_pubfh; int nfs_pubfhset = 0, nfs_rootfhset = 0; struct proc *nfsd_master_proc = NULL; int nfsd_debuglevel = 0; static pid_t nfsd_master_pid = (pid_t)-1; static char nfsd_master_comm[MAXCOMLEN + 1]; static struct timeval nfsd_master_start; static uint32_t nfsv4_sysid = 0; static int nfssvc_srvcall(struct thread *, struct nfssvc_args *, struct ucred *); int nfsrv_enable_crossmntpt = 1; static int nfs_commit_blks; static int nfs_commit_miss; extern int nfsrv_issuedelegs; extern int nfsrv_dolocallocks; extern int nfsd_enable_stringtouid; SYSCTL_NODE(_vfs, OID_AUTO, nfsd, CTLFLAG_RW, 0, "New NFS server"); SYSCTL_INT(_vfs_nfsd, OID_AUTO, mirrormnt, CTLFLAG_RW, &nfsrv_enable_crossmntpt, 0, "Enable nfsd to cross mount points"); SYSCTL_INT(_vfs_nfsd, OID_AUTO, commit_blks, CTLFLAG_RW, &nfs_commit_blks, 0, ""); SYSCTL_INT(_vfs_nfsd, OID_AUTO, commit_miss, CTLFLAG_RW, &nfs_commit_miss, 0, ""); SYSCTL_INT(_vfs_nfsd, OID_AUTO, issue_delegations, CTLFLAG_RW, &nfsrv_issuedelegs, 0, "Enable nfsd to issue delegations"); SYSCTL_INT(_vfs_nfsd, OID_AUTO, enable_locallocks, CTLFLAG_RW, &nfsrv_dolocallocks, 0, "Enable nfsd to acquire local locks on files"); SYSCTL_INT(_vfs_nfsd, OID_AUTO, debuglevel, CTLFLAG_RW, &nfsd_debuglevel, 0, "Debug level for new nfs server"); SYSCTL_INT(_vfs_nfsd, OID_AUTO, enable_stringtouid, CTLFLAG_RW, &nfsd_enable_stringtouid, 0, "Enable nfsd to accept numeric owner_names"); #define MAX_REORDERED_RPC 16 #define NUM_HEURISTIC 1031 #define NHUSE_INIT 64 #define NHUSE_INC 16 #define NHUSE_MAX 2048 static struct nfsheur { struct vnode *nh_vp; /* vp to match (unreferenced pointer) */ off_t nh_nextoff; /* next offset for sequential detection */ int nh_use; /* use count for selection */ int nh_seqcount; /* heuristic */ } nfsheur[NUM_HEURISTIC]; /* * Heuristic to detect sequential operation. */ static struct nfsheur * nfsrv_sequential_heuristic(struct uio *uio, struct vnode *vp) { struct nfsheur *nh; int hi, try; /* Locate best candidate. */ try = 32; hi = ((int)(vm_offset_t)vp / sizeof(struct vnode)) % NUM_HEURISTIC; nh = &nfsheur[hi]; while (try--) { if (nfsheur[hi].nh_vp == vp) { nh = &nfsheur[hi]; break; } if (nfsheur[hi].nh_use > 0) --nfsheur[hi].nh_use; hi = (hi + 1) % NUM_HEURISTIC; if (nfsheur[hi].nh_use < nh->nh_use) nh = &nfsheur[hi]; } /* Initialize hint if this is a new file. */ if (nh->nh_vp != vp) { nh->nh_vp = vp; nh->nh_nextoff = uio->uio_offset; nh->nh_use = NHUSE_INIT; if (uio->uio_offset == 0) nh->nh_seqcount = 4; else nh->nh_seqcount = 1; } /* Calculate heuristic. */ if ((uio->uio_offset == 0 && nh->nh_seqcount > 0) || uio->uio_offset == nh->nh_nextoff) { /* See comments in vfs_vnops.c:sequential_heuristic(). */ nh->nh_seqcount += howmany(uio->uio_resid, 16384); if (nh->nh_seqcount > IO_SEQMAX) nh->nh_seqcount = IO_SEQMAX; } else if (qabs(uio->uio_offset - nh->nh_nextoff) <= MAX_REORDERED_RPC * imax(vp->v_mount->mnt_stat.f_iosize, uio->uio_resid)) { /* Probably a reordered RPC, leave seqcount alone. */ } else if (nh->nh_seqcount > 1) { nh->nh_seqcount /= 2; } else { nh->nh_seqcount = 0; } nh->nh_use += NHUSE_INC; if (nh->nh_use > NHUSE_MAX) nh->nh_use = NHUSE_MAX; return (nh); } /* * Get attributes into nfsvattr structure. */ int nfsvno_getattr(struct vnode *vp, struct nfsvattr *nvap, struct ucred *cred, struct thread *p, int vpislocked) { int error, lockedit = 0; if (vpislocked == 0) { /* * When vpislocked == 0, the vnode is either exclusively * locked by this thread or not locked by this thread. * As such, shared lock it, if not exclusively locked. */ if (NFSVOPISLOCKED(vp) != LK_EXCLUSIVE) { lockedit = 1; NFSVOPLOCK(vp, LK_SHARED | LK_RETRY); } } error = VOP_GETATTR(vp, &nvap->na_vattr, cred); if (lockedit != 0) NFSVOPUNLOCK(vp, 0); NFSEXITCODE(error); return (error); } /* * Get a file handle for a vnode. */ int nfsvno_getfh(struct vnode *vp, fhandle_t *fhp, struct thread *p) { int error; NFSBZERO((caddr_t)fhp, sizeof(fhandle_t)); fhp->fh_fsid = vp->v_mount->mnt_stat.f_fsid; error = VOP_VPTOFH(vp, &fhp->fh_fid); NFSEXITCODE(error); return (error); } /* * Perform access checking for vnodes obtained from file handles that would * refer to files already opened by a Unix client. You cannot just use * vn_writechk() and VOP_ACCESSX() for two reasons. * 1 - You must check for exported rdonly as well as MNT_RDONLY for the write * case. * 2 - The owner is to be given access irrespective of mode bits for some * operations, so that processes that chmod after opening a file don't * break. */ int nfsvno_accchk(struct vnode *vp, accmode_t accmode, struct ucred *cred, struct nfsexstuff *exp, struct thread *p, int override, int vpislocked, u_int32_t *supportedtypep) { struct vattr vattr; int error = 0, getret = 0; if (vpislocked == 0) { if (NFSVOPLOCK(vp, LK_SHARED) != 0) { error = EPERM; goto out; } } if (accmode & VWRITE) { /* Just vn_writechk() changed to check rdonly */ /* * Disallow write attempts on read-only file systems; * unless the file is a socket or a block or character * device resident on the file system. */ if (NFSVNO_EXRDONLY(exp) || (vp->v_mount->mnt_flag & MNT_RDONLY)) { switch (vp->v_type) { case VREG: case VDIR: case VLNK: error = EROFS; default: break; } } /* * If there's shared text associated with * the inode, try to free it up once. If * we fail, we can't allow writing. */ if (VOP_IS_TEXT(vp) && error == 0) error = ETXTBSY; } if (error != 0) { if (vpislocked == 0) NFSVOPUNLOCK(vp, 0); goto out; } /* * Should the override still be applied when ACLs are enabled? */ error = VOP_ACCESSX(vp, accmode, cred, p); if (error != 0 && (accmode & (VDELETE | VDELETE_CHILD))) { /* * Try again with VEXPLICIT_DENY, to see if the test for * deletion is supported. */ error = VOP_ACCESSX(vp, accmode | VEXPLICIT_DENY, cred, p); if (error == 0) { if (vp->v_type == VDIR) { accmode &= ~(VDELETE | VDELETE_CHILD); accmode |= VWRITE; error = VOP_ACCESSX(vp, accmode, cred, p); } else if (supportedtypep != NULL) { *supportedtypep &= ~NFSACCESS_DELETE; } } } /* * Allow certain operations for the owner (reads and writes * on files that are already open). */ if (override != NFSACCCHK_NOOVERRIDE && (error == EPERM || error == EACCES)) { if (cred->cr_uid == 0 && (override & NFSACCCHK_ALLOWROOT)) error = 0; else if (override & NFSACCCHK_ALLOWOWNER) { getret = VOP_GETATTR(vp, &vattr, cred); if (getret == 0 && cred->cr_uid == vattr.va_uid) error = 0; } } if (vpislocked == 0) NFSVOPUNLOCK(vp, 0); out: NFSEXITCODE(error); return (error); } /* * Set attribute(s) vnop. */ int nfsvno_setattr(struct vnode *vp, struct nfsvattr *nvap, struct ucred *cred, struct thread *p, struct nfsexstuff *exp) { int error; error = VOP_SETATTR(vp, &nvap->na_vattr, cred); NFSEXITCODE(error); return (error); } /* * Set up nameidata for a lookup() call and do it. */ int nfsvno_namei(struct nfsrv_descript *nd, struct nameidata *ndp, struct vnode *dp, int islocked, struct nfsexstuff *exp, struct thread *p, struct vnode **retdirp) { struct componentname *cnp = &ndp->ni_cnd; int i; struct iovec aiov; struct uio auio; int lockleaf = (cnp->cn_flags & LOCKLEAF) != 0, linklen; int error = 0, crossmnt; char *cp; *retdirp = NULL; cnp->cn_nameptr = cnp->cn_pnbuf; ndp->ni_strictrelative = 0; /* * Extract and set starting directory. */ if (dp->v_type != VDIR) { if (islocked) vput(dp); else vrele(dp); nfsvno_relpathbuf(ndp); error = ENOTDIR; goto out1; } if (islocked) NFSVOPUNLOCK(dp, 0); VREF(dp); *retdirp = dp; if (NFSVNO_EXRDONLY(exp)) cnp->cn_flags |= RDONLY; ndp->ni_segflg = UIO_SYSSPACE; crossmnt = 1; if (nd->nd_flag & ND_PUBLOOKUP) { ndp->ni_loopcnt = 0; if (cnp->cn_pnbuf[0] == '/') { vrele(dp); /* * Check for degenerate pathnames here, since lookup() * panics on them. */ for (i = 1; i < ndp->ni_pathlen; i++) if (cnp->cn_pnbuf[i] != '/') break; if (i == ndp->ni_pathlen) { error = NFSERR_ACCES; goto out; } dp = rootvnode; VREF(dp); } } else if ((nfsrv_enable_crossmntpt == 0 && NFSVNO_EXPORTED(exp)) || (nd->nd_flag & ND_NFSV4) == 0) { /* * Only cross mount points for NFSv4 when doing a * mount while traversing the file system above * the mount point, unless nfsrv_enable_crossmntpt is set. */ cnp->cn_flags |= NOCROSSMOUNT; crossmnt = 0; } /* * Initialize for scan, set ni_startdir and bump ref on dp again * because lookup() will dereference ni_startdir. */ cnp->cn_thread = p; ndp->ni_startdir = dp; ndp->ni_rootdir = rootvnode; ndp->ni_topdir = NULL; if (!lockleaf) cnp->cn_flags |= LOCKLEAF; for (;;) { cnp->cn_nameptr = cnp->cn_pnbuf; /* * Call lookup() to do the real work. If an error occurs, * ndp->ni_vp and ni_dvp are left uninitialized or NULL and * we do not have to dereference anything before returning. * In either case ni_startdir will be dereferenced and NULLed * out. */ error = lookup(ndp); if (error) break; /* * Check for encountering a symbolic link. Trivial * termination occurs if no symlink encountered. */ if ((cnp->cn_flags & ISSYMLINK) == 0) { if ((cnp->cn_flags & (SAVENAME | SAVESTART)) == 0) nfsvno_relpathbuf(ndp); if (ndp->ni_vp && !lockleaf) NFSVOPUNLOCK(ndp->ni_vp, 0); break; } /* * Validate symlink */ if ((cnp->cn_flags & LOCKPARENT) && ndp->ni_pathlen == 1) NFSVOPUNLOCK(ndp->ni_dvp, 0); if (!(nd->nd_flag & ND_PUBLOOKUP)) { error = EINVAL; goto badlink2; } if (ndp->ni_loopcnt++ >= MAXSYMLINKS) { error = ELOOP; goto badlink2; } if (ndp->ni_pathlen > 1) cp = uma_zalloc(namei_zone, M_WAITOK); else cp = cnp->cn_pnbuf; aiov.iov_base = cp; aiov.iov_len = MAXPATHLEN; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_offset = 0; auio.uio_rw = UIO_READ; auio.uio_segflg = UIO_SYSSPACE; auio.uio_td = NULL; auio.uio_resid = MAXPATHLEN; error = VOP_READLINK(ndp->ni_vp, &auio, cnp->cn_cred); if (error) { badlink1: if (ndp->ni_pathlen > 1) uma_zfree(namei_zone, cp); badlink2: vrele(ndp->ni_dvp); vput(ndp->ni_vp); break; } linklen = MAXPATHLEN - auio.uio_resid; if (linklen == 0) { error = ENOENT; goto badlink1; } if (linklen + ndp->ni_pathlen >= MAXPATHLEN) { error = ENAMETOOLONG; goto badlink1; } /* * Adjust or replace path */ if (ndp->ni_pathlen > 1) { NFSBCOPY(ndp->ni_next, cp + linklen, ndp->ni_pathlen); uma_zfree(namei_zone, cnp->cn_pnbuf); cnp->cn_pnbuf = cp; } else cnp->cn_pnbuf[linklen] = '\0'; ndp->ni_pathlen += linklen; /* * Cleanup refs for next loop and check if root directory * should replace current directory. Normally ni_dvp * becomes the new base directory and is cleaned up when * we loop. Explicitly null pointers after invalidation * to clarify operation. */ vput(ndp->ni_vp); ndp->ni_vp = NULL; if (cnp->cn_pnbuf[0] == '/') { vrele(ndp->ni_dvp); ndp->ni_dvp = ndp->ni_rootdir; VREF(ndp->ni_dvp); } ndp->ni_startdir = ndp->ni_dvp; ndp->ni_dvp = NULL; } if (!lockleaf) cnp->cn_flags &= ~LOCKLEAF; out: if (error) { nfsvno_relpathbuf(ndp); ndp->ni_vp = NULL; ndp->ni_dvp = NULL; ndp->ni_startdir = NULL; } else if ((ndp->ni_cnd.cn_flags & (WANTPARENT|LOCKPARENT)) == 0) { ndp->ni_dvp = NULL; } out1: NFSEXITCODE2(error, nd); return (error); } /* * Set up a pathname buffer and return a pointer to it and, optionally * set a hash pointer. */ void nfsvno_setpathbuf(struct nameidata *ndp, char **bufpp, u_long **hashpp) { struct componentname *cnp = &ndp->ni_cnd; cnp->cn_flags |= (NOMACCHECK | HASBUF); cnp->cn_pnbuf = uma_zalloc(namei_zone, M_WAITOK); if (hashpp != NULL) *hashpp = NULL; *bufpp = cnp->cn_pnbuf; } /* * Release the above path buffer, if not released by nfsvno_namei(). */ void nfsvno_relpathbuf(struct nameidata *ndp) { if ((ndp->ni_cnd.cn_flags & HASBUF) == 0) panic("nfsrelpath"); uma_zfree(namei_zone, ndp->ni_cnd.cn_pnbuf); ndp->ni_cnd.cn_flags &= ~HASBUF; } /* * Readlink vnode op into an mbuf list. */ int nfsvno_readlink(struct vnode *vp, struct ucred *cred, struct thread *p, struct mbuf **mpp, struct mbuf **mpendp, int *lenp) { struct iovec iv[(NFS_MAXPATHLEN+MLEN-1)/MLEN]; struct iovec *ivp = iv; struct uio io, *uiop = &io; struct mbuf *mp, *mp2 = NULL, *mp3 = NULL; int i, len, tlen, error = 0; len = 0; i = 0; while (len < NFS_MAXPATHLEN) { NFSMGET(mp); MCLGET(mp, M_WAITOK); mp->m_len = M_SIZE(mp); if (len == 0) { mp3 = mp2 = mp; } else { mp2->m_next = mp; mp2 = mp; } if ((len + mp->m_len) > NFS_MAXPATHLEN) { mp->m_len = NFS_MAXPATHLEN - len; len = NFS_MAXPATHLEN; } else { len += mp->m_len; } ivp->iov_base = mtod(mp, caddr_t); ivp->iov_len = mp->m_len; i++; ivp++; } uiop->uio_iov = iv; uiop->uio_iovcnt = i; uiop->uio_offset = 0; uiop->uio_resid = len; uiop->uio_rw = UIO_READ; uiop->uio_segflg = UIO_SYSSPACE; uiop->uio_td = NULL; error = VOP_READLINK(vp, uiop, cred); if (error) { m_freem(mp3); *lenp = 0; goto out; } if (uiop->uio_resid > 0) { len -= uiop->uio_resid; tlen = NFSM_RNDUP(len); nfsrv_adj(mp3, NFS_MAXPATHLEN - tlen, tlen - len); } *lenp = len; *mpp = mp3; *mpendp = mp; out: NFSEXITCODE(error); return (error); } /* * Read vnode op call into mbuf list. */ int nfsvno_read(struct vnode *vp, off_t off, int cnt, struct ucred *cred, struct thread *p, struct mbuf **mpp, struct mbuf **mpendp) { struct mbuf *m; int i; struct iovec *iv; struct iovec *iv2; int error = 0, len, left, siz, tlen, ioflag = 0; struct mbuf *m2 = NULL, *m3; struct uio io, *uiop = &io; struct nfsheur *nh; len = left = NFSM_RNDUP(cnt); m3 = NULL; /* * Generate the mbuf list with the uio_iov ref. to it. */ i = 0; while (left > 0) { NFSMGET(m); MCLGET(m, M_WAITOK); m->m_len = 0; siz = min(M_TRAILINGSPACE(m), left); left -= siz; i++; if (m3) m2->m_next = m; else m3 = m; m2 = m; } MALLOC(iv, struct iovec *, i * sizeof (struct iovec), M_TEMP, M_WAITOK); uiop->uio_iov = iv2 = iv; m = m3; left = len; i = 0; while (left > 0) { if (m == NULL) panic("nfsvno_read iov"); siz = min(M_TRAILINGSPACE(m), left); if (siz > 0) { iv->iov_base = mtod(m, caddr_t) + m->m_len; iv->iov_len = siz; m->m_len += siz; left -= siz; iv++; i++; } m = m->m_next; } uiop->uio_iovcnt = i; uiop->uio_offset = off; uiop->uio_resid = len; uiop->uio_rw = UIO_READ; uiop->uio_segflg = UIO_SYSSPACE; uiop->uio_td = NULL; nh = nfsrv_sequential_heuristic(uiop, vp); ioflag |= nh->nh_seqcount << IO_SEQSHIFT; error = VOP_READ(vp, uiop, IO_NODELOCKED | ioflag, cred); FREE((caddr_t)iv2, M_TEMP); if (error) { m_freem(m3); *mpp = NULL; goto out; } nh->nh_nextoff = uiop->uio_offset; tlen = len - uiop->uio_resid; cnt = cnt < tlen ? cnt : tlen; tlen = NFSM_RNDUP(cnt); if (tlen == 0) { m_freem(m3); m3 = NULL; } else if (len != tlen || tlen != cnt) nfsrv_adj(m3, len - tlen, tlen - cnt); *mpp = m3; *mpendp = m2; out: NFSEXITCODE(error); return (error); } /* * Write vnode op from an mbuf list. */ int nfsvno_write(struct vnode *vp, off_t off, int retlen, int cnt, int stable, struct mbuf *mp, char *cp, struct ucred *cred, struct thread *p) { struct iovec *ivp; int i, len; struct iovec *iv; int ioflags, error; struct uio io, *uiop = &io; struct nfsheur *nh; MALLOC(ivp, struct iovec *, cnt * sizeof (struct iovec), M_TEMP, M_WAITOK); uiop->uio_iov = iv = ivp; uiop->uio_iovcnt = cnt; i = mtod(mp, caddr_t) + mp->m_len - cp; len = retlen; while (len > 0) { if (mp == NULL) panic("nfsvno_write"); if (i > 0) { i = min(i, len); ivp->iov_base = cp; ivp->iov_len = i; ivp++; len -= i; } mp = mp->m_next; if (mp) { i = mp->m_len; cp = mtod(mp, caddr_t); } } if (stable == NFSWRITE_UNSTABLE) ioflags = IO_NODELOCKED; else ioflags = (IO_SYNC | IO_NODELOCKED); uiop->uio_resid = retlen; uiop->uio_rw = UIO_WRITE; uiop->uio_segflg = UIO_SYSSPACE; NFSUIOPROC(uiop, p); uiop->uio_offset = off; nh = nfsrv_sequential_heuristic(uiop, vp); ioflags |= nh->nh_seqcount << IO_SEQSHIFT; error = VOP_WRITE(vp, uiop, ioflags, cred); if (error == 0) nh->nh_nextoff = uiop->uio_offset; FREE((caddr_t)iv, M_TEMP); NFSEXITCODE(error); return (error); } /* * Common code for creating a regular file (plus special files for V2). */ int nfsvno_createsub(struct nfsrv_descript *nd, struct nameidata *ndp, struct vnode **vpp, struct nfsvattr *nvap, int *exclusive_flagp, int32_t *cverf, NFSDEV_T rdev, struct thread *p, struct nfsexstuff *exp) { u_quad_t tempsize; int error; error = nd->nd_repstat; if (!error && ndp->ni_vp == NULL) { if (nvap->na_type == VREG || nvap->na_type == VSOCK) { vrele(ndp->ni_startdir); error = VOP_CREATE(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr); vput(ndp->ni_dvp); nfsvno_relpathbuf(ndp); if (!error) { if (*exclusive_flagp) { *exclusive_flagp = 0; NFSVNO_ATTRINIT(nvap); nvap->na_atime.tv_sec = cverf[0]; nvap->na_atime.tv_nsec = cverf[1]; error = VOP_SETATTR(ndp->ni_vp, &nvap->na_vattr, nd->nd_cred); } } /* * NFS V2 Only. nfsrvd_mknod() does this for V3. * (This implies, just get out on an error.) */ } else if (nvap->na_type == VCHR || nvap->na_type == VBLK || nvap->na_type == VFIFO) { if (nvap->na_type == VCHR && rdev == 0xffffffff) nvap->na_type = VFIFO; if (nvap->na_type != VFIFO && (error = priv_check_cred(nd->nd_cred, PRIV_VFS_MKNOD_DEV, 0))) { vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); vput(ndp->ni_dvp); goto out; } nvap->na_rdev = rdev; error = VOP_MKNOD(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr); vput(ndp->ni_dvp); nfsvno_relpathbuf(ndp); vrele(ndp->ni_startdir); if (error) goto out; } else { vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); vput(ndp->ni_dvp); error = ENXIO; goto out; } *vpp = ndp->ni_vp; } else { /* * Handle cases where error is already set and/or * the file exists. * 1 - clean up the lookup * 2 - iff !error and na_size set, truncate it */ vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); *vpp = ndp->ni_vp; if (ndp->ni_dvp == *vpp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); if (!error && nvap->na_size != VNOVAL) { error = nfsvno_accchk(*vpp, VWRITE, nd->nd_cred, exp, p, NFSACCCHK_NOOVERRIDE, NFSACCCHK_VPISLOCKED, NULL); if (!error) { tempsize = nvap->na_size; NFSVNO_ATTRINIT(nvap); nvap->na_size = tempsize; error = VOP_SETATTR(*vpp, &nvap->na_vattr, nd->nd_cred); } } if (error) vput(*vpp); } out: NFSEXITCODE(error); return (error); } /* * Do a mknod vnode op. */ int nfsvno_mknod(struct nameidata *ndp, struct nfsvattr *nvap, struct ucred *cred, struct thread *p) { int error = 0; enum vtype vtyp; vtyp = nvap->na_type; /* * Iff doesn't exist, create it. */ if (ndp->ni_vp) { vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); vput(ndp->ni_dvp); vrele(ndp->ni_vp); error = EEXIST; goto out; } if (vtyp != VCHR && vtyp != VBLK && vtyp != VSOCK && vtyp != VFIFO) { vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); vput(ndp->ni_dvp); error = NFSERR_BADTYPE; goto out; } if (vtyp == VSOCK) { vrele(ndp->ni_startdir); error = VOP_CREATE(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr); vput(ndp->ni_dvp); nfsvno_relpathbuf(ndp); } else { if (nvap->na_type != VFIFO && (error = priv_check_cred(cred, PRIV_VFS_MKNOD_DEV, 0))) { vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); vput(ndp->ni_dvp); goto out; } error = VOP_MKNOD(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr); vput(ndp->ni_dvp); nfsvno_relpathbuf(ndp); vrele(ndp->ni_startdir); /* * Since VOP_MKNOD returns the ni_vp, I can't * see any reason to do the lookup. */ } out: NFSEXITCODE(error); return (error); } /* * Mkdir vnode op. */ int nfsvno_mkdir(struct nameidata *ndp, struct nfsvattr *nvap, uid_t saved_uid, struct ucred *cred, struct thread *p, struct nfsexstuff *exp) { int error = 0; if (ndp->ni_vp != NULL) { if (ndp->ni_dvp == ndp->ni_vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); vrele(ndp->ni_vp); nfsvno_relpathbuf(ndp); error = EEXIST; goto out; } error = VOP_MKDIR(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr); vput(ndp->ni_dvp); nfsvno_relpathbuf(ndp); out: NFSEXITCODE(error); return (error); } /* * symlink vnode op. */ int nfsvno_symlink(struct nameidata *ndp, struct nfsvattr *nvap, char *pathcp, int pathlen, int not_v2, uid_t saved_uid, struct ucred *cred, struct thread *p, struct nfsexstuff *exp) { int error = 0; if (ndp->ni_vp) { vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); if (ndp->ni_dvp == ndp->ni_vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); vrele(ndp->ni_vp); error = EEXIST; goto out; } error = VOP_SYMLINK(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr, pathcp); vput(ndp->ni_dvp); vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); /* * Although FreeBSD still had the lookup code in * it for 7/current, there doesn't seem to be any * point, since VOP_SYMLINK() returns the ni_vp. * Just vput it for v2. */ if (!not_v2 && !error) vput(ndp->ni_vp); out: NFSEXITCODE(error); return (error); } /* * Parse symbolic link arguments. * This function has an ugly side effect. It will MALLOC() an area for * the symlink and set iov_base to point to it, only if it succeeds. * So, if it returns with uiop->uio_iov->iov_base != NULL, that must * be FREE'd later. */ int nfsvno_getsymlink(struct nfsrv_descript *nd, struct nfsvattr *nvap, struct thread *p, char **pathcpp, int *lenp) { u_int32_t *tl; char *pathcp = NULL; int error = 0, len; struct nfsv2_sattr *sp; *pathcpp = NULL; *lenp = 0; if ((nd->nd_flag & ND_NFSV3) && (error = nfsrv_sattr(nd, NULL, nvap, NULL, NULL, p))) goto nfsmout; NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); len = fxdr_unsigned(int, *tl); if (len > NFS_MAXPATHLEN || len <= 0) { error = EBADRPC; goto nfsmout; } MALLOC(pathcp, caddr_t, len + 1, M_TEMP, M_WAITOK); error = nfsrv_mtostr(nd, pathcp, len); if (error) goto nfsmout; if (nd->nd_flag & ND_NFSV2) { NFSM_DISSECT(sp, struct nfsv2_sattr *, NFSX_V2SATTR); nvap->na_mode = fxdr_unsigned(u_int16_t, sp->sa_mode); } *pathcpp = pathcp; *lenp = len; NFSEXITCODE2(0, nd); return (0); nfsmout: if (pathcp) free(pathcp, M_TEMP); NFSEXITCODE2(error, nd); return (error); } /* * Remove a non-directory object. */ int nfsvno_removesub(struct nameidata *ndp, int is_v4, struct ucred *cred, struct thread *p, struct nfsexstuff *exp) { struct vnode *vp; int error = 0; vp = ndp->ni_vp; if (vp->v_type == VDIR) error = NFSERR_ISDIR; else if (is_v4) error = nfsrv_checkremove(vp, 1, p); if (!error) error = VOP_REMOVE(ndp->ni_dvp, vp, &ndp->ni_cnd); if (ndp->ni_dvp == vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); vput(vp); if ((ndp->ni_cnd.cn_flags & SAVENAME) != 0) nfsvno_relpathbuf(ndp); NFSEXITCODE(error); return (error); } /* * Remove a directory. */ int nfsvno_rmdirsub(struct nameidata *ndp, int is_v4, struct ucred *cred, struct thread *p, struct nfsexstuff *exp) { struct vnode *vp; int error = 0; vp = ndp->ni_vp; if (vp->v_type != VDIR) { error = ENOTDIR; goto out; } /* * No rmdir "." please. */ if (ndp->ni_dvp == vp) { error = EINVAL; goto out; } /* * The root of a mounted filesystem cannot be deleted. */ if (vp->v_vflag & VV_ROOT) error = EBUSY; out: if (!error) error = VOP_RMDIR(ndp->ni_dvp, vp, &ndp->ni_cnd); if (ndp->ni_dvp == vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); vput(vp); if ((ndp->ni_cnd.cn_flags & SAVENAME) != 0) nfsvno_relpathbuf(ndp); NFSEXITCODE(error); return (error); } /* * Rename vnode op. */ int nfsvno_rename(struct nameidata *fromndp, struct nameidata *tondp, u_int32_t ndstat, u_int32_t ndflag, struct ucred *cred, struct thread *p) { struct vnode *fvp, *tvp, *tdvp; int error = 0; fvp = fromndp->ni_vp; if (ndstat) { vrele(fromndp->ni_dvp); vrele(fvp); error = ndstat; goto out1; } tdvp = tondp->ni_dvp; tvp = tondp->ni_vp; if (tvp != NULL) { if (fvp->v_type == VDIR && tvp->v_type != VDIR) { error = (ndflag & ND_NFSV2) ? EISDIR : EEXIST; goto out; } else if (fvp->v_type != VDIR && tvp->v_type == VDIR) { error = (ndflag & ND_NFSV2) ? ENOTDIR : EEXIST; goto out; } if (tvp->v_type == VDIR && tvp->v_mountedhere) { error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EXDEV; goto out; } /* * A rename to '.' or '..' results in a prematurely * unlocked vnode on FreeBSD5, so I'm just going to fail that * here. */ if ((tondp->ni_cnd.cn_namelen == 1 && tondp->ni_cnd.cn_nameptr[0] == '.') || (tondp->ni_cnd.cn_namelen == 2 && tondp->ni_cnd.cn_nameptr[0] == '.' && tondp->ni_cnd.cn_nameptr[1] == '.')) { error = EINVAL; goto out; } } if (fvp->v_type == VDIR && fvp->v_mountedhere) { error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EXDEV; goto out; } if (fvp->v_mount != tdvp->v_mount) { error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EXDEV; goto out; } if (fvp == tdvp) { error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EINVAL; goto out; } if (fvp == tvp) { /* * If source and destination are the same, there is nothing to * do. Set error to -1 to indicate this. */ error = -1; goto out; } if (ndflag & ND_NFSV4) { if (NFSVOPLOCK(fvp, LK_EXCLUSIVE) == 0) { error = nfsrv_checkremove(fvp, 0, p); NFSVOPUNLOCK(fvp, 0); } else error = EPERM; if (tvp && !error) error = nfsrv_checkremove(tvp, 1, p); } else { /* * For NFSv2 and NFSv3, try to get rid of the delegation, so * that the NFSv4 client won't be confused by the rename. * Since nfsd_recalldelegation() can only be called on an * unlocked vnode at this point and fvp is the file that will * still exist after the rename, just do fvp. */ nfsd_recalldelegation(fvp, p); } out: if (!error) { error = VOP_RENAME(fromndp->ni_dvp, fromndp->ni_vp, &fromndp->ni_cnd, tondp->ni_dvp, tondp->ni_vp, &tondp->ni_cnd); } else { if (tdvp == tvp) vrele(tdvp); else vput(tdvp); if (tvp) vput(tvp); vrele(fromndp->ni_dvp); vrele(fvp); if (error == -1) error = 0; } vrele(tondp->ni_startdir); nfsvno_relpathbuf(tondp); out1: vrele(fromndp->ni_startdir); nfsvno_relpathbuf(fromndp); NFSEXITCODE(error); return (error); } /* * Link vnode op. */ int nfsvno_link(struct nameidata *ndp, struct vnode *vp, struct ucred *cred, struct thread *p, struct nfsexstuff *exp) { struct vnode *xp; int error = 0; xp = ndp->ni_vp; if (xp != NULL) { error = EEXIST; } else { xp = ndp->ni_dvp; if (vp->v_mount != xp->v_mount) error = EXDEV; } if (!error) { NFSVOPLOCK(vp, LK_EXCLUSIVE | LK_RETRY); if ((vp->v_iflag & VI_DOOMED) == 0) error = VOP_LINK(ndp->ni_dvp, vp, &ndp->ni_cnd); else error = EPERM; if (ndp->ni_dvp == vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); NFSVOPUNLOCK(vp, 0); } else { if (ndp->ni_dvp == ndp->ni_vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); if (ndp->ni_vp) vrele(ndp->ni_vp); } nfsvno_relpathbuf(ndp); NFSEXITCODE(error); return (error); } /* * Do the fsync() appropriate for the commit. */ int nfsvno_fsync(struct vnode *vp, u_int64_t off, int cnt, struct ucred *cred, struct thread *td) { int error = 0; /* * RFC 1813 3.3.21: if count is 0, a flush from offset to the end of * file is done. At this time VOP_FSYNC does not accept offset and * byte count parameters so call VOP_FSYNC the whole file for now. * The same is true for NFSv4: RFC 3530 Sec. 14.2.3. + * File systems that do not use the buffer cache (as indicated + * by MNTK_USES_BCACHE not being set) must use VOP_FSYNC(). */ - if (cnt == 0 || cnt > MAX_COMMIT_COUNT) { + if (cnt == 0 || cnt > MAX_COMMIT_COUNT || + (vp->v_mount->mnt_kern_flag & MNTK_USES_BCACHE) == 0) { /* * Give up and do the whole thing */ if (vp->v_object && (vp->v_object->flags & OBJ_MIGHTBEDIRTY)) { VM_OBJECT_WLOCK(vp->v_object); vm_object_page_clean(vp->v_object, 0, 0, OBJPC_SYNC); VM_OBJECT_WUNLOCK(vp->v_object); } error = VOP_FSYNC(vp, MNT_WAIT, td); } else { /* * Locate and synchronously write any buffers that fall * into the requested range. Note: we are assuming that * f_iosize is a power of 2. */ int iosize = vp->v_mount->mnt_stat.f_iosize; int iomask = iosize - 1; struct bufobj *bo; daddr_t lblkno; /* * Align to iosize boundry, super-align to page boundry. */ if (off & iomask) { cnt += off & iomask; off &= ~(u_quad_t)iomask; } if (off & PAGE_MASK) { cnt += off & PAGE_MASK; off &= ~(u_quad_t)PAGE_MASK; } lblkno = off / iosize; if (vp->v_object && (vp->v_object->flags & OBJ_MIGHTBEDIRTY)) { VM_OBJECT_WLOCK(vp->v_object); vm_object_page_clean(vp->v_object, off, off + cnt, OBJPC_SYNC); VM_OBJECT_WUNLOCK(vp->v_object); } bo = &vp->v_bufobj; BO_LOCK(bo); while (cnt > 0) { struct buf *bp; /* * If we have a buffer and it is marked B_DELWRI we * have to lock and write it. Otherwise the prior * write is assumed to have already been committed. * * gbincore() can return invalid buffers now so we * have to check that bit as well (though B_DELWRI * should not be set if B_INVAL is set there could be * a race here since we haven't locked the buffer). */ if ((bp = gbincore(&vp->v_bufobj, lblkno)) != NULL) { if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) { BO_LOCK(bo); continue; /* retry */ } if ((bp->b_flags & (B_DELWRI|B_INVAL)) == B_DELWRI) { bremfree(bp); bp->b_flags &= ~B_ASYNC; bwrite(bp); ++nfs_commit_miss; } else BUF_UNLOCK(bp); BO_LOCK(bo); } ++nfs_commit_blks; if (cnt < iosize) break; cnt -= iosize; ++lblkno; } BO_UNLOCK(bo); } NFSEXITCODE(error); return (error); } /* * Statfs vnode op. */ int nfsvno_statfs(struct vnode *vp, struct statfs *sf) { int error; error = VFS_STATFS(vp->v_mount, sf); if (error == 0) { /* * Since NFS handles these values as unsigned on the * wire, there is no way to represent negative values, * so set them to 0. Without this, they will appear * to be very large positive values for clients like * Solaris10. */ if (sf->f_bavail < 0) sf->f_bavail = 0; if (sf->f_ffree < 0) sf->f_ffree = 0; } NFSEXITCODE(error); return (error); } /* * Do the vnode op stuff for Open. Similar to nfsvno_createsub(), but * must handle nfsrv_opencheck() calls after any other access checks. */ void nfsvno_open(struct nfsrv_descript *nd, struct nameidata *ndp, nfsquad_t clientid, nfsv4stateid_t *stateidp, struct nfsstate *stp, int *exclusive_flagp, struct nfsvattr *nvap, int32_t *cverf, int create, NFSACL_T *aclp, nfsattrbit_t *attrbitp, struct ucred *cred, struct thread *p, struct nfsexstuff *exp, struct vnode **vpp) { struct vnode *vp = NULL; u_quad_t tempsize; struct nfsexstuff nes; if (ndp->ni_vp == NULL) nd->nd_repstat = nfsrv_opencheck(clientid, stateidp, stp, NULL, nd, p, nd->nd_repstat); if (!nd->nd_repstat) { if (ndp->ni_vp == NULL) { vrele(ndp->ni_startdir); nd->nd_repstat = VOP_CREATE(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr); vput(ndp->ni_dvp); nfsvno_relpathbuf(ndp); if (!nd->nd_repstat) { if (*exclusive_flagp) { *exclusive_flagp = 0; NFSVNO_ATTRINIT(nvap); nvap->na_atime.tv_sec = cverf[0]; nvap->na_atime.tv_nsec = cverf[1]; nd->nd_repstat = VOP_SETATTR(ndp->ni_vp, &nvap->na_vattr, cred); } else { nfsrv_fixattr(nd, ndp->ni_vp, nvap, aclp, p, attrbitp, exp); } } vp = ndp->ni_vp; } else { if (ndp->ni_startdir) vrele(ndp->ni_startdir); nfsvno_relpathbuf(ndp); vp = ndp->ni_vp; if (create == NFSV4OPEN_CREATE) { if (ndp->ni_dvp == vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); } if (NFSVNO_ISSETSIZE(nvap) && vp->v_type == VREG) { if (ndp->ni_cnd.cn_flags & RDONLY) NFSVNO_SETEXRDONLY(&nes); else NFSVNO_EXINIT(&nes); nd->nd_repstat = nfsvno_accchk(vp, VWRITE, cred, &nes, p, NFSACCCHK_NOOVERRIDE, NFSACCCHK_VPISLOCKED, NULL); nd->nd_repstat = nfsrv_opencheck(clientid, stateidp, stp, vp, nd, p, nd->nd_repstat); if (!nd->nd_repstat) { tempsize = nvap->na_size; NFSVNO_ATTRINIT(nvap); nvap->na_size = tempsize; nd->nd_repstat = VOP_SETATTR(vp, &nvap->na_vattr, cred); } } else if (vp->v_type == VREG) { nd->nd_repstat = nfsrv_opencheck(clientid, stateidp, stp, vp, nd, p, nd->nd_repstat); } } } else { if (ndp->ni_cnd.cn_flags & HASBUF) nfsvno_relpathbuf(ndp); if (ndp->ni_startdir && create == NFSV4OPEN_CREATE) { vrele(ndp->ni_startdir); if (ndp->ni_dvp == ndp->ni_vp) vrele(ndp->ni_dvp); else vput(ndp->ni_dvp); if (ndp->ni_vp) vput(ndp->ni_vp); } } *vpp = vp; NFSEXITCODE2(0, nd); } /* * Updates the file rev and sets the mtime and ctime * to the current clock time, returning the va_filerev and va_Xtime * values. * Return ESTALE to indicate the vnode is VI_DOOMED. */ int nfsvno_updfilerev(struct vnode *vp, struct nfsvattr *nvap, struct ucred *cred, struct thread *p) { struct vattr va; VATTR_NULL(&va); vfs_timestamp(&va.va_mtime); if (NFSVOPISLOCKED(vp) != LK_EXCLUSIVE) { NFSVOPLOCK(vp, LK_UPGRADE | LK_RETRY); if ((vp->v_iflag & VI_DOOMED) != 0) return (ESTALE); } (void) VOP_SETATTR(vp, &va, cred); (void) nfsvno_getattr(vp, nvap, cred, p, 1); return (0); } /* * Glue routine to nfsv4_fillattr(). */ int nfsvno_fillattr(struct nfsrv_descript *nd, struct mount *mp, struct vnode *vp, struct nfsvattr *nvap, fhandle_t *fhp, int rderror, nfsattrbit_t *attrbitp, struct ucred *cred, struct thread *p, int isdgram, int reterr, int supports_nfsv4acls, int at_root, uint64_t mounted_on_fileno) { int error; error = nfsv4_fillattr(nd, mp, vp, NULL, &nvap->na_vattr, fhp, rderror, attrbitp, cred, p, isdgram, reterr, supports_nfsv4acls, at_root, mounted_on_fileno); NFSEXITCODE2(0, nd); return (error); } /* Since the Readdir vnode ops vary, put the entire functions in here. */ /* * nfs readdir service * - mallocs what it thinks is enough to read * count rounded up to a multiple of DIRBLKSIZ <= NFS_MAXREADDIR * - calls VOP_READDIR() * - loops around building the reply * if the output generated exceeds count break out of loop * The NFSM_CLGET macro is used here so that the reply will be packed * tightly in mbuf clusters. * - it trims out records with d_fileno == 0 * this doesn't matter for Unix clients, but they might confuse clients * for other os'. * - it trims out records with d_type == DT_WHT * these cannot be seen through NFS (unless we extend the protocol) * The alternate call nfsrvd_readdirplus() does lookups as well. * PS: The NFS protocol spec. does not clarify what the "count" byte * argument is a count of.. just name strings and file id's or the * entire reply rpc or ... * I tried just file name and id sizes and it confused the Sun client, * so I am using the full rpc size now. The "paranoia.." comment refers * to including the status longwords that are not a part of the dir. * "entry" structures, but are in the rpc. */ int nfsrvd_readdir(struct nfsrv_descript *nd, int isdgram, struct vnode *vp, struct thread *p, struct nfsexstuff *exp) { struct dirent *dp; u_int32_t *tl; int dirlen; char *cpos, *cend, *rbuf; struct nfsvattr at; int nlen, error = 0, getret = 1; int siz, cnt, fullsiz, eofflag, ncookies; u_int64_t off, toff, verf; u_long *cookies = NULL, *cookiep; struct uio io; struct iovec iv; int is_ufs; if (nd->nd_repstat) { nfsrv_postopattr(nd, getret, &at); goto out; } if (nd->nd_flag & ND_NFSV2) { NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED); off = fxdr_unsigned(u_quad_t, *tl++); } else { NFSM_DISSECT(tl, u_int32_t *, 5 * NFSX_UNSIGNED); off = fxdr_hyper(tl); tl += 2; verf = fxdr_hyper(tl); tl += 2; } toff = off; cnt = fxdr_unsigned(int, *tl); if (cnt > NFS_SRVMAXDATA(nd) || cnt < 0) cnt = NFS_SRVMAXDATA(nd); siz = ((cnt + DIRBLKSIZ - 1) & ~(DIRBLKSIZ - 1)); fullsiz = siz; if (nd->nd_flag & ND_NFSV3) { nd->nd_repstat = getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1); #if 0 /* * va_filerev is not sufficient as a cookie verifier, * since it is not supposed to change when entries are * removed/added unless that offset cookies returned to * the client are no longer valid. */ if (!nd->nd_repstat && toff && verf != at.na_filerev) nd->nd_repstat = NFSERR_BAD_COOKIE; #endif } if (!nd->nd_repstat && vp->v_type != VDIR) nd->nd_repstat = NFSERR_NOTDIR; if (nd->nd_repstat == 0 && cnt == 0) { if (nd->nd_flag & ND_NFSV2) /* NFSv2 does not have NFSERR_TOOSMALL */ nd->nd_repstat = EPERM; else nd->nd_repstat = NFSERR_TOOSMALL; } if (!nd->nd_repstat) nd->nd_repstat = nfsvno_accchk(vp, VEXEC, nd->nd_cred, exp, p, NFSACCCHK_NOOVERRIDE, NFSACCCHK_VPISLOCKED, NULL); if (nd->nd_repstat) { vput(vp); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); goto out; } is_ufs = strcmp(vp->v_mount->mnt_vfc->vfc_name, "ufs") == 0; MALLOC(rbuf, caddr_t, siz, M_TEMP, M_WAITOK); again: eofflag = 0; if (cookies) { free((caddr_t)cookies, M_TEMP); cookies = NULL; } iv.iov_base = rbuf; iv.iov_len = siz; io.uio_iov = &iv; io.uio_iovcnt = 1; io.uio_offset = (off_t)off; io.uio_resid = siz; io.uio_segflg = UIO_SYSSPACE; io.uio_rw = UIO_READ; io.uio_td = NULL; nd->nd_repstat = VOP_READDIR(vp, &io, nd->nd_cred, &eofflag, &ncookies, &cookies); off = (u_int64_t)io.uio_offset; if (io.uio_resid) siz -= io.uio_resid; if (!cookies && !nd->nd_repstat) nd->nd_repstat = NFSERR_PERM; if (nd->nd_flag & ND_NFSV3) { getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1); if (!nd->nd_repstat) nd->nd_repstat = getret; } /* * Handles the failed cases. nd->nd_repstat == 0 past here. */ if (nd->nd_repstat) { vput(vp); free((caddr_t)rbuf, M_TEMP); if (cookies) free((caddr_t)cookies, M_TEMP); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); goto out; } /* * If nothing read, return eof * rpc reply */ if (siz == 0) { vput(vp); if (nd->nd_flag & ND_NFSV2) { NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); } else { nfsrv_postopattr(nd, getret, &at); NFSM_BUILD(tl, u_int32_t *, 4 * NFSX_UNSIGNED); txdr_hyper(at.na_filerev, tl); tl += 2; } *tl++ = newnfs_false; *tl = newnfs_true; FREE((caddr_t)rbuf, M_TEMP); FREE((caddr_t)cookies, M_TEMP); goto out; } /* * Check for degenerate cases of nothing useful read. * If so go try again */ cpos = rbuf; cend = rbuf + siz; dp = (struct dirent *)cpos; cookiep = cookies; /* * For some reason FreeBSD's ufs_readdir() chooses to back the * directory offset up to a block boundary, so it is necessary to * skip over the records that precede the requested offset. This * requires the assumption that file offset cookies monotonically * increase. */ while (cpos < cend && ncookies > 0 && (dp->d_fileno == 0 || dp->d_type == DT_WHT || (is_ufs == 1 && ((u_quad_t)(*cookiep)) <= toff))) { cpos += dp->d_reclen; dp = (struct dirent *)cpos; cookiep++; ncookies--; } if (cpos >= cend || ncookies == 0) { siz = fullsiz; toff = off; goto again; } vput(vp); /* * dirlen is the size of the reply, including all XDR and must * not exceed cnt. For NFSv2, RFC1094 didn't clearly indicate * if the XDR should be included in "count", but to be safe, we do. * (Include the two booleans at the end of the reply in dirlen now.) */ if (nd->nd_flag & ND_NFSV3) { nfsrv_postopattr(nd, getret, &at); NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); txdr_hyper(at.na_filerev, tl); dirlen = NFSX_V3POSTOPATTR + NFSX_VERF + 2 * NFSX_UNSIGNED; } else { dirlen = 2 * NFSX_UNSIGNED; } /* Loop through the records and build reply */ while (cpos < cend && ncookies > 0) { nlen = dp->d_namlen; if (dp->d_fileno != 0 && dp->d_type != DT_WHT && nlen <= NFS_MAXNAMLEN) { if (nd->nd_flag & ND_NFSV3) dirlen += (6*NFSX_UNSIGNED + NFSM_RNDUP(nlen)); else dirlen += (4*NFSX_UNSIGNED + NFSM_RNDUP(nlen)); if (dirlen > cnt) { eofflag = 0; break; } /* * Build the directory record xdr from * the dirent entry. */ if (nd->nd_flag & ND_NFSV3) { NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED); *tl++ = newnfs_true; *tl++ = 0; } else { NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); *tl++ = newnfs_true; } *tl = txdr_unsigned(dp->d_fileno); (void) nfsm_strtom(nd, dp->d_name, nlen); if (nd->nd_flag & ND_NFSV3) { NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); *tl++ = 0; } else NFSM_BUILD(tl, u_int32_t *, NFSX_UNSIGNED); *tl = txdr_unsigned(*cookiep); } cpos += dp->d_reclen; dp = (struct dirent *)cpos; cookiep++; ncookies--; } if (cpos < cend) eofflag = 0; NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); *tl++ = newnfs_false; if (eofflag) *tl = newnfs_true; else *tl = newnfs_false; FREE((caddr_t)rbuf, M_TEMP); FREE((caddr_t)cookies, M_TEMP); out: NFSEXITCODE2(0, nd); return (0); nfsmout: vput(vp); NFSEXITCODE2(error, nd); return (error); } /* * Readdirplus for V3 and Readdir for V4. */ int nfsrvd_readdirplus(struct nfsrv_descript *nd, int isdgram, struct vnode *vp, struct thread *p, struct nfsexstuff *exp) { struct dirent *dp; u_int32_t *tl; int dirlen; char *cpos, *cend, *rbuf; struct vnode *nvp; fhandle_t nfh; struct nfsvattr nva, at, *nvap = &nva; struct mbuf *mb0, *mb1; struct nfsreferral *refp; int nlen, r, error = 0, getret = 1, usevget = 1; int siz, cnt, fullsiz, eofflag, ncookies, entrycnt; caddr_t bpos0, bpos1; u_int64_t off, toff, verf; u_long *cookies = NULL, *cookiep; nfsattrbit_t attrbits, rderrbits, savbits; struct uio io; struct iovec iv; struct componentname cn; int at_root, is_ufs, is_zfs, needs_unbusy, supports_nfsv4acls; struct mount *mp, *new_mp; uint64_t mounted_on_fileno; if (nd->nd_repstat) { nfsrv_postopattr(nd, getret, &at); goto out; } NFSM_DISSECT(tl, u_int32_t *, 6 * NFSX_UNSIGNED); off = fxdr_hyper(tl); toff = off; tl += 2; verf = fxdr_hyper(tl); tl += 2; siz = fxdr_unsigned(int, *tl++); cnt = fxdr_unsigned(int, *tl); /* * Use the server's maximum data transfer size as the upper bound * on reply datalen. */ if (cnt > NFS_SRVMAXDATA(nd) || cnt < 0) cnt = NFS_SRVMAXDATA(nd); /* * siz is a "hint" of how much directory information (name, fileid, * cookie) should be in the reply. At least one client "hints" 0, * so I set it to cnt for that case. I also round it up to the * next multiple of DIRBLKSIZ. */ if (siz <= 0) siz = cnt; siz = ((siz + DIRBLKSIZ - 1) & ~(DIRBLKSIZ - 1)); if (nd->nd_flag & ND_NFSV4) { error = nfsrv_getattrbits(nd, &attrbits, NULL, NULL); if (error) goto nfsmout; NFSSET_ATTRBIT(&savbits, &attrbits); NFSCLRNOTFILLABLE_ATTRBIT(&attrbits); NFSZERO_ATTRBIT(&rderrbits); NFSSETBIT_ATTRBIT(&rderrbits, NFSATTRBIT_RDATTRERROR); } else { NFSZERO_ATTRBIT(&attrbits); } fullsiz = siz; nd->nd_repstat = getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1); if (!nd->nd_repstat) { if (off && verf != at.na_filerev) { /* * va_filerev is not sufficient as a cookie verifier, * since it is not supposed to change when entries are * removed/added unless that offset cookies returned to * the client are no longer valid. */ #if 0 if (nd->nd_flag & ND_NFSV4) { nd->nd_repstat = NFSERR_NOTSAME; } else { nd->nd_repstat = NFSERR_BAD_COOKIE; } #endif } else if ((nd->nd_flag & ND_NFSV4) && off == 0 && verf != 0) { nd->nd_repstat = NFSERR_BAD_COOKIE; } } if (!nd->nd_repstat && vp->v_type != VDIR) nd->nd_repstat = NFSERR_NOTDIR; if (!nd->nd_repstat && cnt == 0) nd->nd_repstat = NFSERR_TOOSMALL; if (!nd->nd_repstat) nd->nd_repstat = nfsvno_accchk(vp, VEXEC, nd->nd_cred, exp, p, NFSACCCHK_NOOVERRIDE, NFSACCCHK_VPISLOCKED, NULL); if (nd->nd_repstat) { vput(vp); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); goto out; } is_ufs = strcmp(vp->v_mount->mnt_vfc->vfc_name, "ufs") == 0; is_zfs = strcmp(vp->v_mount->mnt_vfc->vfc_name, "zfs") == 0; MALLOC(rbuf, caddr_t, siz, M_TEMP, M_WAITOK); again: eofflag = 0; if (cookies) { free((caddr_t)cookies, M_TEMP); cookies = NULL; } iv.iov_base = rbuf; iv.iov_len = siz; io.uio_iov = &iv; io.uio_iovcnt = 1; io.uio_offset = (off_t)off; io.uio_resid = siz; io.uio_segflg = UIO_SYSSPACE; io.uio_rw = UIO_READ; io.uio_td = NULL; nd->nd_repstat = VOP_READDIR(vp, &io, nd->nd_cred, &eofflag, &ncookies, &cookies); off = (u_int64_t)io.uio_offset; if (io.uio_resid) siz -= io.uio_resid; getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1); if (!cookies && !nd->nd_repstat) nd->nd_repstat = NFSERR_PERM; if (!nd->nd_repstat) nd->nd_repstat = getret; if (nd->nd_repstat) { vput(vp); if (cookies) free((caddr_t)cookies, M_TEMP); free((caddr_t)rbuf, M_TEMP); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); goto out; } /* * If nothing read, return eof * rpc reply */ if (siz == 0) { vput(vp); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); NFSM_BUILD(tl, u_int32_t *, 4 * NFSX_UNSIGNED); txdr_hyper(at.na_filerev, tl); tl += 2; *tl++ = newnfs_false; *tl = newnfs_true; free((caddr_t)cookies, M_TEMP); free((caddr_t)rbuf, M_TEMP); goto out; } /* * Check for degenerate cases of nothing useful read. * If so go try again */ cpos = rbuf; cend = rbuf + siz; dp = (struct dirent *)cpos; cookiep = cookies; /* * For some reason FreeBSD's ufs_readdir() chooses to back the * directory offset up to a block boundary, so it is necessary to * skip over the records that precede the requested offset. This * requires the assumption that file offset cookies monotonically * increase. */ while (cpos < cend && ncookies > 0 && (dp->d_fileno == 0 || dp->d_type == DT_WHT || (is_ufs == 1 && ((u_quad_t)(*cookiep)) <= toff) || ((nd->nd_flag & ND_NFSV4) && ((dp->d_namlen == 1 && dp->d_name[0] == '.') || (dp->d_namlen==2 && dp->d_name[0]=='.' && dp->d_name[1]=='.'))))) { cpos += dp->d_reclen; dp = (struct dirent *)cpos; cookiep++; ncookies--; } if (cpos >= cend || ncookies == 0) { siz = fullsiz; toff = off; goto again; } /* * Busy the file system so that the mount point won't go away * and, as such, VFS_VGET() can be used safely. */ mp = vp->v_mount; vfs_ref(mp); NFSVOPUNLOCK(vp, 0); nd->nd_repstat = vfs_busy(mp, 0); vfs_rel(mp); if (nd->nd_repstat != 0) { vrele(vp); free(cookies, M_TEMP); free(rbuf, M_TEMP); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); goto out; } /* * Check to see if entries in this directory can be safely acquired * via VFS_VGET() or if a switch to VOP_LOOKUP() is required. * ZFS snapshot directories need VOP_LOOKUP(), so that any * automount of the snapshot directory that is required will * be done. * This needs to be done here for NFSv4, since NFSv4 never does * a VFS_VGET() for "." or "..". */ if (is_zfs == 1) { r = VFS_VGET(mp, at.na_fileid, LK_SHARED, &nvp); if (r == EOPNOTSUPP) { usevget = 0; cn.cn_nameiop = LOOKUP; cn.cn_lkflags = LK_SHARED | LK_RETRY; cn.cn_cred = nd->nd_cred; cn.cn_thread = p; } else if (r == 0) vput(nvp); } /* * Save this position, in case there is an error before one entry * is created. */ mb0 = nd->nd_mb; bpos0 = nd->nd_bpos; /* * Fill in the first part of the reply. * dirlen is the reply length in bytes and cannot exceed cnt. * (Include the two booleans at the end of the reply in dirlen now, * so we recognize when we have exceeded cnt.) */ if (nd->nd_flag & ND_NFSV3) { dirlen = NFSX_V3POSTOPATTR + NFSX_VERF + 2 * NFSX_UNSIGNED; nfsrv_postopattr(nd, getret, &at); } else { dirlen = NFSX_VERF + 2 * NFSX_UNSIGNED; } NFSM_BUILD(tl, u_int32_t *, NFSX_VERF); txdr_hyper(at.na_filerev, tl); /* * Save this position, in case there is an empty reply needed. */ mb1 = nd->nd_mb; bpos1 = nd->nd_bpos; /* Loop through the records and build reply */ entrycnt = 0; while (cpos < cend && ncookies > 0 && dirlen < cnt) { nlen = dp->d_namlen; if (dp->d_fileno != 0 && dp->d_type != DT_WHT && nlen <= NFS_MAXNAMLEN && ((nd->nd_flag & ND_NFSV3) || nlen > 2 || (nlen==2 && (dp->d_name[0]!='.' || dp->d_name[1]!='.')) || (nlen == 1 && dp->d_name[0] != '.'))) { /* * Save the current position in the reply, in case * this entry exceeds cnt. */ mb1 = nd->nd_mb; bpos1 = nd->nd_bpos; /* * For readdir_and_lookup get the vnode using * the file number. */ nvp = NULL; refp = NULL; r = 0; at_root = 0; needs_unbusy = 0; new_mp = mp; mounted_on_fileno = (uint64_t)dp->d_fileno; if ((nd->nd_flag & ND_NFSV3) || NFSNONZERO_ATTRBIT(&savbits)) { if (nd->nd_flag & ND_NFSV4) refp = nfsv4root_getreferral(NULL, vp, dp->d_fileno); if (refp == NULL) { if (usevget) r = VFS_VGET(mp, dp->d_fileno, LK_SHARED, &nvp); else r = EOPNOTSUPP; if (r == EOPNOTSUPP) { if (usevget) { usevget = 0; cn.cn_nameiop = LOOKUP; cn.cn_lkflags = LK_SHARED | LK_RETRY; cn.cn_cred = nd->nd_cred; cn.cn_thread = p; } cn.cn_nameptr = dp->d_name; cn.cn_namelen = nlen; cn.cn_flags = ISLASTCN | NOFOLLOW | LOCKLEAF; if (nlen == 2 && dp->d_name[0] == '.' && dp->d_name[1] == '.') cn.cn_flags |= ISDOTDOT; if (NFSVOPLOCK(vp, LK_SHARED) != 0) { nd->nd_repstat = EPERM; break; } if ((vp->v_vflag & VV_ROOT) != 0 && (cn.cn_flags & ISDOTDOT) != 0) { vref(vp); nvp = vp; r = 0; } else { r = VOP_LOOKUP(vp, &nvp, &cn); if (vp != nvp) NFSVOPUNLOCK(vp, 0); } } /* * For NFSv4, check to see if nvp is * a mount point and get the mount * point vnode, as required. */ if (r == 0 && nfsrv_enable_crossmntpt != 0 && (nd->nd_flag & ND_NFSV4) != 0 && nvp->v_type == VDIR && nvp->v_mountedhere != NULL) { new_mp = nvp->v_mountedhere; r = vfs_busy(new_mp, 0); vput(nvp); nvp = NULL; if (r == 0) { r = VFS_ROOT(new_mp, LK_SHARED, &nvp); needs_unbusy = 1; if (r == 0) at_root = 1; } } } if (!r) { if (refp == NULL && ((nd->nd_flag & ND_NFSV3) || NFSNONZERO_ATTRBIT(&attrbits))) { r = nfsvno_getfh(nvp, &nfh, p); if (!r) r = nfsvno_getattr(nvp, nvap, nd->nd_cred, p, 1); if (r == 0 && is_zfs == 1 && nfsrv_enable_crossmntpt != 0 && (nd->nd_flag & ND_NFSV4) != 0 && nvp->v_type == VDIR && vp->v_mount != nvp->v_mount) { /* * For a ZFS snapshot, there is a * pseudo mount that does not set * v_mountedhere, so it needs to * be detected via a different * mount structure. */ at_root = 1; if (new_mp == mp) new_mp = nvp->v_mount; } } } else { nvp = NULL; } if (r) { if (!NFSISSET_ATTRBIT(&attrbits, NFSATTRBIT_RDATTRERROR)) { if (nvp != NULL) vput(nvp); if (needs_unbusy != 0) vfs_unbusy(new_mp); nd->nd_repstat = r; break; } } } /* * Build the directory record xdr */ if (nd->nd_flag & ND_NFSV3) { NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED); *tl++ = newnfs_true; *tl++ = 0; *tl = txdr_unsigned(dp->d_fileno); dirlen += nfsm_strtom(nd, dp->d_name, nlen); NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); *tl++ = 0; *tl = txdr_unsigned(*cookiep); nfsrv_postopattr(nd, 0, nvap); dirlen += nfsm_fhtom(nd,(u_int8_t *)&nfh,0,1); dirlen += (5*NFSX_UNSIGNED+NFSX_V3POSTOPATTR); if (nvp != NULL) vput(nvp); } else { NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED); *tl++ = newnfs_true; *tl++ = 0; *tl = txdr_unsigned(*cookiep); dirlen += nfsm_strtom(nd, dp->d_name, nlen); if (nvp != NULL) { supports_nfsv4acls = nfs_supportsnfsv4acls(nvp); NFSVOPUNLOCK(nvp, 0); } else supports_nfsv4acls = 0; if (refp != NULL) { dirlen += nfsrv_putreferralattr(nd, &savbits, refp, 0, &nd->nd_repstat); if (nd->nd_repstat) { if (nvp != NULL) vrele(nvp); if (needs_unbusy != 0) vfs_unbusy(new_mp); break; } } else if (r) { dirlen += nfsvno_fillattr(nd, new_mp, nvp, nvap, &nfh, r, &rderrbits, nd->nd_cred, p, isdgram, 0, supports_nfsv4acls, at_root, mounted_on_fileno); } else { dirlen += nfsvno_fillattr(nd, new_mp, nvp, nvap, &nfh, r, &attrbits, nd->nd_cred, p, isdgram, 0, supports_nfsv4acls, at_root, mounted_on_fileno); } if (nvp != NULL) vrele(nvp); dirlen += (3 * NFSX_UNSIGNED); } if (needs_unbusy != 0) vfs_unbusy(new_mp); if (dirlen <= cnt) entrycnt++; } cpos += dp->d_reclen; dp = (struct dirent *)cpos; cookiep++; ncookies--; } vrele(vp); vfs_unbusy(mp); /* * If dirlen > cnt, we must strip off the last entry. If that * results in an empty reply, report NFSERR_TOOSMALL. */ if (dirlen > cnt || nd->nd_repstat) { if (!nd->nd_repstat && entrycnt == 0) nd->nd_repstat = NFSERR_TOOSMALL; if (nd->nd_repstat) { newnfs_trimtrailing(nd, mb0, bpos0); if (nd->nd_flag & ND_NFSV3) nfsrv_postopattr(nd, getret, &at); } else newnfs_trimtrailing(nd, mb1, bpos1); eofflag = 0; } else if (cpos < cend) eofflag = 0; if (!nd->nd_repstat) { NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED); *tl++ = newnfs_false; if (eofflag) *tl = newnfs_true; else *tl = newnfs_false; } FREE((caddr_t)cookies, M_TEMP); FREE((caddr_t)rbuf, M_TEMP); out: NFSEXITCODE2(0, nd); return (0); nfsmout: vput(vp); NFSEXITCODE2(error, nd); return (error); } /* * Get the settable attributes out of the mbuf list. * (Return 0 or EBADRPC) */ int nfsrv_sattr(struct nfsrv_descript *nd, vnode_t vp, struct nfsvattr *nvap, nfsattrbit_t *attrbitp, NFSACL_T *aclp, struct thread *p) { u_int32_t *tl; struct nfsv2_sattr *sp; int error = 0, toclient = 0; switch (nd->nd_flag & (ND_NFSV2 | ND_NFSV3 | ND_NFSV4)) { case ND_NFSV2: NFSM_DISSECT(sp, struct nfsv2_sattr *, NFSX_V2SATTR); /* * Some old clients didn't fill in the high order 16bits. * --> check the low order 2 bytes for 0xffff */ if ((fxdr_unsigned(int, sp->sa_mode) & 0xffff) != 0xffff) nvap->na_mode = nfstov_mode(sp->sa_mode); if (sp->sa_uid != newnfs_xdrneg1) nvap->na_uid = fxdr_unsigned(uid_t, sp->sa_uid); if (sp->sa_gid != newnfs_xdrneg1) nvap->na_gid = fxdr_unsigned(gid_t, sp->sa_gid); if (sp->sa_size != newnfs_xdrneg1) nvap->na_size = fxdr_unsigned(u_quad_t, sp->sa_size); if (sp->sa_atime.nfsv2_sec != newnfs_xdrneg1) { #ifdef notyet fxdr_nfsv2time(&sp->sa_atime, &nvap->na_atime); #else nvap->na_atime.tv_sec = fxdr_unsigned(u_int32_t,sp->sa_atime.nfsv2_sec); nvap->na_atime.tv_nsec = 0; #endif } if (sp->sa_mtime.nfsv2_sec != newnfs_xdrneg1) fxdr_nfsv2time(&sp->sa_mtime, &nvap->na_mtime); break; case ND_NFSV3: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (*tl == newnfs_true) { NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); nvap->na_mode = nfstov_mode(*tl); } NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (*tl == newnfs_true) { NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); nvap->na_uid = fxdr_unsigned(uid_t, *tl); } NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (*tl == newnfs_true) { NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); nvap->na_gid = fxdr_unsigned(gid_t, *tl); } NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (*tl == newnfs_true) { NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED); nvap->na_size = fxdr_hyper(tl); } NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); switch (fxdr_unsigned(int, *tl)) { case NFSV3SATTRTIME_TOCLIENT: NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED); fxdr_nfsv3time(tl, &nvap->na_atime); toclient = 1; break; case NFSV3SATTRTIME_TOSERVER: vfs_timestamp(&nvap->na_atime); nvap->na_vaflags |= VA_UTIMES_NULL; break; }; NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); switch (fxdr_unsigned(int, *tl)) { case NFSV3SATTRTIME_TOCLIENT: NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED); fxdr_nfsv3time(tl, &nvap->na_mtime); nvap->na_vaflags &= ~VA_UTIMES_NULL; break; case NFSV3SATTRTIME_TOSERVER: vfs_timestamp(&nvap->na_mtime); if (!toclient) nvap->na_vaflags |= VA_UTIMES_NULL; break; }; break; case ND_NFSV4: error = nfsv4_sattr(nd, vp, nvap, attrbitp, aclp, p); }; nfsmout: NFSEXITCODE2(error, nd); return (error); } /* * Handle the setable attributes for V4. * Returns NFSERR_BADXDR if it can't be parsed, 0 otherwise. */ int nfsv4_sattr(struct nfsrv_descript *nd, vnode_t vp, struct nfsvattr *nvap, nfsattrbit_t *attrbitp, NFSACL_T *aclp, struct thread *p) { u_int32_t *tl; int attrsum = 0; int i, j; int error, attrsize, bitpos, aclsize, aceerr, retnotsup = 0; int toclient = 0; u_char *cp, namestr[NFSV4_SMALLSTR + 1]; uid_t uid; gid_t gid; error = nfsrv_getattrbits(nd, attrbitp, NULL, &retnotsup); if (error) goto nfsmout; NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); attrsize = fxdr_unsigned(int, *tl); /* * Loop around getting the setable attributes. If an unsupported * one is found, set nd_repstat == NFSERR_ATTRNOTSUPP and return. */ if (retnotsup) { nd->nd_repstat = NFSERR_ATTRNOTSUPP; bitpos = NFSATTRBIT_MAX; } else { bitpos = 0; } for (; bitpos < NFSATTRBIT_MAX; bitpos++) { if (attrsum > attrsize) { error = NFSERR_BADXDR; goto nfsmout; } if (NFSISSET_ATTRBIT(attrbitp, bitpos)) switch (bitpos) { case NFSATTRBIT_SIZE: NFSM_DISSECT(tl, u_int32_t *, NFSX_HYPER); if (vp != NULL && vp->v_type != VREG) { error = (vp->v_type == VDIR) ? NFSERR_ISDIR : NFSERR_INVAL; goto nfsmout; } nvap->na_size = fxdr_hyper(tl); attrsum += NFSX_HYPER; break; case NFSATTRBIT_ACL: error = nfsrv_dissectacl(nd, aclp, &aceerr, &aclsize, p); if (error) goto nfsmout; if (aceerr && !nd->nd_repstat) nd->nd_repstat = aceerr; attrsum += aclsize; break; case NFSATTRBIT_ARCHIVE: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (!nd->nd_repstat) nd->nd_repstat = NFSERR_ATTRNOTSUPP; attrsum += NFSX_UNSIGNED; break; case NFSATTRBIT_HIDDEN: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (!nd->nd_repstat) nd->nd_repstat = NFSERR_ATTRNOTSUPP; attrsum += NFSX_UNSIGNED; break; case NFSATTRBIT_MIMETYPE: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); i = fxdr_unsigned(int, *tl); error = nfsm_advance(nd, NFSM_RNDUP(i), -1); if (error) goto nfsmout; if (!nd->nd_repstat) nd->nd_repstat = NFSERR_ATTRNOTSUPP; attrsum += (NFSX_UNSIGNED + NFSM_RNDUP(i)); break; case NFSATTRBIT_MODE: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); nvap->na_mode = nfstov_mode(*tl); attrsum += NFSX_UNSIGNED; break; case NFSATTRBIT_OWNER: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); j = fxdr_unsigned(int, *tl); if (j < 0) { error = NFSERR_BADXDR; goto nfsmout; } if (j > NFSV4_SMALLSTR) cp = malloc(j + 1, M_NFSSTRING, M_WAITOK); else cp = namestr; error = nfsrv_mtostr(nd, cp, j); if (error) { if (j > NFSV4_SMALLSTR) free(cp, M_NFSSTRING); goto nfsmout; } if (!nd->nd_repstat) { nd->nd_repstat = nfsv4_strtouid(nd, cp, j, &uid, p); if (!nd->nd_repstat) nvap->na_uid = uid; } if (j > NFSV4_SMALLSTR) free(cp, M_NFSSTRING); attrsum += (NFSX_UNSIGNED + NFSM_RNDUP(j)); break; case NFSATTRBIT_OWNERGROUP: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); j = fxdr_unsigned(int, *tl); if (j < 0) { error = NFSERR_BADXDR; goto nfsmout; } if (j > NFSV4_SMALLSTR) cp = malloc(j + 1, M_NFSSTRING, M_WAITOK); else cp = namestr; error = nfsrv_mtostr(nd, cp, j); if (error) { if (j > NFSV4_SMALLSTR) free(cp, M_NFSSTRING); goto nfsmout; } if (!nd->nd_repstat) { nd->nd_repstat = nfsv4_strtogid(nd, cp, j, &gid, p); if (!nd->nd_repstat) nvap->na_gid = gid; } if (j > NFSV4_SMALLSTR) free(cp, M_NFSSTRING); attrsum += (NFSX_UNSIGNED + NFSM_RNDUP(j)); break; case NFSATTRBIT_SYSTEM: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); if (!nd->nd_repstat) nd->nd_repstat = NFSERR_ATTRNOTSUPP; attrsum += NFSX_UNSIGNED; break; case NFSATTRBIT_TIMEACCESSSET: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); attrsum += NFSX_UNSIGNED; if (fxdr_unsigned(int, *tl)==NFSV4SATTRTIME_TOCLIENT) { NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME); fxdr_nfsv4time(tl, &nvap->na_atime); toclient = 1; attrsum += NFSX_V4TIME; } else { vfs_timestamp(&nvap->na_atime); nvap->na_vaflags |= VA_UTIMES_NULL; } break; case NFSATTRBIT_TIMEBACKUP: NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME); if (!nd->nd_repstat) nd->nd_repstat = NFSERR_ATTRNOTSUPP; attrsum += NFSX_V4TIME; break; case NFSATTRBIT_TIMECREATE: NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME); if (!nd->nd_repstat) nd->nd_repstat = NFSERR_ATTRNOTSUPP; attrsum += NFSX_V4TIME; break; case NFSATTRBIT_TIMEMODIFYSET: NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); attrsum += NFSX_UNSIGNED; if (fxdr_unsigned(int, *tl)==NFSV4SATTRTIME_TOCLIENT) { NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME); fxdr_nfsv4time(tl, &nvap->na_mtime); nvap->na_vaflags &= ~VA_UTIMES_NULL; attrsum += NFSX_V4TIME; } else { vfs_timestamp(&nvap->na_mtime); if (!toclient) nvap->na_vaflags |= VA_UTIMES_NULL; } break; default: nd->nd_repstat = NFSERR_ATTRNOTSUPP; /* * set bitpos so we drop out of the loop. */ bitpos = NFSATTRBIT_MAX; break; }; } /* * some clients pad the attrlist, so we need to skip over the * padding. */ if (attrsum > attrsize) { error = NFSERR_BADXDR; } else { attrsize = NFSM_RNDUP(attrsize); if (attrsum < attrsize) error = nfsm_advance(nd, attrsize - attrsum, -1); } nfsmout: NFSEXITCODE2(error, nd); return (error); } /* * Check/setup export credentials. */ int nfsd_excred(struct nfsrv_descript *nd, struct nfsexstuff *exp, struct ucred *credanon) { int error = 0; /* * Check/setup credentials. */ if (nd->nd_flag & ND_GSS) exp->nes_exflag &= ~MNT_EXPORTANON; /* * Check to see if the operation is allowed for this security flavor. * RFC2623 suggests that the NFSv3 Fsinfo RPC be allowed to * AUTH_NONE or AUTH_SYS for file systems requiring RPCSEC_GSS. * Also, allow Secinfo, so that it can acquire the correct flavor(s). */ if (nfsvno_testexp(nd, exp) && nd->nd_procnum != NFSV4OP_SECINFO && nd->nd_procnum != NFSPROC_FSINFO) { if (nd->nd_flag & ND_NFSV4) error = NFSERR_WRONGSEC; else error = (NFSERR_AUTHERR | AUTH_TOOWEAK); goto out; } /* * Check to see if the file system is exported V4 only. */ if (NFSVNO_EXV4ONLY(exp) && !(nd->nd_flag & ND_NFSV4)) { error = NFSERR_PROGNOTV4; goto out; } /* * Now, map the user credentials. * (Note that ND_AUTHNONE will only be set for an NFSv3 * Fsinfo RPC. If set for anything else, this code might need * to change.) */ if (NFSVNO_EXPORTED(exp) && ((!(nd->nd_flag & ND_GSS) && nd->nd_cred->cr_uid == 0) || NFSVNO_EXPORTANON(exp) || (nd->nd_flag & ND_AUTHNONE))) { nd->nd_cred->cr_uid = credanon->cr_uid; nd->nd_cred->cr_gid = credanon->cr_gid; crsetgroups(nd->nd_cred, credanon->cr_ngroups, credanon->cr_groups); } out: NFSEXITCODE2(error, nd); return (error); } /* * Check exports. */ int nfsvno_checkexp(struct mount *mp, struct sockaddr *nam, struct nfsexstuff *exp, struct ucred **credp) { int i, error, *secflavors; error = VFS_CHECKEXP(mp, nam, &exp->nes_exflag, credp, &exp->nes_numsecflavor, &secflavors); if (error) { if (nfs_rootfhset) { exp->nes_exflag = 0; exp->nes_numsecflavor = 0; error = 0; } } else { /* Copy the security flavors. */ for (i = 0; i < exp->nes_numsecflavor; i++) exp->nes_secflavors[i] = secflavors[i]; } NFSEXITCODE(error); return (error); } /* * Get a vnode for a file handle and export stuff. */ int nfsvno_fhtovp(struct mount *mp, fhandle_t *fhp, struct sockaddr *nam, int lktype, struct vnode **vpp, struct nfsexstuff *exp, struct ucred **credp) { int i, error, *secflavors; *credp = NULL; exp->nes_numsecflavor = 0; error = VFS_FHTOVP(mp, &fhp->fh_fid, lktype, vpp); if (error != 0) /* Make sure the server replies ESTALE to the client. */ error = ESTALE; if (nam && !error) { error = VFS_CHECKEXP(mp, nam, &exp->nes_exflag, credp, &exp->nes_numsecflavor, &secflavors); if (error) { if (nfs_rootfhset) { exp->nes_exflag = 0; exp->nes_numsecflavor = 0; error = 0; } else { vput(*vpp); } } else { /* Copy the security flavors. */ for (i = 0; i < exp->nes_numsecflavor; i++) exp->nes_secflavors[i] = secflavors[i]; } } NFSEXITCODE(error); return (error); } /* * nfsd_fhtovp() - convert a fh to a vnode ptr * - look up fsid in mount list (if not found ret error) * - get vp and export rights by calling nfsvno_fhtovp() * - if cred->cr_uid == 0 or MNT_EXPORTANON set it to credanon * for AUTH_SYS * - if mpp != NULL, return the mount point so that it can * be used for vn_finished_write() by the caller */ void nfsd_fhtovp(struct nfsrv_descript *nd, struct nfsrvfh *nfp, int lktype, struct vnode **vpp, struct nfsexstuff *exp, struct mount **mpp, int startwrite, struct thread *p) { struct mount *mp; struct ucred *credanon; fhandle_t *fhp; fhp = (fhandle_t *)nfp->nfsrvfh_data; /* * Check for the special case of the nfsv4root_fh. */ mp = vfs_busyfs(&fhp->fh_fsid); if (mpp != NULL) *mpp = mp; if (mp == NULL) { *vpp = NULL; nd->nd_repstat = ESTALE; goto out; } if (startwrite) { vn_start_write(NULL, mpp, V_WAIT); if (lktype == LK_SHARED && !(MNT_SHARED_WRITES(mp))) lktype = LK_EXCLUSIVE; } nd->nd_repstat = nfsvno_fhtovp(mp, fhp, nd->nd_nam, lktype, vpp, exp, &credanon); vfs_unbusy(mp); /* * For NFSv4 without a pseudo root fs, unexported file handles * can be returned, so that Lookup works everywhere. */ if (!nd->nd_repstat && exp->nes_exflag == 0 && !(nd->nd_flag & ND_NFSV4)) { vput(*vpp); nd->nd_repstat = EACCES; } /* * Personally, I've never seen any point in requiring a * reserved port#, since only in the rare case where the * clients are all boxes with secure system priviledges, * does it provide any enhanced security, but... some people * believe it to be useful and keep putting this code back in. * (There is also some "security checker" out there that * complains if the nfs server doesn't enforce this.) * However, note the following: * RFC3530 (NFSv4) specifies that a reserved port# not be * required. * RFC2623 recommends that, if a reserved port# is checked for, * that there be a way to turn that off--> ifdef'd. */ #ifdef NFS_REQRSVPORT if (!nd->nd_repstat) { struct sockaddr_in *saddr; struct sockaddr_in6 *saddr6; saddr = NFSSOCKADDR(nd->nd_nam, struct sockaddr_in *); saddr6 = NFSSOCKADDR(nd->nd_nam, struct sockaddr_in6 *); if (!(nd->nd_flag & ND_NFSV4) && ((saddr->sin_family == AF_INET && ntohs(saddr->sin_port) >= IPPORT_RESERVED) || (saddr6->sin6_family == AF_INET6 && ntohs(saddr6->sin6_port) >= IPPORT_RESERVED))) { vput(*vpp); nd->nd_repstat = (NFSERR_AUTHERR | AUTH_TOOWEAK); } } #endif /* NFS_REQRSVPORT */ /* * Check/setup credentials. */ if (!nd->nd_repstat) { nd->nd_saveduid = nd->nd_cred->cr_uid; nd->nd_repstat = nfsd_excred(nd, exp, credanon); if (nd->nd_repstat) vput(*vpp); } if (credanon != NULL) crfree(credanon); if (nd->nd_repstat) { if (startwrite) vn_finished_write(mp); *vpp = NULL; if (mpp != NULL) *mpp = NULL; } out: NFSEXITCODE2(0, nd); } /* * glue for fp. */ static int fp_getfvp(struct thread *p, int fd, struct file **fpp, struct vnode **vpp) { struct filedesc *fdp; struct file *fp; int error = 0; fdp = p->td_proc->p_fd; if (fd < 0 || fd >= fdp->fd_nfiles || (fp = fdp->fd_ofiles[fd].fde_file) == NULL) { error = EBADF; goto out; } *fpp = fp; out: NFSEXITCODE(error); return (error); } /* * Called from nfssvc() to update the exports list. Just call * vfs_export(). This has to be done, since the v4 root fake fs isn't * in the mount list. */ int nfsrv_v4rootexport(void *argp, struct ucred *cred, struct thread *p) { struct nfsex_args *nfsexargp = (struct nfsex_args *)argp; int error = 0; struct nameidata nd; fhandle_t fh; error = vfs_export(&nfsv4root_mnt, &nfsexargp->export); if ((nfsexargp->export.ex_flags & MNT_DELEXPORT) != 0) nfs_rootfhset = 0; else if (error == 0) { if (nfsexargp->fspec == NULL) { error = EPERM; goto out; } /* * If fspec != NULL, this is the v4root path. */ NDINIT(&nd, LOOKUP, FOLLOW, UIO_USERSPACE, nfsexargp->fspec, p); if ((error = namei(&nd)) != 0) goto out; error = nfsvno_getfh(nd.ni_vp, &fh, p); vrele(nd.ni_vp); if (!error) { nfs_rootfh.nfsrvfh_len = NFSX_MYFH; NFSBCOPY((caddr_t)&fh, nfs_rootfh.nfsrvfh_data, sizeof (fhandle_t)); nfs_rootfhset = 1; } } out: NFSEXITCODE(error); return (error); } /* * This function needs to test to see if the system is near its limit * for memory allocation via malloc() or mget() and return True iff * either of these resources are near their limit. * XXX (For now, this is just a stub.) */ int nfsrv_testmalloclimit = 0; int nfsrv_mallocmget_limit(void) { static int printmesg = 0; static int testval = 1; if (nfsrv_testmalloclimit && (testval++ % 1000) == 0) { if ((printmesg++ % 100) == 0) printf("nfsd: malloc/mget near limit\n"); return (1); } return (0); } /* * BSD specific initialization of a mount point. */ void nfsd_mntinit(void) { static int inited = 0; if (inited) return; inited = 1; nfsv4root_mnt.mnt_flag = (MNT_RDONLY | MNT_EXPORTED); TAILQ_INIT(&nfsv4root_mnt.mnt_nvnodelist); TAILQ_INIT(&nfsv4root_mnt.mnt_activevnodelist); nfsv4root_mnt.mnt_export = NULL; TAILQ_INIT(&nfsv4root_opt); TAILQ_INIT(&nfsv4root_newopt); nfsv4root_mnt.mnt_opt = &nfsv4root_opt; nfsv4root_mnt.mnt_optnew = &nfsv4root_newopt; nfsv4root_mnt.mnt_nvnodelistsize = 0; nfsv4root_mnt.mnt_activevnodelistsize = 0; } /* * Get a vnode for a file handle, without checking exports, etc. */ struct vnode * nfsvno_getvp(fhandle_t *fhp) { struct mount *mp; struct vnode *vp; int error; mp = vfs_busyfs(&fhp->fh_fsid); if (mp == NULL) return (NULL); error = VFS_FHTOVP(mp, &fhp->fh_fid, LK_EXCLUSIVE, &vp); vfs_unbusy(mp); if (error) return (NULL); return (vp); } /* * Do a local VOP_ADVLOCK(). */ int nfsvno_advlock(struct vnode *vp, int ftype, u_int64_t first, u_int64_t end, struct thread *td) { int error = 0; struct flock fl; u_int64_t tlen; if (nfsrv_dolocallocks == 0) goto out; ASSERT_VOP_UNLOCKED(vp, "nfsvno_advlock: vp locked"); fl.l_whence = SEEK_SET; fl.l_type = ftype; fl.l_start = (off_t)first; if (end == NFS64BITSSET) { fl.l_len = 0; } else { tlen = end - first; fl.l_len = (off_t)tlen; } /* * For FreeBSD8, the l_pid and l_sysid must be set to the same * values for all calls, so that all locks will be held by the * nfsd server. (The nfsd server handles conflicts between the * various clients.) * Since an NFSv4 lockowner is a ClientID plus an array of up to 1024 * bytes, so it can't be put in l_sysid. */ if (nfsv4_sysid == 0) nfsv4_sysid = nlm_acquire_next_sysid(); fl.l_pid = (pid_t)0; fl.l_sysid = (int)nfsv4_sysid; if (ftype == F_UNLCK) error = VOP_ADVLOCK(vp, (caddr_t)td->td_proc, F_UNLCK, &fl, (F_POSIX | F_REMOTE)); else error = VOP_ADVLOCK(vp, (caddr_t)td->td_proc, F_SETLK, &fl, (F_POSIX | F_REMOTE)); out: NFSEXITCODE(error); return (error); } /* * Check the nfsv4 root exports. */ int nfsvno_v4rootexport(struct nfsrv_descript *nd) { struct ucred *credanon; int exflags, error = 0, numsecflavor, *secflavors, i; error = vfs_stdcheckexp(&nfsv4root_mnt, nd->nd_nam, &exflags, &credanon, &numsecflavor, &secflavors); if (error) { error = NFSERR_PROGUNAVAIL; goto out; } if (credanon != NULL) crfree(credanon); for (i = 0; i < numsecflavor; i++) { if (secflavors[i] == AUTH_SYS) nd->nd_flag |= ND_EXAUTHSYS; else if (secflavors[i] == RPCSEC_GSS_KRB5) nd->nd_flag |= ND_EXGSS; else if (secflavors[i] == RPCSEC_GSS_KRB5I) nd->nd_flag |= ND_EXGSSINTEGRITY; else if (secflavors[i] == RPCSEC_GSS_KRB5P) nd->nd_flag |= ND_EXGSSPRIVACY; } out: NFSEXITCODE(error); return (error); } /* * Nfs server psuedo system call for the nfsd's */ /* * MPSAFE */ static int nfssvc_nfsd(struct thread *td, struct nfssvc_args *uap) { struct file *fp; struct nfsd_addsock_args sockarg; struct nfsd_nfsd_args nfsdarg; cap_rights_t rights; int error; if (uap->flag & NFSSVC_NFSDADDSOCK) { error = copyin(uap->argp, (caddr_t)&sockarg, sizeof (sockarg)); if (error) goto out; /* * Since we don't know what rights might be required, * pretend that we need them all. It is better to be too * careful than too reckless. */ error = fget(td, sockarg.sock, cap_rights_init(&rights, CAP_SOCK_SERVER), &fp); if (error != 0) goto out; if (fp->f_type != DTYPE_SOCKET) { fdrop(fp, td); error = EPERM; goto out; } error = nfsrvd_addsock(fp); fdrop(fp, td); } else if (uap->flag & NFSSVC_NFSDNFSD) { if (uap->argp == NULL) { error = EINVAL; goto out; } error = copyin(uap->argp, (caddr_t)&nfsdarg, sizeof (nfsdarg)); if (error) goto out; error = nfsrvd_nfsd(td, &nfsdarg); } else { error = nfssvc_srvcall(td, uap, td->td_ucred); } out: NFSEXITCODE(error); return (error); } static int nfssvc_srvcall(struct thread *p, struct nfssvc_args *uap, struct ucred *cred) { struct nfsex_args export; struct file *fp = NULL; int stablefd, len; struct nfsd_clid adminrevoke; struct nfsd_dumplist dumplist; struct nfsd_dumpclients *dumpclients; struct nfsd_dumplocklist dumplocklist; struct nfsd_dumplocks *dumplocks; struct nameidata nd; vnode_t vp; int error = EINVAL, igotlock; struct proc *procp; static int suspend_nfsd = 0; if (uap->flag & NFSSVC_PUBLICFH) { NFSBZERO((caddr_t)&nfs_pubfh.nfsrvfh_data, sizeof (fhandle_t)); error = copyin(uap->argp, &nfs_pubfh.nfsrvfh_data, sizeof (fhandle_t)); if (!error) nfs_pubfhset = 1; } else if (uap->flag & NFSSVC_V4ROOTEXPORT) { error = copyin(uap->argp,(caddr_t)&export, sizeof (struct nfsex_args)); if (!error) error = nfsrv_v4rootexport(&export, cred, p); } else if (uap->flag & NFSSVC_NOPUBLICFH) { nfs_pubfhset = 0; error = 0; } else if (uap->flag & NFSSVC_STABLERESTART) { error = copyin(uap->argp, (caddr_t)&stablefd, sizeof (int)); if (!error) error = fp_getfvp(p, stablefd, &fp, &vp); if (!error && (NFSFPFLAG(fp) & (FREAD | FWRITE)) != (FREAD | FWRITE)) error = EBADF; if (!error && newnfs_numnfsd != 0) error = EPERM; if (!error) { nfsrv_stablefirst.nsf_fp = fp; nfsrv_setupstable(p); } } else if (uap->flag & NFSSVC_ADMINREVOKE) { error = copyin(uap->argp, (caddr_t)&adminrevoke, sizeof (struct nfsd_clid)); if (!error) error = nfsrv_adminrevoke(&adminrevoke, p); } else if (uap->flag & NFSSVC_DUMPCLIENTS) { error = copyin(uap->argp, (caddr_t)&dumplist, sizeof (struct nfsd_dumplist)); if (!error && (dumplist.ndl_size < 1 || dumplist.ndl_size > NFSRV_MAXDUMPLIST)) error = EPERM; if (!error) { len = sizeof (struct nfsd_dumpclients) * dumplist.ndl_size; dumpclients = (struct nfsd_dumpclients *)malloc(len, M_TEMP, M_WAITOK); nfsrv_dumpclients(dumpclients, dumplist.ndl_size); error = copyout(dumpclients, CAST_USER_ADDR_T(dumplist.ndl_list), len); free((caddr_t)dumpclients, M_TEMP); } } else if (uap->flag & NFSSVC_DUMPLOCKS) { error = copyin(uap->argp, (caddr_t)&dumplocklist, sizeof (struct nfsd_dumplocklist)); if (!error && (dumplocklist.ndllck_size < 1 || dumplocklist.ndllck_size > NFSRV_MAXDUMPLIST)) error = EPERM; if (!error) error = nfsrv_lookupfilename(&nd, dumplocklist.ndllck_fname, p); if (!error) { len = sizeof (struct nfsd_dumplocks) * dumplocklist.ndllck_size; dumplocks = (struct nfsd_dumplocks *)malloc(len, M_TEMP, M_WAITOK); nfsrv_dumplocks(nd.ni_vp, dumplocks, dumplocklist.ndllck_size, p); vput(nd.ni_vp); error = copyout(dumplocks, CAST_USER_ADDR_T(dumplocklist.ndllck_list), len); free((caddr_t)dumplocks, M_TEMP); } } else if (uap->flag & NFSSVC_BACKUPSTABLE) { procp = p->td_proc; PROC_LOCK(procp); nfsd_master_pid = procp->p_pid; bcopy(procp->p_comm, nfsd_master_comm, MAXCOMLEN + 1); nfsd_master_start = procp->p_stats->p_start; nfsd_master_proc = procp; PROC_UNLOCK(procp); } else if ((uap->flag & NFSSVC_SUSPENDNFSD) != 0) { NFSLOCKV4ROOTMUTEX(); if (suspend_nfsd == 0) { /* Lock out all nfsd threads */ do { igotlock = nfsv4_lock(&nfsd_suspend_lock, 1, NULL, NFSV4ROOTLOCKMUTEXPTR, NULL); } while (igotlock == 0 && suspend_nfsd == 0); suspend_nfsd = 1; } NFSUNLOCKV4ROOTMUTEX(); error = 0; } else if ((uap->flag & NFSSVC_RESUMENFSD) != 0) { NFSLOCKV4ROOTMUTEX(); if (suspend_nfsd != 0) { nfsv4_unlock(&nfsd_suspend_lock, 0); suspend_nfsd = 0; } NFSUNLOCKV4ROOTMUTEX(); error = 0; } NFSEXITCODE(error); return (error); } /* * Check exports. * Returns 0 if ok, 1 otherwise. */ int nfsvno_testexp(struct nfsrv_descript *nd, struct nfsexstuff *exp) { int i; /* * This seems odd, but allow the case where the security flavor * list is empty. This happens when NFSv4 is traversing non-exported * file systems. Exported file systems should always have a non-empty * security flavor list. */ if (exp->nes_numsecflavor == 0) return (0); for (i = 0; i < exp->nes_numsecflavor; i++) { /* * The tests for privacy and integrity must be first, * since ND_GSS is set for everything but AUTH_SYS. */ if (exp->nes_secflavors[i] == RPCSEC_GSS_KRB5P && (nd->nd_flag & ND_GSSPRIVACY)) return (0); if (exp->nes_secflavors[i] == RPCSEC_GSS_KRB5I && (nd->nd_flag & ND_GSSINTEGRITY)) return (0); if (exp->nes_secflavors[i] == RPCSEC_GSS_KRB5 && (nd->nd_flag & ND_GSS)) return (0); if (exp->nes_secflavors[i] == AUTH_SYS && (nd->nd_flag & ND_GSS) == 0) return (0); } return (1); } /* * Calculate a hash value for the fid in a file handle. */ uint32_t nfsrv_hashfh(fhandle_t *fhp) { uint32_t hashval; hashval = hash32_buf(&fhp->fh_fid, sizeof(struct fid), 0); return (hashval); } /* * Calculate a hash value for the sessionid. */ uint32_t nfsrv_hashsessionid(uint8_t *sessionid) { uint32_t hashval; hashval = hash32_buf(sessionid, NFSX_V4SESSIONID, 0); return (hashval); } /* * Signal the userland master nfsd to backup the stable restart file. */ void nfsrv_backupstable(void) { struct proc *procp; if (nfsd_master_proc != NULL) { procp = pfind(nfsd_master_pid); /* Try to make sure it is the correct process. */ if (procp == nfsd_master_proc && procp->p_stats->p_start.tv_sec == nfsd_master_start.tv_sec && procp->p_stats->p_start.tv_usec == nfsd_master_start.tv_usec && strcmp(procp->p_comm, nfsd_master_comm) == 0) kern_psignal(procp, SIGUSR2); else nfsd_master_proc = NULL; if (procp != NULL) PROC_UNLOCK(procp); } } extern int (*nfsd_call_nfsd)(struct thread *, struct nfssvc_args *); /* * Called once to initialize data structures... */ static int nfsd_modevent(module_t mod, int type, void *data) { int error = 0, i; static int loaded = 0; switch (type) { case MOD_LOAD: if (loaded) goto out; newnfs_portinit(); for (i = 0; i < NFSRVCACHE_HASHSIZE; i++) { mtx_init(&nfsrchash_table[i].mtx, "nfsrtc", NULL, MTX_DEF); mtx_init(&nfsrcahash_table[i].mtx, "nfsrtca", NULL, MTX_DEF); } mtx_init(&nfsrc_udpmtx, "nfsuc", NULL, MTX_DEF); mtx_init(&nfs_v4root_mutex, "nfs4rt", NULL, MTX_DEF); mtx_init(&nfsv4root_mnt.mnt_mtx, "nfs4mnt", NULL, MTX_DEF); for (i = 0; i < NFSSESSIONHASHSIZE; i++) mtx_init(&nfssessionhash[i].mtx, "nfssm", NULL, MTX_DEF); lockinit(&nfsv4root_mnt.mnt_explock, PVFS, "explock", 0, 0); nfsrvd_initcache(); nfsd_init(); NFSD_LOCK(); nfsrvd_init(0); NFSD_UNLOCK(); nfsd_mntinit(); #ifdef VV_DISABLEDELEG vn_deleg_ops.vndeleg_recall = nfsd_recalldelegation; vn_deleg_ops.vndeleg_disable = nfsd_disabledelegation; #endif nfsd_call_servertimer = nfsrv_servertimer; nfsd_call_nfsd = nfssvc_nfsd; loaded = 1; break; case MOD_UNLOAD: if (newnfs_numnfsd != 0) { error = EBUSY; break; } #ifdef VV_DISABLEDELEG vn_deleg_ops.vndeleg_recall = NULL; vn_deleg_ops.vndeleg_disable = NULL; #endif nfsd_call_servertimer = NULL; nfsd_call_nfsd = NULL; /* Clean out all NFSv4 state. */ nfsrv_throwawayallstate(curthread); /* Clean the NFS server reply cache */ nfsrvd_cleancache(); /* Free up the krpc server pool. */ if (nfsrvd_pool != NULL) svcpool_destroy(nfsrvd_pool); /* and get rid of the locks */ for (i = 0; i < NFSRVCACHE_HASHSIZE; i++) { mtx_destroy(&nfsrchash_table[i].mtx); mtx_destroy(&nfsrcahash_table[i].mtx); } mtx_destroy(&nfsrc_udpmtx); mtx_destroy(&nfs_v4root_mutex); mtx_destroy(&nfsv4root_mnt.mnt_mtx); for (i = 0; i < NFSSESSIONHASHSIZE; i++) mtx_destroy(&nfssessionhash[i].mtx); lockdestroy(&nfsv4root_mnt.mnt_explock); loaded = 0; break; default: error = EOPNOTSUPP; break; } out: NFSEXITCODE(error); return (error); } static moduledata_t nfsd_mod = { "nfsd", nfsd_modevent, NULL, }; DECLARE_MODULE(nfsd, nfsd_mod, SI_SUB_VFS, SI_ORDER_ANY); /* So that loader and kldload(2) can find us, wherever we are.. */ MODULE_VERSION(nfsd, 1); MODULE_DEPEND(nfsd, nfscommon, 1, 1, 1); MODULE_DEPEND(nfsd, nfslock, 1, 1, 1); MODULE_DEPEND(nfsd, nfslockd, 1, 1, 1); MODULE_DEPEND(nfsd, krpc, 1, 1, 1); MODULE_DEPEND(nfsd, nfssvc, 1, 1, 1); Index: user/ngie/more-tests/sys/fs/nullfs/null_vfsops.c =================================================================== --- user/ngie/more-tests/sys/fs/nullfs/null_vfsops.c (revision 281584) +++ user/ngie/more-tests/sys/fs/nullfs/null_vfsops.c (revision 281585) @@ -1,456 +1,456 @@ /*- * Copyright (c) 1992, 1993, 1995 * The Regents of the University of California. All rights reserved. * * This code is derived from software donated to Berkeley by * Jan-Simon Pendry. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)null_vfsops.c 8.2 (Berkeley) 1/21/94 * * @(#)lofs_vfsops.c 1.2 (Berkeley) 6/18/92 * $FreeBSD$ */ /* * Null Layer * (See null_vnops.c for a description of what this does.) */ #include #include #include #include #include #include #include #include #include #include #include #include static MALLOC_DEFINE(M_NULLFSMNT, "nullfs_mount", "NULLFS mount structure"); static vfs_fhtovp_t nullfs_fhtovp; static vfs_mount_t nullfs_mount; static vfs_quotactl_t nullfs_quotactl; static vfs_root_t nullfs_root; static vfs_sync_t nullfs_sync; static vfs_statfs_t nullfs_statfs; static vfs_unmount_t nullfs_unmount; static vfs_vget_t nullfs_vget; static vfs_extattrctl_t nullfs_extattrctl; /* * Mount null layer */ static int nullfs_mount(struct mount *mp) { int error = 0; struct vnode *lowerrootvp, *vp; struct vnode *nullm_rootvp; struct null_mount *xmp; struct thread *td = curthread; char *target; int isvnunlocked = 0, len; struct nameidata nd, *ndp = &nd; NULLFSDEBUG("nullfs_mount(mp = %p)\n", (void *)mp); if (!prison_allow(td->td_ucred, PR_ALLOW_MOUNT_NULLFS)) return (EPERM); if (mp->mnt_flag & MNT_ROOTFS) return (EOPNOTSUPP); /* * Update is a no-op */ if (mp->mnt_flag & MNT_UPDATE) { /* * Only support update mounts for NFS export. */ if (vfs_flagopt(mp->mnt_optnew, "export", NULL, 0)) return (0); else return (EOPNOTSUPP); } /* * Get argument */ error = vfs_getopt(mp->mnt_optnew, "target", (void **)&target, &len); if (error || target[len - 1] != '\0') return (EINVAL); /* * Unlock lower node to avoid possible deadlock. */ if ((mp->mnt_vnodecovered->v_op == &null_vnodeops) && VOP_ISLOCKED(mp->mnt_vnodecovered) == LK_EXCLUSIVE) { VOP_UNLOCK(mp->mnt_vnodecovered, 0); isvnunlocked = 1; } /* * Find lower node */ NDINIT(ndp, LOOKUP, FOLLOW|LOCKLEAF, UIO_SYSSPACE, target, curthread); error = namei(ndp); /* * Re-lock vnode. * XXXKIB This is deadlock-prone as well. */ if (isvnunlocked) vn_lock(mp->mnt_vnodecovered, LK_EXCLUSIVE | LK_RETRY); if (error) return (error); NDFREE(ndp, NDF_ONLY_PNBUF); /* * Sanity check on lower vnode */ lowerrootvp = ndp->ni_vp; /* * Check multi null mount to avoid `lock against myself' panic. */ if (lowerrootvp == VTONULL(mp->mnt_vnodecovered)->null_lowervp) { NULLFSDEBUG("nullfs_mount: multi null mount?\n"); vput(lowerrootvp); return (EDEADLK); } xmp = (struct null_mount *) malloc(sizeof(struct null_mount), M_NULLFSMNT, M_WAITOK | M_ZERO); /* * Save reference to underlying FS */ xmp->nullm_vfs = lowerrootvp->v_mount; /* * Save reference. Each mount also holds * a reference on the root vnode. */ error = null_nodeget(mp, lowerrootvp, &vp); /* * Make sure the node alias worked */ if (error) { free(xmp, M_NULLFSMNT); return (error); } /* * Keep a held reference to the root vnode. * It is vrele'd in nullfs_unmount. */ nullm_rootvp = vp; nullm_rootvp->v_vflag |= VV_ROOT; xmp->nullm_rootvp = nullm_rootvp; /* * Unlock the node (either the lower or the alias) */ VOP_UNLOCK(vp, 0); if (NULLVPTOLOWERVP(nullm_rootvp)->v_mount->mnt_flag & MNT_LOCAL) { MNT_ILOCK(mp); mp->mnt_flag |= MNT_LOCAL; MNT_IUNLOCK(mp); } xmp->nullm_flags |= NULLM_CACHE; if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0) xmp->nullm_flags &= ~NULLM_CACHE; MNT_ILOCK(mp); if ((xmp->nullm_flags & NULLM_CACHE) != 0) { mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & (MNTK_SHARED_WRITES | MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED); } mp->mnt_kern_flag |= MNTK_LOOKUP_EXCL_DOTDOT; mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & - MNTK_SUSPENDABLE; + (MNTK_SUSPENDABLE | MNTK_USES_BCACHE); MNT_IUNLOCK(mp); mp->mnt_data = xmp; vfs_getnewfsid(mp); if ((xmp->nullm_flags & NULLM_CACHE) != 0) { MNT_ILOCK(xmp->nullm_vfs); TAILQ_INSERT_TAIL(&xmp->nullm_vfs->mnt_uppers, mp, mnt_upper_link); MNT_IUNLOCK(xmp->nullm_vfs); } vfs_mountedfrom(mp, target); NULLFSDEBUG("nullfs_mount: lower %s, alias at %s\n", mp->mnt_stat.f_mntfromname, mp->mnt_stat.f_mntonname); return (0); } /* * Free reference to null layer */ static int nullfs_unmount(mp, mntflags) struct mount *mp; int mntflags; { struct null_mount *mntdata; struct mount *ump; int error, flags; NULLFSDEBUG("nullfs_unmount: mp = %p\n", (void *)mp); if (mntflags & MNT_FORCE) flags = FORCECLOSE; else flags = 0; /* There is 1 extra root vnode reference (nullm_rootvp). */ error = vflush(mp, 1, flags, curthread); if (error) return (error); /* * Finally, throw away the null_mount structure */ mntdata = mp->mnt_data; ump = mntdata->nullm_vfs; if ((mntdata->nullm_flags & NULLM_CACHE) != 0) { MNT_ILOCK(ump); while ((ump->mnt_kern_flag & MNTK_VGONE_UPPER) != 0) { ump->mnt_kern_flag |= MNTK_VGONE_WAITER; msleep(&ump->mnt_uppers, &ump->mnt_mtx, 0, "vgnupw", 0); } TAILQ_REMOVE(&ump->mnt_uppers, mp, mnt_upper_link); MNT_IUNLOCK(ump); } mp->mnt_data = NULL; free(mntdata, M_NULLFSMNT); return (0); } static int nullfs_root(mp, flags, vpp) struct mount *mp; int flags; struct vnode **vpp; { struct vnode *vp; NULLFSDEBUG("nullfs_root(mp = %p, vp = %p->%p)\n", (void *)mp, (void *)MOUNTTONULLMOUNT(mp)->nullm_rootvp, (void *)NULLVPTOLOWERVP(MOUNTTONULLMOUNT(mp)->nullm_rootvp)); /* * Return locked reference to root. */ vp = MOUNTTONULLMOUNT(mp)->nullm_rootvp; VREF(vp); ASSERT_VOP_UNLOCKED(vp, "root vnode is locked"); vn_lock(vp, flags | LK_RETRY); *vpp = vp; return 0; } static int nullfs_quotactl(mp, cmd, uid, arg) struct mount *mp; int cmd; uid_t uid; void *arg; { return VFS_QUOTACTL(MOUNTTONULLMOUNT(mp)->nullm_vfs, cmd, uid, arg); } static int nullfs_statfs(mp, sbp) struct mount *mp; struct statfs *sbp; { int error; struct statfs mstat; NULLFSDEBUG("nullfs_statfs(mp = %p, vp = %p->%p)\n", (void *)mp, (void *)MOUNTTONULLMOUNT(mp)->nullm_rootvp, (void *)NULLVPTOLOWERVP(MOUNTTONULLMOUNT(mp)->nullm_rootvp)); bzero(&mstat, sizeof(mstat)); error = VFS_STATFS(MOUNTTONULLMOUNT(mp)->nullm_vfs, &mstat); if (error) return (error); /* now copy across the "interesting" information and fake the rest */ sbp->f_type = mstat.f_type; sbp->f_flags = (sbp->f_flags & (MNT_RDONLY | MNT_NOEXEC | MNT_NOSUID | MNT_UNION | MNT_NOSYMFOLLOW)) | (mstat.f_flags & ~MNT_ROOTFS); sbp->f_bsize = mstat.f_bsize; sbp->f_iosize = mstat.f_iosize; sbp->f_blocks = mstat.f_blocks; sbp->f_bfree = mstat.f_bfree; sbp->f_bavail = mstat.f_bavail; sbp->f_files = mstat.f_files; sbp->f_ffree = mstat.f_ffree; return (0); } static int nullfs_sync(mp, waitfor) struct mount *mp; int waitfor; { /* * XXX - Assumes no data cached at null layer. */ return (0); } static int nullfs_vget(mp, ino, flags, vpp) struct mount *mp; ino_t ino; int flags; struct vnode **vpp; { int error; KASSERT((flags & LK_TYPE_MASK) != 0, ("nullfs_vget: no lock requested")); error = VFS_VGET(MOUNTTONULLMOUNT(mp)->nullm_vfs, ino, flags, vpp); if (error != 0) return (error); return (null_nodeget(mp, *vpp, vpp)); } static int nullfs_fhtovp(mp, fidp, flags, vpp) struct mount *mp; struct fid *fidp; int flags; struct vnode **vpp; { int error; error = VFS_FHTOVP(MOUNTTONULLMOUNT(mp)->nullm_vfs, fidp, flags, vpp); if (error != 0) return (error); return (null_nodeget(mp, *vpp, vpp)); } static int nullfs_extattrctl(mp, cmd, filename_vp, namespace, attrname) struct mount *mp; int cmd; struct vnode *filename_vp; int namespace; const char *attrname; { return (VFS_EXTATTRCTL(MOUNTTONULLMOUNT(mp)->nullm_vfs, cmd, filename_vp, namespace, attrname)); } static void nullfs_reclaim_lowervp(struct mount *mp, struct vnode *lowervp) { struct vnode *vp; vp = null_hashget(mp, lowervp); if (vp == NULL) return; VTONULL(vp)->null_flags |= NULLV_NOUNLOCK; vgone(vp); vput(vp); } static void nullfs_unlink_lowervp(struct mount *mp, struct vnode *lowervp) { struct vnode *vp; struct null_node *xp; vp = null_hashget(mp, lowervp); if (vp == NULL) return; xp = VTONULL(vp); xp->null_flags |= NULLV_DROP | NULLV_NOUNLOCK; vhold(vp); vunref(vp); if (vp->v_usecount == 0) { /* * If vunref() dropped the last use reference on the * nullfs vnode, it must be reclaimed, and its lock * was split from the lower vnode lock. Need to do * extra unlock before allowing the final vdrop() to * free the vnode. */ KASSERT((vp->v_iflag & VI_DOOMED) != 0, ("not reclaimed nullfs vnode %p", vp)); VOP_UNLOCK(vp, 0); } else { /* * Otherwise, the nullfs vnode still shares the lock * with the lower vnode, and must not be unlocked. * Also clear the NULLV_NOUNLOCK, the flag is not * relevant for future reclamations. */ ASSERT_VOP_ELOCKED(vp, "unlink_lowervp"); KASSERT((vp->v_iflag & VI_DOOMED) == 0, ("reclaimed nullfs vnode %p", vp)); xp->null_flags &= ~NULLV_NOUNLOCK; } vdrop(vp); } static struct vfsops null_vfsops = { .vfs_extattrctl = nullfs_extattrctl, .vfs_fhtovp = nullfs_fhtovp, .vfs_init = nullfs_init, .vfs_mount = nullfs_mount, .vfs_quotactl = nullfs_quotactl, .vfs_root = nullfs_root, .vfs_statfs = nullfs_statfs, .vfs_sync = nullfs_sync, .vfs_uninit = nullfs_uninit, .vfs_unmount = nullfs_unmount, .vfs_vget = nullfs_vget, .vfs_reclaim_lowervp = nullfs_reclaim_lowervp, .vfs_unlink_lowervp = nullfs_unlink_lowervp, }; VFS_SET(null_vfsops, nullfs, VFCF_LOOPBACK | VFCF_JAIL); Index: user/ngie/more-tests/sys/kern/imgact_elf.c =================================================================== --- user/ngie/more-tests/sys/kern/imgact_elf.c (revision 281584) +++ user/ngie/more-tests/sys/kern/imgact_elf.c (revision 281585) @@ -1,2157 +1,2158 @@ /*- * Copyright (c) 2000 David O'Brien * Copyright (c) 1995-1996 Søren Schmidt * Copyright (c) 1996 Peter Wemm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_capsicum.h" #include "opt_compat.h" #include "opt_gzio.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define ELF_NOTE_ROUNDSIZE 4 #define OLD_EI_BRAND 8 static int __elfN(check_header)(const Elf_Ehdr *hdr); static Elf_Brandinfo *__elfN(get_brandinfo)(struct image_params *imgp, const char *interp, int interp_name_len, int32_t *osrel); static int __elfN(load_file)(struct proc *p, const char *file, u_long *addr, u_long *entry, size_t pagesize); static int __elfN(load_section)(struct image_params *imgp, vm_offset_t offset, caddr_t vmaddr, size_t memsz, size_t filsz, vm_prot_t prot, size_t pagesize); static int __CONCAT(exec_, __elfN(imgact))(struct image_params *imgp); static boolean_t __elfN(freebsd_trans_osrel)(const Elf_Note *note, int32_t *osrel); static boolean_t kfreebsd_trans_osrel(const Elf_Note *note, int32_t *osrel); static boolean_t __elfN(check_note)(struct image_params *imgp, Elf_Brandnote *checknote, int32_t *osrel); static vm_prot_t __elfN(trans_prot)(Elf_Word); static Elf_Word __elfN(untrans_prot)(vm_prot_t); SYSCTL_NODE(_kern, OID_AUTO, __CONCAT(elf, __ELF_WORD_SIZE), CTLFLAG_RW, 0, ""); #define CORE_BUF_SIZE (16 * 1024) int __elfN(fallback_brand) = -1; SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, fallback_brand, CTLFLAG_RWTUN, &__elfN(fallback_brand), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) " brand of last resort"); static int elf_legacy_coredump = 0; SYSCTL_INT(_debug, OID_AUTO, __elfN(legacy_coredump), CTLFLAG_RW, &elf_legacy_coredump, 0, ""); int __elfN(nxstack) = #if defined(__amd64__) || defined(__powerpc64__) /* both 64 and 32 bit */ 1; #else 0; #endif SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, nxstack, CTLFLAG_RW, &__elfN(nxstack), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) ": enable non-executable stack"); #if __ELF_WORD_SIZE == 32 #if defined(__amd64__) int i386_read_exec = 0; SYSCTL_INT(_kern_elf32, OID_AUTO, read_exec, CTLFLAG_RW, &i386_read_exec, 0, "enable execution from readable segments"); #endif #endif static Elf_Brandinfo *elf_brand_list[MAX_BRANDS]; #define trunc_page_ps(va, ps) ((va) & ~(ps - 1)) #define round_page_ps(va, ps) (((va) + (ps - 1)) & ~(ps - 1)) #define aligned(a, t) (trunc_page_ps((u_long)(a), sizeof(t)) == (u_long)(a)) static const char FREEBSD_ABI_VENDOR[] = "FreeBSD"; Elf_Brandnote __elfN(freebsd_brandnote) = { .hdr.n_namesz = sizeof(FREEBSD_ABI_VENDOR), .hdr.n_descsz = sizeof(int32_t), .hdr.n_type = 1, .vendor = FREEBSD_ABI_VENDOR, .flags = BN_TRANSLATE_OSREL, .trans_osrel = __elfN(freebsd_trans_osrel) }; static boolean_t __elfN(freebsd_trans_osrel)(const Elf_Note *note, int32_t *osrel) { uintptr_t p; p = (uintptr_t)(note + 1); p += roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE); *osrel = *(const int32_t *)(p); return (TRUE); } static const char GNU_ABI_VENDOR[] = "GNU"; static int GNU_KFREEBSD_ABI_DESC = 3; Elf_Brandnote __elfN(kfreebsd_brandnote) = { .hdr.n_namesz = sizeof(GNU_ABI_VENDOR), .hdr.n_descsz = 16, /* XXX at least 16 */ .hdr.n_type = 1, .vendor = GNU_ABI_VENDOR, .flags = BN_TRANSLATE_OSREL, .trans_osrel = kfreebsd_trans_osrel }; static boolean_t kfreebsd_trans_osrel(const Elf_Note *note, int32_t *osrel) { const Elf32_Word *desc; uintptr_t p; p = (uintptr_t)(note + 1); p += roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE); desc = (const Elf32_Word *)p; if (desc[0] != GNU_KFREEBSD_ABI_DESC) return (FALSE); /* * Debian GNU/kFreeBSD embed the earliest compatible kernel version * (__FreeBSD_version: Rxx) in the LSB way. */ *osrel = desc[1] * 100000 + desc[2] * 1000 + desc[3]; return (TRUE); } int __elfN(insert_brand_entry)(Elf_Brandinfo *entry) { int i; for (i = 0; i < MAX_BRANDS; i++) { if (elf_brand_list[i] == NULL) { elf_brand_list[i] = entry; break; } } if (i == MAX_BRANDS) { printf("WARNING: %s: could not insert brandinfo entry: %p\n", __func__, entry); return (-1); } return (0); } int __elfN(remove_brand_entry)(Elf_Brandinfo *entry) { int i; for (i = 0; i < MAX_BRANDS; i++) { if (elf_brand_list[i] == entry) { elf_brand_list[i] = NULL; break; } } if (i == MAX_BRANDS) return (-1); return (0); } int __elfN(brand_inuse)(Elf_Brandinfo *entry) { struct proc *p; int rval = FALSE; sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { if (p->p_sysent == entry->sysvec) { rval = TRUE; break; } } sx_sunlock(&allproc_lock); return (rval); } static Elf_Brandinfo * __elfN(get_brandinfo)(struct image_params *imgp, const char *interp, int interp_name_len, int32_t *osrel) { const Elf_Ehdr *hdr = (const Elf_Ehdr *)imgp->image_header; Elf_Brandinfo *bi; boolean_t ret; int i; /* * We support four types of branding -- (1) the ELF EI_OSABI field * that SCO added to the ELF spec, (2) FreeBSD 3.x's traditional string * branding w/in the ELF header, (3) path of the `interp_path' * field, and (4) the ".note.ABI-tag" ELF section. */ /* Look for an ".note.ABI-tag" ELF section */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL) continue; if (hdr->e_machine == bi->machine && (bi->flags & (BI_BRAND_NOTE|BI_BRAND_NOTE_MANDATORY)) != 0) { ret = __elfN(check_note)(imgp, bi->brand_note, osrel); if (ret) return (bi); } } /* If the executable has a brand, search for it in the brand list. */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY) continue; if (hdr->e_machine == bi->machine && (hdr->e_ident[EI_OSABI] == bi->brand || strncmp((const char *)&hdr->e_ident[OLD_EI_BRAND], bi->compat_3_brand, strlen(bi->compat_3_brand)) == 0)) return (bi); } /* No known brand, see if the header is recognized by any brand */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY || bi->header_supported == NULL) continue; if (hdr->e_machine == bi->machine) { ret = bi->header_supported(imgp); if (ret) return (bi); } } /* Lacking a known brand, search for a recognized interpreter. */ if (interp != NULL) { for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY) continue; if (hdr->e_machine == bi->machine && /* ELF image p_filesz includes terminating zero */ strlen(bi->interp_path) + 1 == interp_name_len && strncmp(interp, bi->interp_path, interp_name_len) == 0) return (bi); } } /* Lacking a recognized interpreter, try the default brand */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY) continue; if (hdr->e_machine == bi->machine && __elfN(fallback_brand) == bi->brand) return (bi); } return (NULL); } static int __elfN(check_header)(const Elf_Ehdr *hdr) { Elf_Brandinfo *bi; int i; if (!IS_ELF(*hdr) || hdr->e_ident[EI_CLASS] != ELF_TARG_CLASS || hdr->e_ident[EI_DATA] != ELF_TARG_DATA || hdr->e_ident[EI_VERSION] != EV_CURRENT || hdr->e_phentsize != sizeof(Elf_Phdr) || hdr->e_version != ELF_TARG_VER) return (ENOEXEC); /* * Make sure we have at least one brand for this machine. */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi != NULL && bi->machine == hdr->e_machine) break; } if (i == MAX_BRANDS) return (ENOEXEC); return (0); } static int __elfN(map_partial)(vm_map_t map, vm_object_t object, vm_ooffset_t offset, vm_offset_t start, vm_offset_t end, vm_prot_t prot) { struct sf_buf *sf; int error; vm_offset_t off; /* * Create the page if it doesn't exist yet. Ignore errors. */ vm_map_lock(map); vm_map_insert(map, NULL, 0, trunc_page(start), round_page(end), VM_PROT_ALL, VM_PROT_ALL, 0); vm_map_unlock(map); /* * Find the page from the underlying object. */ if (object) { sf = vm_imgact_map_page(object, offset); if (sf == NULL) return (KERN_FAILURE); off = offset - trunc_page(offset); error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)start, end - start); vm_imgact_unmap_page(sf); if (error) { return (KERN_FAILURE); } } return (KERN_SUCCESS); } static int __elfN(map_insert)(vm_map_t map, vm_object_t object, vm_ooffset_t offset, vm_offset_t start, vm_offset_t end, vm_prot_t prot, int cow) { struct sf_buf *sf; vm_offset_t off; vm_size_t sz; int error, rv; if (start != trunc_page(start)) { rv = __elfN(map_partial)(map, object, offset, start, round_page(start), prot); if (rv) return (rv); offset += round_page(start) - start; start = round_page(start); } if (end != round_page(end)) { rv = __elfN(map_partial)(map, object, offset + trunc_page(end) - start, trunc_page(end), end, prot); if (rv) return (rv); end = trunc_page(end); } if (end > start) { if (offset & PAGE_MASK) { /* * The mapping is not page aligned. This means we have * to copy the data. Sigh. */ rv = vm_map_find(map, NULL, 0, &start, end - start, 0, VMFS_NO_SPACE, prot | VM_PROT_WRITE, VM_PROT_ALL, 0); if (rv) return (rv); if (object == NULL) return (KERN_SUCCESS); for (; start < end; start += sz) { sf = vm_imgact_map_page(object, offset); if (sf == NULL) return (KERN_FAILURE); off = offset - trunc_page(offset); sz = end - start; if (sz > PAGE_SIZE - off) sz = PAGE_SIZE - off; error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)start, sz); vm_imgact_unmap_page(sf); if (error) { return (KERN_FAILURE); } offset += sz; } rv = KERN_SUCCESS; } else { vm_object_reference(object); vm_map_lock(map); rv = vm_map_insert(map, object, offset, start, end, prot, VM_PROT_ALL, cow); vm_map_unlock(map); if (rv != KERN_SUCCESS) vm_object_deallocate(object); } return (rv); } else { return (KERN_SUCCESS); } } static int __elfN(load_section)(struct image_params *imgp, vm_offset_t offset, caddr_t vmaddr, size_t memsz, size_t filsz, vm_prot_t prot, size_t pagesize) { struct sf_buf *sf; size_t map_len; vm_map_t map; vm_object_t object; vm_offset_t map_addr; int error, rv, cow; size_t copy_len; vm_offset_t file_addr; /* * It's necessary to fail if the filsz + offset taken from the * header is greater than the actual file pager object's size. * If we were to allow this, then the vm_map_find() below would * walk right off the end of the file object and into the ether. * * While I'm here, might as well check for something else that * is invalid: filsz cannot be greater than memsz. */ if ((off_t)filsz + offset > imgp->attr->va_size || filsz > memsz) { uprintf("elf_load_section: truncated ELF file\n"); return (ENOEXEC); } object = imgp->object; map = &imgp->proc->p_vmspace->vm_map; map_addr = trunc_page_ps((vm_offset_t)vmaddr, pagesize); file_addr = trunc_page_ps(offset, pagesize); /* * We have two choices. We can either clear the data in the last page * of an oversized mapping, or we can start the anon mapping a page * early and copy the initialized data into that first page. We * choose the second.. */ if (memsz > filsz) map_len = trunc_page_ps(offset + filsz, pagesize) - file_addr; else map_len = round_page_ps(offset + filsz, pagesize) - file_addr; if (map_len != 0) { /* cow flags: don't dump readonly sections in core */ cow = MAP_COPY_ON_WRITE | MAP_PREFAULT | (prot & VM_PROT_WRITE ? 0 : MAP_DISABLE_COREDUMP); rv = __elfN(map_insert)(map, object, file_addr, /* file offset */ map_addr, /* virtual start */ map_addr + map_len,/* virtual end */ prot, cow); if (rv != KERN_SUCCESS) return (EINVAL); /* we can stop now if we've covered it all */ if (memsz == filsz) { return (0); } } /* * We have to get the remaining bit of the file into the first part * of the oversized map segment. This is normally because the .data * segment in the file is extended to provide bss. It's a neat idea * to try and save a page, but it's a pain in the behind to implement. */ copy_len = (offset + filsz) - trunc_page_ps(offset + filsz, pagesize); map_addr = trunc_page_ps((vm_offset_t)vmaddr + filsz, pagesize); map_len = round_page_ps((vm_offset_t)vmaddr + memsz, pagesize) - map_addr; /* This had damn well better be true! */ if (map_len != 0) { rv = __elfN(map_insert)(map, NULL, 0, map_addr, map_addr + map_len, VM_PROT_ALL, 0); if (rv != KERN_SUCCESS) { return (EINVAL); } } if (copy_len != 0) { vm_offset_t off; sf = vm_imgact_map_page(object, offset + filsz); if (sf == NULL) return (EIO); /* send the page fragment to user space */ off = trunc_page_ps(offset + filsz, pagesize) - trunc_page(offset + filsz); error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)map_addr, copy_len); vm_imgact_unmap_page(sf); if (error) { return (error); } } /* * set it to the specified protection. * XXX had better undo the damage from pasting over the cracks here! */ vm_map_protect(map, trunc_page(map_addr), round_page(map_addr + map_len), prot, FALSE); return (0); } /* * Load the file "file" into memory. It may be either a shared object * or an executable. * * The "addr" reference parameter is in/out. On entry, it specifies * the address where a shared object should be loaded. If the file is * an executable, this value is ignored. On exit, "addr" specifies * where the file was actually loaded. * * The "entry" reference parameter is out only. On exit, it specifies * the entry point for the loaded file. */ static int __elfN(load_file)(struct proc *p, const char *file, u_long *addr, u_long *entry, size_t pagesize) { struct { struct nameidata nd; struct vattr attr; struct image_params image_params; } *tempdata; const Elf_Ehdr *hdr = NULL; const Elf_Phdr *phdr = NULL; struct nameidata *nd; struct vattr *attr; struct image_params *imgp; vm_prot_t prot; u_long rbase; u_long base_addr = 0; int error, i, numsegs; #ifdef CAPABILITY_MODE /* * XXXJA: This check can go away once we are sufficiently confident * that the checks in namei() are correct. */ if (IN_CAPABILITY_MODE(curthread)) return (ECAPMODE); #endif tempdata = malloc(sizeof(*tempdata), M_TEMP, M_WAITOK); nd = &tempdata->nd; attr = &tempdata->attr; imgp = &tempdata->image_params; /* * Initialize part of the common data */ imgp->proc = p; imgp->attr = attr; imgp->firstpage = NULL; imgp->image_header = NULL; imgp->object = NULL; imgp->execlabel = NULL; NDINIT(nd, LOOKUP, LOCKLEAF | FOLLOW, UIO_SYSSPACE, file, curthread); if ((error = namei(nd)) != 0) { nd->ni_vp = NULL; goto fail; } NDFREE(nd, NDF_ONLY_PNBUF); imgp->vp = nd->ni_vp; /* * Check permissions, modes, uid, etc on the file, and "open" it. */ error = exec_check_permissions(imgp); if (error) goto fail; error = exec_map_first_page(imgp); if (error) goto fail; /* * Also make certain that the interpreter stays the same, so set * its VV_TEXT flag, too. */ VOP_SET_TEXT(nd->ni_vp); imgp->object = nd->ni_vp->v_object; hdr = (const Elf_Ehdr *)imgp->image_header; if ((error = __elfN(check_header)(hdr)) != 0) goto fail; if (hdr->e_type == ET_DYN) rbase = *addr; else if (hdr->e_type == ET_EXEC) rbase = 0; else { error = ENOEXEC; goto fail; } /* Only support headers that fit within first page for now */ if ((hdr->e_phoff > PAGE_SIZE) || (u_int)hdr->e_phentsize * hdr->e_phnum > PAGE_SIZE - hdr->e_phoff) { error = ENOEXEC; goto fail; } phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff); if (!aligned(phdr, Elf_Addr)) { error = ENOEXEC; goto fail; } for (i = 0, numsegs = 0; i < hdr->e_phnum; i++) { if (phdr[i].p_type == PT_LOAD && phdr[i].p_memsz != 0) { /* Loadable segment */ prot = __elfN(trans_prot)(phdr[i].p_flags); error = __elfN(load_section)(imgp, phdr[i].p_offset, (caddr_t)(uintptr_t)phdr[i].p_vaddr + rbase, phdr[i].p_memsz, phdr[i].p_filesz, prot, pagesize); if (error != 0) goto fail; /* * Establish the base address if this is the * first segment. */ if (numsegs == 0) base_addr = trunc_page(phdr[i].p_vaddr + rbase); numsegs++; } } *addr = base_addr; *entry = (unsigned long)hdr->e_entry + rbase; fail: if (imgp->firstpage) exec_unmap_first_page(imgp); if (nd->ni_vp) vput(nd->ni_vp); free(tempdata, M_TEMP); return (error); } static int __CONCAT(exec_, __elfN(imgact))(struct image_params *imgp) { const Elf_Ehdr *hdr = (const Elf_Ehdr *)imgp->image_header; const Elf_Phdr *phdr; Elf_Auxargs *elf_auxargs; struct vmspace *vmspace; vm_prot_t prot; u_long text_size = 0, data_size = 0, total_size = 0; u_long text_addr = 0, data_addr = 0; u_long seg_size, seg_addr; u_long addr, baddr, et_dyn_addr, entry = 0, proghdr = 0; int32_t osrel = 0; int error = 0, i, n, interp_name_len = 0; const char *interp = NULL, *newinterp = NULL; Elf_Brandinfo *brand_info; char *path; struct sysentvec *sv; /* * Do we have a valid ELF header ? * * Only allow ET_EXEC & ET_DYN here, reject ET_DYN later * if particular brand doesn't support it. */ if (__elfN(check_header)(hdr) != 0 || (hdr->e_type != ET_EXEC && hdr->e_type != ET_DYN)) return (-1); /* * From here on down, we return an errno, not -1, as we've * detected an ELF file. */ if ((hdr->e_phoff > PAGE_SIZE) || (u_int)hdr->e_phentsize * hdr->e_phnum > PAGE_SIZE - hdr->e_phoff) { /* Only support headers in first page for now */ return (ENOEXEC); } phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff); if (!aligned(phdr, Elf_Addr)) return (ENOEXEC); n = 0; baddr = 0; for (i = 0; i < hdr->e_phnum; i++) { switch (phdr[i].p_type) { case PT_LOAD: if (n == 0) baddr = phdr[i].p_vaddr; n++; break; case PT_INTERP: /* Path to interpreter */ if (phdr[i].p_filesz > MAXPATHLEN || phdr[i].p_offset > PAGE_SIZE || phdr[i].p_filesz > PAGE_SIZE - phdr[i].p_offset) return (ENOEXEC); interp = imgp->image_header + phdr[i].p_offset; interp_name_len = phdr[i].p_filesz; break; case PT_GNU_STACK: if (__elfN(nxstack)) imgp->stack_prot = __elfN(trans_prot)(phdr[i].p_flags); + imgp->stack_sz = phdr[i].p_memsz; break; } } brand_info = __elfN(get_brandinfo)(imgp, interp, interp_name_len, &osrel); if (brand_info == NULL) { uprintf("ELF binary type \"%u\" not known.\n", hdr->e_ident[EI_OSABI]); return (ENOEXEC); } if (hdr->e_type == ET_DYN) { if ((brand_info->flags & BI_CAN_EXEC_DYN) == 0) return (ENOEXEC); /* * Honour the base load address from the dso if it is * non-zero for some reason. */ if (baddr == 0) et_dyn_addr = ET_DYN_LOAD_ADDR; else et_dyn_addr = 0; } else et_dyn_addr = 0; sv = brand_info->sysvec; if (interp != NULL && brand_info->interp_newpath != NULL) newinterp = brand_info->interp_newpath; /* * Avoid a possible deadlock if the current address space is destroyed * and that address space maps the locked vnode. In the common case, * the locked vnode's v_usecount is decremented but remains greater * than zero. Consequently, the vnode lock is not needed by vrele(). * However, in cases where the vnode lock is external, such as nullfs, * v_usecount may become zero. * * The VV_TEXT flag prevents modifications to the executable while * the vnode is unlocked. */ VOP_UNLOCK(imgp->vp, 0); error = exec_new_vmspace(imgp, sv); imgp->proc->p_sysent = sv; vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY); if (error) return (error); for (i = 0; i < hdr->e_phnum; i++) { switch (phdr[i].p_type) { case PT_LOAD: /* Loadable segment */ if (phdr[i].p_memsz == 0) break; prot = __elfN(trans_prot)(phdr[i].p_flags); error = __elfN(load_section)(imgp, phdr[i].p_offset, (caddr_t)(uintptr_t)phdr[i].p_vaddr + et_dyn_addr, phdr[i].p_memsz, phdr[i].p_filesz, prot, sv->sv_pagesize); if (error != 0) return (error); /* * If this segment contains the program headers, * remember their virtual address for the AT_PHDR * aux entry. Static binaries don't usually include * a PT_PHDR entry. */ if (phdr[i].p_offset == 0 && hdr->e_phoff + hdr->e_phnum * hdr->e_phentsize <= phdr[i].p_filesz) proghdr = phdr[i].p_vaddr + hdr->e_phoff + et_dyn_addr; seg_addr = trunc_page(phdr[i].p_vaddr + et_dyn_addr); seg_size = round_page(phdr[i].p_memsz + phdr[i].p_vaddr + et_dyn_addr - seg_addr); /* * Make the largest executable segment the official * text segment and all others data. * * Note that obreak() assumes that data_addr + * data_size == end of data load area, and the ELF * file format expects segments to be sorted by * address. If multiple data segments exist, the * last one will be used. */ if (phdr[i].p_flags & PF_X && text_size < seg_size) { text_size = seg_size; text_addr = seg_addr; } else { data_size = seg_size; data_addr = seg_addr; } total_size += seg_size; break; case PT_PHDR: /* Program header table info */ proghdr = phdr[i].p_vaddr + et_dyn_addr; break; default: break; } } if (data_addr == 0 && data_size == 0) { data_addr = text_addr; data_size = text_size; } entry = (u_long)hdr->e_entry + et_dyn_addr; /* * Check limits. It should be safe to check the * limits after loading the segments since we do * not actually fault in all the segments pages. */ PROC_LOCK(imgp->proc); if (data_size > lim_cur(imgp->proc, RLIMIT_DATA) || text_size > maxtsiz || total_size > lim_cur(imgp->proc, RLIMIT_VMEM) || racct_set(imgp->proc, RACCT_DATA, data_size) != 0 || racct_set(imgp->proc, RACCT_VMEM, total_size) != 0) { PROC_UNLOCK(imgp->proc); return (ENOMEM); } vmspace = imgp->proc->p_vmspace; vmspace->vm_tsize = text_size >> PAGE_SHIFT; vmspace->vm_taddr = (caddr_t)(uintptr_t)text_addr; vmspace->vm_dsize = data_size >> PAGE_SHIFT; vmspace->vm_daddr = (caddr_t)(uintptr_t)data_addr; /* * We load the dynamic linker where a userland call * to mmap(0, ...) would put it. The rationale behind this * calculation is that it leaves room for the heap to grow to * its maximum allowed size. */ addr = round_page((vm_offset_t)vmspace->vm_daddr + lim_max(imgp->proc, RLIMIT_DATA)); PROC_UNLOCK(imgp->proc); imgp->entry_addr = entry; if (interp != NULL) { int have_interp = FALSE; VOP_UNLOCK(imgp->vp, 0); if (brand_info->emul_path != NULL && brand_info->emul_path[0] != '\0') { path = malloc(MAXPATHLEN, M_TEMP, M_WAITOK); snprintf(path, MAXPATHLEN, "%s%s", brand_info->emul_path, interp); error = __elfN(load_file)(imgp->proc, path, &addr, &imgp->entry_addr, sv->sv_pagesize); free(path, M_TEMP); if (error == 0) have_interp = TRUE; } if (!have_interp && newinterp != NULL) { error = __elfN(load_file)(imgp->proc, newinterp, &addr, &imgp->entry_addr, sv->sv_pagesize); if (error == 0) have_interp = TRUE; } if (!have_interp) { error = __elfN(load_file)(imgp->proc, interp, &addr, &imgp->entry_addr, sv->sv_pagesize); } vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY); if (error != 0) { uprintf("ELF interpreter %s not found\n", interp); return (error); } } else addr = et_dyn_addr; /* * Construct auxargs table (used by the fixup routine) */ elf_auxargs = malloc(sizeof(Elf_Auxargs), M_TEMP, M_WAITOK); elf_auxargs->execfd = -1; elf_auxargs->phdr = proghdr; elf_auxargs->phent = hdr->e_phentsize; elf_auxargs->phnum = hdr->e_phnum; elf_auxargs->pagesz = PAGE_SIZE; elf_auxargs->base = addr; elf_auxargs->flags = 0; elf_auxargs->entry = entry; imgp->auxargs = elf_auxargs; imgp->interpreted = 0; imgp->reloc_base = addr; imgp->proc->p_osrel = osrel; return (error); } #define suword __CONCAT(suword, __ELF_WORD_SIZE) int __elfN(freebsd_fixup)(register_t **stack_base, struct image_params *imgp) { Elf_Auxargs *args = (Elf_Auxargs *)imgp->auxargs; Elf_Addr *base; Elf_Addr *pos; base = (Elf_Addr *)*stack_base; pos = base + (imgp->args->argc + imgp->args->envc + 2); if (args->execfd != -1) AUXARGS_ENTRY(pos, AT_EXECFD, args->execfd); AUXARGS_ENTRY(pos, AT_PHDR, args->phdr); AUXARGS_ENTRY(pos, AT_PHENT, args->phent); AUXARGS_ENTRY(pos, AT_PHNUM, args->phnum); AUXARGS_ENTRY(pos, AT_PAGESZ, args->pagesz); AUXARGS_ENTRY(pos, AT_FLAGS, args->flags); AUXARGS_ENTRY(pos, AT_ENTRY, args->entry); AUXARGS_ENTRY(pos, AT_BASE, args->base); if (imgp->execpathp != 0) AUXARGS_ENTRY(pos, AT_EXECPATH, imgp->execpathp); AUXARGS_ENTRY(pos, AT_OSRELDATE, imgp->proc->p_ucred->cr_prison->pr_osreldate); if (imgp->canary != 0) { AUXARGS_ENTRY(pos, AT_CANARY, imgp->canary); AUXARGS_ENTRY(pos, AT_CANARYLEN, imgp->canarylen); } AUXARGS_ENTRY(pos, AT_NCPUS, mp_ncpus); if (imgp->pagesizes != 0) { AUXARGS_ENTRY(pos, AT_PAGESIZES, imgp->pagesizes); AUXARGS_ENTRY(pos, AT_PAGESIZESLEN, imgp->pagesizeslen); } if (imgp->sysent->sv_timekeep_base != 0) { AUXARGS_ENTRY(pos, AT_TIMEKEEP, imgp->sysent->sv_timekeep_base); } AUXARGS_ENTRY(pos, AT_STACKPROT, imgp->sysent->sv_shared_page_obj != NULL && imgp->stack_prot != 0 ? imgp->stack_prot : imgp->sysent->sv_stackprot); AUXARGS_ENTRY(pos, AT_NULL, 0); free(imgp->auxargs, M_TEMP); imgp->auxargs = NULL; base--; suword(base, (long)imgp->args->argc); *stack_base = (register_t *)base; return (0); } /* * Code for generating ELF core dumps. */ typedef void (*segment_callback)(vm_map_entry_t, void *); /* Closure for cb_put_phdr(). */ struct phdr_closure { Elf_Phdr *phdr; /* Program header to fill in */ Elf_Off offset; /* Offset of segment in core file */ }; /* Closure for cb_size_segment(). */ struct sseg_closure { int count; /* Count of writable segments. */ size_t size; /* Total size of all writable segments. */ }; typedef void (*outfunc_t)(void *, struct sbuf *, size_t *); struct note_info { int type; /* Note type. */ outfunc_t outfunc; /* Output function. */ void *outarg; /* Argument for the output function. */ size_t outsize; /* Output size. */ TAILQ_ENTRY(note_info) link; /* Link to the next note info. */ }; TAILQ_HEAD(note_info_list, note_info); /* Coredump output parameters. */ struct coredump_params { off_t offset; struct ucred *active_cred; struct ucred *file_cred; struct thread *td; struct vnode *vp; struct gzio_stream *gzs; }; static void cb_put_phdr(vm_map_entry_t, void *); static void cb_size_segment(vm_map_entry_t, void *); static int core_write(struct coredump_params *, void *, size_t, off_t, enum uio_seg); static void each_writable_segment(struct thread *, segment_callback, void *); static int __elfN(corehdr)(struct coredump_params *, int, void *, size_t, struct note_info_list *, size_t); static void __elfN(prepare_notes)(struct thread *, struct note_info_list *, size_t *); static void __elfN(puthdr)(struct thread *, void *, size_t, int, size_t); static void __elfN(putnote)(struct note_info *, struct sbuf *); static size_t register_note(struct note_info_list *, int, outfunc_t, void *); static int sbuf_drain_core_output(void *, const char *, int); static int sbuf_drain_count(void *arg, const char *data, int len); static void __elfN(note_fpregset)(void *, struct sbuf *, size_t *); static void __elfN(note_prpsinfo)(void *, struct sbuf *, size_t *); static void __elfN(note_prstatus)(void *, struct sbuf *, size_t *); static void __elfN(note_threadmd)(void *, struct sbuf *, size_t *); static void __elfN(note_thrmisc)(void *, struct sbuf *, size_t *); static void __elfN(note_procstat_auxv)(void *, struct sbuf *, size_t *); static void __elfN(note_procstat_proc)(void *, struct sbuf *, size_t *); static void __elfN(note_procstat_psstrings)(void *, struct sbuf *, size_t *); static void note_procstat_files(void *, struct sbuf *, size_t *); static void note_procstat_groups(void *, struct sbuf *, size_t *); static void note_procstat_osrel(void *, struct sbuf *, size_t *); static void note_procstat_rlimit(void *, struct sbuf *, size_t *); static void note_procstat_umask(void *, struct sbuf *, size_t *); static void note_procstat_vmmap(void *, struct sbuf *, size_t *); #ifdef GZIO extern int compress_user_cores_gzlevel; /* * Write out a core segment to the compression stream. */ static int compress_chunk(struct coredump_params *p, char *base, char *buf, u_int len) { u_int chunk_len; int error; while (len > 0) { chunk_len = MIN(len, CORE_BUF_SIZE); copyin(base, buf, chunk_len); error = gzio_write(p->gzs, buf, chunk_len); if (error != 0) break; base += chunk_len; len -= chunk_len; } return (error); } static int core_gz_write(void *base, size_t len, off_t offset, void *arg) { return (core_write((struct coredump_params *)arg, base, len, offset, UIO_SYSSPACE)); } #endif /* GZIO */ static int core_write(struct coredump_params *p, void *base, size_t len, off_t offset, enum uio_seg seg) { return (vn_rdwr_inchunks(UIO_WRITE, p->vp, base, len, offset, seg, IO_UNIT | IO_DIRECT | IO_RANGELOCKED, p->active_cred, p->file_cred, NULL, p->td)); } static int core_output(void *base, size_t len, off_t offset, struct coredump_params *p, void *tmpbuf) { #ifdef GZIO if (p->gzs != NULL) return (compress_chunk(p, base, tmpbuf, len)); #endif return (core_write(p, base, len, offset, UIO_USERSPACE)); } /* * Drain into a core file. */ static int sbuf_drain_core_output(void *arg, const char *data, int len) { struct coredump_params *p; int error, locked; p = (struct coredump_params *)arg; /* * Some kern_proc out routines that print to this sbuf may * call us with the process lock held. Draining with the * non-sleepable lock held is unsafe. The lock is needed for * those routines when dumping a live process. In our case we * can safely release the lock before draining and acquire * again after. */ locked = PROC_LOCKED(p->td->td_proc); if (locked) PROC_UNLOCK(p->td->td_proc); #ifdef GZIO if (p->gzs != NULL) error = gzio_write(p->gzs, __DECONST(char *, data), len); else #endif error = core_write(p, __DECONST(void *, data), len, p->offset, UIO_SYSSPACE); if (locked) PROC_LOCK(p->td->td_proc); if (error != 0) return (-error); p->offset += len; return (len); } /* * Drain into a counter. */ static int sbuf_drain_count(void *arg, const char *data __unused, int len) { size_t *sizep; sizep = (size_t *)arg; *sizep += len; return (len); } int __elfN(coredump)(struct thread *td, struct vnode *vp, off_t limit, int flags) { struct ucred *cred = td->td_ucred; int error = 0; struct sseg_closure seginfo; struct note_info_list notelst; struct coredump_params params; struct note_info *ninfo; void *hdr, *tmpbuf; size_t hdrsize, notesz, coresize; boolean_t compress; compress = (flags & IMGACT_CORE_COMPRESS) != 0; hdr = NULL; TAILQ_INIT(¬elst); /* Size the program segments. */ seginfo.count = 0; seginfo.size = 0; each_writable_segment(td, cb_size_segment, &seginfo); /* * Collect info about the core file header area. */ hdrsize = sizeof(Elf_Ehdr) + sizeof(Elf_Phdr) * (1 + seginfo.count); __elfN(prepare_notes)(td, ¬elst, ¬esz); coresize = round_page(hdrsize + notesz) + seginfo.size; #ifdef RACCT PROC_LOCK(td->td_proc); error = racct_add(td->td_proc, RACCT_CORE, coresize); PROC_UNLOCK(td->td_proc); if (error != 0) { error = EFAULT; goto done; } #endif if (coresize >= limit) { error = EFAULT; goto done; } /* Set up core dump parameters. */ params.offset = 0; params.active_cred = cred; params.file_cred = NOCRED; params.td = td; params.vp = vp; params.gzs = NULL; tmpbuf = NULL; #ifdef GZIO /* Create a compression stream if necessary. */ if (compress) { params.gzs = gzio_init(core_gz_write, GZIO_DEFLATE, CORE_BUF_SIZE, compress_user_cores_gzlevel, ¶ms); if (params.gzs == NULL) { error = EFAULT; goto done; } tmpbuf = malloc(CORE_BUF_SIZE, M_TEMP, M_WAITOK | M_ZERO); } #endif /* * Allocate memory for building the header, fill it up, * and write it out following the notes. */ hdr = malloc(hdrsize, M_TEMP, M_WAITOK); if (hdr == NULL) { error = EINVAL; goto done; } error = __elfN(corehdr)(¶ms, seginfo.count, hdr, hdrsize, ¬elst, notesz); /* Write the contents of all of the writable segments. */ if (error == 0) { Elf_Phdr *php; off_t offset; int i; php = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)) + 1; offset = round_page(hdrsize + notesz); for (i = 0; i < seginfo.count; i++) { error = core_output((caddr_t)(uintptr_t)php->p_vaddr, php->p_filesz, offset, ¶ms, tmpbuf); if (error != 0) break; offset += php->p_filesz; php++; } #ifdef GZIO if (error == 0 && compress) error = gzio_flush(params.gzs); #endif } if (error) { log(LOG_WARNING, "Failed to write core file for process %s (error %d)\n", curproc->p_comm, error); } done: #ifdef GZIO if (compress) { free(tmpbuf, M_TEMP); gzio_fini(params.gzs); } #endif while ((ninfo = TAILQ_FIRST(¬elst)) != NULL) { TAILQ_REMOVE(¬elst, ninfo, link); free(ninfo, M_TEMP); } if (hdr != NULL) free(hdr, M_TEMP); return (error); } /* * A callback for each_writable_segment() to write out the segment's * program header entry. */ static void cb_put_phdr(entry, closure) vm_map_entry_t entry; void *closure; { struct phdr_closure *phc = (struct phdr_closure *)closure; Elf_Phdr *phdr = phc->phdr; phc->offset = round_page(phc->offset); phdr->p_type = PT_LOAD; phdr->p_offset = phc->offset; phdr->p_vaddr = entry->start; phdr->p_paddr = 0; phdr->p_filesz = phdr->p_memsz = entry->end - entry->start; phdr->p_align = PAGE_SIZE; phdr->p_flags = __elfN(untrans_prot)(entry->protection); phc->offset += phdr->p_filesz; phc->phdr++; } /* * A callback for each_writable_segment() to gather information about * the number of segments and their total size. */ static void cb_size_segment(entry, closure) vm_map_entry_t entry; void *closure; { struct sseg_closure *ssc = (struct sseg_closure *)closure; ssc->count++; ssc->size += entry->end - entry->start; } /* * For each writable segment in the process's memory map, call the given * function with a pointer to the map entry and some arbitrary * caller-supplied data. */ static void each_writable_segment(td, func, closure) struct thread *td; segment_callback func; void *closure; { struct proc *p = td->td_proc; vm_map_t map = &p->p_vmspace->vm_map; vm_map_entry_t entry; vm_object_t backing_object, object; boolean_t ignore_entry; vm_map_lock_read(map); for (entry = map->header.next; entry != &map->header; entry = entry->next) { /* * Don't dump inaccessible mappings, deal with legacy * coredump mode. * * Note that read-only segments related to the elf binary * are marked MAP_ENTRY_NOCOREDUMP now so we no longer * need to arbitrarily ignore such segments. */ if (elf_legacy_coredump) { if ((entry->protection & VM_PROT_RW) != VM_PROT_RW) continue; } else { if ((entry->protection & VM_PROT_ALL) == 0) continue; } /* * Dont include memory segment in the coredump if * MAP_NOCORE is set in mmap(2) or MADV_NOCORE in * madvise(2). Do not dump submaps (i.e. parts of the * kernel map). */ if (entry->eflags & (MAP_ENTRY_NOCOREDUMP|MAP_ENTRY_IS_SUB_MAP)) continue; if ((object = entry->object.vm_object) == NULL) continue; /* Ignore memory-mapped devices and such things. */ VM_OBJECT_RLOCK(object); while ((backing_object = object->backing_object) != NULL) { VM_OBJECT_RLOCK(backing_object); VM_OBJECT_RUNLOCK(object); object = backing_object; } ignore_entry = object->type != OBJT_DEFAULT && object->type != OBJT_SWAP && object->type != OBJT_VNODE && object->type != OBJT_PHYS; VM_OBJECT_RUNLOCK(object); if (ignore_entry) continue; (*func)(entry, closure); } vm_map_unlock_read(map); } /* * Write the core file header to the file, including padding up to * the page boundary. */ static int __elfN(corehdr)(struct coredump_params *p, int numsegs, void *hdr, size_t hdrsize, struct note_info_list *notelst, size_t notesz) { struct note_info *ninfo; struct sbuf *sb; int error; /* Fill in the header. */ bzero(hdr, hdrsize); __elfN(puthdr)(p->td, hdr, hdrsize, numsegs, notesz); sb = sbuf_new(NULL, NULL, CORE_BUF_SIZE, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_core_output, p); sbuf_start_section(sb, NULL); sbuf_bcat(sb, hdr, hdrsize); TAILQ_FOREACH(ninfo, notelst, link) __elfN(putnote)(ninfo, sb); /* Align up to a page boundary for the program segments. */ sbuf_end_section(sb, -1, PAGE_SIZE, 0); error = sbuf_finish(sb); sbuf_delete(sb); return (error); } static void __elfN(prepare_notes)(struct thread *td, struct note_info_list *list, size_t *sizep) { struct proc *p; struct thread *thr; size_t size; p = td->td_proc; size = 0; size += register_note(list, NT_PRPSINFO, __elfN(note_prpsinfo), p); /* * To have the debugger select the right thread (LWP) as the initial * thread, we dump the state of the thread passed to us in td first. * This is the thread that causes the core dump and thus likely to * be the right thread one wants to have selected in the debugger. */ thr = td; while (thr != NULL) { size += register_note(list, NT_PRSTATUS, __elfN(note_prstatus), thr); size += register_note(list, NT_FPREGSET, __elfN(note_fpregset), thr); size += register_note(list, NT_THRMISC, __elfN(note_thrmisc), thr); size += register_note(list, -1, __elfN(note_threadmd), thr); thr = (thr == td) ? TAILQ_FIRST(&p->p_threads) : TAILQ_NEXT(thr, td_plist); if (thr == td) thr = TAILQ_NEXT(thr, td_plist); } size += register_note(list, NT_PROCSTAT_PROC, __elfN(note_procstat_proc), p); size += register_note(list, NT_PROCSTAT_FILES, note_procstat_files, p); size += register_note(list, NT_PROCSTAT_VMMAP, note_procstat_vmmap, p); size += register_note(list, NT_PROCSTAT_GROUPS, note_procstat_groups, p); size += register_note(list, NT_PROCSTAT_UMASK, note_procstat_umask, p); size += register_note(list, NT_PROCSTAT_RLIMIT, note_procstat_rlimit, p); size += register_note(list, NT_PROCSTAT_OSREL, note_procstat_osrel, p); size += register_note(list, NT_PROCSTAT_PSSTRINGS, __elfN(note_procstat_psstrings), p); size += register_note(list, NT_PROCSTAT_AUXV, __elfN(note_procstat_auxv), p); *sizep = size; } static void __elfN(puthdr)(struct thread *td, void *hdr, size_t hdrsize, int numsegs, size_t notesz) { Elf_Ehdr *ehdr; Elf_Phdr *phdr; struct phdr_closure phc; ehdr = (Elf_Ehdr *)hdr; phdr = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)); ehdr->e_ident[EI_MAG0] = ELFMAG0; ehdr->e_ident[EI_MAG1] = ELFMAG1; ehdr->e_ident[EI_MAG2] = ELFMAG2; ehdr->e_ident[EI_MAG3] = ELFMAG3; ehdr->e_ident[EI_CLASS] = ELF_CLASS; ehdr->e_ident[EI_DATA] = ELF_DATA; ehdr->e_ident[EI_VERSION] = EV_CURRENT; ehdr->e_ident[EI_OSABI] = ELFOSABI_FREEBSD; ehdr->e_ident[EI_ABIVERSION] = 0; ehdr->e_ident[EI_PAD] = 0; ehdr->e_type = ET_CORE; #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 ehdr->e_machine = ELF_ARCH32; #else ehdr->e_machine = ELF_ARCH; #endif ehdr->e_version = EV_CURRENT; ehdr->e_entry = 0; ehdr->e_phoff = sizeof(Elf_Ehdr); ehdr->e_flags = 0; ehdr->e_ehsize = sizeof(Elf_Ehdr); ehdr->e_phentsize = sizeof(Elf_Phdr); ehdr->e_phnum = numsegs + 1; ehdr->e_shentsize = sizeof(Elf_Shdr); ehdr->e_shnum = 0; ehdr->e_shstrndx = SHN_UNDEF; /* * Fill in the program header entries. */ /* The note segement. */ phdr->p_type = PT_NOTE; phdr->p_offset = hdrsize; phdr->p_vaddr = 0; phdr->p_paddr = 0; phdr->p_filesz = notesz; phdr->p_memsz = 0; phdr->p_flags = PF_R; phdr->p_align = ELF_NOTE_ROUNDSIZE; phdr++; /* All the writable segments from the program. */ phc.phdr = phdr; phc.offset = round_page(hdrsize + notesz); each_writable_segment(td, cb_put_phdr, &phc); } static size_t register_note(struct note_info_list *list, int type, outfunc_t out, void *arg) { struct note_info *ninfo; size_t size, notesize; size = 0; out(arg, NULL, &size); ninfo = malloc(sizeof(*ninfo), M_TEMP, M_ZERO | M_WAITOK); ninfo->type = type; ninfo->outfunc = out; ninfo->outarg = arg; ninfo->outsize = size; TAILQ_INSERT_TAIL(list, ninfo, link); if (type == -1) return (size); notesize = sizeof(Elf_Note) + /* note header */ roundup2(sizeof(FREEBSD_ABI_VENDOR), ELF_NOTE_ROUNDSIZE) + /* note name */ roundup2(size, ELF_NOTE_ROUNDSIZE); /* note description */ return (notesize); } static size_t append_note_data(const void *src, void *dst, size_t len) { size_t padded_len; padded_len = roundup2(len, ELF_NOTE_ROUNDSIZE); if (dst != NULL) { bcopy(src, dst, len); bzero((char *)dst + len, padded_len - len); } return (padded_len); } size_t __elfN(populate_note)(int type, void *src, void *dst, size_t size, void **descp) { Elf_Note *note; char *buf; size_t notesize; buf = dst; if (buf != NULL) { note = (Elf_Note *)buf; note->n_namesz = sizeof(FREEBSD_ABI_VENDOR); note->n_descsz = size; note->n_type = type; buf += sizeof(*note); buf += append_note_data(FREEBSD_ABI_VENDOR, buf, sizeof(FREEBSD_ABI_VENDOR)); append_note_data(src, buf, size); if (descp != NULL) *descp = buf; } notesize = sizeof(Elf_Note) + /* note header */ roundup2(sizeof(FREEBSD_ABI_VENDOR), ELF_NOTE_ROUNDSIZE) + /* note name */ roundup2(size, ELF_NOTE_ROUNDSIZE); /* note description */ return (notesize); } static void __elfN(putnote)(struct note_info *ninfo, struct sbuf *sb) { Elf_Note note; ssize_t old_len; if (ninfo->type == -1) { ninfo->outfunc(ninfo->outarg, sb, &ninfo->outsize); return; } note.n_namesz = sizeof(FREEBSD_ABI_VENDOR); note.n_descsz = ninfo->outsize; note.n_type = ninfo->type; sbuf_bcat(sb, ¬e, sizeof(note)); sbuf_start_section(sb, &old_len); sbuf_bcat(sb, FREEBSD_ABI_VENDOR, sizeof(FREEBSD_ABI_VENDOR)); sbuf_end_section(sb, old_len, ELF_NOTE_ROUNDSIZE, 0); if (note.n_descsz == 0) return; sbuf_start_section(sb, &old_len); ninfo->outfunc(ninfo->outarg, sb, &ninfo->outsize); sbuf_end_section(sb, old_len, ELF_NOTE_ROUNDSIZE, 0); } /* * Miscellaneous note out functions. */ #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 #include typedef struct prstatus32 elf_prstatus_t; typedef struct prpsinfo32 elf_prpsinfo_t; typedef struct fpreg32 elf_prfpregset_t; typedef struct fpreg32 elf_fpregset_t; typedef struct reg32 elf_gregset_t; typedef struct thrmisc32 elf_thrmisc_t; #define ELF_KERN_PROC_MASK KERN_PROC_MASK32 typedef struct kinfo_proc32 elf_kinfo_proc_t; typedef uint32_t elf_ps_strings_t; #else typedef prstatus_t elf_prstatus_t; typedef prpsinfo_t elf_prpsinfo_t; typedef prfpregset_t elf_prfpregset_t; typedef prfpregset_t elf_fpregset_t; typedef gregset_t elf_gregset_t; typedef thrmisc_t elf_thrmisc_t; #define ELF_KERN_PROC_MASK 0 typedef struct kinfo_proc elf_kinfo_proc_t; typedef vm_offset_t elf_ps_strings_t; #endif static void __elfN(note_prpsinfo)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; elf_prpsinfo_t *psinfo; p = (struct proc *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(*psinfo), ("invalid size")); psinfo = malloc(sizeof(*psinfo), M_TEMP, M_ZERO | M_WAITOK); psinfo->pr_version = PRPSINFO_VERSION; psinfo->pr_psinfosz = sizeof(elf_prpsinfo_t); strlcpy(psinfo->pr_fname, p->p_comm, sizeof(psinfo->pr_fname)); /* * XXX - We don't fill in the command line arguments properly * yet. */ strlcpy(psinfo->pr_psargs, p->p_comm, sizeof(psinfo->pr_psargs)); sbuf_bcat(sb, psinfo, sizeof(*psinfo)); free(psinfo, M_TEMP); } *sizep = sizeof(*psinfo); } static void __elfN(note_prstatus)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; elf_prstatus_t *status; td = (struct thread *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(*status), ("invalid size")); status = malloc(sizeof(*status), M_TEMP, M_ZERO | M_WAITOK); status->pr_version = PRSTATUS_VERSION; status->pr_statussz = sizeof(elf_prstatus_t); status->pr_gregsetsz = sizeof(elf_gregset_t); status->pr_fpregsetsz = sizeof(elf_fpregset_t); status->pr_osreldate = osreldate; status->pr_cursig = td->td_proc->p_sig; status->pr_pid = td->td_tid; #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 fill_regs32(td, &status->pr_reg); #else fill_regs(td, &status->pr_reg); #endif sbuf_bcat(sb, status, sizeof(*status)); free(status, M_TEMP); } *sizep = sizeof(*status); } static void __elfN(note_fpregset)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; elf_prfpregset_t *fpregset; td = (struct thread *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(*fpregset), ("invalid size")); fpregset = malloc(sizeof(*fpregset), M_TEMP, M_ZERO | M_WAITOK); #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 fill_fpregs32(td, fpregset); #else fill_fpregs(td, fpregset); #endif sbuf_bcat(sb, fpregset, sizeof(*fpregset)); free(fpregset, M_TEMP); } *sizep = sizeof(*fpregset); } static void __elfN(note_thrmisc)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; elf_thrmisc_t thrmisc; td = (struct thread *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(thrmisc), ("invalid size")); bzero(&thrmisc._pad, sizeof(thrmisc._pad)); strcpy(thrmisc.pr_tname, td->td_name); sbuf_bcat(sb, &thrmisc, sizeof(thrmisc)); } *sizep = sizeof(thrmisc); } /* * Allow for MD specific notes, as well as any MD * specific preparations for writing MI notes. */ static void __elfN(note_threadmd)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; void *buf; size_t size; td = (struct thread *)arg; size = *sizep; if (size != 0 && sb != NULL) buf = malloc(size, M_TEMP, M_ZERO | M_WAITOK); else buf = NULL; size = 0; __elfN(dump_thread)(td, buf, &size); KASSERT(sb == NULL || *sizep == size, ("invalid size")); if (size != 0 && sb != NULL) sbuf_bcat(sb, buf, size); free(buf, M_TEMP); *sizep = size; } #ifdef KINFO_PROC_SIZE CTASSERT(sizeof(struct kinfo_proc) == KINFO_PROC_SIZE); #endif static void __elfN(note_procstat_proc)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + p->p_numthreads * sizeof(elf_kinfo_proc_t); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(elf_kinfo_proc_t); sbuf_bcat(sb, &structsize, sizeof(structsize)); sx_slock(&proctree_lock); PROC_LOCK(p); kern_proc_out(p, sb, ELF_KERN_PROC_MASK); sx_sunlock(&proctree_lock); } *sizep = size; } #ifdef KINFO_FILE_SIZE CTASSERT(sizeof(struct kinfo_file) == KINFO_FILE_SIZE); #endif static void note_procstat_files(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; if (sb == NULL) { size = 0; sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_count, &size); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_filedesc_out(p, sb, -1); sbuf_finish(sb); sbuf_delete(sb); *sizep = size; } else { structsize = sizeof(struct kinfo_file); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_filedesc_out(p, sb, -1); } } #ifdef KINFO_VMENTRY_SIZE CTASSERT(sizeof(struct kinfo_vmentry) == KINFO_VMENTRY_SIZE); #endif static void note_procstat_vmmap(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; if (sb == NULL) { size = 0; sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_count, &size); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_vmmap_out(p, sb); sbuf_finish(sb); sbuf_delete(sb); *sizep = size; } else { structsize = sizeof(struct kinfo_vmentry); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_vmmap_out(p, sb); } } static void note_procstat_groups(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + p->p_ucred->cr_ngroups * sizeof(gid_t); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(gid_t); sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, p->p_ucred->cr_groups, p->p_ucred->cr_ngroups * sizeof(gid_t)); } *sizep = size; } static void note_procstat_umask(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(p->p_fd->fd_cmask); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(p->p_fd->fd_cmask); sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, &p->p_fd->fd_cmask, sizeof(p->p_fd->fd_cmask)); } *sizep = size; } static void note_procstat_rlimit(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; struct rlimit rlim[RLIM_NLIMITS]; size_t size; int structsize, i; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(rlim); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(rlim); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); for (i = 0; i < RLIM_NLIMITS; i++) lim_rlimit(p, i, &rlim[i]); PROC_UNLOCK(p); sbuf_bcat(sb, rlim, sizeof(rlim)); } *sizep = size; } static void note_procstat_osrel(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(p->p_osrel); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(p->p_osrel); sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, &p->p_osrel, sizeof(p->p_osrel)); } *sizep = size; } static void __elfN(note_procstat_psstrings)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; elf_ps_strings_t ps_strings; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(ps_strings); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(ps_strings); #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 ps_strings = PTROUT(p->p_sysent->sv_psstrings); #else ps_strings = p->p_sysent->sv_psstrings; #endif sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, &ps_strings, sizeof(ps_strings)); } *sizep = size; } static void __elfN(note_procstat_auxv)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; if (sb == NULL) { size = 0; sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_count, &size); sbuf_bcat(sb, &structsize, sizeof(structsize)); PHOLD(p); proc_getauxv(curthread, p, sb); PRELE(p); sbuf_finish(sb); sbuf_delete(sb); *sizep = size; } else { structsize = sizeof(Elf_Auxinfo); sbuf_bcat(sb, &structsize, sizeof(structsize)); PHOLD(p); proc_getauxv(curthread, p, sb); PRELE(p); } } static boolean_t __elfN(parse_notes)(struct image_params *imgp, Elf_Brandnote *checknote, int32_t *osrel, const Elf_Phdr *pnote) { const Elf_Note *note, *note0, *note_end; const char *note_name; int i; if (pnote == NULL || pnote->p_offset > PAGE_SIZE || pnote->p_filesz > PAGE_SIZE - pnote->p_offset) return (FALSE); note = note0 = (const Elf_Note *)(imgp->image_header + pnote->p_offset); note_end = (const Elf_Note *)(imgp->image_header + pnote->p_offset + pnote->p_filesz); for (i = 0; i < 100 && note >= note0 && note < note_end; i++) { if (!aligned(note, Elf32_Addr) || (const char *)note_end - (const char *)note < sizeof(Elf_Note)) return (FALSE); if (note->n_namesz != checknote->hdr.n_namesz || note->n_descsz != checknote->hdr.n_descsz || note->n_type != checknote->hdr.n_type) goto nextnote; note_name = (const char *)(note + 1); if (note_name + checknote->hdr.n_namesz >= (const char *)note_end || strncmp(checknote->vendor, note_name, checknote->hdr.n_namesz) != 0) goto nextnote; /* * Fetch the osreldate for binary * from the ELF OSABI-note if necessary. */ if ((checknote->flags & BN_TRANSLATE_OSREL) != 0 && checknote->trans_osrel != NULL) return (checknote->trans_osrel(note, osrel)); return (TRUE); nextnote: note = (const Elf_Note *)((const char *)(note + 1) + roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE) + roundup2(note->n_descsz, ELF_NOTE_ROUNDSIZE)); } return (FALSE); } /* * Try to find the appropriate ABI-note section for checknote, * fetch the osreldate for binary from the ELF OSABI-note. Only the * first page of the image is searched, the same as for headers. */ static boolean_t __elfN(check_note)(struct image_params *imgp, Elf_Brandnote *checknote, int32_t *osrel) { const Elf_Phdr *phdr; const Elf_Ehdr *hdr; int i; hdr = (const Elf_Ehdr *)imgp->image_header; phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff); for (i = 0; i < hdr->e_phnum; i++) { if (phdr[i].p_type == PT_NOTE && __elfN(parse_notes)(imgp, checknote, osrel, &phdr[i])) return (TRUE); } return (FALSE); } /* * Tell kern_execve.c about it, with a little help from the linker. */ static struct execsw __elfN(execsw) = { __CONCAT(exec_, __elfN(imgact)), __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) }; EXEC_SET(__CONCAT(elf, __ELF_WORD_SIZE), __elfN(execsw)); static vm_prot_t __elfN(trans_prot)(Elf_Word flags) { vm_prot_t prot; prot = 0; if (flags & PF_X) prot |= VM_PROT_EXECUTE; if (flags & PF_W) prot |= VM_PROT_WRITE; if (flags & PF_R) prot |= VM_PROT_READ; #if __ELF_WORD_SIZE == 32 #if defined(__amd64__) if (i386_read_exec && (flags & PF_R)) prot |= VM_PROT_EXECUTE; #endif #endif return (prot); } static Elf_Word __elfN(untrans_prot)(vm_prot_t prot) { Elf_Word flags; flags = 0; if (prot & VM_PROT_EXECUTE) flags |= PF_X; if (prot & VM_PROT_READ) flags |= PF_R; if (prot & VM_PROT_WRITE) flags |= PF_W; return (flags); } Index: user/ngie/more-tests/sys/kern/kern_exec.c =================================================================== --- user/ngie/more-tests/sys/kern/kern_exec.c (revision 281584) +++ user/ngie/more-tests/sys/kern/kern_exec.c (revision 281585) @@ -1,1498 +1,1511 @@ /*- * Copyright (c) 1993, David Greenman * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_capsicum.h" #include "opt_hwpmc_hooks.h" #include "opt_ktrace.h" #include "opt_vm.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef KTRACE #include #endif #include #include #include #include #include #include #include #include #include #ifdef HWPMC_HOOKS #include #endif #include #include #include #ifdef KDTRACE_HOOKS #include dtrace_execexit_func_t dtrace_fasttrap_exec; #endif SDT_PROVIDER_DECLARE(proc); SDT_PROBE_DEFINE1(proc, kernel, , exec, "char *"); SDT_PROBE_DEFINE1(proc, kernel, , exec__failure, "int"); SDT_PROBE_DEFINE1(proc, kernel, , exec__success, "char *"); MALLOC_DEFINE(M_PARGS, "proc-args", "Process arguments"); static int sysctl_kern_ps_strings(SYSCTL_HANDLER_ARGS); static int sysctl_kern_usrstack(SYSCTL_HANDLER_ARGS); static int sysctl_kern_stackprot(SYSCTL_HANDLER_ARGS); static int do_execve(struct thread *td, struct image_args *args, struct mac *mac_p); /* XXX This should be vm_size_t. */ SYSCTL_PROC(_kern, KERN_PS_STRINGS, ps_strings, CTLTYPE_ULONG|CTLFLAG_RD, NULL, 0, sysctl_kern_ps_strings, "LU", ""); /* XXX This should be vm_size_t. */ SYSCTL_PROC(_kern, KERN_USRSTACK, usrstack, CTLTYPE_ULONG|CTLFLAG_RD| CTLFLAG_CAPRD, NULL, 0, sysctl_kern_usrstack, "LU", ""); SYSCTL_PROC(_kern, OID_AUTO, stackprot, CTLTYPE_INT|CTLFLAG_RD, NULL, 0, sysctl_kern_stackprot, "I", ""); u_long ps_arg_cache_limit = PAGE_SIZE / 16; SYSCTL_ULONG(_kern, OID_AUTO, ps_arg_cache_limit, CTLFLAG_RW, &ps_arg_cache_limit, 0, ""); static int disallow_high_osrel; SYSCTL_INT(_kern, OID_AUTO, disallow_high_osrel, CTLFLAG_RW, &disallow_high_osrel, 0, "Disallow execution of binaries built for higher version of the world"); static int map_at_zero = 0; SYSCTL_INT(_security_bsd, OID_AUTO, map_at_zero, CTLFLAG_RWTUN, &map_at_zero, 0, "Permit processes to map an object at virtual address 0."); static int sysctl_kern_ps_strings(SYSCTL_HANDLER_ARGS) { struct proc *p; int error; p = curproc; #ifdef SCTL_MASK32 if (req->flags & SCTL_MASK32) { unsigned int val; val = (unsigned int)p->p_sysent->sv_psstrings; error = SYSCTL_OUT(req, &val, sizeof(val)); } else #endif error = SYSCTL_OUT(req, &p->p_sysent->sv_psstrings, sizeof(p->p_sysent->sv_psstrings)); return error; } static int sysctl_kern_usrstack(SYSCTL_HANDLER_ARGS) { struct proc *p; int error; p = curproc; #ifdef SCTL_MASK32 if (req->flags & SCTL_MASK32) { unsigned int val; val = (unsigned int)p->p_sysent->sv_usrstack; error = SYSCTL_OUT(req, &val, sizeof(val)); } else #endif error = SYSCTL_OUT(req, &p->p_sysent->sv_usrstack, sizeof(p->p_sysent->sv_usrstack)); return error; } static int sysctl_kern_stackprot(SYSCTL_HANDLER_ARGS) { struct proc *p; p = curproc; return (SYSCTL_OUT(req, &p->p_sysent->sv_stackprot, sizeof(p->p_sysent->sv_stackprot))); } /* * Each of the items is a pointer to a `const struct execsw', hence the * double pointer here. */ static const struct execsw **execsw; #ifndef _SYS_SYSPROTO_H_ struct execve_args { char *fname; char **argv; char **envv; }; #endif int sys_execve(td, uap) struct thread *td; struct execve_args /* { char *fname; char **argv; char **envv; } */ *uap; { int error; struct image_args args; error = exec_copyin_args(&args, uap->fname, UIO_USERSPACE, uap->argv, uap->envv); if (error == 0) error = kern_execve(td, &args, NULL); return (error); } #ifndef _SYS_SYSPROTO_H_ struct fexecve_args { int fd; char **argv; char **envv; } #endif int sys_fexecve(struct thread *td, struct fexecve_args *uap) { int error; struct image_args args; error = exec_copyin_args(&args, NULL, UIO_SYSSPACE, uap->argv, uap->envv); if (error == 0) { args.fd = uap->fd; error = kern_execve(td, &args, NULL); } return (error); } #ifndef _SYS_SYSPROTO_H_ struct __mac_execve_args { char *fname; char **argv; char **envv; struct mac *mac_p; }; #endif int sys___mac_execve(td, uap) struct thread *td; struct __mac_execve_args /* { char *fname; char **argv; char **envv; struct mac *mac_p; } */ *uap; { #ifdef MAC int error; struct image_args args; error = exec_copyin_args(&args, uap->fname, UIO_USERSPACE, uap->argv, uap->envv); if (error == 0) error = kern_execve(td, &args, uap->mac_p); return (error); #else return (ENOSYS); #endif } /* * XXX: kern_execve has the astonishing property of not always returning to * the caller. If sufficiently bad things happen during the call to * do_execve(), it can end up calling exit1(); as a result, callers must * avoid doing anything which they might need to undo (e.g., allocating * memory). */ int kern_execve(td, args, mac_p) struct thread *td; struct image_args *args; struct mac *mac_p; { struct proc *p = td->td_proc; struct vmspace *oldvmspace; int error; AUDIT_ARG_ARGV(args->begin_argv, args->argc, args->begin_envv - args->begin_argv); AUDIT_ARG_ENVV(args->begin_envv, args->envc, args->endp - args->begin_envv); if (p->p_flag & P_HADTHREADS) { PROC_LOCK(p); if (thread_single(p, SINGLE_BOUNDARY)) { PROC_UNLOCK(p); exec_free_args(args); return (ERESTART); /* Try again later. */ } PROC_UNLOCK(p); } KASSERT((td->td_pflags & TDP_EXECVMSPC) == 0, ("nested execve")); oldvmspace = td->td_proc->p_vmspace; error = do_execve(td, args, mac_p); if (p->p_flag & P_HADTHREADS) { PROC_LOCK(p); /* * If success, we upgrade to SINGLE_EXIT state to * force other threads to suicide. */ if (error == 0) thread_single(p, SINGLE_EXIT); else thread_single_end(p, SINGLE_BOUNDARY); PROC_UNLOCK(p); } if ((td->td_pflags & TDP_EXECVMSPC) != 0) { KASSERT(td->td_proc->p_vmspace != oldvmspace, ("oldvmspace still used")); vmspace_free(oldvmspace); td->td_pflags &= ~TDP_EXECVMSPC; } return (error); } /* * In-kernel implementation of execve(). All arguments are assumed to be * userspace pointers from the passed thread. */ static int do_execve(td, args, mac_p) struct thread *td; struct image_args *args; struct mac *mac_p; { struct proc *p = td->td_proc; struct nameidata nd; struct ucred *newcred = NULL, *oldcred; struct uidinfo *euip = NULL; register_t *stack_base; int error, i; struct image_params image_params, *imgp; struct vattr attr; int (*img_first)(struct image_params *); struct pargs *oldargs = NULL, *newargs = NULL; struct sigacts *oldsigacts, *newsigacts; #ifdef KTRACE struct vnode *tracevp = NULL; struct ucred *tracecred = NULL; #endif struct vnode *textvp = NULL, *binvp; cap_rights_t rights; int credential_changing; int textset; #ifdef MAC struct label *interpvplabel = NULL; int will_transition; #endif #ifdef HWPMC_HOOKS struct pmckern_procexec pe; #endif static const char fexecv_proc_title[] = "(fexecv)"; imgp = &image_params; /* * Lock the process and set the P_INEXEC flag to indicate that * it should be left alone until we're done here. This is * necessary to avoid race conditions - e.g. in ptrace() - * that might allow a local user to illicitly obtain elevated * privileges. */ PROC_LOCK(p); KASSERT((p->p_flag & P_INEXEC) == 0, ("%s(): process already has P_INEXEC flag", __func__)); p->p_flag |= P_INEXEC; PROC_UNLOCK(p); /* * Initialize part of the common data */ bzero(imgp, sizeof(*imgp)); imgp->proc = p; imgp->attr = &attr; imgp->args = args; #ifdef MAC error = mac_execve_enter(imgp, mac_p); if (error) goto exec_fail; #endif /* * Translate the file name. namei() returns a vnode pointer * in ni_vp amoung other things. * * XXXAUDIT: It would be desirable to also audit the name of the * interpreter if this is an interpreted binary. */ if (args->fname != NULL) { NDINIT(&nd, LOOKUP, ISOPEN | LOCKLEAF | FOLLOW | SAVENAME | AUDITVNODE1, UIO_SYSSPACE, args->fname, td); } SDT_PROBE(proc, kernel, , exec, args->fname, 0, 0, 0, 0 ); interpret: if (args->fname != NULL) { #ifdef CAPABILITY_MODE /* * While capability mode can't reach this point via direct * path arguments to execve(), we also don't allow * interpreters to be used in capability mode (for now). * Catch indirect lookups and return a permissions error. */ if (IN_CAPABILITY_MODE(td)) { error = ECAPMODE; goto exec_fail; } #endif error = namei(&nd); if (error) goto exec_fail; binvp = nd.ni_vp; imgp->vp = binvp; } else { AUDIT_ARG_FD(args->fd); /* * Descriptors opened only with O_EXEC or O_RDONLY are allowed. */ error = fgetvp_exec(td, args->fd, cap_rights_init(&rights, CAP_FEXECVE), &binvp); if (error) goto exec_fail; vn_lock(binvp, LK_EXCLUSIVE | LK_RETRY); AUDIT_ARG_VNODE1(binvp); imgp->vp = binvp; } /* * Check file permissions (also 'opens' file) */ error = exec_check_permissions(imgp); if (error) goto exec_fail_dealloc; imgp->object = imgp->vp->v_object; if (imgp->object != NULL) vm_object_reference(imgp->object); /* * Set VV_TEXT now so no one can write to the executable while we're * activating it. * * Remember if this was set before and unset it in case this is not * actually an executable image. */ textset = VOP_IS_TEXT(imgp->vp); VOP_SET_TEXT(imgp->vp); error = exec_map_first_page(imgp); if (error) goto exec_fail_dealloc; imgp->proc->p_osrel = 0; /* * If the current process has a special image activator it * wants to try first, call it. For example, emulating shell * scripts differently. */ error = -1; if ((img_first = imgp->proc->p_sysent->sv_imgact_try) != NULL) error = img_first(imgp); /* * Loop through the list of image activators, calling each one. * An activator returns -1 if there is no match, 0 on success, * and an error otherwise. */ for (i = 0; error == -1 && execsw[i]; ++i) { if (execsw[i]->ex_imgact == NULL || execsw[i]->ex_imgact == img_first) { continue; } error = (*execsw[i]->ex_imgact)(imgp); } if (error) { if (error == -1) { if (textset == 0) VOP_UNSET_TEXT(imgp->vp); error = ENOEXEC; } goto exec_fail_dealloc; } /* * Special interpreter operation, cleanup and loop up to try to * activate the interpreter. */ if (imgp->interpreted) { exec_unmap_first_page(imgp); /* * VV_TEXT needs to be unset for scripts. There is a short * period before we determine that something is a script where * VV_TEXT will be set. The vnode lock is held over this * entire period so nothing should illegitimately be blocked. */ VOP_UNSET_TEXT(imgp->vp); /* free name buffer and old vnode */ if (args->fname != NULL) NDFREE(&nd, NDF_ONLY_PNBUF); #ifdef MAC mac_execve_interpreter_enter(binvp, &interpvplabel); #endif if (imgp->opened) { VOP_CLOSE(binvp, FREAD, td->td_ucred, td); imgp->opened = 0; } vput(binvp); vm_object_deallocate(imgp->object); imgp->object = NULL; /* set new name to that of the interpreter */ NDINIT(&nd, LOOKUP, LOCKLEAF | FOLLOW | SAVENAME, UIO_SYSSPACE, imgp->interpreter_name, td); args->fname = imgp->interpreter_name; goto interpret; } /* * NB: We unlock the vnode here because it is believed that none * of the sv_copyout_strings/sv_fixup operations require the vnode. */ VOP_UNLOCK(imgp->vp, 0); /* * Do the best to calculate the full path to the image file. */ if (imgp->auxargs != NULL && ((args->fname != NULL && args->fname[0] == '/') || vn_fullpath(td, imgp->vp, &imgp->execpath, &imgp->freepath) != 0)) imgp->execpath = args->fname; if (disallow_high_osrel && P_OSREL_MAJOR(p->p_osrel) > P_OSREL_MAJOR(__FreeBSD_version)) { error = ENOEXEC; uprintf("Osrel %d for image %s too high\n", p->p_osrel, imgp->execpath != NULL ? imgp->execpath : ""); vn_lock(imgp->vp, LK_SHARED | LK_RETRY); goto exec_fail_dealloc; } /* * Copy out strings (args and env) and initialize stack base */ if (p->p_sysent->sv_copyout_strings) stack_base = (*p->p_sysent->sv_copyout_strings)(imgp); else stack_base = exec_copyout_strings(imgp); /* * If custom stack fixup routine present for this process * let it do the stack setup. * Else stuff argument count as first item on stack */ if (p->p_sysent->sv_fixup != NULL) (*p->p_sysent->sv_fixup)(&stack_base, imgp); else suword(--stack_base, imgp->args->argc); /* * For security and other reasons, the file descriptor table cannot * be shared after an exec. */ fdunshare(td); /* close files on exec */ fdcloseexec(td); /* * Malloc things before we need locks. */ i = imgp->args->begin_envv - imgp->args->begin_argv; /* Cache arguments if they fit inside our allowance */ if (ps_arg_cache_limit >= i + sizeof(struct pargs)) { newargs = pargs_alloc(i); bcopy(imgp->args->begin_argv, newargs->ar_args, i); } vn_lock(imgp->vp, LK_SHARED | LK_RETRY); /* Get a reference to the vnode prior to locking the proc */ VREF(binvp); /* * For security and other reasons, signal handlers cannot * be shared after an exec. The new process gets a copy of the old * handlers. In execsigs(), the new process will have its signals * reset. */ if (sigacts_shared(p->p_sigacts)) { oldsigacts = p->p_sigacts; newsigacts = sigacts_alloc(); sigacts_copy(newsigacts, oldsigacts); } else { oldsigacts = NULL; newsigacts = NULL; /* satisfy gcc */ } PROC_LOCK(p); if (oldsigacts) p->p_sigacts = newsigacts; oldcred = p->p_ucred; /* Stop profiling */ stopprofclock(p); /* reset caught signals */ execsigs(p); /* name this process - nameiexec(p, ndp) */ bzero(p->p_comm, sizeof(p->p_comm)); if (args->fname) bcopy(nd.ni_cnd.cn_nameptr, p->p_comm, min(nd.ni_cnd.cn_namelen, MAXCOMLEN)); else if (vn_commname(binvp, p->p_comm, sizeof(p->p_comm)) != 0) bcopy(fexecv_proc_title, p->p_comm, sizeof(fexecv_proc_title)); bcopy(p->p_comm, td->td_name, sizeof(td->td_name)); #ifdef KTR sched_clear_tdname(td); #endif /* * mark as execed, wakeup the process that vforked (if any) and tell * it that it now has its own resources back */ p->p_flag |= P_EXEC; if ((p->p_flag2 & P2_NOTRACE_EXEC) == 0) p->p_flag2 &= ~P2_NOTRACE; if (p->p_flag & P_PPWAIT) { p->p_flag &= ~(P_PPWAIT | P_PPTRACE); cv_broadcast(&p->p_pwait); } /* * Implement image setuid/setgid. * * Don't honor setuid/setgid if the filesystem prohibits it or if * the process is being traced. * * We disable setuid/setgid/etc in compatibility mode on the basis * that most setugid applications are not written with that * environment in mind, and will therefore almost certainly operate * incorrectly. In principle there's no reason that setugid * applications might not be useful in capability mode, so we may want * to reconsider this conservative design choice in the future. * * XXXMAC: For the time being, use NOSUID to also prohibit * transitions on the file system. */ credential_changing = 0; credential_changing |= (attr.va_mode & S_ISUID) && oldcred->cr_uid != attr.va_uid; credential_changing |= (attr.va_mode & S_ISGID) && oldcred->cr_gid != attr.va_gid; #ifdef MAC will_transition = mac_vnode_execve_will_transition(oldcred, imgp->vp, interpvplabel, imgp); credential_changing |= will_transition; #endif if (credential_changing && #ifdef CAPABILITY_MODE ((oldcred->cr_flags & CRED_FLAG_CAPMODE) == 0) && #endif (imgp->vp->v_mount->mnt_flag & MNT_NOSUID) == 0 && (p->p_flag & P_TRACED) == 0) { /* * Turn off syscall tracing for set-id programs, except for * root. Record any set-id flags first to make sure that * we do not regain any tracing during a possible block. */ setsugid(p); #ifdef KTRACE if (p->p_tracecred != NULL && priv_check_cred(p->p_tracecred, PRIV_DEBUG_DIFFCRED, 0)) ktrprocexec(p, &tracecred, &tracevp); #endif /* * Close any file descriptors 0..2 that reference procfs, * then make sure file descriptors 0..2 are in use. * * Both fdsetugidsafety() and fdcheckstd() may call functions * taking sleepable locks, so temporarily drop our locks. */ PROC_UNLOCK(p); VOP_UNLOCK(imgp->vp, 0); fdsetugidsafety(td); error = fdcheckstd(td); if (error != 0) goto done1; newcred = crdup(oldcred); euip = uifind(attr.va_uid); vn_lock(imgp->vp, LK_SHARED | LK_RETRY); PROC_LOCK(p); /* * Set the new credentials. */ if (attr.va_mode & S_ISUID) change_euid(newcred, euip); if (attr.va_mode & S_ISGID) change_egid(newcred, attr.va_gid); #ifdef MAC if (will_transition) { mac_vnode_execve_transition(oldcred, newcred, imgp->vp, interpvplabel, imgp); } #endif /* * Implement correct POSIX saved-id behavior. * * XXXMAC: Note that the current logic will save the * uid and gid if a MAC domain transition occurs, even * though maybe it shouldn't. */ change_svuid(newcred, newcred->cr_uid); change_svgid(newcred, newcred->cr_gid); proc_set_cred(p, newcred); } else { if (oldcred->cr_uid == oldcred->cr_ruid && oldcred->cr_gid == oldcred->cr_rgid) p->p_flag &= ~P_SUGID; /* * Implement correct POSIX saved-id behavior. * * XXX: It's not clear that the existing behavior is * POSIX-compliant. A number of sources indicate that the * saved uid/gid should only be updated if the new ruid is * not equal to the old ruid, or the new euid is not equal * to the old euid and the new euid is not equal to the old * ruid. The FreeBSD code always updates the saved uid/gid. * Also, this code uses the new (replaced) euid and egid as * the source, which may or may not be the right ones to use. */ if (oldcred->cr_svuid != oldcred->cr_uid || oldcred->cr_svgid != oldcred->cr_gid) { PROC_UNLOCK(p); VOP_UNLOCK(imgp->vp, 0); newcred = crdup(oldcred); vn_lock(imgp->vp, LK_SHARED | LK_RETRY); PROC_LOCK(p); change_svuid(newcred, newcred->cr_uid); change_svgid(newcred, newcred->cr_gid); proc_set_cred(p, newcred); } } /* * Store the vp for use in procfs. This vnode was referenced prior * to locking the proc lock. */ textvp = p->p_textvp; p->p_textvp = binvp; #ifdef KDTRACE_HOOKS /* * Tell the DTrace fasttrap provider about the exec if it * has declared an interest. */ if (dtrace_fasttrap_exec) dtrace_fasttrap_exec(p); #endif /* * Notify others that we exec'd, and clear the P_INEXEC flag * as we're now a bona fide freshly-execed process. */ KNOTE_LOCKED(&p->p_klist, NOTE_EXEC); p->p_flag &= ~P_INEXEC; /* clear "fork but no exec" flag, as we _are_ execing */ p->p_acflag &= ~AFORK; /* * Free any previous argument cache and replace it with * the new argument cache, if any. */ oldargs = p->p_args; p->p_args = newargs; newargs = NULL; #ifdef HWPMC_HOOKS /* * Check if system-wide sampling is in effect or if the * current process is using PMCs. If so, do exec() time * processing. This processing needs to happen AFTER the * P_INEXEC flag is cleared. * * The proc lock needs to be released before taking the PMC * SX. */ if (PMC_SYSTEM_SAMPLING_ACTIVE() || PMC_PROC_IS_USING_PMCS(p)) { PROC_UNLOCK(p); VOP_UNLOCK(imgp->vp, 0); pe.pm_credentialschanged = credential_changing; pe.pm_entryaddr = imgp->entry_addr; PMC_CALL_HOOK_X(td, PMC_FN_PROCESS_EXEC, (void *) &pe); vn_lock(imgp->vp, LK_SHARED | LK_RETRY); } else PROC_UNLOCK(p); #else /* !HWPMC_HOOKS */ PROC_UNLOCK(p); #endif /* Set values passed into the program in registers. */ if (p->p_sysent->sv_setregs) (*p->p_sysent->sv_setregs)(td, imgp, (u_long)(uintptr_t)stack_base); else exec_setregs(td, imgp, (u_long)(uintptr_t)stack_base); vfs_mark_atime(imgp->vp, td->td_ucred); SDT_PROBE(proc, kernel, , exec__success, args->fname, 0, 0, 0, 0); VOP_UNLOCK(imgp->vp, 0); done1: /* * Free any resources malloc'd earlier that we didn't use. */ if (euip != NULL) uifree(euip); if (newcred != NULL) crfree(oldcred); /* * Handle deferred decrement of ref counts. */ if (textvp != NULL) vrele(textvp); if (error != 0) vrele(binvp); #ifdef KTRACE if (tracevp != NULL) vrele(tracevp); if (tracecred != NULL) crfree(tracecred); #endif vn_lock(imgp->vp, LK_SHARED | LK_RETRY); pargs_drop(oldargs); pargs_drop(newargs); if (oldsigacts != NULL) sigacts_free(oldsigacts); exec_fail_dealloc: /* * free various allocated resources */ if (imgp->firstpage != NULL) exec_unmap_first_page(imgp); if (imgp->vp != NULL) { if (args->fname) NDFREE(&nd, NDF_ONLY_PNBUF); if (imgp->opened) VOP_CLOSE(imgp->vp, FREAD, td->td_ucred, td); vput(imgp->vp); } if (imgp->object != NULL) vm_object_deallocate(imgp->object); free(imgp->freepath, M_TEMP); if (error == 0) { PROC_LOCK(p); td->td_dbgflags |= TDB_EXEC; PROC_UNLOCK(p); /* * Stop the process here if its stop event mask has * the S_EXEC bit set. */ STOPEVENT(p, S_EXEC, 0); goto done2; } exec_fail: /* we're done here, clear P_INEXEC */ PROC_LOCK(p); p->p_flag &= ~P_INEXEC; PROC_UNLOCK(p); SDT_PROBE(proc, kernel, , exec__failure, error, 0, 0, 0, 0); done2: #ifdef MAC mac_execve_exit(imgp); mac_execve_interpreter_exit(interpvplabel); #endif exec_free_args(args); if (error && imgp->vmspace_destroyed) { /* sorry, no more process anymore. exit gracefully */ exit1(td, W_EXITCODE(0, SIGABRT)); /* NOT REACHED */ } #ifdef KTRACE if (error == 0) ktrprocctor(p); #endif return (error); } int exec_map_first_page(imgp) struct image_params *imgp; { int rv, i; int initial_pagein; vm_page_t ma[VM_INITIAL_PAGEIN]; vm_object_t object; if (imgp->firstpage != NULL) exec_unmap_first_page(imgp); object = imgp->vp->v_object; if (object == NULL) return (EACCES); VM_OBJECT_WLOCK(object); #if VM_NRESERVLEVEL > 0 vm_object_color(object, 0); #endif ma[0] = vm_page_grab(object, 0, VM_ALLOC_NORMAL); if (ma[0]->valid != VM_PAGE_BITS_ALL) { initial_pagein = VM_INITIAL_PAGEIN; if (initial_pagein > object->size) initial_pagein = object->size; for (i = 1; i < initial_pagein; i++) { if ((ma[i] = vm_page_next(ma[i - 1])) != NULL) { if (ma[i]->valid) break; if (vm_page_tryxbusy(ma[i])) break; } else { ma[i] = vm_page_alloc(object, i, VM_ALLOC_NORMAL | VM_ALLOC_IFNOTCACHED); if (ma[i] == NULL) break; } } initial_pagein = i; rv = vm_pager_get_pages(object, ma, initial_pagein, 0); ma[0] = vm_page_lookup(object, 0); if ((rv != VM_PAGER_OK) || (ma[0] == NULL)) { if (ma[0] != NULL) { vm_page_lock(ma[0]); vm_page_free(ma[0]); vm_page_unlock(ma[0]); } VM_OBJECT_WUNLOCK(object); return (EIO); } } vm_page_xunbusy(ma[0]); vm_page_lock(ma[0]); vm_page_hold(ma[0]); vm_page_activate(ma[0]); vm_page_unlock(ma[0]); VM_OBJECT_WUNLOCK(object); imgp->firstpage = sf_buf_alloc(ma[0], 0); imgp->image_header = (char *)sf_buf_kva(imgp->firstpage); return (0); } void exec_unmap_first_page(imgp) struct image_params *imgp; { vm_page_t m; if (imgp->firstpage != NULL) { m = sf_buf_page(imgp->firstpage); sf_buf_free(imgp->firstpage); imgp->firstpage = NULL; vm_page_lock(m); vm_page_unhold(m); vm_page_unlock(m); } } /* * Destroy old address space, and allocate a new stack * The new stack is only SGROWSIZ large because it is grown * automatically in trap.c. */ int exec_new_vmspace(imgp, sv) struct image_params *imgp; struct sysentvec *sv; { int error; struct proc *p = imgp->proc; struct vmspace *vmspace = p->p_vmspace; vm_object_t obj; + struct rlimit rlim_stack; vm_offset_t sv_minuser, stack_addr; vm_map_t map; u_long ssiz; imgp->vmspace_destroyed = 1; imgp->sysent = sv; /* May be called with Giant held */ EVENTHANDLER_INVOKE(process_exec, p, imgp); /* * Blow away entire process VM, if address space not shared, * otherwise, create a new VM space so that other threads are * not disrupted */ map = &vmspace->vm_map; if (map_at_zero) sv_minuser = sv->sv_minuser; else sv_minuser = MAX(sv->sv_minuser, PAGE_SIZE); if (vmspace->vm_refcnt == 1 && vm_map_min(map) == sv_minuser && vm_map_max(map) == sv->sv_maxuser) { shmexit(vmspace); pmap_remove_pages(vmspace_pmap(vmspace)); vm_map_remove(map, vm_map_min(map), vm_map_max(map)); } else { error = vmspace_exec(p, sv_minuser, sv->sv_maxuser); if (error) return (error); vmspace = p->p_vmspace; map = &vmspace->vm_map; } /* Map a shared page */ obj = sv->sv_shared_page_obj; if (obj != NULL) { vm_object_reference(obj); error = vm_map_fixed(map, obj, 0, sv->sv_shared_page_base, sv->sv_shared_page_len, VM_PROT_READ | VM_PROT_EXECUTE, VM_PROT_READ | VM_PROT_EXECUTE, MAP_INHERIT_SHARE | MAP_ACC_NO_CHARGE); if (error) { vm_object_deallocate(obj); return (error); } } /* Allocate a new stack */ - if (sv->sv_maxssiz != NULL) + if (imgp->stack_sz != 0) { + ssiz = imgp->stack_sz; + PROC_LOCK(p); + lim_rlimit(p, RLIMIT_STACK, &rlim_stack); + PROC_UNLOCK(p); + if (ssiz > rlim_stack.rlim_max) + ssiz = rlim_stack.rlim_max; + if (ssiz > rlim_stack.rlim_cur) { + rlim_stack.rlim_cur = ssiz; + kern_setrlimit(curthread, RLIMIT_STACK, &rlim_stack); + } + } else if (sv->sv_maxssiz != NULL) { ssiz = *sv->sv_maxssiz; - else + } else { ssiz = maxssiz; + } stack_addr = sv->sv_usrstack - ssiz; error = vm_map_stack(map, stack_addr, (vm_size_t)ssiz, obj != NULL && imgp->stack_prot != 0 ? imgp->stack_prot : sv->sv_stackprot, VM_PROT_ALL, MAP_STACK_GROWS_DOWN); if (error) return (error); /* * vm_ssize and vm_maxsaddr are somewhat antiquated concepts, but they * are still used to enforce the stack rlimit on the process stack. */ vmspace->vm_ssize = sgrowsiz >> PAGE_SHIFT; vmspace->vm_maxsaddr = (char *)sv->sv_usrstack - ssiz; return (0); } /* * Copy out argument and environment strings from the old process address * space into the temporary string buffer. */ int exec_copyin_args(struct image_args *args, char *fname, enum uio_seg segflg, char **argv, char **envv) { u_long argp, envp; int error; size_t length; bzero(args, sizeof(*args)); if (argv == NULL) return (EFAULT); /* * Allocate demand-paged memory for the file name, argument, and * environment strings. */ error = exec_alloc_args(args); if (error != 0) return (error); /* * Copy the file name. */ if (fname != NULL) { args->fname = args->buf; error = (segflg == UIO_SYSSPACE) ? copystr(fname, args->fname, PATH_MAX, &length) : copyinstr(fname, args->fname, PATH_MAX, &length); if (error != 0) goto err_exit; } else length = 0; args->begin_argv = args->buf + length; args->endp = args->begin_argv; args->stringspace = ARG_MAX; /* * extract arguments first */ for (;;) { error = fueword(argv++, &argp); if (error == -1) { error = EFAULT; goto err_exit; } if (argp == 0) break; error = copyinstr((void *)(uintptr_t)argp, args->endp, args->stringspace, &length); if (error != 0) { if (error == ENAMETOOLONG) error = E2BIG; goto err_exit; } args->stringspace -= length; args->endp += length; args->argc++; } args->begin_envv = args->endp; /* * extract environment strings */ if (envv) { for (;;) { error = fueword(envv++, &envp); if (error == -1) { error = EFAULT; goto err_exit; } if (envp == 0) break; error = copyinstr((void *)(uintptr_t)envp, args->endp, args->stringspace, &length); if (error != 0) { if (error == ENAMETOOLONG) error = E2BIG; goto err_exit; } args->stringspace -= length; args->endp += length; args->envc++; } } return (0); err_exit: exec_free_args(args); return (error); } /* * Allocate temporary demand-paged, zero-filled memory for the file name, * argument, and environment strings. Returns zero if the allocation succeeds * and ENOMEM otherwise. */ int exec_alloc_args(struct image_args *args) { args->buf = (char *)kmap_alloc_wait(exec_map, PATH_MAX + ARG_MAX); return (args->buf != NULL ? 0 : ENOMEM); } void exec_free_args(struct image_args *args) { if (args->buf != NULL) { kmap_free_wakeup(exec_map, (vm_offset_t)args->buf, PATH_MAX + ARG_MAX); args->buf = NULL; } if (args->fname_buf != NULL) { free(args->fname_buf, M_TEMP); args->fname_buf = NULL; } } /* * Copy strings out to the new process address space, constructing new arg * and env vector tables. Return a pointer to the base so that it can be used * as the initial stack pointer. */ register_t * exec_copyout_strings(imgp) struct image_params *imgp; { int argc, envc; char **vectp; char *stringp; uintptr_t destp; register_t *stack_base; struct ps_strings *arginfo; struct proc *p; size_t execpath_len; int szsigcode, szps; char canary[sizeof(long) * 8]; szps = sizeof(pagesizes[0]) * MAXPAGESIZES; /* * Calculate string base and vector table pointers. * Also deal with signal trampoline code for this exec type. */ if (imgp->execpath != NULL && imgp->auxargs != NULL) execpath_len = strlen(imgp->execpath) + 1; else execpath_len = 0; p = imgp->proc; szsigcode = 0; arginfo = (struct ps_strings *)p->p_sysent->sv_psstrings; if (p->p_sysent->sv_sigcode_base == 0) { if (p->p_sysent->sv_szsigcode != NULL) szsigcode = *(p->p_sysent->sv_szsigcode); } destp = (uintptr_t)arginfo; /* * install sigcode */ if (szsigcode != 0) { destp -= szsigcode; destp = rounddown2(destp, sizeof(void *)); copyout(p->p_sysent->sv_sigcode, (void *)destp, szsigcode); } /* * Copy the image path for the rtld. */ if (execpath_len != 0) { destp -= execpath_len; imgp->execpathp = destp; copyout(imgp->execpath, (void *)destp, execpath_len); } /* * Prepare the canary for SSP. */ arc4rand(canary, sizeof(canary), 0); destp -= sizeof(canary); imgp->canary = destp; copyout(canary, (void *)destp, sizeof(canary)); imgp->canarylen = sizeof(canary); /* * Prepare the pagesizes array. */ destp -= szps; destp = rounddown2(destp, sizeof(void *)); imgp->pagesizes = destp; copyout(pagesizes, (void *)destp, szps); imgp->pagesizeslen = szps; destp -= ARG_MAX - imgp->args->stringspace; destp = rounddown2(destp, sizeof(void *)); /* * If we have a valid auxargs ptr, prepare some room * on the stack. */ if (imgp->auxargs) { /* * 'AT_COUNT*2' is size for the ELF Auxargs data. This is for * lower compatibility. */ imgp->auxarg_size = (imgp->auxarg_size) ? imgp->auxarg_size : (AT_COUNT * 2); /* * The '+ 2' is for the null pointers at the end of each of * the arg and env vector sets,and imgp->auxarg_size is room * for argument of Runtime loader. */ vectp = (char **)(destp - (imgp->args->argc + imgp->args->envc + 2 + imgp->auxarg_size) * sizeof(char *)); } else { /* * The '+ 2' is for the null pointers at the end of each of * the arg and env vector sets */ vectp = (char **)(destp - (imgp->args->argc + imgp->args->envc + 2) * sizeof(char *)); } /* * vectp also becomes our initial stack base */ stack_base = (register_t *)vectp; stringp = imgp->args->begin_argv; argc = imgp->args->argc; envc = imgp->args->envc; /* * Copy out strings - arguments and environment. */ copyout(stringp, (void *)destp, ARG_MAX - imgp->args->stringspace); /* * Fill in "ps_strings" struct for ps, w, etc. */ suword(&arginfo->ps_argvstr, (long)(intptr_t)vectp); suword32(&arginfo->ps_nargvstr, argc); /* * Fill in argument portion of vector table. */ for (; argc > 0; --argc) { suword(vectp++, (long)(intptr_t)destp); while (*stringp++ != 0) destp++; destp++; } /* a null vector table pointer separates the argp's from the envp's */ suword(vectp++, 0); suword(&arginfo->ps_envstr, (long)(intptr_t)vectp); suword32(&arginfo->ps_nenvstr, envc); /* * Fill in environment portion of vector table. */ for (; envc > 0; --envc) { suword(vectp++, (long)(intptr_t)destp); while (*stringp++ != 0) destp++; destp++; } /* end of vector table is a null pointer */ suword(vectp, 0); return (stack_base); } /* * Check permissions of file to execute. * Called with imgp->vp locked. * Return 0 for success or error code on failure. */ int exec_check_permissions(imgp) struct image_params *imgp; { struct vnode *vp = imgp->vp; struct vattr *attr = imgp->attr; struct thread *td; int error, writecount; td = curthread; /* Get file attributes */ error = VOP_GETATTR(vp, attr, td->td_ucred); if (error) return (error); #ifdef MAC error = mac_vnode_check_exec(td->td_ucred, imgp->vp, imgp); if (error) return (error); #endif /* * 1) Check if file execution is disabled for the filesystem that * this file resides on. * 2) Ensure that at least one execute bit is on. Otherwise, a * privileged user will always succeed, and we don't want this * to happen unless the file really is executable. * 3) Ensure that the file is a regular file. */ if ((vp->v_mount->mnt_flag & MNT_NOEXEC) || (attr->va_mode & (S_IXUSR | S_IXGRP | S_IXOTH)) == 0 || (attr->va_type != VREG)) return (EACCES); /* * Zero length files can't be exec'd */ if (attr->va_size == 0) return (ENOEXEC); /* * Check for execute permission to file based on current credentials. */ error = VOP_ACCESS(vp, VEXEC, td->td_ucred, td); if (error) return (error); /* * Check number of open-for-writes on the file and deny execution * if there are any. */ error = VOP_GET_WRITECOUNT(vp, &writecount); if (error != 0) return (error); if (writecount != 0) return (ETXTBSY); /* * Call filesystem specific open routine (which does nothing in the * general case). */ error = VOP_OPEN(vp, FREAD, td->td_ucred, td, NULL); if (error == 0) imgp->opened = 1; return (error); } /* * Exec handler registration */ int exec_register(execsw_arg) const struct execsw *execsw_arg; { const struct execsw **es, **xs, **newexecsw; int count = 2; /* New slot and trailing NULL */ if (execsw) for (es = execsw; *es; es++) count++; newexecsw = malloc(count * sizeof(*es), M_TEMP, M_WAITOK); if (newexecsw == NULL) return (ENOMEM); xs = newexecsw; if (execsw) for (es = execsw; *es; es++) *xs++ = *es; *xs++ = execsw_arg; *xs = NULL; if (execsw) free(execsw, M_TEMP); execsw = newexecsw; return (0); } int exec_unregister(execsw_arg) const struct execsw *execsw_arg; { const struct execsw **es, **xs, **newexecsw; int count = 1; if (execsw == NULL) panic("unregister with no handlers left?\n"); for (es = execsw; *es; es++) { if (*es == execsw_arg) break; } if (*es == NULL) return (ENOENT); for (es = execsw; *es; es++) if (*es != execsw_arg) count++; newexecsw = malloc(count * sizeof(*es), M_TEMP, M_WAITOK); if (newexecsw == NULL) return (ENOMEM); xs = newexecsw; for (es = execsw; *es; es++) if (*es != execsw_arg) *xs++ = *es; *xs = NULL; if (execsw) free(execsw, M_TEMP); execsw = newexecsw; return (0); } Index: user/ngie/more-tests/sys/kern/kern_poll.c =================================================================== --- user/ngie/more-tests/sys/kern/kern_poll.c (revision 281584) +++ user/ngie/more-tests/sys/kern/kern_poll.c (revision 281585) @@ -1,568 +1,574 @@ /*- * Copyright (c) 2001-2002 Luigi Rizzo * * Supported by: the Xorp Project (www.xorp.org) * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_device_polling.h" #include #include #include #include #include #include #include #include /* needed by net/if.h */ #include #include #include #include #include #include /* for NETISR_POLL */ #include void hardclock_device_poll(void); /* hook from hardclock */ static struct mtx poll_mtx; /* * Polling support for [network] device drivers. * * Drivers which support this feature can register with the * polling code. * * If registration is successful, the driver must disable interrupts, * and further I/O is performed through the handler, which is invoked * (at least once per clock tick) with 3 arguments: the "arg" passed at * register time (a struct ifnet pointer), a command, and a "count" limit. * * The command can be one of the following: * POLL_ONLY: quick move of "count" packets from input/output queues. * POLL_AND_CHECK_STATUS: as above, plus check status registers or do * other more expensive operations. This command is issued periodically * but less frequently than POLL_ONLY. * * The count limit specifies how much work the handler can do during the * call -- typically this is the number of packets to be received, or * transmitted, etc. (drivers are free to interpret this number, as long * as the max time spent in the function grows roughly linearly with the * count). * * Polling is enabled and disabled via setting IFCAP_POLLING flag on * the interface. The driver ioctl handler should register interface * with polling and disable interrupts, if registration was successful. * * A second variable controls the sharing of CPU between polling/kernel * network processing, and other activities (typically userlevel tasks): * kern.polling.user_frac (between 0 and 100, default 50) sets the share * of CPU allocated to user tasks. CPU is allocated proportionally to the * shares, by dynamically adjusting the "count" (poll_burst). * * Other parameters can should be left to their default values. * The following constraints hold * * 1 <= poll_each_burst <= poll_burst <= poll_burst_max * MIN_POLL_BURST_MAX <= poll_burst_max <= MAX_POLL_BURST_MAX */ #define MIN_POLL_BURST_MAX 10 #define MAX_POLL_BURST_MAX 20000 static uint32_t poll_burst = 5; static uint32_t poll_burst_max = 150; /* good for 100Mbit net and HZ=1000 */ static uint32_t poll_each_burst = 5; static SYSCTL_NODE(_kern, OID_AUTO, polling, CTLFLAG_RW, 0, "Device polling parameters"); SYSCTL_UINT(_kern_polling, OID_AUTO, burst, CTLFLAG_RD, &poll_burst, 0, "Current polling burst size"); static int netisr_poll_scheduled; static int netisr_pollmore_scheduled; static int poll_shutting_down; static int poll_burst_max_sysctl(SYSCTL_HANDLER_ARGS) { uint32_t val = poll_burst_max; int error; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr ) return (error); if (val < MIN_POLL_BURST_MAX || val > MAX_POLL_BURST_MAX) return (EINVAL); mtx_lock(&poll_mtx); poll_burst_max = val; if (poll_burst > poll_burst_max) poll_burst = poll_burst_max; if (poll_each_burst > poll_burst_max) poll_each_burst = MIN_POLL_BURST_MAX; mtx_unlock(&poll_mtx); return (0); } SYSCTL_PROC(_kern_polling, OID_AUTO, burst_max, CTLTYPE_UINT | CTLFLAG_RW, 0, sizeof(uint32_t), poll_burst_max_sysctl, "I", "Max Polling burst size"); static int poll_each_burst_sysctl(SYSCTL_HANDLER_ARGS) { uint32_t val = poll_each_burst; int error; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr ) return (error); if (val < 1) return (EINVAL); mtx_lock(&poll_mtx); if (val > poll_burst_max) { mtx_unlock(&poll_mtx); return (EINVAL); } poll_each_burst = val; mtx_unlock(&poll_mtx); return (0); } SYSCTL_PROC(_kern_polling, OID_AUTO, each_burst, CTLTYPE_UINT | CTLFLAG_RW, 0, sizeof(uint32_t), poll_each_burst_sysctl, "I", "Max size of each burst"); static uint32_t poll_in_idle_loop=0; /* do we poll in idle loop ? */ SYSCTL_UINT(_kern_polling, OID_AUTO, idle_poll, CTLFLAG_RW, &poll_in_idle_loop, 0, "Enable device polling in idle loop"); static uint32_t user_frac = 50; static int user_frac_sysctl(SYSCTL_HANDLER_ARGS) { uint32_t val = user_frac; int error; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr ) return (error); if (val > 99) return (EINVAL); mtx_lock(&poll_mtx); user_frac = val; mtx_unlock(&poll_mtx); return (0); } SYSCTL_PROC(_kern_polling, OID_AUTO, user_frac, CTLTYPE_UINT | CTLFLAG_RW, 0, sizeof(uint32_t), user_frac_sysctl, "I", "Desired user fraction of cpu time"); static uint32_t reg_frac_count = 0; static uint32_t reg_frac = 20 ; static int reg_frac_sysctl(SYSCTL_HANDLER_ARGS) { uint32_t val = reg_frac; int error; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr ) return (error); if (val < 1 || val > hz) return (EINVAL); mtx_lock(&poll_mtx); reg_frac = val; if (reg_frac_count >= reg_frac) reg_frac_count = 0; mtx_unlock(&poll_mtx); return (0); } SYSCTL_PROC(_kern_polling, OID_AUTO, reg_frac, CTLTYPE_UINT | CTLFLAG_RW, 0, sizeof(uint32_t), reg_frac_sysctl, "I", "Every this many cycles check registers"); static uint32_t short_ticks; SYSCTL_UINT(_kern_polling, OID_AUTO, short_ticks, CTLFLAG_RD, &short_ticks, 0, "Hardclock ticks shorter than they should be"); static uint32_t lost_polls; SYSCTL_UINT(_kern_polling, OID_AUTO, lost_polls, CTLFLAG_RD, &lost_polls, 0, "How many times we would have lost a poll tick"); static uint32_t pending_polls; SYSCTL_UINT(_kern_polling, OID_AUTO, pending_polls, CTLFLAG_RD, &pending_polls, 0, "Do we need to poll again"); static int residual_burst = 0; SYSCTL_INT(_kern_polling, OID_AUTO, residual_burst, CTLFLAG_RD, &residual_burst, 0, "# of residual cycles in burst"); static uint32_t poll_handlers; /* next free entry in pr[]. */ SYSCTL_UINT(_kern_polling, OID_AUTO, handlers, CTLFLAG_RD, &poll_handlers, 0, "Number of registered poll handlers"); static uint32_t phase; SYSCTL_UINT(_kern_polling, OID_AUTO, phase, CTLFLAG_RD, &phase, 0, "Polling phase"); static uint32_t suspect; SYSCTL_UINT(_kern_polling, OID_AUTO, suspect, CTLFLAG_RD, &suspect, 0, "suspect event"); static uint32_t stalled; SYSCTL_UINT(_kern_polling, OID_AUTO, stalled, CTLFLAG_RD, &stalled, 0, "potential stalls"); static uint32_t idlepoll_sleeping; /* idlepoll is sleeping */ SYSCTL_UINT(_kern_polling, OID_AUTO, idlepoll_sleeping, CTLFLAG_RD, &idlepoll_sleeping, 0, "idlepoll is sleeping"); #define POLL_LIST_LEN 128 struct pollrec { poll_handler_t *handler; struct ifnet *ifp; }; static struct pollrec pr[POLL_LIST_LEN]; static void poll_shutdown(void *arg, int howto) { poll_shutting_down = 1; } static void init_device_poll(void) { mtx_init(&poll_mtx, "polling", NULL, MTX_DEF); EVENTHANDLER_REGISTER(shutdown_post_sync, poll_shutdown, NULL, SHUTDOWN_PRI_LAST); } SYSINIT(device_poll, SI_SUB_SOFTINTR, SI_ORDER_MIDDLE, init_device_poll, NULL); /* * Hook from hardclock. Tries to schedule a netisr, but keeps track * of lost ticks due to the previous handler taking too long. * Normally, this should not happen, because polling handler should * run for a short time. However, in some cases (e.g. when there are * changes in link status etc.) the drivers take a very long time * (even in the order of milliseconds) to reset and reconfigure the * device, causing apparent lost polls. * * The first part of the code is just for debugging purposes, and tries * to count how often hardclock ticks are shorter than they should, * meaning either stray interrupts or delayed events. */ void hardclock_device_poll(void) { static struct timeval prev_t, t; int delta; if (poll_handlers == 0 || poll_shutting_down) return; microuptime(&t); delta = (t.tv_usec - prev_t.tv_usec) + (t.tv_sec - prev_t.tv_sec)*1000000; if (delta * hz < 500000) short_ticks++; else prev_t = t; if (pending_polls > 100) { /* * Too much, assume it has stalled (not always true * see comment above). */ stalled++; pending_polls = 0; phase = 0; } if (phase <= 2) { if (phase != 0) suspect++; phase = 1; netisr_poll_scheduled = 1; netisr_pollmore_scheduled = 1; netisr_sched_poll(); phase = 2; } if (pending_polls++ > 0) lost_polls++; } /* * ether_poll is called from the idle loop. */ static void ether_poll(int count) { int i; mtx_lock(&poll_mtx); if (count > poll_each_burst) count = poll_each_burst; for (i = 0 ; i < poll_handlers ; i++) pr[i].handler(pr[i].ifp, POLL_ONLY, count); mtx_unlock(&poll_mtx); } /* * netisr_pollmore is called after other netisr's, possibly scheduling * another NETISR_POLL call, or adapting the burst size for the next cycle. * * It is very bad to fetch large bursts of packets from a single card at once, * because the burst could take a long time to be completely processed, or * could saturate the intermediate queue (ipintrq or similar) leading to * losses or unfairness. To reduce the problem, and also to account better for * time spent in network-related processing, we split the burst in smaller * chunks of fixed size, giving control to the other netisr's between chunks. * This helps in improving the fairness, reducing livelock (because we * emulate more closely the "process to completion" that we have with * fastforwarding) and accounting for the work performed in low level * handling and forwarding. */ static struct timeval poll_start_t; void netisr_pollmore() { struct timeval t; int kern_load; + if (poll_handlers == 0) + return; + mtx_lock(&poll_mtx); if (!netisr_pollmore_scheduled) { mtx_unlock(&poll_mtx); return; } netisr_pollmore_scheduled = 0; phase = 5; if (residual_burst > 0) { netisr_poll_scheduled = 1; netisr_pollmore_scheduled = 1; netisr_sched_poll(); mtx_unlock(&poll_mtx); /* will run immediately on return, followed by netisrs */ return; } /* here we can account time spent in netisr's in this tick */ microuptime(&t); kern_load = (t.tv_usec - poll_start_t.tv_usec) + (t.tv_sec - poll_start_t.tv_sec)*1000000; /* us */ kern_load = (kern_load * hz) / 10000; /* 0..100 */ if (kern_load > (100 - user_frac)) { /* try decrease ticks */ if (poll_burst > 1) poll_burst--; } else { if (poll_burst < poll_burst_max) poll_burst++; } pending_polls--; if (pending_polls == 0) /* we are done */ phase = 0; else { /* * Last cycle was long and caused us to miss one or more * hardclock ticks. Restart processing again, but slightly * reduce the burst size to prevent that this happens again. */ poll_burst -= (poll_burst / 8); if (poll_burst < 1) poll_burst = 1; netisr_poll_scheduled = 1; netisr_pollmore_scheduled = 1; netisr_sched_poll(); phase = 6; } mtx_unlock(&poll_mtx); } /* * netisr_poll is typically scheduled once per tick. */ void netisr_poll(void) { int i, cycles; enum poll_cmd arg = POLL_ONLY; + + if (poll_handlers == 0) + return; mtx_lock(&poll_mtx); if (!netisr_poll_scheduled) { mtx_unlock(&poll_mtx); return; } netisr_poll_scheduled = 0; phase = 3; if (residual_burst == 0) { /* first call in this tick */ microuptime(&poll_start_t); if (++reg_frac_count == reg_frac) { arg = POLL_AND_CHECK_STATUS; reg_frac_count = 0; } residual_burst = poll_burst; } cycles = (residual_burst < poll_each_burst) ? residual_burst : poll_each_burst; residual_burst -= cycles; for (i = 0 ; i < poll_handlers ; i++) pr[i].handler(pr[i].ifp, arg, cycles); phase = 4; mtx_unlock(&poll_mtx); } /* * Try to register routine for polling. Returns 0 if successful * (and polling should be enabled), error code otherwise. * A device is not supposed to register itself multiple times. * * This is called from within the *_ioctl() functions. */ int ether_poll_register(poll_handler_t *h, if_t ifp) { int i; KASSERT(h != NULL, ("%s: handler is NULL", __func__)); KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__)); mtx_lock(&poll_mtx); if (poll_handlers >= POLL_LIST_LEN) { /* * List full, cannot register more entries. * This should never happen; if it does, it is probably a * broken driver trying to register multiple times. Checking * this at runtime is expensive, and won't solve the problem * anyways, so just report a few times and then give up. */ static int verbose = 10 ; if (verbose >0) { log(LOG_ERR, "poll handlers list full, " "maybe a broken driver ?\n"); verbose--; } mtx_unlock(&poll_mtx); return (ENOMEM); /* no polling for you */ } for (i = 0 ; i < poll_handlers ; i++) if (pr[i].ifp == ifp && pr[i].handler != NULL) { mtx_unlock(&poll_mtx); log(LOG_DEBUG, "ether_poll_register: %s: handler" " already registered\n", ifp->if_xname); return (EEXIST); } pr[poll_handlers].handler = h; pr[poll_handlers].ifp = ifp; poll_handlers++; mtx_unlock(&poll_mtx); if (idlepoll_sleeping) wakeup(&idlepoll_sleeping); return (0); } /* * Remove interface from the polling list. Called from *_ioctl(), too. */ int ether_poll_deregister(if_t ifp) { int i; KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__)); mtx_lock(&poll_mtx); for (i = 0 ; i < poll_handlers ; i++) if (pr[i].ifp == ifp) /* found it */ break; if (i == poll_handlers) { log(LOG_DEBUG, "ether_poll_deregister: %s: not found!\n", ifp->if_xname); mtx_unlock(&poll_mtx); return (ENOENT); } poll_handlers--; if (i < poll_handlers) { /* Last entry replaces this one. */ pr[i].handler = pr[poll_handlers].handler; pr[i].ifp = pr[poll_handlers].ifp; } mtx_unlock(&poll_mtx); return (0); } static void poll_idle(void) { struct thread *td = curthread; struct rtprio rtp; rtp.prio = RTP_PRIO_MAX; /* lowest priority */ rtp.type = RTP_PRIO_IDLE; PROC_SLOCK(td->td_proc); rtp_to_pri(&rtp, td); PROC_SUNLOCK(td->td_proc); for (;;) { if (poll_in_idle_loop && poll_handlers > 0) { idlepoll_sleeping = 0; ether_poll(poll_each_burst); thread_lock(td); mi_switch(SW_VOL, NULL); thread_unlock(td); } else { idlepoll_sleeping = 1; tsleep(&idlepoll_sleeping, 0, "pollid", hz * 3); } } } static struct proc *idlepoll; static struct kproc_desc idlepoll_kp = { "idlepoll", poll_idle, &idlepoll }; SYSINIT(idlepoll, SI_SUB_KTHREAD_VM, SI_ORDER_ANY, kproc_start, &idlepoll_kp); Index: user/ngie/more-tests/sys/kern/kern_resource.c =================================================================== --- user/ngie/more-tests/sys/kern/kern_resource.c (revision 281584) +++ user/ngie/more-tests/sys/kern/kern_resource.c (revision 281585) @@ -1,1438 +1,1443 @@ /*- * Copyright (c) 1982, 1986, 1991, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)kern_resource.c 8.5 (Berkeley) 1/21/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static MALLOC_DEFINE(M_PLIMIT, "plimit", "plimit structures"); static MALLOC_DEFINE(M_UIDINFO, "uidinfo", "uidinfo structures"); #define UIHASH(uid) (&uihashtbl[(uid) & uihash]) static struct rwlock uihashtbl_lock; static LIST_HEAD(uihashhead, uidinfo) *uihashtbl; static u_long uihash; /* size of hash table - 1 */ static void calcru1(struct proc *p, struct rusage_ext *ruxp, struct timeval *up, struct timeval *sp); static int donice(struct thread *td, struct proc *chgp, int n); static struct uidinfo *uilookup(uid_t uid); static void ruxagg_locked(struct rusage_ext *rux, struct thread *td); static __inline int lim_shared(struct plimit *limp); /* * Resource controls and accounting. */ #ifndef _SYS_SYSPROTO_H_ struct getpriority_args { int which; int who; }; #endif int sys_getpriority(struct thread *td, register struct getpriority_args *uap) { struct proc *p; struct pgrp *pg; int error, low; error = 0; low = PRIO_MAX + 1; switch (uap->which) { case PRIO_PROCESS: if (uap->who == 0) low = td->td_proc->p_nice; else { p = pfind(uap->who); if (p == NULL) break; if (p_cansee(td, p) == 0) low = p->p_nice; PROC_UNLOCK(p); } break; case PRIO_PGRP: sx_slock(&proctree_lock); if (uap->who == 0) { pg = td->td_proc->p_pgrp; PGRP_LOCK(pg); } else { pg = pgfind(uap->who); if (pg == NULL) { sx_sunlock(&proctree_lock); break; } } sx_sunlock(&proctree_lock); LIST_FOREACH(p, &pg->pg_members, p_pglist) { PROC_LOCK(p); if (p->p_state == PRS_NORMAL && p_cansee(td, p) == 0) { if (p->p_nice < low) low = p->p_nice; } PROC_UNLOCK(p); } PGRP_UNLOCK(pg); break; case PRIO_USER: if (uap->who == 0) uap->who = td->td_ucred->cr_uid; sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { PROC_LOCK(p); if (p->p_state == PRS_NORMAL && p_cansee(td, p) == 0 && p->p_ucred->cr_uid == uap->who) { if (p->p_nice < low) low = p->p_nice; } PROC_UNLOCK(p); } sx_sunlock(&allproc_lock); break; default: error = EINVAL; break; } if (low == PRIO_MAX + 1 && error == 0) error = ESRCH; td->td_retval[0] = low; return (error); } #ifndef _SYS_SYSPROTO_H_ struct setpriority_args { int which; int who; int prio; }; #endif int sys_setpriority(struct thread *td, struct setpriority_args *uap) { struct proc *curp, *p; struct pgrp *pg; int found = 0, error = 0; curp = td->td_proc; switch (uap->which) { case PRIO_PROCESS: if (uap->who == 0) { PROC_LOCK(curp); error = donice(td, curp, uap->prio); PROC_UNLOCK(curp); } else { p = pfind(uap->who); if (p == NULL) break; error = p_cansee(td, p); if (error == 0) error = donice(td, p, uap->prio); PROC_UNLOCK(p); } found++; break; case PRIO_PGRP: sx_slock(&proctree_lock); if (uap->who == 0) { pg = curp->p_pgrp; PGRP_LOCK(pg); } else { pg = pgfind(uap->who); if (pg == NULL) { sx_sunlock(&proctree_lock); break; } } sx_sunlock(&proctree_lock); LIST_FOREACH(p, &pg->pg_members, p_pglist) { PROC_LOCK(p); if (p->p_state == PRS_NORMAL && p_cansee(td, p) == 0) { error = donice(td, p, uap->prio); found++; } PROC_UNLOCK(p); } PGRP_UNLOCK(pg); break; case PRIO_USER: if (uap->who == 0) uap->who = td->td_ucred->cr_uid; sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { PROC_LOCK(p); if (p->p_state == PRS_NORMAL && p->p_ucred->cr_uid == uap->who && p_cansee(td, p) == 0) { error = donice(td, p, uap->prio); found++; } PROC_UNLOCK(p); } sx_sunlock(&allproc_lock); break; default: error = EINVAL; break; } if (found == 0 && error == 0) error = ESRCH; return (error); } /* * Set "nice" for a (whole) process. */ static int donice(struct thread *td, struct proc *p, int n) { int error; PROC_LOCK_ASSERT(p, MA_OWNED); if ((error = p_cansched(td, p))) return (error); if (n > PRIO_MAX) n = PRIO_MAX; if (n < PRIO_MIN) n = PRIO_MIN; if (n < p->p_nice && priv_check(td, PRIV_SCHED_SETPRIORITY) != 0) return (EACCES); sched_nice(p, n); return (0); } static int unprivileged_idprio; SYSCTL_INT(_security_bsd, OID_AUTO, unprivileged_idprio, CTLFLAG_RW, &unprivileged_idprio, 0, "Allow non-root users to set an idle priority"); /* * Set realtime priority for LWP. */ #ifndef _SYS_SYSPROTO_H_ struct rtprio_thread_args { int function; lwpid_t lwpid; struct rtprio *rtp; }; #endif int sys_rtprio_thread(struct thread *td, struct rtprio_thread_args *uap) { struct proc *p; struct rtprio rtp; struct thread *td1; int cierror, error; /* Perform copyin before acquiring locks if needed. */ if (uap->function == RTP_SET) cierror = copyin(uap->rtp, &rtp, sizeof(struct rtprio)); else cierror = 0; if (uap->lwpid == 0 || uap->lwpid == td->td_tid) { p = td->td_proc; td1 = td; PROC_LOCK(p); } else { /* Only look up thread in current process */ td1 = tdfind(uap->lwpid, curproc->p_pid); if (td1 == NULL) return (ESRCH); p = td1->td_proc; } switch (uap->function) { case RTP_LOOKUP: if ((error = p_cansee(td, p))) break; pri_to_rtp(td1, &rtp); PROC_UNLOCK(p); return (copyout(&rtp, uap->rtp, sizeof(struct rtprio))); case RTP_SET: if ((error = p_cansched(td, p)) || (error = cierror)) break; /* Disallow setting rtprio in most cases if not superuser. */ /* * Realtime priority has to be restricted for reasons which * should be obvious. However, for idleprio processes, there is * a potential for system deadlock if an idleprio process gains * a lock on a resource that other processes need (and the * idleprio process can't run due to a CPU-bound normal * process). Fix me! XXX * * This problem is not only related to idleprio process. * A user level program can obtain a file lock and hold it * indefinitely. Additionally, without idleprio processes it is * still conceivable that a program with low priority will never * get to run. In short, allowing this feature might make it * easier to lock a resource indefinitely, but it is not the * only thing that makes it possible. */ if (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_REALTIME || (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_IDLE && unprivileged_idprio == 0)) { error = priv_check(td, PRIV_SCHED_RTPRIO); if (error) break; } error = rtp_to_pri(&rtp, td1); break; default: error = EINVAL; break; } PROC_UNLOCK(p); return (error); } /* * Set realtime priority. */ #ifndef _SYS_SYSPROTO_H_ struct rtprio_args { int function; pid_t pid; struct rtprio *rtp; }; #endif int sys_rtprio(struct thread *td, register struct rtprio_args *uap) { struct proc *p; struct thread *tdp; struct rtprio rtp; int cierror, error; /* Perform copyin before acquiring locks if needed. */ if (uap->function == RTP_SET) cierror = copyin(uap->rtp, &rtp, sizeof(struct rtprio)); else cierror = 0; if (uap->pid == 0) { p = td->td_proc; PROC_LOCK(p); } else { p = pfind(uap->pid); if (p == NULL) return (ESRCH); } switch (uap->function) { case RTP_LOOKUP: if ((error = p_cansee(td, p))) break; /* * Return OUR priority if no pid specified, * or if one is, report the highest priority * in the process. There isn't much more you can do as * there is only room to return a single priority. * Note: specifying our own pid is not the same * as leaving it zero. */ if (uap->pid == 0) { pri_to_rtp(td, &rtp); } else { struct rtprio rtp2; rtp.type = RTP_PRIO_IDLE; rtp.prio = RTP_PRIO_MAX; FOREACH_THREAD_IN_PROC(p, tdp) { pri_to_rtp(tdp, &rtp2); if (rtp2.type < rtp.type || (rtp2.type == rtp.type && rtp2.prio < rtp.prio)) { rtp.type = rtp2.type; rtp.prio = rtp2.prio; } } } PROC_UNLOCK(p); return (copyout(&rtp, uap->rtp, sizeof(struct rtprio))); case RTP_SET: if ((error = p_cansched(td, p)) || (error = cierror)) break; /* * Disallow setting rtprio in most cases if not superuser. * See the comment in sys_rtprio_thread about idprio * threads holding a lock. */ if (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_REALTIME || (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_IDLE && !unprivileged_idprio)) { error = priv_check(td, PRIV_SCHED_RTPRIO); if (error) break; } /* * If we are setting our own priority, set just our * thread but if we are doing another process, * do all the threads on that process. If we * specify our own pid we do the latter. */ if (uap->pid == 0) { error = rtp_to_pri(&rtp, td); } else { FOREACH_THREAD_IN_PROC(p, td) { if ((error = rtp_to_pri(&rtp, td)) != 0) break; } } break; default: error = EINVAL; break; } PROC_UNLOCK(p); return (error); } int rtp_to_pri(struct rtprio *rtp, struct thread *td) { u_char newpri, oldclass, oldpri; switch (RTP_PRIO_BASE(rtp->type)) { case RTP_PRIO_REALTIME: if (rtp->prio > RTP_PRIO_MAX) return (EINVAL); newpri = PRI_MIN_REALTIME + rtp->prio; break; case RTP_PRIO_NORMAL: if (rtp->prio > (PRI_MAX_TIMESHARE - PRI_MIN_TIMESHARE)) return (EINVAL); newpri = PRI_MIN_TIMESHARE + rtp->prio; break; case RTP_PRIO_IDLE: if (rtp->prio > RTP_PRIO_MAX) return (EINVAL); newpri = PRI_MIN_IDLE + rtp->prio; break; default: return (EINVAL); } thread_lock(td); oldclass = td->td_pri_class; sched_class(td, rtp->type); /* XXX fix */ oldpri = td->td_user_pri; sched_user_prio(td, newpri); if (td->td_user_pri != oldpri && (oldclass != RTP_PRIO_NORMAL || td->td_pri_class != RTP_PRIO_NORMAL)) sched_prio(td, td->td_user_pri); if (TD_ON_UPILOCK(td) && oldpri != newpri) { critical_enter(); thread_unlock(td); umtx_pi_adjust(td, oldpri); critical_exit(); } else thread_unlock(td); return (0); } void pri_to_rtp(struct thread *td, struct rtprio *rtp) { thread_lock(td); switch (PRI_BASE(td->td_pri_class)) { case PRI_REALTIME: rtp->prio = td->td_base_user_pri - PRI_MIN_REALTIME; break; case PRI_TIMESHARE: rtp->prio = td->td_base_user_pri - PRI_MIN_TIMESHARE; break; case PRI_IDLE: rtp->prio = td->td_base_user_pri - PRI_MIN_IDLE; break; default: break; } rtp->type = td->td_pri_class; thread_unlock(td); } #if defined(COMPAT_43) #ifndef _SYS_SYSPROTO_H_ struct osetrlimit_args { u_int which; struct orlimit *rlp; }; #endif int osetrlimit(struct thread *td, register struct osetrlimit_args *uap) { struct orlimit olim; struct rlimit lim; int error; if ((error = copyin(uap->rlp, &olim, sizeof(struct orlimit)))) return (error); lim.rlim_cur = olim.rlim_cur; lim.rlim_max = olim.rlim_max; error = kern_setrlimit(td, uap->which, &lim); return (error); } #ifndef _SYS_SYSPROTO_H_ struct ogetrlimit_args { u_int which; struct orlimit *rlp; }; #endif int ogetrlimit(struct thread *td, register struct ogetrlimit_args *uap) { struct orlimit olim; struct rlimit rl; struct proc *p; int error; if (uap->which >= RLIM_NLIMITS) return (EINVAL); p = td->td_proc; PROC_LOCK(p); lim_rlimit(p, uap->which, &rl); PROC_UNLOCK(p); /* * XXX would be more correct to convert only RLIM_INFINITY to the * old RLIM_INFINITY and fail with EOVERFLOW for other larger * values. Most 64->32 and 32->16 conversions, including not * unimportant ones of uids are even more broken than what we * do here (they blindly truncate). We don't do this correctly * here since we have little experience with EOVERFLOW yet. * Elsewhere, getuid() can't fail... */ olim.rlim_cur = rl.rlim_cur > 0x7fffffff ? 0x7fffffff : rl.rlim_cur; olim.rlim_max = rl.rlim_max > 0x7fffffff ? 0x7fffffff : rl.rlim_max; error = copyout(&olim, uap->rlp, sizeof(olim)); return (error); } #endif /* COMPAT_43 */ #ifndef _SYS_SYSPROTO_H_ struct __setrlimit_args { u_int which; struct rlimit *rlp; }; #endif int sys_setrlimit(struct thread *td, register struct __setrlimit_args *uap) { struct rlimit alim; int error; if ((error = copyin(uap->rlp, &alim, sizeof(struct rlimit)))) return (error); error = kern_setrlimit(td, uap->which, &alim); return (error); } static void lim_cb(void *arg) { struct rlimit rlim; struct thread *td; struct proc *p; p = arg; PROC_LOCK_ASSERT(p, MA_OWNED); /* * Check if the process exceeds its cpu resource allocation. If * it reaches the max, arrange to kill the process in ast(). */ if (p->p_cpulimit == RLIM_INFINITY) return; PROC_STATLOCK(p); FOREACH_THREAD_IN_PROC(p, td) { ruxagg(p, td); } PROC_STATUNLOCK(p); if (p->p_rux.rux_runtime > p->p_cpulimit * cpu_tickrate()) { lim_rlimit(p, RLIMIT_CPU, &rlim); if (p->p_rux.rux_runtime >= rlim.rlim_max * cpu_tickrate()) { killproc(p, "exceeded maximum CPU limit"); } else { if (p->p_cpulimit < rlim.rlim_max) p->p_cpulimit += 5; kern_psignal(p, SIGXCPU); } } if ((p->p_flag & P_WEXIT) == 0) callout_reset_sbt(&p->p_limco, SBT_1S, 0, lim_cb, p, C_PREL(1)); } int kern_setrlimit(struct thread *td, u_int which, struct rlimit *limp) { return (kern_proc_setrlimit(td, td->td_proc, which, limp)); } int kern_proc_setrlimit(struct thread *td, struct proc *p, u_int which, struct rlimit *limp) { struct plimit *newlim, *oldlim; register struct rlimit *alimp; struct rlimit oldssiz; int error; if (which >= RLIM_NLIMITS) return (EINVAL); /* * Preserve historical bugs by treating negative limits as unsigned. */ if (limp->rlim_cur < 0) limp->rlim_cur = RLIM_INFINITY; if (limp->rlim_max < 0) limp->rlim_max = RLIM_INFINITY; oldssiz.rlim_cur = 0; newlim = NULL; PROC_LOCK(p); if (lim_shared(p->p_limit)) { PROC_UNLOCK(p); newlim = lim_alloc(); PROC_LOCK(p); } oldlim = p->p_limit; alimp = &oldlim->pl_rlimit[which]; if (limp->rlim_cur > alimp->rlim_max || limp->rlim_max > alimp->rlim_max) if ((error = priv_check(td, PRIV_PROC_SETRLIMIT))) { PROC_UNLOCK(p); if (newlim != NULL) lim_free(newlim); return (error); } if (limp->rlim_cur > limp->rlim_max) limp->rlim_cur = limp->rlim_max; if (newlim != NULL) { lim_copy(newlim, oldlim); alimp = &newlim->pl_rlimit[which]; } switch (which) { case RLIMIT_CPU: if (limp->rlim_cur != RLIM_INFINITY && p->p_cpulimit == RLIM_INFINITY) callout_reset_sbt(&p->p_limco, SBT_1S, 0, lim_cb, p, C_PREL(1)); p->p_cpulimit = limp->rlim_cur; break; case RLIMIT_DATA: if (limp->rlim_cur > maxdsiz) limp->rlim_cur = maxdsiz; if (limp->rlim_max > maxdsiz) limp->rlim_max = maxdsiz; break; case RLIMIT_STACK: if (limp->rlim_cur > maxssiz) limp->rlim_cur = maxssiz; if (limp->rlim_max > maxssiz) limp->rlim_max = maxssiz; oldssiz = *alimp; if (p->p_sysent->sv_fixlimit != NULL) p->p_sysent->sv_fixlimit(&oldssiz, RLIMIT_STACK); break; case RLIMIT_NOFILE: if (limp->rlim_cur > maxfilesperproc) limp->rlim_cur = maxfilesperproc; if (limp->rlim_max > maxfilesperproc) limp->rlim_max = maxfilesperproc; break; case RLIMIT_NPROC: if (limp->rlim_cur > maxprocperuid) limp->rlim_cur = maxprocperuid; if (limp->rlim_max > maxprocperuid) limp->rlim_max = maxprocperuid; if (limp->rlim_cur < 1) limp->rlim_cur = 1; if (limp->rlim_max < 1) limp->rlim_max = 1; break; } if (p->p_sysent->sv_fixlimit != NULL) p->p_sysent->sv_fixlimit(limp, which); *alimp = *limp; if (newlim != NULL) p->p_limit = newlim; PROC_UNLOCK(p); if (newlim != NULL) lim_free(oldlim); - if (which == RLIMIT_STACK) { + if (which == RLIMIT_STACK && + /* + * Skip calls from exec_new_vmspace(), done when stack is + * not mapped yet. + */ + (td != curthread || (p->p_flag & P_INEXEC) == 0)) { /* * Stack is allocated to the max at exec time with only * "rlim_cur" bytes accessible. If stack limit is going * up make more accessible, if going down make inaccessible. */ if (limp->rlim_cur != oldssiz.rlim_cur) { vm_offset_t addr; vm_size_t size; vm_prot_t prot; if (limp->rlim_cur > oldssiz.rlim_cur) { prot = p->p_sysent->sv_stackprot; size = limp->rlim_cur - oldssiz.rlim_cur; addr = p->p_sysent->sv_usrstack - limp->rlim_cur; } else { prot = VM_PROT_NONE; size = oldssiz.rlim_cur - limp->rlim_cur; addr = p->p_sysent->sv_usrstack - oldssiz.rlim_cur; } addr = trunc_page(addr); size = round_page(size); (void)vm_map_protect(&p->p_vmspace->vm_map, addr, addr + size, prot, FALSE); } } return (0); } #ifndef _SYS_SYSPROTO_H_ struct __getrlimit_args { u_int which; struct rlimit *rlp; }; #endif /* ARGSUSED */ int sys_getrlimit(struct thread *td, register struct __getrlimit_args *uap) { struct rlimit rlim; struct proc *p; int error; if (uap->which >= RLIM_NLIMITS) return (EINVAL); p = td->td_proc; PROC_LOCK(p); lim_rlimit(p, uap->which, &rlim); PROC_UNLOCK(p); error = copyout(&rlim, uap->rlp, sizeof(struct rlimit)); return (error); } /* * Transform the running time and tick information for children of proc p * into user and system time usage. */ void calccru(struct proc *p, struct timeval *up, struct timeval *sp) { PROC_LOCK_ASSERT(p, MA_OWNED); calcru1(p, &p->p_crux, up, sp); } /* * Transform the running time and tick information in proc p into user * and system time usage. If appropriate, include the current time slice * on this CPU. */ void calcru(struct proc *p, struct timeval *up, struct timeval *sp) { struct thread *td; uint64_t runtime, u; PROC_LOCK_ASSERT(p, MA_OWNED); PROC_STATLOCK_ASSERT(p, MA_OWNED); /* * If we are getting stats for the current process, then add in the * stats that this thread has accumulated in its current time slice. * We reset the thread and CPU state as if we had performed a context * switch right here. */ td = curthread; if (td->td_proc == p) { u = cpu_ticks(); runtime = u - PCPU_GET(switchtime); td->td_runtime += runtime; td->td_incruntime += runtime; PCPU_SET(switchtime, u); } /* Make sure the per-thread stats are current. */ FOREACH_THREAD_IN_PROC(p, td) { if (td->td_incruntime == 0) continue; ruxagg(p, td); } calcru1(p, &p->p_rux, up, sp); } /* Collect resource usage for a single thread. */ void rufetchtd(struct thread *td, struct rusage *ru) { struct proc *p; uint64_t runtime, u; p = td->td_proc; PROC_STATLOCK_ASSERT(p, MA_OWNED); THREAD_LOCK_ASSERT(td, MA_OWNED); /* * If we are getting stats for the current thread, then add in the * stats that this thread has accumulated in its current time slice. * We reset the thread and CPU state as if we had performed a context * switch right here. */ if (td == curthread) { u = cpu_ticks(); runtime = u - PCPU_GET(switchtime); td->td_runtime += runtime; td->td_incruntime += runtime; PCPU_SET(switchtime, u); } ruxagg(p, td); *ru = td->td_ru; calcru1(p, &td->td_rux, &ru->ru_utime, &ru->ru_stime); } static void calcru1(struct proc *p, struct rusage_ext *ruxp, struct timeval *up, struct timeval *sp) { /* {user, system, interrupt, total} {ticks, usec}: */ uint64_t ut, uu, st, su, it, tt, tu; ut = ruxp->rux_uticks; st = ruxp->rux_sticks; it = ruxp->rux_iticks; tt = ut + st + it; if (tt == 0) { /* Avoid divide by zero */ st = 1; tt = 1; } tu = cputick2usec(ruxp->rux_runtime); if ((int64_t)tu < 0) { /* XXX: this should be an assert /phk */ printf("calcru: negative runtime of %jd usec for pid %d (%s)\n", (intmax_t)tu, p->p_pid, p->p_comm); tu = ruxp->rux_tu; } if (tu >= ruxp->rux_tu) { /* * The normal case, time increased. * Enforce monotonicity of bucketed numbers. */ uu = (tu * ut) / tt; if (uu < ruxp->rux_uu) uu = ruxp->rux_uu; su = (tu * st) / tt; if (su < ruxp->rux_su) su = ruxp->rux_su; } else if (tu + 3 > ruxp->rux_tu || 101 * tu > 100 * ruxp->rux_tu) { /* * When we calibrate the cputicker, it is not uncommon to * see the presumably fixed frequency increase slightly over * time as a result of thermal stabilization and NTP * discipline (of the reference clock). We therefore ignore * a bit of backwards slop because we expect to catch up * shortly. We use a 3 microsecond limit to catch low * counts and a 1% limit for high counts. */ uu = ruxp->rux_uu; su = ruxp->rux_su; tu = ruxp->rux_tu; } else { /* tu < ruxp->rux_tu */ /* * What happened here was likely that a laptop, which ran at * a reduced clock frequency at boot, kicked into high gear. * The wisdom of spamming this message in that case is * dubious, but it might also be indicative of something * serious, so lets keep it and hope laptops can be made * more truthful about their CPU speed via ACPI. */ printf("calcru: runtime went backwards from %ju usec " "to %ju usec for pid %d (%s)\n", (uintmax_t)ruxp->rux_tu, (uintmax_t)tu, p->p_pid, p->p_comm); uu = (tu * ut) / tt; su = (tu * st) / tt; } ruxp->rux_uu = uu; ruxp->rux_su = su; ruxp->rux_tu = tu; up->tv_sec = uu / 1000000; up->tv_usec = uu % 1000000; sp->tv_sec = su / 1000000; sp->tv_usec = su % 1000000; } #ifndef _SYS_SYSPROTO_H_ struct getrusage_args { int who; struct rusage *rusage; }; #endif int sys_getrusage(register struct thread *td, register struct getrusage_args *uap) { struct rusage ru; int error; error = kern_getrusage(td, uap->who, &ru); if (error == 0) error = copyout(&ru, uap->rusage, sizeof(struct rusage)); return (error); } int kern_getrusage(struct thread *td, int who, struct rusage *rup) { struct proc *p; int error; error = 0; p = td->td_proc; PROC_LOCK(p); switch (who) { case RUSAGE_SELF: rufetchcalc(p, rup, &rup->ru_utime, &rup->ru_stime); break; case RUSAGE_CHILDREN: *rup = p->p_stats->p_cru; calccru(p, &rup->ru_utime, &rup->ru_stime); break; case RUSAGE_THREAD: PROC_STATLOCK(p); thread_lock(td); rufetchtd(td, rup); thread_unlock(td); PROC_STATUNLOCK(p); break; default: error = EINVAL; } PROC_UNLOCK(p); return (error); } void rucollect(struct rusage *ru, struct rusage *ru2) { long *ip, *ip2; int i; if (ru->ru_maxrss < ru2->ru_maxrss) ru->ru_maxrss = ru2->ru_maxrss; ip = &ru->ru_first; ip2 = &ru2->ru_first; for (i = &ru->ru_last - &ru->ru_first; i >= 0; i--) *ip++ += *ip2++; } void ruadd(struct rusage *ru, struct rusage_ext *rux, struct rusage *ru2, struct rusage_ext *rux2) { rux->rux_runtime += rux2->rux_runtime; rux->rux_uticks += rux2->rux_uticks; rux->rux_sticks += rux2->rux_sticks; rux->rux_iticks += rux2->rux_iticks; rux->rux_uu += rux2->rux_uu; rux->rux_su += rux2->rux_su; rux->rux_tu += rux2->rux_tu; rucollect(ru, ru2); } /* * Aggregate tick counts into the proc's rusage_ext. */ static void ruxagg_locked(struct rusage_ext *rux, struct thread *td) { THREAD_LOCK_ASSERT(td, MA_OWNED); PROC_STATLOCK_ASSERT(td->td_proc, MA_OWNED); rux->rux_runtime += td->td_incruntime; rux->rux_uticks += td->td_uticks; rux->rux_sticks += td->td_sticks; rux->rux_iticks += td->td_iticks; } void ruxagg(struct proc *p, struct thread *td) { thread_lock(td); ruxagg_locked(&p->p_rux, td); ruxagg_locked(&td->td_rux, td); td->td_incruntime = 0; td->td_uticks = 0; td->td_iticks = 0; td->td_sticks = 0; thread_unlock(td); } /* * Update the rusage_ext structure and fetch a valid aggregate rusage * for proc p if storage for one is supplied. */ void rufetch(struct proc *p, struct rusage *ru) { struct thread *td; PROC_STATLOCK_ASSERT(p, MA_OWNED); *ru = p->p_ru; if (p->p_numthreads > 0) { FOREACH_THREAD_IN_PROC(p, td) { ruxagg(p, td); rucollect(ru, &td->td_ru); } } } /* * Atomically perform a rufetch and a calcru together. * Consumers, can safely assume the calcru is executed only once * rufetch is completed. */ void rufetchcalc(struct proc *p, struct rusage *ru, struct timeval *up, struct timeval *sp) { PROC_STATLOCK(p); rufetch(p, ru); calcru(p, up, sp); PROC_STATUNLOCK(p); } /* * Allocate a new resource limits structure and initialize its * reference count and mutex pointer. */ struct plimit * lim_alloc() { struct plimit *limp; limp = malloc(sizeof(struct plimit), M_PLIMIT, M_WAITOK); refcount_init(&limp->pl_refcnt, 1); return (limp); } struct plimit * lim_hold(struct plimit *limp) { refcount_acquire(&limp->pl_refcnt); return (limp); } static __inline int lim_shared(struct plimit *limp) { return (limp->pl_refcnt > 1); } void lim_fork(struct proc *p1, struct proc *p2) { PROC_LOCK_ASSERT(p1, MA_OWNED); PROC_LOCK_ASSERT(p2, MA_OWNED); p2->p_limit = lim_hold(p1->p_limit); callout_init_mtx(&p2->p_limco, &p2->p_mtx, 0); if (p1->p_cpulimit != RLIM_INFINITY) callout_reset_sbt(&p2->p_limco, SBT_1S, 0, lim_cb, p2, C_PREL(1)); } void lim_free(struct plimit *limp) { if (refcount_release(&limp->pl_refcnt)) free((void *)limp, M_PLIMIT); } /* * Make a copy of the plimit structure. * We share these structures copy-on-write after fork. */ void lim_copy(struct plimit *dst, struct plimit *src) { KASSERT(!lim_shared(dst), ("lim_copy to shared limit")); bcopy(src->pl_rlimit, dst->pl_rlimit, sizeof(src->pl_rlimit)); } /* * Return the hard limit for a particular system resource. The * which parameter specifies the index into the rlimit array. */ rlim_t lim_max(struct proc *p, int which) { struct rlimit rl; lim_rlimit(p, which, &rl); return (rl.rlim_max); } /* * Return the current (soft) limit for a particular system resource. * The which parameter which specifies the index into the rlimit array */ rlim_t lim_cur(struct proc *p, int which) { struct rlimit rl; lim_rlimit(p, which, &rl); return (rl.rlim_cur); } /* * Return a copy of the entire rlimit structure for the system limit * specified by 'which' in the rlimit structure pointed to by 'rlp'. */ void lim_rlimit(struct proc *p, int which, struct rlimit *rlp) { PROC_LOCK_ASSERT(p, MA_OWNED); KASSERT(which >= 0 && which < RLIM_NLIMITS, ("request for invalid resource limit")); *rlp = p->p_limit->pl_rlimit[which]; if (p->p_sysent->sv_fixlimit != NULL) p->p_sysent->sv_fixlimit(rlp, which); } void uihashinit() { uihashtbl = hashinit(maxproc / 16, M_UIDINFO, &uihash); rw_init(&uihashtbl_lock, "uidinfo hash"); } /* * Look up a uidinfo struct for the parameter uid. * uihashtbl_lock must be locked. * Increase refcount on uidinfo struct returned. */ static struct uidinfo * uilookup(uid_t uid) { struct uihashhead *uipp; struct uidinfo *uip; rw_assert(&uihashtbl_lock, RA_LOCKED); uipp = UIHASH(uid); LIST_FOREACH(uip, uipp, ui_hash) if (uip->ui_uid == uid) { uihold(uip); break; } return (uip); } /* * Find or allocate a struct uidinfo for a particular uid. * Returns with uidinfo struct referenced. * uifree() should be called on a struct uidinfo when released. */ struct uidinfo * uifind(uid_t uid) { struct uidinfo *new_uip, *uip; rw_rlock(&uihashtbl_lock); uip = uilookup(uid); rw_runlock(&uihashtbl_lock); if (uip != NULL) return (uip); new_uip = malloc(sizeof(*new_uip), M_UIDINFO, M_WAITOK | M_ZERO); racct_create(&new_uip->ui_racct); refcount_init(&new_uip->ui_ref, 1); new_uip->ui_uid = uid; mtx_init(&new_uip->ui_vmsize_mtx, "ui_vmsize", NULL, MTX_DEF); rw_wlock(&uihashtbl_lock); /* * There's a chance someone created our uidinfo while we * were in malloc and not holding the lock, so we have to * make sure we don't insert a duplicate uidinfo. */ if ((uip = uilookup(uid)) == NULL) { LIST_INSERT_HEAD(UIHASH(uid), new_uip, ui_hash); rw_wunlock(&uihashtbl_lock); uip = new_uip; } else { rw_wunlock(&uihashtbl_lock); racct_destroy(&new_uip->ui_racct); mtx_destroy(&new_uip->ui_vmsize_mtx); free(new_uip, M_UIDINFO); } return (uip); } /* * Place another refcount on a uidinfo struct. */ void uihold(struct uidinfo *uip) { refcount_acquire(&uip->ui_ref); } /*- * Since uidinfo structs have a long lifetime, we use an * opportunistic refcounting scheme to avoid locking the lookup hash * for each release. * * If the refcount hits 0, we need to free the structure, * which means we need to lock the hash. * Optimal case: * After locking the struct and lowering the refcount, if we find * that we don't need to free, simply unlock and return. * Suboptimal case: * If refcount lowering results in need to free, bump the count * back up, lose the lock and acquire the locks in the proper * order to try again. */ void uifree(struct uidinfo *uip) { int old; /* Prepare for optimal case. */ old = uip->ui_ref; if (old > 1 && atomic_cmpset_int(&uip->ui_ref, old, old - 1)) return; /* Prepare for suboptimal case. */ rw_wlock(&uihashtbl_lock); if (refcount_release(&uip->ui_ref) == 0) { rw_wunlock(&uihashtbl_lock); return; } racct_destroy(&uip->ui_racct); LIST_REMOVE(uip, ui_hash); rw_wunlock(&uihashtbl_lock); if (uip->ui_sbsize != 0) printf("freeing uidinfo: uid = %d, sbsize = %ld\n", uip->ui_uid, uip->ui_sbsize); if (uip->ui_proccnt != 0) printf("freeing uidinfo: uid = %d, proccnt = %ld\n", uip->ui_uid, uip->ui_proccnt); if (uip->ui_vmsize != 0) printf("freeing uidinfo: uid = %d, swapuse = %lld\n", uip->ui_uid, (unsigned long long)uip->ui_vmsize); mtx_destroy(&uip->ui_vmsize_mtx); free(uip, M_UIDINFO); } #ifdef RACCT void ui_racct_foreach(void (*callback)(struct racct *racct, void *arg2, void *arg3), void *arg2, void *arg3) { struct uidinfo *uip; struct uihashhead *uih; rw_rlock(&uihashtbl_lock); for (uih = &uihashtbl[uihash]; uih >= uihashtbl; uih--) { LIST_FOREACH(uip, uih, ui_hash) { (callback)(uip->ui_racct, arg2, arg3); } } rw_runlock(&uihashtbl_lock); } #endif /* * Change the count associated with number of processes * a given user is using. When 'max' is 0, don't enforce a limit */ int chgproccnt(struct uidinfo *uip, int diff, rlim_t max) { /* Don't allow them to exceed max, but allow subtraction. */ if (diff > 0 && max != 0) { if (atomic_fetchadd_long(&uip->ui_proccnt, (long)diff) + diff > max) { atomic_subtract_long(&uip->ui_proccnt, (long)diff); return (0); } } else { atomic_add_long(&uip->ui_proccnt, (long)diff); if (uip->ui_proccnt < 0) printf("negative proccnt for uid = %d\n", uip->ui_uid); } return (1); } /* * Change the total socket buffer size a user has used. */ int chgsbsize(struct uidinfo *uip, u_int *hiwat, u_int to, rlim_t max) { int diff; diff = to - *hiwat; if (diff > 0) { if (atomic_fetchadd_long(&uip->ui_sbsize, (long)diff) + diff > max) { atomic_subtract_long(&uip->ui_sbsize, (long)diff); return (0); } } else { atomic_add_long(&uip->ui_sbsize, (long)diff); if (uip->ui_sbsize < 0) printf("negative sbsize for uid = %d\n", uip->ui_uid); } *hiwat = to; return (1); } /* * Change the count associated with number of pseudo-terminals * a given user is using. When 'max' is 0, don't enforce a limit */ int chgptscnt(struct uidinfo *uip, int diff, rlim_t max) { /* Don't allow them to exceed max, but allow subtraction. */ if (diff > 0 && max != 0) { if (atomic_fetchadd_long(&uip->ui_ptscnt, (long)diff) + diff > max) { atomic_subtract_long(&uip->ui_ptscnt, (long)diff); return (0); } } else { atomic_add_long(&uip->ui_ptscnt, (long)diff); if (uip->ui_ptscnt < 0) printf("negative ptscnt for uid = %d\n", uip->ui_uid); } return (1); } int chgkqcnt(struct uidinfo *uip, int diff, rlim_t max) { if (diff > 0 && max != 0) { if (atomic_fetchadd_long(&uip->ui_kqcnt, (long)diff) + diff > max) { atomic_subtract_long(&uip->ui_kqcnt, (long)diff); return (0); } } else { atomic_add_long(&uip->ui_kqcnt, (long)diff); if (uip->ui_kqcnt < 0) printf("negative kqcnt for uid = %d\n", uip->ui_uid); } return (1); } Index: user/ngie/more-tests/sys/kern/kern_timeout.c =================================================================== --- user/ngie/more-tests/sys/kern/kern_timeout.c (revision 281584) +++ user/ngie/more-tests/sys/kern/kern_timeout.c (revision 281585) @@ -1,1594 +1,1596 @@ /*- * Copyright (c) 1982, 1986, 1991, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * From: @(#)kern_clock.c 8.5 (Berkeley) 1/21/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_callout_profiling.h" #if defined(__arm__) #include "opt_timer.h" #endif #include "opt_rss.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef SMP #include #endif #ifndef NO_EVENTTIMERS DPCPU_DECLARE(sbintime_t, hardclocktime); #endif SDT_PROVIDER_DEFINE(callout_execute); SDT_PROBE_DEFINE1(callout_execute, kernel, , callout__start, "struct callout *"); SDT_PROBE_DEFINE1(callout_execute, kernel, , callout__end, "struct callout *"); #ifdef CALLOUT_PROFILING static int avg_depth; SYSCTL_INT(_debug, OID_AUTO, to_avg_depth, CTLFLAG_RD, &avg_depth, 0, "Average number of items examined per softclock call. Units = 1/1000"); static int avg_gcalls; SYSCTL_INT(_debug, OID_AUTO, to_avg_gcalls, CTLFLAG_RD, &avg_gcalls, 0, "Average number of Giant callouts made per softclock call. Units = 1/1000"); static int avg_lockcalls; SYSCTL_INT(_debug, OID_AUTO, to_avg_lockcalls, CTLFLAG_RD, &avg_lockcalls, 0, "Average number of lock callouts made per softclock call. Units = 1/1000"); static int avg_mpcalls; SYSCTL_INT(_debug, OID_AUTO, to_avg_mpcalls, CTLFLAG_RD, &avg_mpcalls, 0, "Average number of MP callouts made per softclock call. Units = 1/1000"); static int avg_depth_dir; SYSCTL_INT(_debug, OID_AUTO, to_avg_depth_dir, CTLFLAG_RD, &avg_depth_dir, 0, "Average number of direct callouts examined per callout_process call. " "Units = 1/1000"); static int avg_lockcalls_dir; SYSCTL_INT(_debug, OID_AUTO, to_avg_lockcalls_dir, CTLFLAG_RD, &avg_lockcalls_dir, 0, "Average number of lock direct callouts made per " "callout_process call. Units = 1/1000"); static int avg_mpcalls_dir; SYSCTL_INT(_debug, OID_AUTO, to_avg_mpcalls_dir, CTLFLAG_RD, &avg_mpcalls_dir, 0, "Average number of MP direct callouts made per callout_process call. " "Units = 1/1000"); #endif static int ncallout; SYSCTL_INT(_kern, OID_AUTO, ncallout, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &ncallout, 0, "Number of entries in callwheel and size of timeout() preallocation"); #ifdef RSS static int pin_default_swi = 1; static int pin_pcpu_swi = 1; #else static int pin_default_swi = 0; static int pin_pcpu_swi = 0; #endif SYSCTL_INT(_kern, OID_AUTO, pin_default_swi, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &pin_default_swi, 0, "Pin the default (non-per-cpu) swi (shared with PCPU 0 swi)"); SYSCTL_INT(_kern, OID_AUTO, pin_pcpu_swi, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &pin_pcpu_swi, 0, "Pin the per-CPU swis (except PCPU 0, which is also default"); /* * TODO: * allocate more timeout table slots when table overflows. */ u_int callwheelsize, callwheelmask; /* * The callout cpu exec entities represent informations necessary for * describing the state of callouts currently running on the CPU and the ones * necessary for migrating callouts to the new callout cpu. In particular, * the first entry of the array cc_exec_entity holds informations for callout * running in SWI thread context, while the second one holds informations * for callout running directly from hardware interrupt context. * The cached informations are very important for deferring migration when * the migrating callout is already running. */ struct cc_exec { struct callout *cc_curr; #ifdef SMP void (*ce_migration_func)(void *); void *ce_migration_arg; int ce_migration_cpu; sbintime_t ce_migration_time; sbintime_t ce_migration_prec; #endif bool cc_cancel; bool cc_waiting; }; /* * There is one struct callout_cpu per cpu, holding all relevant * state for the callout processing thread on the individual CPU. */ struct callout_cpu { struct mtx_padalign cc_lock; struct cc_exec cc_exec_entity[2]; struct callout *cc_next; struct callout *cc_callout; struct callout_list *cc_callwheel; struct callout_tailq cc_expireq; struct callout_slist cc_callfree; sbintime_t cc_firstevent; sbintime_t cc_lastscan; void *cc_cookie; u_int cc_bucket; u_int cc_inited; char cc_ktr_event_name[20]; }; #define callout_migrating(c) ((c)->c_iflags & CALLOUT_DFRMIGRATION) #define cc_exec_curr(cc, dir) cc->cc_exec_entity[dir].cc_curr #define cc_exec_next(cc) cc->cc_next #define cc_exec_cancel(cc, dir) cc->cc_exec_entity[dir].cc_cancel #define cc_exec_waiting(cc, dir) cc->cc_exec_entity[dir].cc_waiting #ifdef SMP #define cc_migration_func(cc, dir) cc->cc_exec_entity[dir].ce_migration_func #define cc_migration_arg(cc, dir) cc->cc_exec_entity[dir].ce_migration_arg #define cc_migration_cpu(cc, dir) cc->cc_exec_entity[dir].ce_migration_cpu #define cc_migration_time(cc, dir) cc->cc_exec_entity[dir].ce_migration_time #define cc_migration_prec(cc, dir) cc->cc_exec_entity[dir].ce_migration_prec struct callout_cpu cc_cpu[MAXCPU]; #define CPUBLOCK MAXCPU #define CC_CPU(cpu) (&cc_cpu[(cpu)]) #define CC_SELF() CC_CPU(PCPU_GET(cpuid)) #else struct callout_cpu cc_cpu; #define CC_CPU(cpu) &cc_cpu #define CC_SELF() &cc_cpu #endif #define CC_LOCK(cc) mtx_lock_spin(&(cc)->cc_lock) #define CC_UNLOCK(cc) mtx_unlock_spin(&(cc)->cc_lock) #define CC_LOCK_ASSERT(cc) mtx_assert(&(cc)->cc_lock, MA_OWNED) static int timeout_cpu; static void callout_cpu_init(struct callout_cpu *cc, int cpu); static void softclock_call_cc(struct callout *c, struct callout_cpu *cc, #ifdef CALLOUT_PROFILING int *mpcalls, int *lockcalls, int *gcalls, #endif int direct); static MALLOC_DEFINE(M_CALLOUT, "callout", "Callout datastructures"); /** * Locked by cc_lock: * cc_curr - If a callout is in progress, it is cc_curr. * If cc_curr is non-NULL, threads waiting in * callout_drain() will be woken up as soon as the * relevant callout completes. * cc_cancel - Changing to 1 with both callout_lock and cc_lock held * guarantees that the current callout will not run. * The softclock() function sets this to 0 before it * drops callout_lock to acquire c_lock, and it calls * the handler only if curr_cancelled is still 0 after * cc_lock is successfully acquired. * cc_waiting - If a thread is waiting in callout_drain(), then * callout_wait is nonzero. Set only when * cc_curr is non-NULL. */ /* * Resets the execution entity tied to a specific callout cpu. */ static void cc_cce_cleanup(struct callout_cpu *cc, int direct) { cc_exec_curr(cc, direct) = NULL; cc_exec_cancel(cc, direct) = false; cc_exec_waiting(cc, direct) = false; #ifdef SMP cc_migration_cpu(cc, direct) = CPUBLOCK; cc_migration_time(cc, direct) = 0; cc_migration_prec(cc, direct) = 0; cc_migration_func(cc, direct) = NULL; cc_migration_arg(cc, direct) = NULL; #endif } /* * Checks if migration is requested by a specific callout cpu. */ static int cc_cce_migrating(struct callout_cpu *cc, int direct) { #ifdef SMP return (cc_migration_cpu(cc, direct) != CPUBLOCK); #else return (0); #endif } /* * Kernel low level callwheel initialization * called on cpu0 during kernel startup. */ static void callout_callwheel_init(void *dummy) { struct callout_cpu *cc; /* * Calculate the size of the callout wheel and the preallocated * timeout() structures. * XXX: Clip callout to result of previous function of maxusers * maximum 384. This is still huge, but acceptable. */ memset(CC_CPU(0), 0, sizeof(cc_cpu)); ncallout = imin(16 + maxproc + maxfiles, 18508); TUNABLE_INT_FETCH("kern.ncallout", &ncallout); /* * Calculate callout wheel size, should be next power of two higher * than 'ncallout'. */ callwheelsize = 1 << fls(ncallout); callwheelmask = callwheelsize - 1; /* * Fetch whether we're pinning the swi's or not. */ TUNABLE_INT_FETCH("kern.pin_default_swi", &pin_default_swi); TUNABLE_INT_FETCH("kern.pin_pcpu_swi", &pin_pcpu_swi); /* * Only cpu0 handles timeout(9) and receives a preallocation. * * XXX: Once all timeout(9) consumers are converted this can * be removed. */ timeout_cpu = PCPU_GET(cpuid); cc = CC_CPU(timeout_cpu); cc->cc_callout = malloc(ncallout * sizeof(struct callout), M_CALLOUT, M_WAITOK); callout_cpu_init(cc, timeout_cpu); } SYSINIT(callwheel_init, SI_SUB_CPU, SI_ORDER_ANY, callout_callwheel_init, NULL); /* * Initialize the per-cpu callout structures. */ static void callout_cpu_init(struct callout_cpu *cc, int cpu) { struct callout *c; int i; mtx_init(&cc->cc_lock, "callout", NULL, MTX_SPIN | MTX_RECURSE); SLIST_INIT(&cc->cc_callfree); cc->cc_inited = 1; cc->cc_callwheel = malloc(sizeof(struct callout_list) * callwheelsize, M_CALLOUT, M_WAITOK); for (i = 0; i < callwheelsize; i++) LIST_INIT(&cc->cc_callwheel[i]); TAILQ_INIT(&cc->cc_expireq); cc->cc_firstevent = SBT_MAX; for (i = 0; i < 2; i++) cc_cce_cleanup(cc, i); snprintf(cc->cc_ktr_event_name, sizeof(cc->cc_ktr_event_name), "callwheel cpu %d", cpu); if (cc->cc_callout == NULL) /* Only cpu0 handles timeout(9) */ return; for (i = 0; i < ncallout; i++) { c = &cc->cc_callout[i]; callout_init(c, 0); c->c_iflags = CALLOUT_LOCAL_ALLOC; SLIST_INSERT_HEAD(&cc->cc_callfree, c, c_links.sle); } } #ifdef SMP /* * Switches the cpu tied to a specific callout. * The function expects a locked incoming callout cpu and returns with * locked outcoming callout cpu. */ static struct callout_cpu * callout_cpu_switch(struct callout *c, struct callout_cpu *cc, int new_cpu) { struct callout_cpu *new_cc; MPASS(c != NULL && cc != NULL); CC_LOCK_ASSERT(cc); /* * Avoid interrupts and preemption firing after the callout cpu * is blocked in order to avoid deadlocks as the new thread * may be willing to acquire the callout cpu lock. */ c->c_cpu = CPUBLOCK; spinlock_enter(); CC_UNLOCK(cc); new_cc = CC_CPU(new_cpu); CC_LOCK(new_cc); spinlock_exit(); c->c_cpu = new_cpu; return (new_cc); } #endif /* * Start standard softclock thread. */ static void start_softclock(void *dummy) { struct callout_cpu *cc; char name[MAXCOMLEN]; #ifdef SMP int cpu; struct intr_event *ie; #endif cc = CC_CPU(timeout_cpu); snprintf(name, sizeof(name), "clock (%d)", timeout_cpu); if (swi_add(&clk_intr_event, name, softclock, cc, SWI_CLOCK, INTR_MPSAFE, &cc->cc_cookie)) panic("died while creating standard software ithreads"); if (pin_default_swi && (intr_event_bind(clk_intr_event, timeout_cpu) != 0)) { printf("%s: timeout clock couldn't be pinned to cpu %d\n", __func__, timeout_cpu); } #ifdef SMP CPU_FOREACH(cpu) { if (cpu == timeout_cpu) continue; cc = CC_CPU(cpu); cc->cc_callout = NULL; /* Only cpu0 handles timeout(9). */ callout_cpu_init(cc, cpu); snprintf(name, sizeof(name), "clock (%d)", cpu); ie = NULL; if (swi_add(&ie, name, softclock, cc, SWI_CLOCK, INTR_MPSAFE, &cc->cc_cookie)) panic("died while creating standard software ithreads"); if (pin_pcpu_swi && (intr_event_bind(ie, cpu) != 0)) { printf("%s: per-cpu clock couldn't be pinned to " "cpu %d\n", __func__, cpu); } } #endif } SYSINIT(start_softclock, SI_SUB_SOFTINTR, SI_ORDER_FIRST, start_softclock, NULL); #define CC_HASH_SHIFT 8 static inline u_int callout_hash(sbintime_t sbt) { return (sbt >> (32 - CC_HASH_SHIFT)); } static inline u_int callout_get_bucket(sbintime_t sbt) { return (callout_hash(sbt) & callwheelmask); } void callout_process(sbintime_t now) { struct callout *tmp, *tmpn; struct callout_cpu *cc; struct callout_list *sc; sbintime_t first, last, max, tmp_max; uint32_t lookahead; u_int firstb, lastb, nowb; #ifdef CALLOUT_PROFILING int depth_dir = 0, mpcalls_dir = 0, lockcalls_dir = 0; #endif cc = CC_SELF(); mtx_lock_spin_flags(&cc->cc_lock, MTX_QUIET); /* Compute the buckets of the last scan and present times. */ firstb = callout_hash(cc->cc_lastscan); cc->cc_lastscan = now; nowb = callout_hash(now); /* Compute the last bucket and minimum time of the bucket after it. */ if (nowb == firstb) lookahead = (SBT_1S / 16); else if (nowb - firstb == 1) lookahead = (SBT_1S / 8); else lookahead = (SBT_1S / 2); first = last = now; first += (lookahead / 2); last += lookahead; last &= (0xffffffffffffffffLLU << (32 - CC_HASH_SHIFT)); lastb = callout_hash(last) - 1; max = last; /* * Check if we wrapped around the entire wheel from the last scan. * In case, we need to scan entirely the wheel for pending callouts. */ if (lastb - firstb >= callwheelsize) { lastb = firstb + callwheelsize - 1; if (nowb - firstb >= callwheelsize) nowb = lastb; } /* Iterate callwheel from firstb to nowb and then up to lastb. */ do { sc = &cc->cc_callwheel[firstb & callwheelmask]; tmp = LIST_FIRST(sc); while (tmp != NULL) { /* Run the callout if present time within allowed. */ if (tmp->c_time <= now) { /* * Consumer told us the callout may be run * directly from hardware interrupt context. */ if (tmp->c_iflags & CALLOUT_DIRECT) { #ifdef CALLOUT_PROFILING ++depth_dir; #endif cc_exec_next(cc) = LIST_NEXT(tmp, c_links.le); cc->cc_bucket = firstb & callwheelmask; LIST_REMOVE(tmp, c_links.le); softclock_call_cc(tmp, cc, #ifdef CALLOUT_PROFILING &mpcalls_dir, &lockcalls_dir, NULL, #endif 1); tmp = cc_exec_next(cc); cc_exec_next(cc) = NULL; } else { tmpn = LIST_NEXT(tmp, c_links.le); LIST_REMOVE(tmp, c_links.le); TAILQ_INSERT_TAIL(&cc->cc_expireq, tmp, c_links.tqe); tmp->c_iflags |= CALLOUT_PROCESSED; tmp = tmpn; } continue; } /* Skip events from distant future. */ if (tmp->c_time >= max) goto next; /* * Event minimal time is bigger than present maximal * time, so it cannot be aggregated. */ if (tmp->c_time > last) { lastb = nowb; goto next; } /* Update first and last time, respecting this event. */ if (tmp->c_time < first) first = tmp->c_time; tmp_max = tmp->c_time + tmp->c_precision; if (tmp_max < last) last = tmp_max; next: tmp = LIST_NEXT(tmp, c_links.le); } /* Proceed with the next bucket. */ firstb++; /* * Stop if we looked after present time and found * some event we can't execute at now. * Stop if we looked far enough into the future. */ } while (((int)(firstb - lastb)) <= 0); cc->cc_firstevent = last; #ifndef NO_EVENTTIMERS cpu_new_callout(curcpu, last, first); #endif #ifdef CALLOUT_PROFILING avg_depth_dir += (depth_dir * 1000 - avg_depth_dir) >> 8; avg_mpcalls_dir += (mpcalls_dir * 1000 - avg_mpcalls_dir) >> 8; avg_lockcalls_dir += (lockcalls_dir * 1000 - avg_lockcalls_dir) >> 8; #endif mtx_unlock_spin_flags(&cc->cc_lock, MTX_QUIET); /* * swi_sched acquires the thread lock, so we don't want to call it * with cc_lock held; incorrect locking order. */ if (!TAILQ_EMPTY(&cc->cc_expireq)) swi_sched(cc->cc_cookie, 0); } static struct callout_cpu * callout_lock(struct callout *c) { struct callout_cpu *cc; int cpu; for (;;) { cpu = c->c_cpu; #ifdef SMP if (cpu == CPUBLOCK) { while (c->c_cpu == CPUBLOCK) cpu_spinwait(); continue; } #endif cc = CC_CPU(cpu); CC_LOCK(cc); if (cpu == c->c_cpu) break; CC_UNLOCK(cc); } return (cc); } static void callout_cc_add(struct callout *c, struct callout_cpu *cc, sbintime_t sbt, sbintime_t precision, void (*func)(void *), void *arg, int cpu, int flags) { int bucket; CC_LOCK_ASSERT(cc); if (sbt < cc->cc_lastscan) sbt = cc->cc_lastscan; c->c_arg = arg; c->c_iflags |= CALLOUT_PENDING; c->c_iflags &= ~CALLOUT_PROCESSED; c->c_flags |= CALLOUT_ACTIVE; + if (flags & C_DIRECT_EXEC) + c->c_iflags |= CALLOUT_DIRECT; c->c_func = func; c->c_time = sbt; c->c_precision = precision; bucket = callout_get_bucket(c->c_time); CTR3(KTR_CALLOUT, "precision set for %p: %d.%08x", c, (int)(c->c_precision >> 32), (u_int)(c->c_precision & 0xffffffff)); LIST_INSERT_HEAD(&cc->cc_callwheel[bucket], c, c_links.le); if (cc->cc_bucket == bucket) cc_exec_next(cc) = c; #ifndef NO_EVENTTIMERS /* * Inform the eventtimers(4) subsystem there's a new callout * that has been inserted, but only if really required. */ if (SBT_MAX - c->c_time < c->c_precision) c->c_precision = SBT_MAX - c->c_time; sbt = c->c_time + c->c_precision; if (sbt < cc->cc_firstevent) { cc->cc_firstevent = sbt; cpu_new_callout(cpu, sbt, c->c_time); } #endif } static void callout_cc_del(struct callout *c, struct callout_cpu *cc) { if ((c->c_iflags & CALLOUT_LOCAL_ALLOC) == 0) return; c->c_func = NULL; SLIST_INSERT_HEAD(&cc->cc_callfree, c, c_links.sle); } static void softclock_call_cc(struct callout *c, struct callout_cpu *cc, #ifdef CALLOUT_PROFILING int *mpcalls, int *lockcalls, int *gcalls, #endif int direct) { struct rm_priotracker tracker; void (*c_func)(void *); void *c_arg; struct lock_class *class; struct lock_object *c_lock; uintptr_t lock_status; int c_iflags; #ifdef SMP struct callout_cpu *new_cc; void (*new_func)(void *); void *new_arg; int flags, new_cpu; sbintime_t new_prec, new_time; #endif #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) sbintime_t sbt1, sbt2; struct timespec ts2; static sbintime_t maxdt = 2 * SBT_1MS; /* 2 msec */ static timeout_t *lastfunc; #endif KASSERT((c->c_iflags & CALLOUT_PENDING) == CALLOUT_PENDING, ("softclock_call_cc: pend %p %x", c, c->c_iflags)); KASSERT((c->c_flags & CALLOUT_ACTIVE) == CALLOUT_ACTIVE, ("softclock_call_cc: act %p %x", c, c->c_flags)); class = (c->c_lock != NULL) ? LOCK_CLASS(c->c_lock) : NULL; lock_status = 0; if (c->c_flags & CALLOUT_SHAREDLOCK) { if (class == &lock_class_rm) lock_status = (uintptr_t)&tracker; else lock_status = 1; } c_lock = c->c_lock; c_func = c->c_func; c_arg = c->c_arg; c_iflags = c->c_iflags; if (c->c_iflags & CALLOUT_LOCAL_ALLOC) c->c_iflags = CALLOUT_LOCAL_ALLOC; else c->c_iflags &= ~CALLOUT_PENDING; cc_exec_curr(cc, direct) = c; cc_exec_cancel(cc, direct) = false; CC_UNLOCK(cc); if (c_lock != NULL) { class->lc_lock(c_lock, lock_status); /* * The callout may have been cancelled * while we switched locks. */ if (cc_exec_cancel(cc, direct)) { class->lc_unlock(c_lock); goto skip; } /* The callout cannot be stopped now. */ cc_exec_cancel(cc, direct) = true; if (c_lock == &Giant.lock_object) { #ifdef CALLOUT_PROFILING (*gcalls)++; #endif CTR3(KTR_CALLOUT, "callout giant %p func %p arg %p", c, c_func, c_arg); } else { #ifdef CALLOUT_PROFILING (*lockcalls)++; #endif CTR3(KTR_CALLOUT, "callout lock %p func %p arg %p", c, c_func, c_arg); } } else { #ifdef CALLOUT_PROFILING (*mpcalls)++; #endif CTR3(KTR_CALLOUT, "callout %p func %p arg %p", c, c_func, c_arg); } KTR_STATE3(KTR_SCHED, "callout", cc->cc_ktr_event_name, "running", "func:%p", c_func, "arg:%p", c_arg, "direct:%d", direct); #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) sbt1 = sbinuptime(); #endif THREAD_NO_SLEEPING(); SDT_PROBE(callout_execute, kernel, , callout__start, c, 0, 0, 0, 0); c_func(c_arg); SDT_PROBE(callout_execute, kernel, , callout__end, c, 0, 0, 0, 0); THREAD_SLEEPING_OK(); #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) sbt2 = sbinuptime(); sbt2 -= sbt1; if (sbt2 > maxdt) { if (lastfunc != c_func || sbt2 > maxdt * 2) { ts2 = sbttots(sbt2); printf( "Expensive timeout(9) function: %p(%p) %jd.%09ld s\n", c_func, c_arg, (intmax_t)ts2.tv_sec, ts2.tv_nsec); } maxdt = sbt2; lastfunc = c_func; } #endif KTR_STATE0(KTR_SCHED, "callout", cc->cc_ktr_event_name, "idle"); CTR1(KTR_CALLOUT, "callout %p finished", c); if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) class->lc_unlock(c_lock); skip: CC_LOCK(cc); KASSERT(cc_exec_curr(cc, direct) == c, ("mishandled cc_curr")); cc_exec_curr(cc, direct) = NULL; if (cc_exec_waiting(cc, direct)) { /* * There is someone waiting for the * callout to complete. * If the callout was scheduled for * migration just cancel it. */ if (cc_cce_migrating(cc, direct)) { cc_cce_cleanup(cc, direct); /* * It should be assert here that the callout is not * destroyed but that is not easy. */ c->c_iflags &= ~CALLOUT_DFRMIGRATION; } cc_exec_waiting(cc, direct) = false; CC_UNLOCK(cc); wakeup(&cc_exec_waiting(cc, direct)); CC_LOCK(cc); } else if (cc_cce_migrating(cc, direct)) { KASSERT((c_iflags & CALLOUT_LOCAL_ALLOC) == 0, ("Migrating legacy callout %p", c)); #ifdef SMP /* * If the callout was scheduled for * migration just perform it now. */ new_cpu = cc_migration_cpu(cc, direct); new_time = cc_migration_time(cc, direct); new_prec = cc_migration_prec(cc, direct); new_func = cc_migration_func(cc, direct); new_arg = cc_migration_arg(cc, direct); cc_cce_cleanup(cc, direct); /* * It should be assert here that the callout is not destroyed * but that is not easy. * * As first thing, handle deferred callout stops. */ if (!callout_migrating(c)) { CTR3(KTR_CALLOUT, "deferred cancelled %p func %p arg %p", c, new_func, new_arg); callout_cc_del(c, cc); return; } c->c_iflags &= ~CALLOUT_DFRMIGRATION; new_cc = callout_cpu_switch(c, cc, new_cpu); flags = (direct) ? C_DIRECT_EXEC : 0; callout_cc_add(c, new_cc, new_time, new_prec, new_func, new_arg, new_cpu, flags); CC_UNLOCK(new_cc); CC_LOCK(cc); #else panic("migration should not happen"); #endif } /* * If the current callout is locally allocated (from * timeout(9)) then put it on the freelist. * * Note: we need to check the cached copy of c_iflags because * if it was not local, then it's not safe to deref the * callout pointer. */ KASSERT((c_iflags & CALLOUT_LOCAL_ALLOC) == 0 || c->c_iflags == CALLOUT_LOCAL_ALLOC, ("corrupted callout")); if (c_iflags & CALLOUT_LOCAL_ALLOC) callout_cc_del(c, cc); } /* * The callout mechanism is based on the work of Adam M. Costello and * George Varghese, published in a technical report entitled "Redesigning * the BSD Callout and Timer Facilities" and modified slightly for inclusion * in FreeBSD by Justin T. Gibbs. The original work on the data structures * used in this implementation was published by G. Varghese and T. Lauck in * the paper "Hashed and Hierarchical Timing Wheels: Data Structures for * the Efficient Implementation of a Timer Facility" in the Proceedings of * the 11th ACM Annual Symposium on Operating Systems Principles, * Austin, Texas Nov 1987. */ /* * Software (low priority) clock interrupt. * Run periodic events from timeout queue. */ void softclock(void *arg) { struct callout_cpu *cc; struct callout *c; #ifdef CALLOUT_PROFILING int depth = 0, gcalls = 0, lockcalls = 0, mpcalls = 0; #endif cc = (struct callout_cpu *)arg; CC_LOCK(cc); while ((c = TAILQ_FIRST(&cc->cc_expireq)) != NULL) { TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); softclock_call_cc(c, cc, #ifdef CALLOUT_PROFILING &mpcalls, &lockcalls, &gcalls, #endif 0); #ifdef CALLOUT_PROFILING ++depth; #endif } #ifdef CALLOUT_PROFILING avg_depth += (depth * 1000 - avg_depth) >> 8; avg_mpcalls += (mpcalls * 1000 - avg_mpcalls) >> 8; avg_lockcalls += (lockcalls * 1000 - avg_lockcalls) >> 8; avg_gcalls += (gcalls * 1000 - avg_gcalls) >> 8; #endif CC_UNLOCK(cc); } /* * timeout -- * Execute a function after a specified length of time. * * untimeout -- * Cancel previous timeout function call. * * callout_handle_init -- * Initialize a handle so that using it with untimeout is benign. * * See AT&T BCI Driver Reference Manual for specification. This * implementation differs from that one in that although an * identification value is returned from timeout, the original * arguments to timeout as well as the identifier are used to * identify entries for untimeout. */ struct callout_handle timeout(timeout_t *ftn, void *arg, int to_ticks) { struct callout_cpu *cc; struct callout *new; struct callout_handle handle; cc = CC_CPU(timeout_cpu); CC_LOCK(cc); /* Fill in the next free callout structure. */ new = SLIST_FIRST(&cc->cc_callfree); if (new == NULL) /* XXX Attempt to malloc first */ panic("timeout table full"); SLIST_REMOVE_HEAD(&cc->cc_callfree, c_links.sle); callout_reset(new, to_ticks, ftn, arg); handle.callout = new; CC_UNLOCK(cc); return (handle); } void untimeout(timeout_t *ftn, void *arg, struct callout_handle handle) { struct callout_cpu *cc; /* * Check for a handle that was initialized * by callout_handle_init, but never used * for a real timeout. */ if (handle.callout == NULL) return; cc = callout_lock(handle.callout); if (handle.callout->c_func == ftn && handle.callout->c_arg == arg) callout_stop(handle.callout); CC_UNLOCK(cc); } void callout_handle_init(struct callout_handle *handle) { handle->callout = NULL; } /* * New interface; clients allocate their own callout structures. * * callout_reset() - establish or change a timeout * callout_stop() - disestablish a timeout * callout_init() - initialize a callout structure so that it can * safely be passed to callout_reset() and callout_stop() * * defines three convenience macros: * * callout_active() - returns truth if callout has not been stopped, * drained, or deactivated since the last time the callout was * reset. * callout_pending() - returns truth if callout is still waiting for timeout * callout_deactivate() - marks the callout as having been serviced */ int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; int ignore_cpu=0; cancelled = 0; if (cpu == -1) { ignore_cpu = 1; } else if ((cpu >= MAXCPU) || ((CC_CPU(cpu))->cc_inited == 0)) { /* Invalid CPU spec */ panic("Invalid CPU in callout %d", cpu); } if (flags & C_ABSOLUTE) { to_sbt = sbt; } else { if ((flags & C_HARDCLOCK) && (sbt < tick_sbt)) sbt = tick_sbt; if ((flags & C_HARDCLOCK) || #ifdef NO_EVENTTIMERS sbt >= sbt_timethreshold) { to_sbt = getsbinuptime(); /* Add safety belt for the case of hz > 1000. */ to_sbt += tc_tick_sbt - tick_sbt; #else sbt >= sbt_tickthreshold) { /* * Obtain the time of the last hardclock() call on * this CPU directly from the kern_clocksource.c. * This value is per-CPU, but it is equal for all * active ones. */ #ifdef __LP64__ to_sbt = DPCPU_GET(hardclocktime); #else spinlock_enter(); to_sbt = DPCPU_GET(hardclocktime); spinlock_exit(); #endif #endif if ((flags & C_HARDCLOCK) == 0) to_sbt += tick_sbt; } else to_sbt = sbinuptime(); if (SBT_MAX - to_sbt < sbt) to_sbt = SBT_MAX; else to_sbt += sbt; pr = ((C_PRELGET(flags) < 0) ? sbt >> tc_precexp : sbt >> C_PRELGET(flags)); if (pr > precision) precision = pr; } /* * This flag used to be added by callout_cc_add, but the * first time you call this we could end up with the * wrong direct flag if we don't do it before we add. */ if (flags & C_DIRECT_EXEC) { direct = 1; } else { direct = 0; } KASSERT(!direct || c->c_lock == NULL, ("%s: direct callout %p has lock", __func__, c)); cc = callout_lock(c); /* * Don't allow migration of pre-allocated callouts lest they * become unbalanced or handle the case where the user does * not care. */ if ((c->c_iflags & CALLOUT_LOCAL_ALLOC) || ignore_cpu) { cpu = c->c_cpu; } if (cc_exec_curr(cc, direct) == c) { /* * We're being asked to reschedule a callout which is * currently in progress. If there is a lock then we * can cancel the callout if it has not really started. */ if (c->c_lock != NULL && cc_exec_cancel(cc, direct)) cancelled = cc_exec_cancel(cc, direct) = true; if (cc_exec_waiting(cc, direct)) { /* * Someone has called callout_drain to kill this * callout. Don't reschedule. */ CTR4(KTR_CALLOUT, "%s %p func %p arg %p", cancelled ? "cancelled" : "failed to cancel", c, c->c_func, c->c_arg); CC_UNLOCK(cc); return (cancelled); } #ifdef SMP if (callout_migrating(c)) { /* * This only occurs when a second callout_reset_sbt_on * is made after a previous one moved it into * deferred migration (below). Note we do *not* change * the prev_cpu even though the previous target may * be different. */ cc_migration_cpu(cc, direct) = cpu; cc_migration_time(cc, direct) = to_sbt; cc_migration_prec(cc, direct) = precision; cc_migration_func(cc, direct) = ftn; cc_migration_arg(cc, direct) = arg; cancelled = 1; CC_UNLOCK(cc); return (cancelled); } #endif } if (c->c_iflags & CALLOUT_PENDING) { if ((c->c_iflags & CALLOUT_PROCESSED) == 0) { if (cc_exec_next(cc) == c) cc_exec_next(cc) = LIST_NEXT(c, c_links.le); LIST_REMOVE(c, c_links.le); } else { TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); } cancelled = 1; c->c_iflags &= ~ CALLOUT_PENDING; c->c_flags &= ~ CALLOUT_ACTIVE; } #ifdef SMP /* * If the callout must migrate try to perform it immediately. * If the callout is currently running, just defer the migration * to a more appropriate moment. */ if (c->c_cpu != cpu) { if (cc_exec_curr(cc, direct) == c) { /* * Pending will have been removed since we are * actually executing the callout on another * CPU. That callout should be waiting on the * lock the caller holds. If we set both * active/and/pending after we return and the * lock on the executing callout proceeds, it * will then see pending is true and return. * At the return from the actual callout execution * the migration will occur in softclock_call_cc * and this new callout will be placed on the * new CPU via a call to callout_cpu_switch() which * will get the lock on the right CPU followed * by a call callout_cc_add() which will add it there. * (see above in softclock_call_cc()). */ cc_migration_cpu(cc, direct) = cpu; cc_migration_time(cc, direct) = to_sbt; cc_migration_prec(cc, direct) = precision; cc_migration_func(cc, direct) = ftn; cc_migration_arg(cc, direct) = arg; c->c_iflags |= (CALLOUT_DFRMIGRATION | CALLOUT_PENDING); c->c_flags |= CALLOUT_ACTIVE; CTR6(KTR_CALLOUT, "migration of %p func %p arg %p in %d.%08x to %u deferred", c, c->c_func, c->c_arg, (int)(to_sbt >> 32), (u_int)(to_sbt & 0xffffffff), cpu); CC_UNLOCK(cc); return (cancelled); } cc = callout_cpu_switch(c, cc, cpu); } #endif callout_cc_add(c, cc, to_sbt, precision, ftn, arg, cpu, flags); CTR6(KTR_CALLOUT, "%sscheduled %p func %p arg %p in %d.%08x", cancelled ? "re" : "", c, c->c_func, c->c_arg, (int)(to_sbt >> 32), (u_int)(to_sbt & 0xffffffff)); CC_UNLOCK(cc); return (cancelled); } /* * Common idioms that can be optimized in the future. */ int callout_schedule_on(struct callout *c, int to_ticks, int cpu) { return callout_reset_on(c, to_ticks, c->c_func, c->c_arg, cpu); } int callout_schedule(struct callout *c, int to_ticks) { return callout_reset_on(c, to_ticks, c->c_func, c->c_arg, c->c_cpu); } int _callout_stop_safe(struct callout *c, int safe) { struct callout_cpu *cc, *old_cc; struct lock_class *class; int direct, sq_locked, use_lock; int not_on_a_list; if (safe) WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, c->c_lock, "calling %s", __func__); /* * Some old subsystems don't hold Giant while running a callout_stop(), * so just discard this check for the moment. */ if (!safe && c->c_lock != NULL) { if (c->c_lock == &Giant.lock_object) use_lock = mtx_owned(&Giant); else { use_lock = 1; class = LOCK_CLASS(c->c_lock); class->lc_assert(c->c_lock, LA_XLOCKED); } } else use_lock = 0; if (c->c_iflags & CALLOUT_DIRECT) { direct = 1; } else { direct = 0; } sq_locked = 0; old_cc = NULL; again: cc = callout_lock(c); if ((c->c_iflags & (CALLOUT_DFRMIGRATION | CALLOUT_PENDING)) == (CALLOUT_DFRMIGRATION | CALLOUT_PENDING) && ((c->c_flags & CALLOUT_ACTIVE) == CALLOUT_ACTIVE)) { /* * Special case where this slipped in while we * were migrating *as* the callout is about to * execute. The caller probably holds the lock * the callout wants. * * Get rid of the migration first. Then set * the flag that tells this code *not* to * try to remove it from any lists (its not * on one yet). When the callout wheel runs, * it will ignore this callout. */ c->c_iflags &= ~CALLOUT_PENDING; c->c_flags &= ~CALLOUT_ACTIVE; not_on_a_list = 1; } else { not_on_a_list = 0; } /* * If the callout was migrating while the callout cpu lock was * dropped, just drop the sleepqueue lock and check the states * again. */ if (sq_locked != 0 && cc != old_cc) { #ifdef SMP CC_UNLOCK(cc); sleepq_release(&cc_exec_waiting(old_cc, direct)); sq_locked = 0; old_cc = NULL; goto again; #else panic("migration should not happen"); #endif } /* * If the callout isn't pending, it's not on the queue, so * don't attempt to remove it from the queue. We can try to * stop it by other means however. */ if (!(c->c_iflags & CALLOUT_PENDING)) { c->c_flags &= ~CALLOUT_ACTIVE; /* * If it wasn't on the queue and it isn't the current * callout, then we can't stop it, so just bail. */ if (cc_exec_curr(cc, direct) != c) { CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", c, c->c_func, c->c_arg); CC_UNLOCK(cc); if (sq_locked) sleepq_release(&cc_exec_waiting(cc, direct)); return (0); } if (safe) { /* * The current callout is running (or just * about to run) and blocking is allowed, so * just wait for the current invocation to * finish. */ while (cc_exec_curr(cc, direct) == c) { /* * Use direct calls to sleepqueue interface * instead of cv/msleep in order to avoid * a LOR between cc_lock and sleepqueue * chain spinlocks. This piece of code * emulates a msleep_spin() call actually. * * If we already have the sleepqueue chain * locked, then we can safely block. If we * don't already have it locked, however, * we have to drop the cc_lock to lock * it. This opens several races, so we * restart at the beginning once we have * both locks. If nothing has changed, then * we will end up back here with sq_locked * set. */ if (!sq_locked) { CC_UNLOCK(cc); sleepq_lock( &cc_exec_waiting(cc, direct)); sq_locked = 1; old_cc = cc; goto again; } /* * Migration could be cancelled here, but * as long as it is still not sure when it * will be packed up, just let softclock() * take care of it. */ cc_exec_waiting(cc, direct) = true; DROP_GIANT(); CC_UNLOCK(cc); sleepq_add( &cc_exec_waiting(cc, direct), &cc->cc_lock.lock_object, "codrain", SLEEPQ_SLEEP, 0); sleepq_wait( &cc_exec_waiting(cc, direct), 0); sq_locked = 0; old_cc = NULL; /* Reacquire locks previously released. */ PICKUP_GIANT(); CC_LOCK(cc); } } else if (use_lock && !cc_exec_cancel(cc, direct)) { /* * The current callout is waiting for its * lock which we hold. Cancel the callout * and return. After our caller drops the * lock, the callout will be skipped in * softclock(). */ cc_exec_cancel(cc, direct) = true; CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p", c, c->c_func, c->c_arg); KASSERT(!cc_cce_migrating(cc, direct), ("callout wrongly scheduled for migration")); if (callout_migrating(c)) { c->c_iflags &= ~CALLOUT_DFRMIGRATION; #ifdef SMP cc_migration_cpu(cc, direct) = CPUBLOCK; cc_migration_time(cc, direct) = 0; cc_migration_prec(cc, direct) = 0; cc_migration_func(cc, direct) = NULL; cc_migration_arg(cc, direct) = NULL; #endif } CC_UNLOCK(cc); KASSERT(!sq_locked, ("sleepqueue chain locked")); return (1); } else if (callout_migrating(c)) { /* * The callout is currently being serviced * and the "next" callout is scheduled at * its completion with a migration. We remove * the migration flag so it *won't* get rescheduled, * but we can't stop the one thats running so * we return 0. */ c->c_iflags &= ~CALLOUT_DFRMIGRATION; #ifdef SMP /* * We can't call cc_cce_cleanup here since * if we do it will remove .ce_curr and * its still running. This will prevent a * reschedule of the callout when the * execution completes. */ cc_migration_cpu(cc, direct) = CPUBLOCK; cc_migration_time(cc, direct) = 0; cc_migration_prec(cc, direct) = 0; cc_migration_func(cc, direct) = NULL; cc_migration_arg(cc, direct) = NULL; #endif CTR3(KTR_CALLOUT, "postponing stop %p func %p arg %p", c, c->c_func, c->c_arg); CC_UNLOCK(cc); return (0); } CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", c, c->c_func, c->c_arg); CC_UNLOCK(cc); KASSERT(!sq_locked, ("sleepqueue chain still locked")); return (0); } if (sq_locked) sleepq_release(&cc_exec_waiting(cc, direct)); c->c_iflags &= ~CALLOUT_PENDING; c->c_flags &= ~CALLOUT_ACTIVE; CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p", c, c->c_func, c->c_arg); if (not_on_a_list == 0) { if ((c->c_iflags & CALLOUT_PROCESSED) == 0) { if (cc_exec_next(cc) == c) cc_exec_next(cc) = LIST_NEXT(c, c_links.le); LIST_REMOVE(c, c_links.le); } else { TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); } } callout_cc_del(c, cc); CC_UNLOCK(cc); return (1); } void callout_init(struct callout *c, int mpsafe) { bzero(c, sizeof *c); if (mpsafe) { c->c_lock = NULL; c->c_iflags = CALLOUT_RETURNUNLOCKED; } else { c->c_lock = &Giant.lock_object; c->c_iflags = 0; } c->c_cpu = timeout_cpu; } void _callout_init_lock(struct callout *c, struct lock_object *lock, int flags) { bzero(c, sizeof *c); c->c_lock = lock; KASSERT((flags & ~(CALLOUT_RETURNUNLOCKED | CALLOUT_SHAREDLOCK)) == 0, ("callout_init_lock: bad flags %d", flags)); KASSERT(lock != NULL || (flags & CALLOUT_RETURNUNLOCKED) == 0, ("callout_init_lock: CALLOUT_RETURNUNLOCKED with no lock")); KASSERT(lock == NULL || !(LOCK_CLASS(lock)->lc_flags & (LC_SPINLOCK | LC_SLEEPABLE)), ("%s: invalid lock class", __func__)); c->c_iflags = flags & (CALLOUT_RETURNUNLOCKED | CALLOUT_SHAREDLOCK); c->c_cpu = timeout_cpu; } #ifdef APM_FIXUP_CALLTODO /* * Adjust the kernel calltodo timeout list. This routine is used after * an APM resume to recalculate the calltodo timer list values with the * number of hz's we have been sleeping. The next hardclock() will detect * that there are fired timers and run softclock() to execute them. * * Please note, I have not done an exhaustive analysis of what code this * might break. I am motivated to have my select()'s and alarm()'s that * have expired during suspend firing upon resume so that the applications * which set the timer can do the maintanence the timer was for as close * as possible to the originally intended time. Testing this code for a * week showed that resuming from a suspend resulted in 22 to 25 timers * firing, which seemed independant on whether the suspend was 2 hours or * 2 days. Your milage may vary. - Ken Key */ void adjust_timeout_calltodo(struct timeval *time_change) { register struct callout *p; unsigned long delta_ticks; /* * How many ticks were we asleep? * (stolen from tvtohz()). */ /* Don't do anything */ if (time_change->tv_sec < 0) return; else if (time_change->tv_sec <= LONG_MAX / 1000000) delta_ticks = (time_change->tv_sec * 1000000 + time_change->tv_usec + (tick - 1)) / tick + 1; else if (time_change->tv_sec <= LONG_MAX / hz) delta_ticks = time_change->tv_sec * hz + (time_change->tv_usec + (tick - 1)) / tick + 1; else delta_ticks = LONG_MAX; if (delta_ticks > INT_MAX) delta_ticks = INT_MAX; /* * Now rip through the timer calltodo list looking for timers * to expire. */ /* don't collide with softclock() */ CC_LOCK(cc); for (p = calltodo.c_next; p != NULL; p = p->c_next) { p->c_time -= delta_ticks; /* Break if the timer had more time on it than delta_ticks */ if (p->c_time > 0) break; /* take back the ticks the timer didn't use (p->c_time <= 0) */ delta_ticks = -p->c_time; } CC_UNLOCK(cc); return; } #endif /* APM_FIXUP_CALLTODO */ static int flssbt(sbintime_t sbt) { sbt += (uint64_t)sbt >> 1; if (sizeof(long) >= sizeof(sbintime_t)) return (flsl(sbt)); if (sbt >= SBT_1S) return (flsl(((uint64_t)sbt) >> 32) + 32); return (flsl(sbt)); } /* * Dump immediate statistic snapshot of the scheduled callouts. */ static int sysctl_kern_callout_stat(SYSCTL_HANDLER_ARGS) { struct callout *tmp; struct callout_cpu *cc; struct callout_list *sc; sbintime_t maxpr, maxt, medpr, medt, now, spr, st, t; int ct[64], cpr[64], ccpbk[32]; int error, val, i, count, tcum, pcum, maxc, c, medc; #ifdef SMP int cpu; #endif val = 0; error = sysctl_handle_int(oidp, &val, 0, req); if (error != 0 || req->newptr == NULL) return (error); count = maxc = 0; st = spr = maxt = maxpr = 0; bzero(ccpbk, sizeof(ccpbk)); bzero(ct, sizeof(ct)); bzero(cpr, sizeof(cpr)); now = sbinuptime(); #ifdef SMP CPU_FOREACH(cpu) { cc = CC_CPU(cpu); #else cc = CC_CPU(timeout_cpu); #endif CC_LOCK(cc); for (i = 0; i < callwheelsize; i++) { sc = &cc->cc_callwheel[i]; c = 0; LIST_FOREACH(tmp, sc, c_links.le) { c++; t = tmp->c_time - now; if (t < 0) t = 0; st += t / SBT_1US; spr += tmp->c_precision / SBT_1US; if (t > maxt) maxt = t; if (tmp->c_precision > maxpr) maxpr = tmp->c_precision; ct[flssbt(t)]++; cpr[flssbt(tmp->c_precision)]++; } if (c > maxc) maxc = c; ccpbk[fls(c + c / 2)]++; count += c; } CC_UNLOCK(cc); #ifdef SMP } #endif for (i = 0, tcum = 0; i < 64 && tcum < count / 2; i++) tcum += ct[i]; medt = (i >= 2) ? (((sbintime_t)1) << (i - 2)) : 0; for (i = 0, pcum = 0; i < 64 && pcum < count / 2; i++) pcum += cpr[i]; medpr = (i >= 2) ? (((sbintime_t)1) << (i - 2)) : 0; for (i = 0, c = 0; i < 32 && c < count / 2; i++) c += ccpbk[i]; medc = (i >= 2) ? (1 << (i - 2)) : 0; printf("Scheduled callouts statistic snapshot:\n"); printf(" Callouts: %6d Buckets: %6d*%-3d Bucket size: 0.%06ds\n", count, callwheelsize, mp_ncpus, 1000000 >> CC_HASH_SHIFT); printf(" C/Bk: med %5d avg %6d.%06jd max %6d\n", medc, count / callwheelsize / mp_ncpus, (uint64_t)count * 1000000 / callwheelsize / mp_ncpus % 1000000, maxc); printf(" Time: med %5jd.%06jds avg %6jd.%06jds max %6jd.%06jds\n", medt / SBT_1S, (medt & 0xffffffff) * 1000000 >> 32, (st / count) / 1000000, (st / count) % 1000000, maxt / SBT_1S, (maxt & 0xffffffff) * 1000000 >> 32); printf(" Prec: med %5jd.%06jds avg %6jd.%06jds max %6jd.%06jds\n", medpr / SBT_1S, (medpr & 0xffffffff) * 1000000 >> 32, (spr / count) / 1000000, (spr / count) % 1000000, maxpr / SBT_1S, (maxpr & 0xffffffff) * 1000000 >> 32); printf(" Distribution: \tbuckets\t time\t tcum\t" " prec\t pcum\n"); for (i = 0, tcum = pcum = 0; i < 64; i++) { if (ct[i] == 0 && cpr[i] == 0) continue; t = (i != 0) ? (((sbintime_t)1) << (i - 1)) : 0; tcum += ct[i]; pcum += cpr[i]; printf(" %10jd.%06jds\t 2**%d\t%7d\t%7d\t%7d\t%7d\n", t / SBT_1S, (t & 0xffffffff) * 1000000 >> 32, i - 1 - (32 - CC_HASH_SHIFT), ct[i], tcum, cpr[i], pcum); } return (error); } SYSCTL_PROC(_kern, OID_AUTO, callout_stat, CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, 0, 0, sysctl_kern_callout_stat, "I", "Dump immediate statistic snapshot of the scheduled callouts"); Index: user/ngie/more-tests/sys/kern/subr_bus.c =================================================================== --- user/ngie/more-tests/sys/kern/subr_bus.c (revision 281584) +++ user/ngie/more-tests/sys/kern/subr_bus.c (revision 281585) @@ -1,5306 +1,5308 @@ /*- * Copyright (c) 1997,1998,2003 Doug Rabson * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_bus.h" #include "opt_random.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include SYSCTL_NODE(_hw, OID_AUTO, bus, CTLFLAG_RW, NULL, NULL); SYSCTL_ROOT_NODE(OID_AUTO, dev, CTLFLAG_RW, NULL, NULL); /* * Used to attach drivers to devclasses. */ typedef struct driverlink *driverlink_t; struct driverlink { kobj_class_t driver; TAILQ_ENTRY(driverlink) link; /* list of drivers in devclass */ int pass; TAILQ_ENTRY(driverlink) passlink; }; /* * Forward declarations */ typedef TAILQ_HEAD(devclass_list, devclass) devclass_list_t; typedef TAILQ_HEAD(driver_list, driverlink) driver_list_t; typedef TAILQ_HEAD(device_list, device) device_list_t; struct devclass { TAILQ_ENTRY(devclass) link; devclass_t parent; /* parent in devclass hierarchy */ driver_list_t drivers; /* bus devclasses store drivers for bus */ char *name; device_t *devices; /* array of devices indexed by unit */ int maxunit; /* size of devices array */ int flags; #define DC_HAS_CHILDREN 1 struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; }; /** * @brief Implementation of device. */ struct device { /* * A device is a kernel object. The first field must be the * current ops table for the object. */ KOBJ_FIELDS; /* * Device hierarchy. */ TAILQ_ENTRY(device) link; /**< list of devices in parent */ TAILQ_ENTRY(device) devlink; /**< global device list membership */ device_t parent; /**< parent of this device */ device_list_t children; /**< list of child devices */ /* * Details of this device. */ driver_t *driver; /**< current driver */ devclass_t devclass; /**< current device class */ int unit; /**< current unit number */ char* nameunit; /**< name+unit e.g. foodev0 */ char* desc; /**< driver specific description */ int busy; /**< count of calls to device_busy() */ device_state_t state; /**< current device state */ uint32_t devflags; /**< api level flags for device_get_flags() */ u_int flags; /**< internal device flags */ u_int order; /**< order from device_add_child_ordered() */ void *ivars; /**< instance variables */ void *softc; /**< current driver's variables */ struct sysctl_ctx_list sysctl_ctx; /**< state for sysctl variables */ struct sysctl_oid *sysctl_tree; /**< state for sysctl variables */ }; static MALLOC_DEFINE(M_BUS, "bus", "Bus data structures"); static MALLOC_DEFINE(M_BUS_SC, "bus-sc", "Bus data structures, softc"); static void devctl2_init(void); #ifdef BUS_DEBUG static int bus_debug = 1; SYSCTL_INT(_debug, OID_AUTO, bus_debug, CTLFLAG_RWTUN, &bus_debug, 0, "Bus debug level"); #define PDEBUG(a) if (bus_debug) {printf("%s:%d: ", __func__, __LINE__), printf a; printf("\n");} #define DEVICENAME(d) ((d)? device_get_name(d): "no device") #define DRIVERNAME(d) ((d)? d->name : "no driver") #define DEVCLANAME(d) ((d)? d->name : "no devclass") /** * Produce the indenting, indent*2 spaces plus a '.' ahead of that to * prevent syslog from deleting initial spaces */ #define indentprintf(p) do { int iJ; printf("."); for (iJ=0; iJparent ? dc->parent->name : ""; break; default: return (EINVAL); } return (SYSCTL_OUT_STR(req, value)); } static void devclass_sysctl_init(devclass_t dc) { if (dc->sysctl_tree != NULL) return; sysctl_ctx_init(&dc->sysctl_ctx); dc->sysctl_tree = SYSCTL_ADD_NODE(&dc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_dev), OID_AUTO, dc->name, CTLFLAG_RD, NULL, ""); SYSCTL_ADD_PROC(&dc->sysctl_ctx, SYSCTL_CHILDREN(dc->sysctl_tree), OID_AUTO, "%parent", CTLTYPE_STRING | CTLFLAG_RD, dc, DEVCLASS_SYSCTL_PARENT, devclass_sysctl_handler, "A", "parent class"); } enum { DEVICE_SYSCTL_DESC, DEVICE_SYSCTL_DRIVER, DEVICE_SYSCTL_LOCATION, DEVICE_SYSCTL_PNPINFO, DEVICE_SYSCTL_PARENT, }; static int device_sysctl_handler(SYSCTL_HANDLER_ARGS) { device_t dev = (device_t)arg1; const char *value; char *buf; int error; buf = NULL; switch (arg2) { case DEVICE_SYSCTL_DESC: value = dev->desc ? dev->desc : ""; break; case DEVICE_SYSCTL_DRIVER: value = dev->driver ? dev->driver->name : ""; break; case DEVICE_SYSCTL_LOCATION: value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO); bus_child_location_str(dev, buf, 1024); break; case DEVICE_SYSCTL_PNPINFO: value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO); bus_child_pnpinfo_str(dev, buf, 1024); break; case DEVICE_SYSCTL_PARENT: value = dev->parent ? dev->parent->nameunit : ""; break; default: return (EINVAL); } error = SYSCTL_OUT_STR(req, value); if (buf != NULL) free(buf, M_BUS); return (error); } static void device_sysctl_init(device_t dev) { devclass_t dc = dev->devclass; int domain; if (dev->sysctl_tree != NULL) return; devclass_sysctl_init(dc); sysctl_ctx_init(&dev->sysctl_ctx); dev->sysctl_tree = SYSCTL_ADD_NODE(&dev->sysctl_ctx, SYSCTL_CHILDREN(dc->sysctl_tree), OID_AUTO, dev->nameunit + strlen(dc->name), CTLFLAG_RD, NULL, ""); SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%desc", CTLTYPE_STRING | CTLFLAG_RD, dev, DEVICE_SYSCTL_DESC, device_sysctl_handler, "A", "device description"); SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%driver", CTLTYPE_STRING | CTLFLAG_RD, dev, DEVICE_SYSCTL_DRIVER, device_sysctl_handler, "A", "device driver name"); SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%location", CTLTYPE_STRING | CTLFLAG_RD, dev, DEVICE_SYSCTL_LOCATION, device_sysctl_handler, "A", "device location relative to parent"); SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%pnpinfo", CTLTYPE_STRING | CTLFLAG_RD, dev, DEVICE_SYSCTL_PNPINFO, device_sysctl_handler, "A", "device identification"); SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%parent", CTLTYPE_STRING | CTLFLAG_RD, dev, DEVICE_SYSCTL_PARENT, device_sysctl_handler, "A", "parent device"); if (bus_get_domain(dev, &domain) == 0) SYSCTL_ADD_INT(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%domain", CTLFLAG_RD, NULL, domain, "NUMA domain"); } static void device_sysctl_update(device_t dev) { devclass_t dc = dev->devclass; if (dev->sysctl_tree == NULL) return; sysctl_rename_oid(dev->sysctl_tree, dev->nameunit + strlen(dc->name)); } static void device_sysctl_fini(device_t dev) { if (dev->sysctl_tree == NULL) return; sysctl_ctx_free(&dev->sysctl_ctx); dev->sysctl_tree = NULL; } /* * /dev/devctl implementation */ /* * This design allows only one reader for /dev/devctl. This is not desirable * in the long run, but will get a lot of hair out of this implementation. * Maybe we should make this device a clonable device. * * Also note: we specifically do not attach a device to the device_t tree * to avoid potential chicken and egg problems. One could argue that all * of this belongs to the root node. One could also further argue that the * sysctl interface that we have not might more properly be an ioctl * interface, but at this stage of the game, I'm not inclined to rock that * boat. * * I'm also not sure that the SIGIO support is done correctly or not, as * I copied it from a driver that had SIGIO support that likely hasn't been * tested since 3.4 or 2.2.8! */ /* Deprecated way to adjust queue length */ static int sysctl_devctl_disable(SYSCTL_HANDLER_ARGS); SYSCTL_PROC(_hw_bus, OID_AUTO, devctl_disable, CTLTYPE_INT | CTLFLAG_RWTUN | CTLFLAG_MPSAFE, NULL, 0, sysctl_devctl_disable, "I", "devctl disable -- deprecated"); #define DEVCTL_DEFAULT_QUEUE_LEN 1000 static int sysctl_devctl_queue(SYSCTL_HANDLER_ARGS); static int devctl_queue_length = DEVCTL_DEFAULT_QUEUE_LEN; SYSCTL_PROC(_hw_bus, OID_AUTO, devctl_queue, CTLTYPE_INT | CTLFLAG_RWTUN | CTLFLAG_MPSAFE, NULL, 0, sysctl_devctl_queue, "I", "devctl queue length"); static d_open_t devopen; static d_close_t devclose; static d_read_t devread; static d_ioctl_t devioctl; static d_poll_t devpoll; static d_kqfilter_t devkqfilter; static struct cdevsw dev_cdevsw = { .d_version = D_VERSION, .d_open = devopen, .d_close = devclose, .d_read = devread, .d_ioctl = devioctl, .d_poll = devpoll, .d_kqfilter = devkqfilter, .d_name = "devctl", }; struct dev_event_info { char *dei_data; TAILQ_ENTRY(dev_event_info) dei_link; }; TAILQ_HEAD(devq, dev_event_info); static struct dev_softc { int inuse; int nonblock; int queued; int async; struct mtx mtx; struct cv cv; struct selinfo sel; struct devq devq; struct sigio *sigio; } devsoftc; static void filt_devctl_detach(struct knote *kn); static int filt_devctl_read(struct knote *kn, long hint); struct filterops devctl_rfiltops = { .f_isfd = 1, .f_detach = filt_devctl_detach, .f_event = filt_devctl_read, }; static struct cdev *devctl_dev; static void devinit(void) { devctl_dev = make_dev_credf(MAKEDEV_ETERNAL, &dev_cdevsw, 0, NULL, UID_ROOT, GID_WHEEL, 0600, "devctl"); mtx_init(&devsoftc.mtx, "dev mtx", "devd", MTX_DEF); cv_init(&devsoftc.cv, "dev cv"); TAILQ_INIT(&devsoftc.devq); knlist_init_mtx(&devsoftc.sel.si_note, &devsoftc.mtx); devctl2_init(); } static int devopen(struct cdev *dev, int oflags, int devtype, struct thread *td) { mtx_lock(&devsoftc.mtx); if (devsoftc.inuse) { mtx_unlock(&devsoftc.mtx); return (EBUSY); } /* move to init */ devsoftc.inuse = 1; mtx_unlock(&devsoftc.mtx); return (0); } static int devclose(struct cdev *dev, int fflag, int devtype, struct thread *td) { mtx_lock(&devsoftc.mtx); devsoftc.inuse = 0; devsoftc.nonblock = 0; devsoftc.async = 0; cv_broadcast(&devsoftc.cv); funsetown(&devsoftc.sigio); mtx_unlock(&devsoftc.mtx); return (0); } /* * The read channel for this device is used to report changes to * userland in realtime. We are required to free the data as well as * the n1 object because we allocate them separately. Also note that * we return one record at a time. If you try to read this device a * character at a time, you will lose the rest of the data. Listening * programs are expected to cope. */ static int devread(struct cdev *dev, struct uio *uio, int ioflag) { struct dev_event_info *n1; int rv; mtx_lock(&devsoftc.mtx); while (TAILQ_EMPTY(&devsoftc.devq)) { if (devsoftc.nonblock) { mtx_unlock(&devsoftc.mtx); return (EAGAIN); } rv = cv_wait_sig(&devsoftc.cv, &devsoftc.mtx); if (rv) { /* * Need to translate ERESTART to EINTR here? -- jake */ mtx_unlock(&devsoftc.mtx); return (rv); } } n1 = TAILQ_FIRST(&devsoftc.devq); TAILQ_REMOVE(&devsoftc.devq, n1, dei_link); devsoftc.queued--; mtx_unlock(&devsoftc.mtx); rv = uiomove(n1->dei_data, strlen(n1->dei_data), uio); free(n1->dei_data, M_BUS); free(n1, M_BUS); return (rv); } static int devioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td) { switch (cmd) { case FIONBIO: if (*(int*)data) devsoftc.nonblock = 1; else devsoftc.nonblock = 0; return (0); case FIOASYNC: if (*(int*)data) devsoftc.async = 1; else devsoftc.async = 0; return (0); case FIOSETOWN: return fsetown(*(int *)data, &devsoftc.sigio); case FIOGETOWN: *(int *)data = fgetown(&devsoftc.sigio); return (0); /* (un)Support for other fcntl() calls. */ case FIOCLEX: case FIONCLEX: case FIONREAD: default: break; } return (ENOTTY); } static int devpoll(struct cdev *dev, int events, struct thread *td) { int revents = 0; mtx_lock(&devsoftc.mtx); if (events & (POLLIN | POLLRDNORM)) { if (!TAILQ_EMPTY(&devsoftc.devq)) revents = events & (POLLIN | POLLRDNORM); else selrecord(td, &devsoftc.sel); } mtx_unlock(&devsoftc.mtx); return (revents); } static int devkqfilter(struct cdev *dev, struct knote *kn) { int error; if (kn->kn_filter == EVFILT_READ) { kn->kn_fop = &devctl_rfiltops; knlist_add(&devsoftc.sel.si_note, kn, 0); error = 0; } else error = EINVAL; return (error); } static void filt_devctl_detach(struct knote *kn) { knlist_remove(&devsoftc.sel.si_note, kn, 0); } static int filt_devctl_read(struct knote *kn, long hint) { kn->kn_data = devsoftc.queued; return (kn->kn_data != 0); } /** * @brief Return whether the userland process is running */ boolean_t devctl_process_running(void) { return (devsoftc.inuse == 1); } /** * @brief Queue data to be read from the devctl device * * Generic interface to queue data to the devctl device. It is * assumed that @p data is properly formatted. It is further assumed * that @p data is allocated using the M_BUS malloc type. */ void devctl_queue_data_f(char *data, int flags) { struct dev_event_info *n1 = NULL, *n2 = NULL; if (strlen(data) == 0) goto out; if (devctl_queue_length == 0) goto out; n1 = malloc(sizeof(*n1), M_BUS, flags); if (n1 == NULL) goto out; n1->dei_data = data; mtx_lock(&devsoftc.mtx); if (devctl_queue_length == 0) { mtx_unlock(&devsoftc.mtx); free(n1->dei_data, M_BUS); free(n1, M_BUS); return; } /* Leave at least one spot in the queue... */ while (devsoftc.queued > devctl_queue_length - 1) { n2 = TAILQ_FIRST(&devsoftc.devq); TAILQ_REMOVE(&devsoftc.devq, n2, dei_link); free(n2->dei_data, M_BUS); free(n2, M_BUS); devsoftc.queued--; } TAILQ_INSERT_TAIL(&devsoftc.devq, n1, dei_link); devsoftc.queued++; cv_broadcast(&devsoftc.cv); KNOTE_LOCKED(&devsoftc.sel.si_note, 0); mtx_unlock(&devsoftc.mtx); selwakeup(&devsoftc.sel); if (devsoftc.async && devsoftc.sigio != NULL) pgsigio(&devsoftc.sigio, SIGIO, 0); return; out: /* * We have to free data on all error paths since the caller * assumes it will be free'd when this item is dequeued. */ free(data, M_BUS); return; } void devctl_queue_data(char *data) { devctl_queue_data_f(data, M_NOWAIT); } /** * @brief Send a 'notification' to userland, using standard ways */ void devctl_notify_f(const char *system, const char *subsystem, const char *type, const char *data, int flags) { int len = 0; char *msg; if (system == NULL) return; /* BOGUS! Must specify system. */ if (subsystem == NULL) return; /* BOGUS! Must specify subsystem. */ if (type == NULL) return; /* BOGUS! Must specify type. */ len += strlen(" system=") + strlen(system); len += strlen(" subsystem=") + strlen(subsystem); len += strlen(" type=") + strlen(type); /* add in the data message plus newline. */ if (data != NULL) len += strlen(data); len += 3; /* '!', '\n', and NUL */ msg = malloc(len, M_BUS, flags); if (msg == NULL) return; /* Drop it on the floor */ if (data != NULL) snprintf(msg, len, "!system=%s subsystem=%s type=%s %s\n", system, subsystem, type, data); else snprintf(msg, len, "!system=%s subsystem=%s type=%s\n", system, subsystem, type); devctl_queue_data_f(msg, flags); } void devctl_notify(const char *system, const char *subsystem, const char *type, const char *data) { devctl_notify_f(system, subsystem, type, data, M_NOWAIT); } /* * Common routine that tries to make sending messages as easy as possible. * We allocate memory for the data, copy strings into that, but do not * free it unless there's an error. The dequeue part of the driver should * free the data. We don't send data when the device is disabled. We do * send data, even when we have no listeners, because we wish to avoid * races relating to startup and restart of listening applications. * * devaddq is designed to string together the type of event, with the * object of that event, plus the plug and play info and location info * for that event. This is likely most useful for devices, but less * useful for other consumers of this interface. Those should use * the devctl_queue_data() interface instead. */ static void devaddq(const char *type, const char *what, device_t dev) { char *data = NULL; char *loc = NULL; char *pnp = NULL; const char *parstr; if (!devctl_queue_length)/* Rare race, but lost races safely discard */ return; data = malloc(1024, M_BUS, M_NOWAIT); if (data == NULL) goto bad; /* get the bus specific location of this device */ loc = malloc(1024, M_BUS, M_NOWAIT); if (loc == NULL) goto bad; *loc = '\0'; bus_child_location_str(dev, loc, 1024); /* Get the bus specific pnp info of this device */ pnp = malloc(1024, M_BUS, M_NOWAIT); if (pnp == NULL) goto bad; *pnp = '\0'; bus_child_pnpinfo_str(dev, pnp, 1024); /* Get the parent of this device, or / if high enough in the tree. */ if (device_get_parent(dev) == NULL) parstr = "."; /* Or '/' ? */ else parstr = device_get_nameunit(device_get_parent(dev)); /* String it all together. */ snprintf(data, 1024, "%s%s at %s %s on %s\n", type, what, loc, pnp, parstr); free(loc, M_BUS); free(pnp, M_BUS); devctl_queue_data(data); return; bad: free(pnp, M_BUS); free(loc, M_BUS); free(data, M_BUS); return; } /* * A device was added to the tree. We are called just after it successfully * attaches (that is, probe and attach success for this device). No call * is made if a device is merely parented into the tree. See devnomatch * if probe fails. If attach fails, no notification is sent (but maybe * we should have a different message for this). */ static void devadded(device_t dev) { devaddq("+", device_get_nameunit(dev), dev); } /* * A device was removed from the tree. We are called just before this * happens. */ static void devremoved(device_t dev) { devaddq("-", device_get_nameunit(dev), dev); } /* * Called when there's no match for this device. This is only called * the first time that no match happens, so we don't keep getting this * message. Should that prove to be undesirable, we can change it. * This is called when all drivers that can attach to a given bus * decline to accept this device. Other errors may not be detected. */ static void devnomatch(device_t dev) { devaddq("?", "", dev); } static int sysctl_devctl_disable(SYSCTL_HANDLER_ARGS) { struct dev_event_info *n1; int dis, error; dis = (devctl_queue_length == 0); error = sysctl_handle_int(oidp, &dis, 0, req); if (error || !req->newptr) return (error); if (mtx_initialized(&devsoftc.mtx)) mtx_lock(&devsoftc.mtx); if (dis) { while (!TAILQ_EMPTY(&devsoftc.devq)) { n1 = TAILQ_FIRST(&devsoftc.devq); TAILQ_REMOVE(&devsoftc.devq, n1, dei_link); free(n1->dei_data, M_BUS); free(n1, M_BUS); } devsoftc.queued = 0; devctl_queue_length = 0; } else { devctl_queue_length = DEVCTL_DEFAULT_QUEUE_LEN; } if (mtx_initialized(&devsoftc.mtx)) mtx_unlock(&devsoftc.mtx); return (0); } static int sysctl_devctl_queue(SYSCTL_HANDLER_ARGS) { struct dev_event_info *n1; int q, error; q = devctl_queue_length; error = sysctl_handle_int(oidp, &q, 0, req); if (error || !req->newptr) return (error); if (q < 0) return (EINVAL); if (mtx_initialized(&devsoftc.mtx)) mtx_lock(&devsoftc.mtx); devctl_queue_length = q; while (devsoftc.queued > devctl_queue_length) { n1 = TAILQ_FIRST(&devsoftc.devq); TAILQ_REMOVE(&devsoftc.devq, n1, dei_link); free(n1->dei_data, M_BUS); free(n1, M_BUS); devsoftc.queued--; } if (mtx_initialized(&devsoftc.mtx)) mtx_unlock(&devsoftc.mtx); return (0); } /* End of /dev/devctl code */ static TAILQ_HEAD(,device) bus_data_devices; static int bus_data_generation = 1; static kobj_method_t null_methods[] = { KOBJMETHOD_END }; DEFINE_CLASS(null, null_methods, 0); /* * Bus pass implementation */ static driver_list_t passes = TAILQ_HEAD_INITIALIZER(passes); int bus_current_pass = BUS_PASS_ROOT; /** * @internal * @brief Register the pass level of a new driver attachment * * Register a new driver attachment's pass level. If no driver * attachment with the same pass level has been added, then @p new * will be added to the global passes list. * * @param new the new driver attachment */ static void driver_register_pass(struct driverlink *new) { struct driverlink *dl; /* We only consider pass numbers during boot. */ if (bus_current_pass == BUS_PASS_DEFAULT) return; /* * Walk the passes list. If we already know about this pass * then there is nothing to do. If we don't, then insert this * driver link into the list. */ TAILQ_FOREACH(dl, &passes, passlink) { if (dl->pass < new->pass) continue; if (dl->pass == new->pass) return; TAILQ_INSERT_BEFORE(dl, new, passlink); return; } TAILQ_INSERT_TAIL(&passes, new, passlink); } /** * @brief Raise the current bus pass * * Raise the current bus pass level to @p pass. Call the BUS_NEW_PASS() * method on the root bus to kick off a new device tree scan for each * new pass level that has at least one driver. */ void bus_set_pass(int pass) { struct driverlink *dl; if (bus_current_pass > pass) panic("Attempt to lower bus pass level"); TAILQ_FOREACH(dl, &passes, passlink) { /* Skip pass values below the current pass level. */ if (dl->pass <= bus_current_pass) continue; /* * Bail once we hit a driver with a pass level that is * too high. */ if (dl->pass > pass) break; /* * Raise the pass level to the next level and rescan * the tree. */ bus_current_pass = dl->pass; BUS_NEW_PASS(root_bus); } /* * If there isn't a driver registered for the requested pass, * then bus_current_pass might still be less than 'pass'. Set * it to 'pass' in that case. */ if (bus_current_pass < pass) bus_current_pass = pass; KASSERT(bus_current_pass == pass, ("Failed to update bus pass level")); } /* * Devclass implementation */ static devclass_list_t devclasses = TAILQ_HEAD_INITIALIZER(devclasses); /** * @internal * @brief Find or create a device class * * If a device class with the name @p classname exists, return it, * otherwise if @p create is non-zero create and return a new device * class. * * If @p parentname is non-NULL, the parent of the devclass is set to * the devclass of that name. * * @param classname the devclass name to find or create * @param parentname the parent devclass name or @c NULL * @param create non-zero to create a devclass */ static devclass_t devclass_find_internal(const char *classname, const char *parentname, int create) { devclass_t dc; PDEBUG(("looking for %s", classname)); if (!classname) return (NULL); TAILQ_FOREACH(dc, &devclasses, link) { if (!strcmp(dc->name, classname)) break; } if (create && !dc) { PDEBUG(("creating %s", classname)); dc = malloc(sizeof(struct devclass) + strlen(classname) + 1, M_BUS, M_NOWAIT | M_ZERO); if (!dc) return (NULL); dc->parent = NULL; dc->name = (char*) (dc + 1); strcpy(dc->name, classname); TAILQ_INIT(&dc->drivers); TAILQ_INSERT_TAIL(&devclasses, dc, link); bus_data_generation_update(); } /* * If a parent class is specified, then set that as our parent so * that this devclass will support drivers for the parent class as * well. If the parent class has the same name don't do this though * as it creates a cycle that can trigger an infinite loop in * device_probe_child() if a device exists for which there is no * suitable driver. */ if (parentname && dc && !dc->parent && strcmp(classname, parentname) != 0) { dc->parent = devclass_find_internal(parentname, NULL, TRUE); dc->parent->flags |= DC_HAS_CHILDREN; } return (dc); } /** * @brief Create a device class * * If a device class with the name @p classname exists, return it, * otherwise create and return a new device class. * * @param classname the devclass name to find or create */ devclass_t devclass_create(const char *classname) { return (devclass_find_internal(classname, NULL, TRUE)); } /** * @brief Find a device class * * If a device class with the name @p classname exists, return it, * otherwise return @c NULL. * * @param classname the devclass name to find */ devclass_t devclass_find(const char *classname) { return (devclass_find_internal(classname, NULL, FALSE)); } /** * @brief Register that a device driver has been added to a devclass * * Register that a device driver has been added to a devclass. This * is called by devclass_add_driver to accomplish the recursive * notification of all the children classes of dc, as well as dc. * Each layer will have BUS_DRIVER_ADDED() called for all instances of * the devclass. * * We do a full search here of the devclass list at each iteration * level to save storing children-lists in the devclass structure. If * we ever move beyond a few dozen devices doing this, we may need to * reevaluate... * * @param dc the devclass to edit * @param driver the driver that was just added */ static void devclass_driver_added(devclass_t dc, driver_t *driver) { devclass_t parent; int i; /* * Call BUS_DRIVER_ADDED for any existing busses in this class. */ for (i = 0; i < dc->maxunit; i++) if (dc->devices[i] && device_is_attached(dc->devices[i])) BUS_DRIVER_ADDED(dc->devices[i], driver); /* * Walk through the children classes. Since we only keep a * single parent pointer around, we walk the entire list of * devclasses looking for children. We set the * DC_HAS_CHILDREN flag when a child devclass is created on * the parent, so we only walk the list for those devclasses * that have children. */ if (!(dc->flags & DC_HAS_CHILDREN)) return; parent = dc; TAILQ_FOREACH(dc, &devclasses, link) { if (dc->parent == parent) devclass_driver_added(dc, driver); } } /** * @brief Add a device driver to a device class * * Add a device driver to a devclass. This is normally called * automatically by DRIVER_MODULE(). The BUS_DRIVER_ADDED() method of * all devices in the devclass will be called to allow them to attempt * to re-probe any unmatched children. * * @param dc the devclass to edit * @param driver the driver to register */ int devclass_add_driver(devclass_t dc, driver_t *driver, int pass, devclass_t *dcp) { driverlink_t dl; const char *parentname; PDEBUG(("%s", DRIVERNAME(driver))); /* Don't allow invalid pass values. */ if (pass <= BUS_PASS_ROOT) return (EINVAL); dl = malloc(sizeof *dl, M_BUS, M_NOWAIT|M_ZERO); if (!dl) return (ENOMEM); /* * Compile the driver's methods. Also increase the reference count * so that the class doesn't get freed when the last instance * goes. This means we can safely use static methods and avoids a * double-free in devclass_delete_driver. */ kobj_class_compile((kobj_class_t) driver); /* * If the driver has any base classes, make the * devclass inherit from the devclass of the driver's * first base class. This will allow the system to * search for drivers in both devclasses for children * of a device using this driver. */ if (driver->baseclasses) parentname = driver->baseclasses[0]->name; else parentname = NULL; *dcp = devclass_find_internal(driver->name, parentname, TRUE); dl->driver = driver; TAILQ_INSERT_TAIL(&dc->drivers, dl, link); driver->refs++; /* XXX: kobj_mtx */ dl->pass = pass; driver_register_pass(dl); devclass_driver_added(dc, driver); bus_data_generation_update(); return (0); } /** * @brief Register that a device driver has been deleted from a devclass * * Register that a device driver has been removed from a devclass. * This is called by devclass_delete_driver to accomplish the * recursive notification of all the children classes of busclass, as * well as busclass. Each layer will attempt to detach the driver * from any devices that are children of the bus's devclass. The function * will return an error if a device fails to detach. * * We do a full search here of the devclass list at each iteration * level to save storing children-lists in the devclass structure. If * we ever move beyond a few dozen devices doing this, we may need to * reevaluate... * * @param busclass the devclass of the parent bus * @param dc the devclass of the driver being deleted * @param driver the driver being deleted */ static int devclass_driver_deleted(devclass_t busclass, devclass_t dc, driver_t *driver) { devclass_t parent; device_t dev; int error, i; /* * Disassociate from any devices. We iterate through all the * devices in the devclass of the driver and detach any which are * using the driver and which have a parent in the devclass which * we are deleting from. * * Note that since a driver can be in multiple devclasses, we * should not detach devices which are not children of devices in * the affected devclass. */ for (i = 0; i < dc->maxunit; i++) { if (dc->devices[i]) { dev = dc->devices[i]; if (dev->driver == driver && dev->parent && dev->parent->devclass == busclass) { if ((error = device_detach(dev)) != 0) return (error); BUS_PROBE_NOMATCH(dev->parent, dev); devnomatch(dev); dev->flags |= DF_DONENOMATCH; } } } /* * Walk through the children classes. Since we only keep a * single parent pointer around, we walk the entire list of * devclasses looking for children. We set the * DC_HAS_CHILDREN flag when a child devclass is created on * the parent, so we only walk the list for those devclasses * that have children. */ if (!(busclass->flags & DC_HAS_CHILDREN)) return (0); parent = busclass; TAILQ_FOREACH(busclass, &devclasses, link) { if (busclass->parent == parent) { error = devclass_driver_deleted(busclass, dc, driver); if (error) return (error); } } return (0); } /** * @brief Delete a device driver from a device class * * Delete a device driver from a devclass. This is normally called * automatically by DRIVER_MODULE(). * * If the driver is currently attached to any devices, * devclass_delete_driver() will first attempt to detach from each * device. If one of the detach calls fails, the driver will not be * deleted. * * @param dc the devclass to edit * @param driver the driver to unregister */ int devclass_delete_driver(devclass_t busclass, driver_t *driver) { devclass_t dc = devclass_find(driver->name); driverlink_t dl; int error; PDEBUG(("%s from devclass %s", driver->name, DEVCLANAME(busclass))); if (!dc) return (0); /* * Find the link structure in the bus' list of drivers. */ TAILQ_FOREACH(dl, &busclass->drivers, link) { if (dl->driver == driver) break; } if (!dl) { PDEBUG(("%s not found in %s list", driver->name, busclass->name)); return (ENOENT); } error = devclass_driver_deleted(busclass, dc, driver); if (error != 0) return (error); TAILQ_REMOVE(&busclass->drivers, dl, link); free(dl, M_BUS); /* XXX: kobj_mtx */ driver->refs--; if (driver->refs == 0) kobj_class_free((kobj_class_t) driver); bus_data_generation_update(); return (0); } /** * @brief Quiesces a set of device drivers from a device class * * Quiesce a device driver from a devclass. This is normally called * automatically by DRIVER_MODULE(). * * If the driver is currently attached to any devices, * devclass_quiesece_driver() will first attempt to quiesce each * device. * * @param dc the devclass to edit * @param driver the driver to unregister */ static int devclass_quiesce_driver(devclass_t busclass, driver_t *driver) { devclass_t dc = devclass_find(driver->name); driverlink_t dl; device_t dev; int i; int error; PDEBUG(("%s from devclass %s", driver->name, DEVCLANAME(busclass))); if (!dc) return (0); /* * Find the link structure in the bus' list of drivers. */ TAILQ_FOREACH(dl, &busclass->drivers, link) { if (dl->driver == driver) break; } if (!dl) { PDEBUG(("%s not found in %s list", driver->name, busclass->name)); return (ENOENT); } /* * Quiesce all devices. We iterate through all the devices in * the devclass of the driver and quiesce any which are using * the driver and which have a parent in the devclass which we * are quiescing. * * Note that since a driver can be in multiple devclasses, we * should not quiesce devices which are not children of * devices in the affected devclass. */ for (i = 0; i < dc->maxunit; i++) { if (dc->devices[i]) { dev = dc->devices[i]; if (dev->driver == driver && dev->parent && dev->parent->devclass == busclass) { if ((error = device_quiesce(dev)) != 0) return (error); } } } return (0); } /** * @internal */ static driverlink_t devclass_find_driver_internal(devclass_t dc, const char *classname) { driverlink_t dl; PDEBUG(("%s in devclass %s", classname, DEVCLANAME(dc))); TAILQ_FOREACH(dl, &dc->drivers, link) { if (!strcmp(dl->driver->name, classname)) return (dl); } PDEBUG(("not found")); return (NULL); } /** * @brief Return the name of the devclass */ const char * devclass_get_name(devclass_t dc) { return (dc->name); } /** * @brief Find a device given a unit number * * @param dc the devclass to search * @param unit the unit number to search for * * @returns the device with the given unit number or @c * NULL if there is no such device */ device_t devclass_get_device(devclass_t dc, int unit) { if (dc == NULL || unit < 0 || unit >= dc->maxunit) return (NULL); return (dc->devices[unit]); } /** * @brief Find the softc field of a device given a unit number * * @param dc the devclass to search * @param unit the unit number to search for * * @returns the softc field of the device with the given * unit number or @c NULL if there is no such * device */ void * devclass_get_softc(devclass_t dc, int unit) { device_t dev; dev = devclass_get_device(dc, unit); if (!dev) return (NULL); return (device_get_softc(dev)); } /** * @brief Get a list of devices in the devclass * * An array containing a list of all the devices in the given devclass * is allocated and returned in @p *devlistp. The number of devices * in the array is returned in @p *devcountp. The caller should free * the array using @c free(p, M_TEMP), even if @p *devcountp is 0. * * @param dc the devclass to examine * @param devlistp points at location for array pointer return * value * @param devcountp points at location for array size return value * * @retval 0 success * @retval ENOMEM the array allocation failed */ int devclass_get_devices(devclass_t dc, device_t **devlistp, int *devcountp) { int count, i; device_t *list; count = devclass_get_count(dc); list = malloc(count * sizeof(device_t), M_TEMP, M_NOWAIT|M_ZERO); if (!list) return (ENOMEM); count = 0; for (i = 0; i < dc->maxunit; i++) { if (dc->devices[i]) { list[count] = dc->devices[i]; count++; } } *devlistp = list; *devcountp = count; return (0); } /** * @brief Get a list of drivers in the devclass * * An array containing a list of pointers to all the drivers in the * given devclass is allocated and returned in @p *listp. The number * of drivers in the array is returned in @p *countp. The caller should * free the array using @c free(p, M_TEMP). * * @param dc the devclass to examine * @param listp gives location for array pointer return value * @param countp gives location for number of array elements * return value * * @retval 0 success * @retval ENOMEM the array allocation failed */ int devclass_get_drivers(devclass_t dc, driver_t ***listp, int *countp) { driverlink_t dl; driver_t **list; int count; count = 0; TAILQ_FOREACH(dl, &dc->drivers, link) count++; list = malloc(count * sizeof(driver_t *), M_TEMP, M_NOWAIT); if (list == NULL) return (ENOMEM); count = 0; TAILQ_FOREACH(dl, &dc->drivers, link) { list[count] = dl->driver; count++; } *listp = list; *countp = count; return (0); } /** * @brief Get the number of devices in a devclass * * @param dc the devclass to examine */ int devclass_get_count(devclass_t dc) { int count, i; count = 0; for (i = 0; i < dc->maxunit; i++) if (dc->devices[i]) count++; return (count); } /** * @brief Get the maximum unit number used in a devclass * * Note that this is one greater than the highest currently-allocated * unit. If a null devclass_t is passed in, -1 is returned to indicate * that not even the devclass has been allocated yet. * * @param dc the devclass to examine */ int devclass_get_maxunit(devclass_t dc) { if (dc == NULL) return (-1); return (dc->maxunit); } /** * @brief Find a free unit number in a devclass * * This function searches for the first unused unit number greater * that or equal to @p unit. * * @param dc the devclass to examine * @param unit the first unit number to check */ int devclass_find_free_unit(devclass_t dc, int unit) { if (dc == NULL) return (unit); while (unit < dc->maxunit && dc->devices[unit] != NULL) unit++; return (unit); } /** * @brief Set the parent of a devclass * * The parent class is normally initialised automatically by * DRIVER_MODULE(). * * @param dc the devclass to edit * @param pdc the new parent devclass */ void devclass_set_parent(devclass_t dc, devclass_t pdc) { dc->parent = pdc; } /** * @brief Get the parent of a devclass * * @param dc the devclass to examine */ devclass_t devclass_get_parent(devclass_t dc) { return (dc->parent); } struct sysctl_ctx_list * devclass_get_sysctl_ctx(devclass_t dc) { return (&dc->sysctl_ctx); } struct sysctl_oid * devclass_get_sysctl_tree(devclass_t dc) { return (dc->sysctl_tree); } /** * @internal * @brief Allocate a unit number * * On entry, @p *unitp is the desired unit number (or @c -1 if any * will do). The allocated unit number is returned in @p *unitp. * @param dc the devclass to allocate from * @param unitp points at the location for the allocated unit * number * * @retval 0 success * @retval EEXIST the requested unit number is already allocated * @retval ENOMEM memory allocation failure */ static int devclass_alloc_unit(devclass_t dc, device_t dev, int *unitp) { const char *s; int unit = *unitp; PDEBUG(("unit %d in devclass %s", unit, DEVCLANAME(dc))); /* Ask the parent bus if it wants to wire this device. */ if (unit == -1) BUS_HINT_DEVICE_UNIT(device_get_parent(dev), dev, dc->name, &unit); /* If we were given a wired unit number, check for existing device */ /* XXX imp XXX */ if (unit != -1) { if (unit >= 0 && unit < dc->maxunit && dc->devices[unit] != NULL) { if (bootverbose) printf("%s: %s%d already exists; skipping it\n", dc->name, dc->name, *unitp); return (EEXIST); } } else { /* Unwired device, find the next available slot for it */ unit = 0; for (unit = 0;; unit++) { /* If there is an "at" hint for a unit then skip it. */ if (resource_string_value(dc->name, unit, "at", &s) == 0) continue; /* If this device slot is already in use, skip it. */ if (unit < dc->maxunit && dc->devices[unit] != NULL) continue; break; } } /* * We've selected a unit beyond the length of the table, so let's * extend the table to make room for all units up to and including * this one. */ if (unit >= dc->maxunit) { device_t *newlist, *oldlist; int newsize; oldlist = dc->devices; newsize = roundup((unit + 1), MINALLOCSIZE / sizeof(device_t)); newlist = malloc(sizeof(device_t) * newsize, M_BUS, M_NOWAIT); if (!newlist) return (ENOMEM); if (oldlist != NULL) bcopy(oldlist, newlist, sizeof(device_t) * dc->maxunit); bzero(newlist + dc->maxunit, sizeof(device_t) * (newsize - dc->maxunit)); dc->devices = newlist; dc->maxunit = newsize; if (oldlist != NULL) free(oldlist, M_BUS); } PDEBUG(("now: unit %d in devclass %s", unit, DEVCLANAME(dc))); *unitp = unit; return (0); } /** * @internal * @brief Add a device to a devclass * * A unit number is allocated for the device (using the device's * preferred unit number if any) and the device is registered in the * devclass. This allows the device to be looked up by its unit * number, e.g. by decoding a dev_t minor number. * * @param dc the devclass to add to * @param dev the device to add * * @retval 0 success * @retval EEXIST the requested unit number is already allocated * @retval ENOMEM memory allocation failure */ static int devclass_add_device(devclass_t dc, device_t dev) { int buflen, error; PDEBUG(("%s in devclass %s", DEVICENAME(dev), DEVCLANAME(dc))); buflen = snprintf(NULL, 0, "%s%d$", dc->name, INT_MAX); if (buflen < 0) return (ENOMEM); dev->nameunit = malloc(buflen, M_BUS, M_NOWAIT|M_ZERO); if (!dev->nameunit) return (ENOMEM); if ((error = devclass_alloc_unit(dc, dev, &dev->unit)) != 0) { free(dev->nameunit, M_BUS); dev->nameunit = NULL; return (error); } dc->devices[dev->unit] = dev; dev->devclass = dc; snprintf(dev->nameunit, buflen, "%s%d", dc->name, dev->unit); return (0); } /** * @internal * @brief Delete a device from a devclass * * The device is removed from the devclass's device list and its unit * number is freed. * @param dc the devclass to delete from * @param dev the device to delete * * @retval 0 success */ static int devclass_delete_device(devclass_t dc, device_t dev) { if (!dc || !dev) return (0); PDEBUG(("%s in devclass %s", DEVICENAME(dev), DEVCLANAME(dc))); if (dev->devclass != dc || dc->devices[dev->unit] != dev) panic("devclass_delete_device: inconsistent device class"); dc->devices[dev->unit] = NULL; if (dev->flags & DF_WILDCARD) dev->unit = -1; dev->devclass = NULL; free(dev->nameunit, M_BUS); dev->nameunit = NULL; return (0); } /** * @internal * @brief Make a new device and add it as a child of @p parent * * @param parent the parent of the new device * @param name the devclass name of the new device or @c NULL * to leave the devclass unspecified * @parem unit the unit number of the new device of @c -1 to * leave the unit number unspecified * * @returns the new device */ static device_t make_device(device_t parent, const char *name, int unit) { device_t dev; devclass_t dc; PDEBUG(("%s at %s as unit %d", name, DEVICENAME(parent), unit)); if (name) { dc = devclass_find_internal(name, NULL, TRUE); if (!dc) { printf("make_device: can't find device class %s\n", name); return (NULL); } } else { dc = NULL; } dev = malloc(sizeof(struct device), M_BUS, M_NOWAIT|M_ZERO); if (!dev) return (NULL); dev->parent = parent; TAILQ_INIT(&dev->children); kobj_init((kobj_t) dev, &null_class); dev->driver = NULL; dev->devclass = NULL; dev->unit = unit; dev->nameunit = NULL; dev->desc = NULL; dev->busy = 0; dev->devflags = 0; dev->flags = DF_ENABLED; dev->order = 0; if (unit == -1) dev->flags |= DF_WILDCARD; if (name) { dev->flags |= DF_FIXEDCLASS; if (devclass_add_device(dc, dev)) { kobj_delete((kobj_t) dev, M_BUS); return (NULL); } } dev->ivars = NULL; dev->softc = NULL; dev->state = DS_NOTPRESENT; TAILQ_INSERT_TAIL(&bus_data_devices, dev, devlink); bus_data_generation_update(); return (dev); } /** * @internal * @brief Print a description of a device. */ static int device_print_child(device_t dev, device_t child) { int retval = 0; if (device_is_alive(child)) retval += BUS_PRINT_CHILD(dev, child); else retval += device_printf(child, " not found\n"); return (retval); } /** * @brief Create a new device * * This creates a new device and adds it as a child of an existing * parent device. The new device will be added after the last existing * child with order zero. * * @param dev the device which will be the parent of the * new child device * @param name devclass name for new device or @c NULL if not * specified * @param unit unit number for new device or @c -1 if not * specified * * @returns the new device */ device_t device_add_child(device_t dev, const char *name, int unit) { return (device_add_child_ordered(dev, 0, name, unit)); } /** * @brief Create a new device * * This creates a new device and adds it as a child of an existing * parent device. The new device will be added after the last existing * child with the same order. * * @param dev the device which will be the parent of the * new child device * @param order a value which is used to partially sort the * children of @p dev - devices created using * lower values of @p order appear first in @p * dev's list of children * @param name devclass name for new device or @c NULL if not * specified * @param unit unit number for new device or @c -1 if not * specified * * @returns the new device */ device_t device_add_child_ordered(device_t dev, u_int order, const char *name, int unit) { device_t child; device_t place; PDEBUG(("%s at %s with order %u as unit %d", name, DEVICENAME(dev), order, unit)); KASSERT(name != NULL || unit == -1, ("child device with wildcard name and specific unit number")); child = make_device(dev, name, unit); if (child == NULL) return (child); child->order = order; TAILQ_FOREACH(place, &dev->children, link) { if (place->order > order) break; } if (place) { /* * The device 'place' is the first device whose order is * greater than the new child. */ TAILQ_INSERT_BEFORE(place, child, link); } else { /* * The new child's order is greater or equal to the order of * any existing device. Add the child to the tail of the list. */ TAILQ_INSERT_TAIL(&dev->children, child, link); } bus_data_generation_update(); return (child); } /** * @brief Delete a device * * This function deletes a device along with all of its children. If * the device currently has a driver attached to it, the device is * detached first using device_detach(). * * @param dev the parent device * @param child the device to delete * * @retval 0 success * @retval non-zero a unit error code describing the error */ int device_delete_child(device_t dev, device_t child) { int error; device_t grandchild; PDEBUG(("%s from %s", DEVICENAME(child), DEVICENAME(dev))); /* remove children first */ while ((grandchild = TAILQ_FIRST(&child->children)) != NULL) { error = device_delete_child(child, grandchild); if (error) return (error); } if ((error = device_detach(child)) != 0) return (error); if (child->devclass) devclass_delete_device(child->devclass, child); if (child->parent) BUS_CHILD_DELETED(dev, child); TAILQ_REMOVE(&dev->children, child, link); TAILQ_REMOVE(&bus_data_devices, child, devlink); kobj_delete((kobj_t) child, M_BUS); bus_data_generation_update(); return (0); } /** * @brief Delete all children devices of the given device, if any. * * This function deletes all children devices of the given device, if * any, using the device_delete_child() function for each device it * finds. If a child device cannot be deleted, this function will * return an error code. * * @param dev the parent device * * @retval 0 success * @retval non-zero a device would not detach */ int device_delete_children(device_t dev) { device_t child; int error; PDEBUG(("Deleting all children of %s", DEVICENAME(dev))); error = 0; while ((child = TAILQ_FIRST(&dev->children)) != NULL) { error = device_delete_child(dev, child); if (error) { PDEBUG(("Failed deleting %s", DEVICENAME(child))); break; } } return (error); } /** * @brief Find a device given a unit number * * This is similar to devclass_get_devices() but only searches for * devices which have @p dev as a parent. * * @param dev the parent device to search * @param unit the unit number to search for. If the unit is -1, * return the first child of @p dev which has name * @p classname (that is, the one with the lowest unit.) * * @returns the device with the given unit number or @c * NULL if there is no such device */ device_t device_find_child(device_t dev, const char *classname, int unit) { devclass_t dc; device_t child; dc = devclass_find(classname); if (!dc) return (NULL); if (unit != -1) { child = devclass_get_device(dc, unit); if (child && child->parent == dev) return (child); } else { for (unit = 0; unit < devclass_get_maxunit(dc); unit++) { child = devclass_get_device(dc, unit); if (child && child->parent == dev) return (child); } } return (NULL); } /** * @internal */ static driverlink_t first_matching_driver(devclass_t dc, device_t dev) { if (dev->devclass) return (devclass_find_driver_internal(dc, dev->devclass->name)); return (TAILQ_FIRST(&dc->drivers)); } /** * @internal */ static driverlink_t next_matching_driver(devclass_t dc, device_t dev, driverlink_t last) { if (dev->devclass) { driverlink_t dl; for (dl = TAILQ_NEXT(last, link); dl; dl = TAILQ_NEXT(dl, link)) if (!strcmp(dev->devclass->name, dl->driver->name)) return (dl); return (NULL); } return (TAILQ_NEXT(last, link)); } /** * @internal */ int device_probe_child(device_t dev, device_t child) { devclass_t dc; driverlink_t best = NULL; driverlink_t dl; int result, pri = 0; int hasclass = (child->devclass != NULL); GIANT_REQUIRED; dc = dev->devclass; if (!dc) panic("device_probe_child: parent device has no devclass"); /* * If the state is already probed, then return. However, don't * return if we can rebid this object. */ if (child->state == DS_ALIVE && (child->flags & DF_REBID) == 0) return (0); for (; dc; dc = dc->parent) { for (dl = first_matching_driver(dc, child); dl; dl = next_matching_driver(dc, child, dl)) { /* If this driver's pass is too high, then ignore it. */ if (dl->pass > bus_current_pass) continue; PDEBUG(("Trying %s", DRIVERNAME(dl->driver))); result = device_set_driver(child, dl->driver); if (result == ENOMEM) return (result); else if (result != 0) continue; if (!hasclass) { if (device_set_devclass(child, dl->driver->name) != 0) { char const * devname = device_get_name(child); if (devname == NULL) devname = "(unknown)"; printf("driver bug: Unable to set " "devclass (class: %s " "devname: %s)\n", dl->driver->name, devname); (void)device_set_driver(child, NULL); continue; } } /* Fetch any flags for the device before probing. */ resource_int_value(dl->driver->name, child->unit, "flags", &child->devflags); result = DEVICE_PROBE(child); /* Reset flags and devclass before the next probe. */ child->devflags = 0; if (!hasclass) (void)device_set_devclass(child, NULL); /* * If the driver returns SUCCESS, there can be * no higher match for this device. */ if (result == 0) { best = dl; pri = 0; break; } /* + * Probes that return BUS_PROBE_NOWILDCARD or lower + * only match on devices whose driver was explicitly + * specified. + */ + if (result <= BUS_PROBE_NOWILDCARD && + !(child->flags & DF_FIXEDCLASS)) { + result = ENXIO; + } + + /* * The driver returned an error so it * certainly doesn't match. */ if (result > 0) { (void)device_set_driver(child, NULL); continue; } /* * A priority lower than SUCCESS, remember the * best matching driver. Initialise the value * of pri for the first match. */ if (best == NULL || result > pri) { - /* - * Probes that return BUS_PROBE_NOWILDCARD - * or lower only match on devices whose - * driver was explicitly specified. - */ - if (result <= BUS_PROBE_NOWILDCARD && - !(child->flags & DF_FIXEDCLASS)) - continue; best = dl; pri = result; continue; } } /* * If we have an unambiguous match in this devclass, * don't look in the parent. */ if (best && pri == 0) break; } /* * If we found a driver, change state and initialise the devclass. */ /* XXX What happens if we rebid and got no best? */ if (best) { /* * If this device was attached, and we were asked to * rescan, and it is a different driver, then we have * to detach the old driver and reattach this new one. * Note, we don't have to check for DF_REBID here * because if the state is > DS_ALIVE, we know it must * be. * * This assumes that all DF_REBID drivers can have * their probe routine called at any time and that * they are idempotent as well as completely benign in * normal operations. * * We also have to make sure that the detach * succeeded, otherwise we fail the operation (or * maybe it should just fail silently? I'm torn). */ if (child->state > DS_ALIVE && best->driver != child->driver) if ((result = device_detach(dev)) != 0) return (result); /* Set the winning driver, devclass, and flags. */ if (!child->devclass) { result = device_set_devclass(child, best->driver->name); if (result != 0) return (result); } result = device_set_driver(child, best->driver); if (result != 0) return (result); resource_int_value(best->driver->name, child->unit, "flags", &child->devflags); if (pri < 0) { /* * A bit bogus. Call the probe method again to make * sure that we have the right description. */ DEVICE_PROBE(child); #if 0 child->flags |= DF_REBID; #endif } else child->flags &= ~DF_REBID; child->state = DS_ALIVE; bus_data_generation_update(); return (0); } return (ENXIO); } /** * @brief Return the parent of a device */ device_t device_get_parent(device_t dev) { return (dev->parent); } /** * @brief Get a list of children of a device * * An array containing a list of all the children of the given device * is allocated and returned in @p *devlistp. The number of devices * in the array is returned in @p *devcountp. The caller should free * the array using @c free(p, M_TEMP). * * @param dev the device to examine * @param devlistp points at location for array pointer return * value * @param devcountp points at location for array size return value * * @retval 0 success * @retval ENOMEM the array allocation failed */ int device_get_children(device_t dev, device_t **devlistp, int *devcountp) { int count; device_t child; device_t *list; count = 0; TAILQ_FOREACH(child, &dev->children, link) { count++; } if (count == 0) { *devlistp = NULL; *devcountp = 0; return (0); } list = malloc(count * sizeof(device_t), M_TEMP, M_NOWAIT|M_ZERO); if (!list) return (ENOMEM); count = 0; TAILQ_FOREACH(child, &dev->children, link) { list[count] = child; count++; } *devlistp = list; *devcountp = count; return (0); } /** * @brief Return the current driver for the device or @c NULL if there * is no driver currently attached */ driver_t * device_get_driver(device_t dev) { return (dev->driver); } /** * @brief Return the current devclass for the device or @c NULL if * there is none. */ devclass_t device_get_devclass(device_t dev) { return (dev->devclass); } /** * @brief Return the name of the device's devclass or @c NULL if there * is none. */ const char * device_get_name(device_t dev) { if (dev != NULL && dev->devclass) return (devclass_get_name(dev->devclass)); return (NULL); } /** * @brief Return a string containing the device's devclass name * followed by an ascii representation of the device's unit number * (e.g. @c "foo2"). */ const char * device_get_nameunit(device_t dev) { return (dev->nameunit); } /** * @brief Return the device's unit number. */ int device_get_unit(device_t dev) { return (dev->unit); } /** * @brief Return the device's description string */ const char * device_get_desc(device_t dev) { return (dev->desc); } /** * @brief Return the device's flags */ uint32_t device_get_flags(device_t dev) { return (dev->devflags); } struct sysctl_ctx_list * device_get_sysctl_ctx(device_t dev) { return (&dev->sysctl_ctx); } struct sysctl_oid * device_get_sysctl_tree(device_t dev) { return (dev->sysctl_tree); } /** * @brief Print the name of the device followed by a colon and a space * * @returns the number of characters printed */ int device_print_prettyname(device_t dev) { const char *name = device_get_name(dev); if (name == NULL) return (printf("unknown: ")); return (printf("%s%d: ", name, device_get_unit(dev))); } /** * @brief Print the name of the device followed by a colon, a space * and the result of calling vprintf() with the value of @p fmt and * the following arguments. * * @returns the number of characters printed */ int device_printf(device_t dev, const char * fmt, ...) { va_list ap; int retval; retval = device_print_prettyname(dev); va_start(ap, fmt); retval += vprintf(fmt, ap); va_end(ap); return (retval); } /** * @internal */ static void device_set_desc_internal(device_t dev, const char* desc, int copy) { if (dev->desc && (dev->flags & DF_DESCMALLOCED)) { free(dev->desc, M_BUS); dev->flags &= ~DF_DESCMALLOCED; dev->desc = NULL; } if (copy && desc) { dev->desc = malloc(strlen(desc) + 1, M_BUS, M_NOWAIT); if (dev->desc) { strcpy(dev->desc, desc); dev->flags |= DF_DESCMALLOCED; } } else { /* Avoid a -Wcast-qual warning */ dev->desc = (char *)(uintptr_t) desc; } bus_data_generation_update(); } /** * @brief Set the device's description * * The value of @c desc should be a string constant that will not * change (at least until the description is changed in a subsequent * call to device_set_desc() or device_set_desc_copy()). */ void device_set_desc(device_t dev, const char* desc) { device_set_desc_internal(dev, desc, FALSE); } /** * @brief Set the device's description * * The string pointed to by @c desc is copied. Use this function if * the device description is generated, (e.g. with sprintf()). */ void device_set_desc_copy(device_t dev, const char* desc) { device_set_desc_internal(dev, desc, TRUE); } /** * @brief Set the device's flags */ void device_set_flags(device_t dev, uint32_t flags) { dev->devflags = flags; } /** * @brief Return the device's softc field * * The softc is allocated and zeroed when a driver is attached, based * on the size field of the driver. */ void * device_get_softc(device_t dev) { return (dev->softc); } /** * @brief Set the device's softc field * * Most drivers do not need to use this since the softc is allocated * automatically when the driver is attached. */ void device_set_softc(device_t dev, void *softc) { if (dev->softc && !(dev->flags & DF_EXTERNALSOFTC)) free(dev->softc, M_BUS_SC); dev->softc = softc; if (dev->softc) dev->flags |= DF_EXTERNALSOFTC; else dev->flags &= ~DF_EXTERNALSOFTC; } /** * @brief Free claimed softc * * Most drivers do not need to use this since the softc is freed * automatically when the driver is detached. */ void device_free_softc(void *softc) { free(softc, M_BUS_SC); } /** * @brief Claim softc * * This function can be used to let the driver free the automatically * allocated softc using "device_free_softc()". This function is * useful when the driver is refcounting the softc and the softc * cannot be freed when the "device_detach" method is called. */ void device_claim_softc(device_t dev) { if (dev->softc) dev->flags |= DF_EXTERNALSOFTC; else dev->flags &= ~DF_EXTERNALSOFTC; } /** * @brief Get the device's ivars field * * The ivars field is used by the parent device to store per-device * state (e.g. the physical location of the device or a list of * resources). */ void * device_get_ivars(device_t dev) { KASSERT(dev != NULL, ("device_get_ivars(NULL, ...)")); return (dev->ivars); } /** * @brief Set the device's ivars field */ void device_set_ivars(device_t dev, void * ivars) { KASSERT(dev != NULL, ("device_set_ivars(NULL, ...)")); dev->ivars = ivars; } /** * @brief Return the device's state */ device_state_t device_get_state(device_t dev) { return (dev->state); } /** * @brief Set the DF_ENABLED flag for the device */ void device_enable(device_t dev) { dev->flags |= DF_ENABLED; } /** * @brief Clear the DF_ENABLED flag for the device */ void device_disable(device_t dev) { dev->flags &= ~DF_ENABLED; } /** * @brief Increment the busy counter for the device */ void device_busy(device_t dev) { if (dev->state < DS_ATTACHING) panic("device_busy: called for unattached device"); if (dev->busy == 0 && dev->parent) device_busy(dev->parent); dev->busy++; if (dev->state == DS_ATTACHED) dev->state = DS_BUSY; } /** * @brief Decrement the busy counter for the device */ void device_unbusy(device_t dev) { if (dev->busy != 0 && dev->state != DS_BUSY && dev->state != DS_ATTACHING) panic("device_unbusy: called for non-busy device %s", device_get_nameunit(dev)); dev->busy--; if (dev->busy == 0) { if (dev->parent) device_unbusy(dev->parent); if (dev->state == DS_BUSY) dev->state = DS_ATTACHED; } } /** * @brief Set the DF_QUIET flag for the device */ void device_quiet(device_t dev) { dev->flags |= DF_QUIET; } /** * @brief Clear the DF_QUIET flag for the device */ void device_verbose(device_t dev) { dev->flags &= ~DF_QUIET; } /** * @brief Return non-zero if the DF_QUIET flag is set on the device */ int device_is_quiet(device_t dev) { return ((dev->flags & DF_QUIET) != 0); } /** * @brief Return non-zero if the DF_ENABLED flag is set on the device */ int device_is_enabled(device_t dev) { return ((dev->flags & DF_ENABLED) != 0); } /** * @brief Return non-zero if the device was successfully probed */ int device_is_alive(device_t dev) { return (dev->state >= DS_ALIVE); } /** * @brief Return non-zero if the device currently has a driver * attached to it */ int device_is_attached(device_t dev) { return (dev->state >= DS_ATTACHED); } /** * @brief Return non-zero if the device is currently suspended. */ int device_is_suspended(device_t dev) { return ((dev->flags & DF_SUSPENDED) != 0); } /** * @brief Set the devclass of a device * @see devclass_add_device(). */ int device_set_devclass(device_t dev, const char *classname) { devclass_t dc; int error; if (!classname) { if (dev->devclass) devclass_delete_device(dev->devclass, dev); return (0); } if (dev->devclass) { printf("device_set_devclass: device class already set\n"); return (EINVAL); } dc = devclass_find_internal(classname, NULL, TRUE); if (!dc) return (ENOMEM); error = devclass_add_device(dc, dev); bus_data_generation_update(); return (error); } /** * @brief Set the devclass of a device and mark the devclass fixed. * @see device_set_devclass() */ int device_set_devclass_fixed(device_t dev, const char *classname) { int error; if (classname == NULL) return (EINVAL); error = device_set_devclass(dev, classname); if (error) return (error); dev->flags |= DF_FIXEDCLASS; return (0); } /** * @brief Set the driver of a device * * @retval 0 success * @retval EBUSY the device already has a driver attached * @retval ENOMEM a memory allocation failure occurred */ int device_set_driver(device_t dev, driver_t *driver) { if (dev->state >= DS_ATTACHED) return (EBUSY); if (dev->driver == driver) return (0); if (dev->softc && !(dev->flags & DF_EXTERNALSOFTC)) { free(dev->softc, M_BUS_SC); dev->softc = NULL; } device_set_desc(dev, NULL); kobj_delete((kobj_t) dev, NULL); dev->driver = driver; if (driver) { kobj_init((kobj_t) dev, (kobj_class_t) driver); if (!(dev->flags & DF_EXTERNALSOFTC) && driver->size > 0) { dev->softc = malloc(driver->size, M_BUS_SC, M_NOWAIT | M_ZERO); if (!dev->softc) { kobj_delete((kobj_t) dev, NULL); kobj_init((kobj_t) dev, &null_class); dev->driver = NULL; return (ENOMEM); } } } else { kobj_init((kobj_t) dev, &null_class); } bus_data_generation_update(); return (0); } /** * @brief Probe a device, and return this status. * * This function is the core of the device autoconfiguration * system. Its purpose is to select a suitable driver for a device and * then call that driver to initialise the hardware appropriately. The * driver is selected by calling the DEVICE_PROBE() method of a set of * candidate drivers and then choosing the driver which returned the * best value. This driver is then attached to the device using * device_attach(). * * The set of suitable drivers is taken from the list of drivers in * the parent device's devclass. If the device was originally created * with a specific class name (see device_add_child()), only drivers * with that name are probed, otherwise all drivers in the devclass * are probed. If no drivers return successful probe values in the * parent devclass, the search continues in the parent of that * devclass (see devclass_get_parent()) if any. * * @param dev the device to initialise * * @retval 0 success * @retval ENXIO no driver was found * @retval ENOMEM memory allocation failure * @retval non-zero some other unix error code * @retval -1 Device already attached */ int device_probe(device_t dev) { int error; GIANT_REQUIRED; if (dev->state >= DS_ALIVE && (dev->flags & DF_REBID) == 0) return (-1); if (!(dev->flags & DF_ENABLED)) { if (bootverbose && device_get_name(dev) != NULL) { device_print_prettyname(dev); printf("not probed (disabled)\n"); } return (-1); } if ((error = device_probe_child(dev->parent, dev)) != 0) { if (bus_current_pass == BUS_PASS_DEFAULT && !(dev->flags & DF_DONENOMATCH)) { BUS_PROBE_NOMATCH(dev->parent, dev); devnomatch(dev); dev->flags |= DF_DONENOMATCH; } return (error); } return (0); } /** * @brief Probe a device and attach a driver if possible * * calls device_probe() and attaches if that was successful. */ int device_probe_and_attach(device_t dev) { int error; GIANT_REQUIRED; error = device_probe(dev); if (error == -1) return (0); else if (error != 0) return (error); CURVNET_SET_QUIET(vnet0); error = device_attach(dev); CURVNET_RESTORE(); return error; } /** * @brief Attach a device driver to a device * * This function is a wrapper around the DEVICE_ATTACH() driver * method. In addition to calling DEVICE_ATTACH(), it initialises the * device's sysctl tree, optionally prints a description of the device * and queues a notification event for user-based device management * services. * * Normally this function is only called internally from * device_probe_and_attach(). * * @param dev the device to initialise * * @retval 0 success * @retval ENXIO no driver was found * @retval ENOMEM memory allocation failure * @retval non-zero some other unix error code */ int device_attach(device_t dev) { uint64_t attachtime; int error; if (resource_disabled(dev->driver->name, dev->unit)) { device_disable(dev); if (bootverbose) device_printf(dev, "disabled via hints entry\n"); return (ENXIO); } device_sysctl_init(dev); if (!device_is_quiet(dev)) device_print_child(dev->parent, dev); attachtime = get_cyclecount(); dev->state = DS_ATTACHING; if ((error = DEVICE_ATTACH(dev)) != 0) { printf("device_attach: %s%d attach returned %d\n", dev->driver->name, dev->unit, error); if (!(dev->flags & DF_FIXEDCLASS)) devclass_delete_device(dev->devclass, dev); (void)device_set_driver(dev, NULL); device_sysctl_fini(dev); KASSERT(dev->busy == 0, ("attach failed but busy")); dev->state = DS_NOTPRESENT; return (error); } attachtime = get_cyclecount() - attachtime; /* * 4 bits per device is a reasonable value for desktop and server * hardware with good get_cyclecount() implementations, but may * need to be adjusted on other platforms. */ #ifdef RANDOM_DEBUG printf("random: %s(): feeding %d bit(s) of entropy from %s%d\n", __func__, 4, dev->driver->name, dev->unit); #endif random_harvest(&attachtime, sizeof(attachtime), 4, RANDOM_ATTACH); device_sysctl_update(dev); if (dev->busy) dev->state = DS_BUSY; else dev->state = DS_ATTACHED; dev->flags &= ~DF_DONENOMATCH; devadded(dev); return (0); } /** * @brief Detach a driver from a device * * This function is a wrapper around the DEVICE_DETACH() driver * method. If the call to DEVICE_DETACH() succeeds, it calls * BUS_CHILD_DETACHED() for the parent of @p dev, queues a * notification event for user-based device management services and * cleans up the device's sysctl tree. * * @param dev the device to un-initialise * * @retval 0 success * @retval ENXIO no driver was found * @retval ENOMEM memory allocation failure * @retval non-zero some other unix error code */ int device_detach(device_t dev) { int error; GIANT_REQUIRED; PDEBUG(("%s", DEVICENAME(dev))); if (dev->state == DS_BUSY) return (EBUSY); if (dev->state != DS_ATTACHED) return (0); if ((error = DEVICE_DETACH(dev)) != 0) return (error); devremoved(dev); if (!device_is_quiet(dev)) device_printf(dev, "detached\n"); if (dev->parent) BUS_CHILD_DETACHED(dev->parent, dev); if (!(dev->flags & DF_FIXEDCLASS)) devclass_delete_device(dev->devclass, dev); dev->state = DS_NOTPRESENT; (void)device_set_driver(dev, NULL); device_sysctl_fini(dev); return (0); } /** * @brief Tells a driver to quiesce itself. * * This function is a wrapper around the DEVICE_QUIESCE() driver * method. If the call to DEVICE_QUIESCE() succeeds. * * @param dev the device to quiesce * * @retval 0 success * @retval ENXIO no driver was found * @retval ENOMEM memory allocation failure * @retval non-zero some other unix error code */ int device_quiesce(device_t dev) { PDEBUG(("%s", DEVICENAME(dev))); if (dev->state == DS_BUSY) return (EBUSY); if (dev->state != DS_ATTACHED) return (0); return (DEVICE_QUIESCE(dev)); } /** * @brief Notify a device of system shutdown * * This function calls the DEVICE_SHUTDOWN() driver method if the * device currently has an attached driver. * * @returns the value returned by DEVICE_SHUTDOWN() */ int device_shutdown(device_t dev) { if (dev->state < DS_ATTACHED) return (0); return (DEVICE_SHUTDOWN(dev)); } /** * @brief Set the unit number of a device * * This function can be used to override the unit number used for a * device (e.g. to wire a device to a pre-configured unit number). */ int device_set_unit(device_t dev, int unit) { devclass_t dc; int err; dc = device_get_devclass(dev); if (unit < dc->maxunit && dc->devices[unit]) return (EBUSY); err = devclass_delete_device(dc, dev); if (err) return (err); dev->unit = unit; err = devclass_add_device(dc, dev); if (err) return (err); bus_data_generation_update(); return (0); } /*======================================*/ /* * Some useful method implementations to make life easier for bus drivers. */ /** * @brief Initialise a resource list. * * @param rl the resource list to initialise */ void resource_list_init(struct resource_list *rl) { STAILQ_INIT(rl); } /** * @brief Reclaim memory used by a resource list. * * This function frees the memory for all resource entries on the list * (if any). * * @param rl the resource list to free */ void resource_list_free(struct resource_list *rl) { struct resource_list_entry *rle; while ((rle = STAILQ_FIRST(rl)) != NULL) { if (rle->res) panic("resource_list_free: resource entry is busy"); STAILQ_REMOVE_HEAD(rl, link); free(rle, M_BUS); } } /** * @brief Add a resource entry. * * This function adds a resource entry using the given @p type, @p * start, @p end and @p count values. A rid value is chosen by * searching sequentially for the first unused rid starting at zero. * * @param rl the resource list to edit * @param type the resource entry type (e.g. SYS_RES_MEMORY) * @param start the start address of the resource * @param end the end address of the resource * @param count XXX end-start+1 */ int resource_list_add_next(struct resource_list *rl, int type, u_long start, u_long end, u_long count) { int rid; rid = 0; while (resource_list_find(rl, type, rid) != NULL) rid++; resource_list_add(rl, type, rid, start, end, count); return (rid); } /** * @brief Add or modify a resource entry. * * If an existing entry exists with the same type and rid, it will be * modified using the given values of @p start, @p end and @p * count. If no entry exists, a new one will be created using the * given values. The resource list entry that matches is then returned. * * @param rl the resource list to edit * @param type the resource entry type (e.g. SYS_RES_MEMORY) * @param rid the resource identifier * @param start the start address of the resource * @param end the end address of the resource * @param count XXX end-start+1 */ struct resource_list_entry * resource_list_add(struct resource_list *rl, int type, int rid, u_long start, u_long end, u_long count) { struct resource_list_entry *rle; rle = resource_list_find(rl, type, rid); if (!rle) { rle = malloc(sizeof(struct resource_list_entry), M_BUS, M_NOWAIT); if (!rle) panic("resource_list_add: can't record entry"); STAILQ_INSERT_TAIL(rl, rle, link); rle->type = type; rle->rid = rid; rle->res = NULL; rle->flags = 0; } if (rle->res) panic("resource_list_add: resource entry is busy"); rle->start = start; rle->end = end; rle->count = count; return (rle); } /** * @brief Determine if a resource entry is busy. * * Returns true if a resource entry is busy meaning that it has an * associated resource that is not an unallocated "reserved" resource. * * @param rl the resource list to search * @param type the resource entry type (e.g. SYS_RES_MEMORY) * @param rid the resource identifier * * @returns Non-zero if the entry is busy, zero otherwise. */ int resource_list_busy(struct resource_list *rl, int type, int rid) { struct resource_list_entry *rle; rle = resource_list_find(rl, type, rid); if (rle == NULL || rle->res == NULL) return (0); if ((rle->flags & (RLE_RESERVED | RLE_ALLOCATED)) == RLE_RESERVED) { KASSERT(!(rman_get_flags(rle->res) & RF_ACTIVE), ("reserved resource is active")); return (0); } return (1); } /** * @brief Determine if a resource entry is reserved. * * Returns true if a resource entry is reserved meaning that it has an * associated "reserved" resource. The resource can either be * allocated or unallocated. * * @param rl the resource list to search * @param type the resource entry type (e.g. SYS_RES_MEMORY) * @param rid the resource identifier * * @returns Non-zero if the entry is reserved, zero otherwise. */ int resource_list_reserved(struct resource_list *rl, int type, int rid) { struct resource_list_entry *rle; rle = resource_list_find(rl, type, rid); if (rle != NULL && rle->flags & RLE_RESERVED) return (1); return (0); } /** * @brief Find a resource entry by type and rid. * * @param rl the resource list to search * @param type the resource entry type (e.g. SYS_RES_MEMORY) * @param rid the resource identifier * * @returns the resource entry pointer or NULL if there is no such * entry. */ struct resource_list_entry * resource_list_find(struct resource_list *rl, int type, int rid) { struct resource_list_entry *rle; STAILQ_FOREACH(rle, rl, link) { if (rle->type == type && rle->rid == rid) return (rle); } return (NULL); } /** * @brief Delete a resource entry. * * @param rl the resource list to edit * @param type the resource entry type (e.g. SYS_RES_MEMORY) * @param rid the resource identifier */ void resource_list_delete(struct resource_list *rl, int type, int rid) { struct resource_list_entry *rle = resource_list_find(rl, type, rid); if (rle) { if (rle->res != NULL) panic("resource_list_delete: resource has not been released"); STAILQ_REMOVE(rl, rle, resource_list_entry, link); free(rle, M_BUS); } } /** * @brief Allocate a reserved resource * * This can be used by busses to force the allocation of resources * that are always active in the system even if they are not allocated * by a driver (e.g. PCI BARs). This function is usually called when * adding a new child to the bus. The resource is allocated from the * parent bus when it is reserved. The resource list entry is marked * with RLE_RESERVED to note that it is a reserved resource. * * Subsequent attempts to allocate the resource with * resource_list_alloc() will succeed the first time and will set * RLE_ALLOCATED to note that it has been allocated. When a reserved * resource that has been allocated is released with * resource_list_release() the resource RLE_ALLOCATED is cleared, but * the actual resource remains allocated. The resource can be released to * the parent bus by calling resource_list_unreserve(). * * @param rl the resource list to allocate from * @param bus the parent device of @p child * @param child the device for which the resource is being reserved * @param type the type of resource to allocate * @param rid a pointer to the resource identifier * @param start hint at the start of the resource range - pass * @c 0UL for any start address * @param end hint at the end of the resource range - pass * @c ~0UL for any end address * @param count hint at the size of range required - pass @c 1 * for any size * @param flags any extra flags to control the resource * allocation - see @c RF_XXX flags in * for details * * @returns the resource which was allocated or @c NULL if no * resource could be allocated */ struct resource * resource_list_reserve(struct resource_list *rl, device_t bus, device_t child, int type, int *rid, u_long start, u_long end, u_long count, u_int flags) { struct resource_list_entry *rle = NULL; int passthrough = (device_get_parent(child) != bus); struct resource *r; if (passthrough) panic( "resource_list_reserve() should only be called for direct children"); if (flags & RF_ACTIVE) panic( "resource_list_reserve() should only reserve inactive resources"); r = resource_list_alloc(rl, bus, child, type, rid, start, end, count, flags); if (r != NULL) { rle = resource_list_find(rl, type, *rid); rle->flags |= RLE_RESERVED; } return (r); } /** * @brief Helper function for implementing BUS_ALLOC_RESOURCE() * * Implement BUS_ALLOC_RESOURCE() by looking up a resource from the list * and passing the allocation up to the parent of @p bus. This assumes * that the first entry of @c device_get_ivars(child) is a struct * resource_list. This also handles 'passthrough' allocations where a * child is a remote descendant of bus by passing the allocation up to * the parent of bus. * * Typically, a bus driver would store a list of child resources * somewhere in the child device's ivars (see device_get_ivars()) and * its implementation of BUS_ALLOC_RESOURCE() would find that list and * then call resource_list_alloc() to perform the allocation. * * @param rl the resource list to allocate from * @param bus the parent device of @p child * @param child the device which is requesting an allocation * @param type the type of resource to allocate * @param rid a pointer to the resource identifier * @param start hint at the start of the resource range - pass * @c 0UL for any start address * @param end hint at the end of the resource range - pass * @c ~0UL for any end address * @param count hint at the size of range required - pass @c 1 * for any size * @param flags any extra flags to control the resource * allocation - see @c RF_XXX flags in * for details * * @returns the resource which was allocated or @c NULL if no * resource could be allocated */ struct resource * resource_list_alloc(struct resource_list *rl, device_t bus, device_t child, int type, int *rid, u_long start, u_long end, u_long count, u_int flags) { struct resource_list_entry *rle = NULL; int passthrough = (device_get_parent(child) != bus); int isdefault = (start == 0UL && end == ~0UL); if (passthrough) { return (BUS_ALLOC_RESOURCE(device_get_parent(bus), child, type, rid, start, end, count, flags)); } rle = resource_list_find(rl, type, *rid); if (!rle) return (NULL); /* no resource of that type/rid */ if (rle->res) { if (rle->flags & RLE_RESERVED) { if (rle->flags & RLE_ALLOCATED) return (NULL); if ((flags & RF_ACTIVE) && bus_activate_resource(child, type, *rid, rle->res) != 0) return (NULL); rle->flags |= RLE_ALLOCATED; return (rle->res); } device_printf(bus, "resource entry %#x type %d for child %s is busy\n", *rid, type, device_get_nameunit(child)); return (NULL); } if (isdefault) { start = rle->start; count = ulmax(count, rle->count); end = ulmax(rle->end, start + count - 1); } rle->res = BUS_ALLOC_RESOURCE(device_get_parent(bus), child, type, rid, start, end, count, flags); /* * Record the new range. */ if (rle->res) { rle->start = rman_get_start(rle->res); rle->end = rman_get_end(rle->res); rle->count = count; } return (rle->res); } /** * @brief Helper function for implementing BUS_RELEASE_RESOURCE() * * Implement BUS_RELEASE_RESOURCE() using a resource list. Normally * used with resource_list_alloc(). * * @param rl the resource list which was allocated from * @param bus the parent device of @p child * @param child the device which is requesting a release * @param type the type of resource to release * @param rid the resource identifier * @param res the resource to release * * @retval 0 success * @retval non-zero a standard unix error code indicating what * error condition prevented the operation */ int resource_list_release(struct resource_list *rl, device_t bus, device_t child, int type, int rid, struct resource *res) { struct resource_list_entry *rle = NULL; int passthrough = (device_get_parent(child) != bus); int error; if (passthrough) { return (BUS_RELEASE_RESOURCE(device_get_parent(bus), child, type, rid, res)); } rle = resource_list_find(rl, type, rid); if (!rle) panic("resource_list_release: can't find resource"); if (!rle->res) panic("resource_list_release: resource entry is not busy"); if (rle->flags & RLE_RESERVED) { if (rle->flags & RLE_ALLOCATED) { if (rman_get_flags(res) & RF_ACTIVE) { error = bus_deactivate_resource(child, type, rid, res); if (error) return (error); } rle->flags &= ~RLE_ALLOCATED; return (0); } return (EINVAL); } error = BUS_RELEASE_RESOURCE(device_get_parent(bus), child, type, rid, res); if (error) return (error); rle->res = NULL; return (0); } /** * @brief Release all active resources of a given type * * Release all active resources of a specified type. This is intended * to be used to cleanup resources leaked by a driver after detach or * a failed attach. * * @param rl the resource list which was allocated from * @param bus the parent device of @p child * @param child the device whose active resources are being released * @param type the type of resources to release * * @retval 0 success * @retval EBUSY at least one resource was active */ int resource_list_release_active(struct resource_list *rl, device_t bus, device_t child, int type) { struct resource_list_entry *rle; int error, retval; retval = 0; STAILQ_FOREACH(rle, rl, link) { if (rle->type != type) continue; if (rle->res == NULL) continue; if ((rle->flags & (RLE_RESERVED | RLE_ALLOCATED)) == RLE_RESERVED) continue; retval = EBUSY; error = resource_list_release(rl, bus, child, type, rman_get_rid(rle->res), rle->res); if (error != 0) device_printf(bus, "Failed to release active resource: %d\n", error); } return (retval); } /** * @brief Fully release a reserved resource * * Fully releases a resource reserved via resource_list_reserve(). * * @param rl the resource list which was allocated from * @param bus the parent device of @p child * @param child the device whose reserved resource is being released * @param type the type of resource to release * @param rid the resource identifier * @param res the resource to release * * @retval 0 success * @retval non-zero a standard unix error code indicating what * error condition prevented the operation */ int resource_list_unreserve(struct resource_list *rl, device_t bus, device_t child, int type, int rid) { struct resource_list_entry *rle = NULL; int passthrough = (device_get_parent(child) != bus); if (passthrough) panic( "resource_list_unreserve() should only be called for direct children"); rle = resource_list_find(rl, type, rid); if (!rle) panic("resource_list_unreserve: can't find resource"); if (!(rle->flags & RLE_RESERVED)) return (EINVAL); if (rle->flags & RLE_ALLOCATED) return (EBUSY); rle->flags &= ~RLE_RESERVED; return (resource_list_release(rl, bus, child, type, rid, rle->res)); } /** * @brief Print a description of resources in a resource list * * Print all resources of a specified type, for use in BUS_PRINT_CHILD(). * The name is printed if at least one resource of the given type is available. * The format is used to print resource start and end. * * @param rl the resource list to print * @param name the name of @p type, e.g. @c "memory" * @param type type type of resource entry to print * @param format printf(9) format string to print resource * start and end values * * @returns the number of characters printed */ int resource_list_print_type(struct resource_list *rl, const char *name, int type, const char *format) { struct resource_list_entry *rle; int printed, retval; printed = 0; retval = 0; /* Yes, this is kinda cheating */ STAILQ_FOREACH(rle, rl, link) { if (rle->type == type) { if (printed == 0) retval += printf(" %s ", name); else retval += printf(","); printed++; retval += printf(format, rle->start); if (rle->count > 1) { retval += printf("-"); retval += printf(format, rle->start + rle->count - 1); } } } return (retval); } /** * @brief Releases all the resources in a list. * * @param rl The resource list to purge. * * @returns nothing */ void resource_list_purge(struct resource_list *rl) { struct resource_list_entry *rle; while ((rle = STAILQ_FIRST(rl)) != NULL) { if (rle->res) bus_release_resource(rman_get_device(rle->res), rle->type, rle->rid, rle->res); STAILQ_REMOVE_HEAD(rl, link); free(rle, M_BUS); } } device_t bus_generic_add_child(device_t dev, u_int order, const char *name, int unit) { return (device_add_child_ordered(dev, order, name, unit)); } /** * @brief Helper function for implementing DEVICE_PROBE() * * This function can be used to help implement the DEVICE_PROBE() for * a bus (i.e. a device which has other devices attached to it). It * calls the DEVICE_IDENTIFY() method of each driver in the device's * devclass. */ int bus_generic_probe(device_t dev) { devclass_t dc = dev->devclass; driverlink_t dl; TAILQ_FOREACH(dl, &dc->drivers, link) { /* * If this driver's pass is too high, then ignore it. * For most drivers in the default pass, this will * never be true. For early-pass drivers they will * only call the identify routines of eligible drivers * when this routine is called. Drivers for later * passes should have their identify routines called * on early-pass busses during BUS_NEW_PASS(). */ if (dl->pass > bus_current_pass) continue; DEVICE_IDENTIFY(dl->driver, dev); } return (0); } /** * @brief Helper function for implementing DEVICE_ATTACH() * * This function can be used to help implement the DEVICE_ATTACH() for * a bus. It calls device_probe_and_attach() for each of the device's * children. */ int bus_generic_attach(device_t dev) { device_t child; TAILQ_FOREACH(child, &dev->children, link) { device_probe_and_attach(child); } return (0); } /** * @brief Helper function for implementing DEVICE_DETACH() * * This function can be used to help implement the DEVICE_DETACH() for * a bus. It calls device_detach() for each of the device's * children. */ int bus_generic_detach(device_t dev) { device_t child; int error; if (dev->state != DS_ATTACHED) return (EBUSY); TAILQ_FOREACH(child, &dev->children, link) { if ((error = device_detach(child)) != 0) return (error); } return (0); } /** * @brief Helper function for implementing DEVICE_SHUTDOWN() * * This function can be used to help implement the DEVICE_SHUTDOWN() * for a bus. It calls device_shutdown() for each of the device's * children. */ int bus_generic_shutdown(device_t dev) { device_t child; TAILQ_FOREACH(child, &dev->children, link) { device_shutdown(child); } return (0); } /** * @brief Default function for suspending a child device. * * This function is to be used by a bus's DEVICE_SUSPEND_CHILD(). */ int bus_generic_suspend_child(device_t dev, device_t child) { int error; error = DEVICE_SUSPEND(child); if (error == 0) child->flags |= DF_SUSPENDED; return (error); } /** * @brief Default function for resuming a child device. * * This function is to be used by a bus's DEVICE_RESUME_CHILD(). */ int bus_generic_resume_child(device_t dev, device_t child) { DEVICE_RESUME(child); child->flags &= ~DF_SUSPENDED; return (0); } /** * @brief Helper function for implementing DEVICE_SUSPEND() * * This function can be used to help implement the DEVICE_SUSPEND() * for a bus. It calls DEVICE_SUSPEND() for each of the device's * children. If any call to DEVICE_SUSPEND() fails, the suspend * operation is aborted and any devices which were suspended are * resumed immediately by calling their DEVICE_RESUME() methods. */ int bus_generic_suspend(device_t dev) { int error; device_t child, child2; TAILQ_FOREACH(child, &dev->children, link) { error = BUS_SUSPEND_CHILD(dev, child); if (error) { for (child2 = TAILQ_FIRST(&dev->children); child2 && child2 != child; child2 = TAILQ_NEXT(child2, link)) BUS_RESUME_CHILD(dev, child2); return (error); } } return (0); } /** * @brief Helper function for implementing DEVICE_RESUME() * * This function can be used to help implement the DEVICE_RESUME() for * a bus. It calls DEVICE_RESUME() on each of the device's children. */ int bus_generic_resume(device_t dev) { device_t child; TAILQ_FOREACH(child, &dev->children, link) { BUS_RESUME_CHILD(dev, child); /* if resume fails, there's nothing we can usefully do... */ } return (0); } /** * @brief Helper function for implementing BUS_PRINT_CHILD(). * * This function prints the first part of the ascii representation of * @p child, including its name, unit and description (if any - see * device_set_desc()). * * @returns the number of characters printed */ int bus_print_child_header(device_t dev, device_t child) { int retval = 0; if (device_get_desc(child)) { retval += device_printf(child, "<%s>", device_get_desc(child)); } else { retval += printf("%s", device_get_nameunit(child)); } return (retval); } /** * @brief Helper function for implementing BUS_PRINT_CHILD(). * * This function prints the last part of the ascii representation of * @p child, which consists of the string @c " on " followed by the * name and unit of the @p dev. * * @returns the number of characters printed */ int bus_print_child_footer(device_t dev, device_t child) { return (printf(" on %s\n", device_get_nameunit(dev))); } /** * @brief Helper function for implementing BUS_PRINT_CHILD(). * * This function prints out the VM domain for the given device. * * @returns the number of characters printed */ int bus_print_child_domain(device_t dev, device_t child) { int domain; /* No domain? Don't print anything */ if (BUS_GET_DOMAIN(dev, child, &domain) != 0) return (0); return (printf(" numa-domain %d", domain)); } /** * @brief Helper function for implementing BUS_PRINT_CHILD(). * * This function simply calls bus_print_child_header() followed by * bus_print_child_footer(). * * @returns the number of characters printed */ int bus_generic_print_child(device_t dev, device_t child) { int retval = 0; retval += bus_print_child_header(dev, child); retval += bus_print_child_domain(dev, child); retval += bus_print_child_footer(dev, child); return (retval); } /** * @brief Stub function for implementing BUS_READ_IVAR(). * * @returns ENOENT */ int bus_generic_read_ivar(device_t dev, device_t child, int index, uintptr_t * result) { return (ENOENT); } /** * @brief Stub function for implementing BUS_WRITE_IVAR(). * * @returns ENOENT */ int bus_generic_write_ivar(device_t dev, device_t child, int index, uintptr_t value) { return (ENOENT); } /** * @brief Stub function for implementing BUS_GET_RESOURCE_LIST(). * * @returns NULL */ struct resource_list * bus_generic_get_resource_list(device_t dev, device_t child) { return (NULL); } /** * @brief Helper function for implementing BUS_DRIVER_ADDED(). * * This implementation of BUS_DRIVER_ADDED() simply calls the driver's * DEVICE_IDENTIFY() method to allow it to add new children to the bus * and then calls device_probe_and_attach() for each unattached child. */ void bus_generic_driver_added(device_t dev, driver_t *driver) { device_t child; DEVICE_IDENTIFY(driver, dev); TAILQ_FOREACH(child, &dev->children, link) { if (child->state == DS_NOTPRESENT || (child->flags & DF_REBID)) device_probe_and_attach(child); } } /** * @brief Helper function for implementing BUS_NEW_PASS(). * * This implementing of BUS_NEW_PASS() first calls the identify * routines for any drivers that probe at the current pass. Then it * walks the list of devices for this bus. If a device is already * attached, then it calls BUS_NEW_PASS() on that device. If the * device is not already attached, it attempts to attach a driver to * it. */ void bus_generic_new_pass(device_t dev) { driverlink_t dl; devclass_t dc; device_t child; dc = dev->devclass; TAILQ_FOREACH(dl, &dc->drivers, link) { if (dl->pass == bus_current_pass) DEVICE_IDENTIFY(dl->driver, dev); } TAILQ_FOREACH(child, &dev->children, link) { if (child->state >= DS_ATTACHED) BUS_NEW_PASS(child); else if (child->state == DS_NOTPRESENT) device_probe_and_attach(child); } } /** * @brief Helper function for implementing BUS_SETUP_INTR(). * * This simple implementation of BUS_SETUP_INTR() simply calls the * BUS_SETUP_INTR() method of the parent of @p dev. */ int bus_generic_setup_intr(device_t dev, device_t child, struct resource *irq, int flags, driver_filter_t *filter, driver_intr_t *intr, void *arg, void **cookiep) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_SETUP_INTR(dev->parent, child, irq, flags, filter, intr, arg, cookiep)); return (EINVAL); } /** * @brief Helper function for implementing BUS_TEARDOWN_INTR(). * * This simple implementation of BUS_TEARDOWN_INTR() simply calls the * BUS_TEARDOWN_INTR() method of the parent of @p dev. */ int bus_generic_teardown_intr(device_t dev, device_t child, struct resource *irq, void *cookie) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_TEARDOWN_INTR(dev->parent, child, irq, cookie)); return (EINVAL); } /** * @brief Helper function for implementing BUS_ADJUST_RESOURCE(). * * This simple implementation of BUS_ADJUST_RESOURCE() simply calls the * BUS_ADJUST_RESOURCE() method of the parent of @p dev. */ int bus_generic_adjust_resource(device_t dev, device_t child, int type, struct resource *r, u_long start, u_long end) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_ADJUST_RESOURCE(dev->parent, child, type, r, start, end)); return (EINVAL); } /** * @brief Helper function for implementing BUS_ALLOC_RESOURCE(). * * This simple implementation of BUS_ALLOC_RESOURCE() simply calls the * BUS_ALLOC_RESOURCE() method of the parent of @p dev. */ struct resource * bus_generic_alloc_resource(device_t dev, device_t child, int type, int *rid, u_long start, u_long end, u_long count, u_int flags) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_ALLOC_RESOURCE(dev->parent, child, type, rid, start, end, count, flags)); return (NULL); } /** * @brief Helper function for implementing BUS_RELEASE_RESOURCE(). * * This simple implementation of BUS_RELEASE_RESOURCE() simply calls the * BUS_RELEASE_RESOURCE() method of the parent of @p dev. */ int bus_generic_release_resource(device_t dev, device_t child, int type, int rid, struct resource *r) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_RELEASE_RESOURCE(dev->parent, child, type, rid, r)); return (EINVAL); } /** * @brief Helper function for implementing BUS_ACTIVATE_RESOURCE(). * * This simple implementation of BUS_ACTIVATE_RESOURCE() simply calls the * BUS_ACTIVATE_RESOURCE() method of the parent of @p dev. */ int bus_generic_activate_resource(device_t dev, device_t child, int type, int rid, struct resource *r) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_ACTIVATE_RESOURCE(dev->parent, child, type, rid, r)); return (EINVAL); } /** * @brief Helper function for implementing BUS_DEACTIVATE_RESOURCE(). * * This simple implementation of BUS_DEACTIVATE_RESOURCE() simply calls the * BUS_DEACTIVATE_RESOURCE() method of the parent of @p dev. */ int bus_generic_deactivate_resource(device_t dev, device_t child, int type, int rid, struct resource *r) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_DEACTIVATE_RESOURCE(dev->parent, child, type, rid, r)); return (EINVAL); } /** * @brief Helper function for implementing BUS_BIND_INTR(). * * This simple implementation of BUS_BIND_INTR() simply calls the * BUS_BIND_INTR() method of the parent of @p dev. */ int bus_generic_bind_intr(device_t dev, device_t child, struct resource *irq, int cpu) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_BIND_INTR(dev->parent, child, irq, cpu)); return (EINVAL); } /** * @brief Helper function for implementing BUS_CONFIG_INTR(). * * This simple implementation of BUS_CONFIG_INTR() simply calls the * BUS_CONFIG_INTR() method of the parent of @p dev. */ int bus_generic_config_intr(device_t dev, int irq, enum intr_trigger trig, enum intr_polarity pol) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_CONFIG_INTR(dev->parent, irq, trig, pol)); return (EINVAL); } /** * @brief Helper function for implementing BUS_DESCRIBE_INTR(). * * This simple implementation of BUS_DESCRIBE_INTR() simply calls the * BUS_DESCRIBE_INTR() method of the parent of @p dev. */ int bus_generic_describe_intr(device_t dev, device_t child, struct resource *irq, void *cookie, const char *descr) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent) return (BUS_DESCRIBE_INTR(dev->parent, child, irq, cookie, descr)); return (EINVAL); } /** * @brief Helper function for implementing BUS_GET_DMA_TAG(). * * This simple implementation of BUS_GET_DMA_TAG() simply calls the * BUS_GET_DMA_TAG() method of the parent of @p dev. */ bus_dma_tag_t bus_generic_get_dma_tag(device_t dev, device_t child) { /* Propagate up the bus hierarchy until someone handles it. */ if (dev->parent != NULL) return (BUS_GET_DMA_TAG(dev->parent, child)); return (NULL); } /** * @brief Helper function for implementing BUS_GET_RESOURCE(). * * This implementation of BUS_GET_RESOURCE() uses the * resource_list_find() function to do most of the work. It calls * BUS_GET_RESOURCE_LIST() to find a suitable resource list to * search. */ int bus_generic_rl_get_resource(device_t dev, device_t child, int type, int rid, u_long *startp, u_long *countp) { struct resource_list * rl = NULL; struct resource_list_entry * rle = NULL; rl = BUS_GET_RESOURCE_LIST(dev, child); if (!rl) return (EINVAL); rle = resource_list_find(rl, type, rid); if (!rle) return (ENOENT); if (startp) *startp = rle->start; if (countp) *countp = rle->count; return (0); } /** * @brief Helper function for implementing BUS_SET_RESOURCE(). * * This implementation of BUS_SET_RESOURCE() uses the * resource_list_add() function to do most of the work. It calls * BUS_GET_RESOURCE_LIST() to find a suitable resource list to * edit. */ int bus_generic_rl_set_resource(device_t dev, device_t child, int type, int rid, u_long start, u_long count) { struct resource_list * rl = NULL; rl = BUS_GET_RESOURCE_LIST(dev, child); if (!rl) return (EINVAL); resource_list_add(rl, type, rid, start, (start + count - 1), count); return (0); } /** * @brief Helper function for implementing BUS_DELETE_RESOURCE(). * * This implementation of BUS_DELETE_RESOURCE() uses the * resource_list_delete() function to do most of the work. It calls * BUS_GET_RESOURCE_LIST() to find a suitable resource list to * edit. */ void bus_generic_rl_delete_resource(device_t dev, device_t child, int type, int rid) { struct resource_list * rl = NULL; rl = BUS_GET_RESOURCE_LIST(dev, child); if (!rl) return; resource_list_delete(rl, type, rid); return; } /** * @brief Helper function for implementing BUS_RELEASE_RESOURCE(). * * This implementation of BUS_RELEASE_RESOURCE() uses the * resource_list_release() function to do most of the work. It calls * BUS_GET_RESOURCE_LIST() to find a suitable resource list. */ int bus_generic_rl_release_resource(device_t dev, device_t child, int type, int rid, struct resource *r) { struct resource_list * rl = NULL; if (device_get_parent(child) != dev) return (BUS_RELEASE_RESOURCE(device_get_parent(dev), child, type, rid, r)); rl = BUS_GET_RESOURCE_LIST(dev, child); if (!rl) return (EINVAL); return (resource_list_release(rl, dev, child, type, rid, r)); } /** * @brief Helper function for implementing BUS_ALLOC_RESOURCE(). * * This implementation of BUS_ALLOC_RESOURCE() uses the * resource_list_alloc() function to do most of the work. It calls * BUS_GET_RESOURCE_LIST() to find a suitable resource list. */ struct resource * bus_generic_rl_alloc_resource(device_t dev, device_t child, int type, int *rid, u_long start, u_long end, u_long count, u_int flags) { struct resource_list * rl = NULL; if (device_get_parent(child) != dev) return (BUS_ALLOC_RESOURCE(device_get_parent(dev), child, type, rid, start, end, count, flags)); rl = BUS_GET_RESOURCE_LIST(dev, child); if (!rl) return (NULL); return (resource_list_alloc(rl, dev, child, type, rid, start, end, count, flags)); } /** * @brief Helper function for implementing BUS_CHILD_PRESENT(). * * This simple implementation of BUS_CHILD_PRESENT() simply calls the * BUS_CHILD_PRESENT() method of the parent of @p dev. */ int bus_generic_child_present(device_t dev, device_t child) { return (BUS_CHILD_PRESENT(device_get_parent(dev), dev)); } int bus_generic_get_domain(device_t dev, device_t child, int *domain) { if (dev->parent) return (BUS_GET_DOMAIN(dev->parent, dev, domain)); return (ENOENT); } /* * Some convenience functions to make it easier for drivers to use the * resource-management functions. All these really do is hide the * indirection through the parent's method table, making for slightly * less-wordy code. In the future, it might make sense for this code * to maintain some sort of a list of resources allocated by each device. */ int bus_alloc_resources(device_t dev, struct resource_spec *rs, struct resource **res) { int i; for (i = 0; rs[i].type != -1; i++) res[i] = NULL; for (i = 0; rs[i].type != -1; i++) { res[i] = bus_alloc_resource_any(dev, rs[i].type, &rs[i].rid, rs[i].flags); if (res[i] == NULL && !(rs[i].flags & RF_OPTIONAL)) { bus_release_resources(dev, rs, res); return (ENXIO); } } return (0); } void bus_release_resources(device_t dev, const struct resource_spec *rs, struct resource **res) { int i; for (i = 0; rs[i].type != -1; i++) if (res[i] != NULL) { bus_release_resource( dev, rs[i].type, rs[i].rid, res[i]); res[i] = NULL; } } /** * @brief Wrapper function for BUS_ALLOC_RESOURCE(). * * This function simply calls the BUS_ALLOC_RESOURCE() method of the * parent of @p dev. */ struct resource * bus_alloc_resource(device_t dev, int type, int *rid, u_long start, u_long end, u_long count, u_int flags) { if (dev->parent == NULL) return (NULL); return (BUS_ALLOC_RESOURCE(dev->parent, dev, type, rid, start, end, count, flags)); } /** * @brief Wrapper function for BUS_ADJUST_RESOURCE(). * * This function simply calls the BUS_ADJUST_RESOURCE() method of the * parent of @p dev. */ int bus_adjust_resource(device_t dev, int type, struct resource *r, u_long start, u_long end) { if (dev->parent == NULL) return (EINVAL); return (BUS_ADJUST_RESOURCE(dev->parent, dev, type, r, start, end)); } /** * @brief Wrapper function for BUS_ACTIVATE_RESOURCE(). * * This function simply calls the BUS_ACTIVATE_RESOURCE() method of the * parent of @p dev. */ int bus_activate_resource(device_t dev, int type, int rid, struct resource *r) { if (dev->parent == NULL) return (EINVAL); return (BUS_ACTIVATE_RESOURCE(dev->parent, dev, type, rid, r)); } /** * @brief Wrapper function for BUS_DEACTIVATE_RESOURCE(). * * This function simply calls the BUS_DEACTIVATE_RESOURCE() method of the * parent of @p dev. */ int bus_deactivate_resource(device_t dev, int type, int rid, struct resource *r) { if (dev->parent == NULL) return (EINVAL); return (BUS_DEACTIVATE_RESOURCE(dev->parent, dev, type, rid, r)); } /** * @brief Wrapper function for BUS_RELEASE_RESOURCE(). * * This function simply calls the BUS_RELEASE_RESOURCE() method of the * parent of @p dev. */ int bus_release_resource(device_t dev, int type, int rid, struct resource *r) { if (dev->parent == NULL) return (EINVAL); return (BUS_RELEASE_RESOURCE(dev->parent, dev, type, rid, r)); } /** * @brief Wrapper function for BUS_SETUP_INTR(). * * This function simply calls the BUS_SETUP_INTR() method of the * parent of @p dev. */ int bus_setup_intr(device_t dev, struct resource *r, int flags, driver_filter_t filter, driver_intr_t handler, void *arg, void **cookiep) { int error; if (dev->parent == NULL) return (EINVAL); error = BUS_SETUP_INTR(dev->parent, dev, r, flags, filter, handler, arg, cookiep); if (error != 0) return (error); if (handler != NULL && !(flags & INTR_MPSAFE)) device_printf(dev, "[GIANT-LOCKED]\n"); return (0); } /** * @brief Wrapper function for BUS_TEARDOWN_INTR(). * * This function simply calls the BUS_TEARDOWN_INTR() method of the * parent of @p dev. */ int bus_teardown_intr(device_t dev, struct resource *r, void *cookie) { if (dev->parent == NULL) return (EINVAL); return (BUS_TEARDOWN_INTR(dev->parent, dev, r, cookie)); } /** * @brief Wrapper function for BUS_BIND_INTR(). * * This function simply calls the BUS_BIND_INTR() method of the * parent of @p dev. */ int bus_bind_intr(device_t dev, struct resource *r, int cpu) { if (dev->parent == NULL) return (EINVAL); return (BUS_BIND_INTR(dev->parent, dev, r, cpu)); } /** * @brief Wrapper function for BUS_DESCRIBE_INTR(). * * This function first formats the requested description into a * temporary buffer and then calls the BUS_DESCRIBE_INTR() method of * the parent of @p dev. */ int bus_describe_intr(device_t dev, struct resource *irq, void *cookie, const char *fmt, ...) { va_list ap; char descr[MAXCOMLEN + 1]; if (dev->parent == NULL) return (EINVAL); va_start(ap, fmt); vsnprintf(descr, sizeof(descr), fmt, ap); va_end(ap); return (BUS_DESCRIBE_INTR(dev->parent, dev, irq, cookie, descr)); } /** * @brief Wrapper function for BUS_SET_RESOURCE(). * * This function simply calls the BUS_SET_RESOURCE() method of the * parent of @p dev. */ int bus_set_resource(device_t dev, int type, int rid, u_long start, u_long count) { return (BUS_SET_RESOURCE(device_get_parent(dev), dev, type, rid, start, count)); } /** * @brief Wrapper function for BUS_GET_RESOURCE(). * * This function simply calls the BUS_GET_RESOURCE() method of the * parent of @p dev. */ int bus_get_resource(device_t dev, int type, int rid, u_long *startp, u_long *countp) { return (BUS_GET_RESOURCE(device_get_parent(dev), dev, type, rid, startp, countp)); } /** * @brief Wrapper function for BUS_GET_RESOURCE(). * * This function simply calls the BUS_GET_RESOURCE() method of the * parent of @p dev and returns the start value. */ u_long bus_get_resource_start(device_t dev, int type, int rid) { u_long start, count; int error; error = BUS_GET_RESOURCE(device_get_parent(dev), dev, type, rid, &start, &count); if (error) return (0); return (start); } /** * @brief Wrapper function for BUS_GET_RESOURCE(). * * This function simply calls the BUS_GET_RESOURCE() method of the * parent of @p dev and returns the count value. */ u_long bus_get_resource_count(device_t dev, int type, int rid) { u_long start, count; int error; error = BUS_GET_RESOURCE(device_get_parent(dev), dev, type, rid, &start, &count); if (error) return (0); return (count); } /** * @brief Wrapper function for BUS_DELETE_RESOURCE(). * * This function simply calls the BUS_DELETE_RESOURCE() method of the * parent of @p dev. */ void bus_delete_resource(device_t dev, int type, int rid) { BUS_DELETE_RESOURCE(device_get_parent(dev), dev, type, rid); } /** * @brief Wrapper function for BUS_CHILD_PRESENT(). * * This function simply calls the BUS_CHILD_PRESENT() method of the * parent of @p dev. */ int bus_child_present(device_t child) { return (BUS_CHILD_PRESENT(device_get_parent(child), child)); } /** * @brief Wrapper function for BUS_CHILD_PNPINFO_STR(). * * This function simply calls the BUS_CHILD_PNPINFO_STR() method of the * parent of @p dev. */ int bus_child_pnpinfo_str(device_t child, char *buf, size_t buflen) { device_t parent; parent = device_get_parent(child); if (parent == NULL) { *buf = '\0'; return (0); } return (BUS_CHILD_PNPINFO_STR(parent, child, buf, buflen)); } /** * @brief Wrapper function for BUS_CHILD_LOCATION_STR(). * * This function simply calls the BUS_CHILD_LOCATION_STR() method of the * parent of @p dev. */ int bus_child_location_str(device_t child, char *buf, size_t buflen) { device_t parent; parent = device_get_parent(child); if (parent == NULL) { *buf = '\0'; return (0); } return (BUS_CHILD_LOCATION_STR(parent, child, buf, buflen)); } /** * @brief Wrapper function for BUS_GET_DMA_TAG(). * * This function simply calls the BUS_GET_DMA_TAG() method of the * parent of @p dev. */ bus_dma_tag_t bus_get_dma_tag(device_t dev) { device_t parent; parent = device_get_parent(dev); if (parent == NULL) return (NULL); return (BUS_GET_DMA_TAG(parent, dev)); } /** * @brief Wrapper function for BUS_GET_DOMAIN(). * * This function simply calls the BUS_GET_DOMAIN() method of the * parent of @p dev. */ int bus_get_domain(device_t dev, int *domain) { return (BUS_GET_DOMAIN(device_get_parent(dev), dev, domain)); } /* Resume all devices and then notify userland that we're up again. */ static int root_resume(device_t dev) { int error; error = bus_generic_resume(dev); if (error == 0) devctl_notify("kern", "power", "resume", NULL); return (error); } static int root_print_child(device_t dev, device_t child) { int retval = 0; retval += bus_print_child_header(dev, child); retval += printf("\n"); return (retval); } static int root_setup_intr(device_t dev, device_t child, struct resource *irq, int flags, driver_filter_t *filter, driver_intr_t *intr, void *arg, void **cookiep) { /* * If an interrupt mapping gets to here something bad has happened. */ panic("root_setup_intr"); } /* * If we get here, assume that the device is permanant and really is * present in the system. Removable bus drivers are expected to intercept * this call long before it gets here. We return -1 so that drivers that * really care can check vs -1 or some ERRNO returned higher in the food * chain. */ static int root_child_present(device_t dev, device_t child) { return (-1); } static kobj_method_t root_methods[] = { /* Device interface */ KOBJMETHOD(device_shutdown, bus_generic_shutdown), KOBJMETHOD(device_suspend, bus_generic_suspend), KOBJMETHOD(device_resume, root_resume), /* Bus interface */ KOBJMETHOD(bus_print_child, root_print_child), KOBJMETHOD(bus_read_ivar, bus_generic_read_ivar), KOBJMETHOD(bus_write_ivar, bus_generic_write_ivar), KOBJMETHOD(bus_setup_intr, root_setup_intr), KOBJMETHOD(bus_child_present, root_child_present), KOBJMETHOD_END }; static driver_t root_driver = { "root", root_methods, 1, /* no softc */ }; device_t root_bus; devclass_t root_devclass; static int root_bus_module_handler(module_t mod, int what, void* arg) { switch (what) { case MOD_LOAD: TAILQ_INIT(&bus_data_devices); kobj_class_compile((kobj_class_t) &root_driver); root_bus = make_device(NULL, "root", 0); root_bus->desc = "System root bus"; kobj_init((kobj_t) root_bus, (kobj_class_t) &root_driver); root_bus->driver = &root_driver; root_bus->state = DS_ATTACHED; root_devclass = devclass_find_internal("root", NULL, FALSE); devinit(); return (0); case MOD_SHUTDOWN: device_shutdown(root_bus); return (0); default: return (EOPNOTSUPP); } return (0); } static moduledata_t root_bus_mod = { "rootbus", root_bus_module_handler, NULL }; DECLARE_MODULE(rootbus, root_bus_mod, SI_SUB_DRIVERS, SI_ORDER_FIRST); /** * @brief Automatically configure devices * * This function begins the autoconfiguration process by calling * device_probe_and_attach() for each child of the @c root0 device. */ void root_bus_configure(void) { PDEBUG((".")); /* Eventually this will be split up, but this is sufficient for now. */ bus_set_pass(BUS_PASS_DEFAULT); } /** * @brief Module handler for registering device drivers * * This module handler is used to automatically register device * drivers when modules are loaded. If @p what is MOD_LOAD, it calls * devclass_add_driver() for the driver described by the * driver_module_data structure pointed to by @p arg */ int driver_module_handler(module_t mod, int what, void *arg) { struct driver_module_data *dmd; devclass_t bus_devclass; kobj_class_t driver; int error, pass; dmd = (struct driver_module_data *)arg; bus_devclass = devclass_find_internal(dmd->dmd_busname, NULL, TRUE); error = 0; switch (what) { case MOD_LOAD: if (dmd->dmd_chainevh) error = dmd->dmd_chainevh(mod,what,dmd->dmd_chainarg); pass = dmd->dmd_pass; driver = dmd->dmd_driver; PDEBUG(("Loading module: driver %s on bus %s (pass %d)", DRIVERNAME(driver), dmd->dmd_busname, pass)); error = devclass_add_driver(bus_devclass, driver, pass, dmd->dmd_devclass); break; case MOD_UNLOAD: PDEBUG(("Unloading module: driver %s from bus %s", DRIVERNAME(dmd->dmd_driver), dmd->dmd_busname)); error = devclass_delete_driver(bus_devclass, dmd->dmd_driver); if (!error && dmd->dmd_chainevh) error = dmd->dmd_chainevh(mod,what,dmd->dmd_chainarg); break; case MOD_QUIESCE: PDEBUG(("Quiesce module: driver %s from bus %s", DRIVERNAME(dmd->dmd_driver), dmd->dmd_busname)); error = devclass_quiesce_driver(bus_devclass, dmd->dmd_driver); if (!error && dmd->dmd_chainevh) error = dmd->dmd_chainevh(mod,what,dmd->dmd_chainarg); break; default: error = EOPNOTSUPP; break; } return (error); } /** * @brief Enumerate all hinted devices for this bus. * * Walks through the hints for this bus and calls the bus_hinted_child * routine for each one it fines. It searches first for the specific * bus that's being probed for hinted children (eg isa0), and then for * generic children (eg isa). * * @param dev bus device to enumerate */ void bus_enumerate_hinted_children(device_t bus) { int i; const char *dname, *busname; int dunit; /* * enumerate all devices on the specific bus */ busname = device_get_nameunit(bus); i = 0; while (resource_find_match(&i, &dname, &dunit, "at", busname) == 0) BUS_HINTED_CHILD(bus, dname, dunit); /* * and all the generic ones. */ busname = device_get_name(bus); i = 0; while (resource_find_match(&i, &dname, &dunit, "at", busname) == 0) BUS_HINTED_CHILD(bus, dname, dunit); } #ifdef BUS_DEBUG /* the _short versions avoid iteration by not calling anything that prints * more than oneliners. I love oneliners. */ static void print_device_short(device_t dev, int indent) { if (!dev) return; indentprintf(("device %d: <%s> %sparent,%schildren,%s%s%s%s%s,%sivars,%ssoftc,busy=%d\n", dev->unit, dev->desc, (dev->parent? "":"no "), (TAILQ_EMPTY(&dev->children)? "no ":""), (dev->flags&DF_ENABLED? "enabled,":"disabled,"), (dev->flags&DF_FIXEDCLASS? "fixed,":""), (dev->flags&DF_WILDCARD? "wildcard,":""), (dev->flags&DF_DESCMALLOCED? "descmalloced,":""), (dev->flags&DF_REBID? "rebiddable,":""), (dev->ivars? "":"no "), (dev->softc? "":"no "), dev->busy)); } static void print_device(device_t dev, int indent) { if (!dev) return; print_device_short(dev, indent); indentprintf(("Parent:\n")); print_device_short(dev->parent, indent+1); indentprintf(("Driver:\n")); print_driver_short(dev->driver, indent+1); indentprintf(("Devclass:\n")); print_devclass_short(dev->devclass, indent+1); } void print_device_tree_short(device_t dev, int indent) /* print the device and all its children (indented) */ { device_t child; if (!dev) return; print_device_short(dev, indent); TAILQ_FOREACH(child, &dev->children, link) { print_device_tree_short(child, indent+1); } } void print_device_tree(device_t dev, int indent) /* print the device and all its children (indented) */ { device_t child; if (!dev) return; print_device(dev, indent); TAILQ_FOREACH(child, &dev->children, link) { print_device_tree(child, indent+1); } } static void print_driver_short(driver_t *driver, int indent) { if (!driver) return; indentprintf(("driver %s: softc size = %zd\n", driver->name, driver->size)); } static void print_driver(driver_t *driver, int indent) { if (!driver) return; print_driver_short(driver, indent); } static void print_driver_list(driver_list_t drivers, int indent) { driverlink_t driver; TAILQ_FOREACH(driver, &drivers, link) { print_driver(driver->driver, indent); } } static void print_devclass_short(devclass_t dc, int indent) { if ( !dc ) return; indentprintf(("devclass %s: max units = %d\n", dc->name, dc->maxunit)); } static void print_devclass(devclass_t dc, int indent) { int i; if ( !dc ) return; print_devclass_short(dc, indent); indentprintf(("Drivers:\n")); print_driver_list(dc->drivers, indent+1); indentprintf(("Devices:\n")); for (i = 0; i < dc->maxunit; i++) if (dc->devices[i]) print_device(dc->devices[i], indent+1); } void print_devclass_list_short(void) { devclass_t dc; printf("Short listing of devclasses, drivers & devices:\n"); TAILQ_FOREACH(dc, &devclasses, link) { print_devclass_short(dc, 0); } } void print_devclass_list(void) { devclass_t dc; printf("Full listing of devclasses, drivers & devices:\n"); TAILQ_FOREACH(dc, &devclasses, link) { print_devclass(dc, 0); } } #endif /* * User-space access to the device tree. * * We implement a small set of nodes: * * hw.bus Single integer read method to obtain the * current generation count. * hw.bus.devices Reads the entire device tree in flat space. * hw.bus.rman Resource manager interface * * We might like to add the ability to scan devclasses and/or drivers to * determine what else is currently loaded/available. */ static int sysctl_bus(SYSCTL_HANDLER_ARGS) { struct u_businfo ubus; ubus.ub_version = BUS_USER_VERSION; ubus.ub_generation = bus_data_generation; return (SYSCTL_OUT(req, &ubus, sizeof(ubus))); } SYSCTL_NODE(_hw_bus, OID_AUTO, info, CTLFLAG_RW, sysctl_bus, "bus-related data"); static int sysctl_devices(SYSCTL_HANDLER_ARGS) { int *name = (int *)arg1; u_int namelen = arg2; int index; struct device *dev; struct u_device udev; /* XXX this is a bit big */ int error; if (namelen != 2) return (EINVAL); if (bus_data_generation_check(name[0])) return (EINVAL); index = name[1]; /* * Scan the list of devices, looking for the requested index. */ TAILQ_FOREACH(dev, &bus_data_devices, devlink) { if (index-- == 0) break; } if (dev == NULL) return (ENOENT); /* * Populate the return array. */ bzero(&udev, sizeof(udev)); udev.dv_handle = (uintptr_t)dev; udev.dv_parent = (uintptr_t)dev->parent; if (dev->nameunit != NULL) strlcpy(udev.dv_name, dev->nameunit, sizeof(udev.dv_name)); if (dev->desc != NULL) strlcpy(udev.dv_desc, dev->desc, sizeof(udev.dv_desc)); if (dev->driver != NULL && dev->driver->name != NULL) strlcpy(udev.dv_drivername, dev->driver->name, sizeof(udev.dv_drivername)); bus_child_pnpinfo_str(dev, udev.dv_pnpinfo, sizeof(udev.dv_pnpinfo)); bus_child_location_str(dev, udev.dv_location, sizeof(udev.dv_location)); udev.dv_devflags = dev->devflags; udev.dv_flags = dev->flags; udev.dv_state = dev->state; error = SYSCTL_OUT(req, &udev, sizeof(udev)); return (error); } SYSCTL_NODE(_hw_bus, OID_AUTO, devices, CTLFLAG_RD, sysctl_devices, "system device tree"); int bus_data_generation_check(int generation) { if (generation != bus_data_generation) return (1); /* XXX generate optimised lists here? */ return (0); } void bus_data_generation_update(void) { bus_data_generation++; } int bus_free_resource(device_t dev, int type, struct resource *r) { if (r == NULL) return (0); return (bus_release_resource(dev, type, rman_get_rid(r), r)); } /* * /dev/devctl2 implementation. The existing /dev/devctl device has * implicit semantics on open, so it could not be reused for this. * Another option would be to call this /dev/bus? */ static int find_device(struct devreq *req, device_t *devp) { device_t dev; /* * First, ensure that the name is nul terminated. */ if (memchr(req->dr_name, '\0', sizeof(req->dr_name)) == NULL) return (EINVAL); /* * Second, try to find an attached device whose name matches * 'name'. */ TAILQ_FOREACH(dev, &bus_data_devices, devlink) { if (dev->nameunit != NULL && strcmp(dev->nameunit, req->dr_name) == 0) { *devp = dev; return (0); } } /* Finally, give device enumerators a chance. */ dev = NULL; EVENTHANDLER_INVOKE(dev_lookup, req->dr_name, &dev); if (dev == NULL) return (ENOENT); *devp = dev; return (0); } static bool driver_exists(struct device *bus, const char *driver) { devclass_t dc; for (dc = bus->devclass; dc != NULL; dc = dc->parent) { if (devclass_find_driver_internal(dc, driver) != NULL) return (true); } return (false); } static int devctl2_ioctl(struct cdev *cdev, u_long cmd, caddr_t data, int fflag, struct thread *td) { struct devreq *req; device_t dev; int error, old; /* Locate the device to control. */ mtx_lock(&Giant); req = (struct devreq *)data; switch (cmd) { case DEV_ATTACH: case DEV_DETACH: case DEV_ENABLE: case DEV_DISABLE: case DEV_SUSPEND: case DEV_RESUME: case DEV_SET_DRIVER: error = priv_check(td, PRIV_DRIVER); if (error == 0) error = find_device(req, &dev); break; default: error = ENOTTY; break; } if (error) { mtx_unlock(&Giant); return (error); } /* Perform the requested operation. */ switch (cmd) { case DEV_ATTACH: if (device_is_attached(dev) && (dev->flags & DF_REBID) == 0) error = EBUSY; else if (!device_is_enabled(dev)) error = ENXIO; else error = device_probe_and_attach(dev); break; case DEV_DETACH: if (!device_is_attached(dev)) { error = ENXIO; break; } if (!(req->dr_flags & DEVF_FORCE_DETACH)) { error = device_quiesce(dev); if (error) break; } error = device_detach(dev); break; case DEV_ENABLE: if (device_is_enabled(dev)) { error = EBUSY; break; } /* * If the device has been probed but not attached (e.g. * when it has been disabled by a loader hint), just * attach the device rather than doing a full probe. */ device_enable(dev); if (device_is_alive(dev)) { /* * If the device was disabled via a hint, clear * the hint. */ if (resource_disabled(dev->driver->name, dev->unit)) resource_unset_value(dev->driver->name, dev->unit, "disabled"); error = device_attach(dev); } else error = device_probe_and_attach(dev); break; case DEV_DISABLE: if (!device_is_enabled(dev)) { error = ENXIO; break; } if (!(req->dr_flags & DEVF_FORCE_DETACH)) { error = device_quiesce(dev); if (error) break; } /* * Force DF_FIXEDCLASS on around detach to preserve * the existing name. */ old = dev->flags; dev->flags |= DF_FIXEDCLASS; error = device_detach(dev); if (!(old & DF_FIXEDCLASS)) dev->flags &= ~DF_FIXEDCLASS; if (error == 0) device_disable(dev); break; case DEV_SUSPEND: if (device_is_suspended(dev)) { error = EBUSY; break; } if (device_get_parent(dev) == NULL) { error = EINVAL; break; } error = BUS_SUSPEND_CHILD(device_get_parent(dev), dev); break; case DEV_RESUME: if (!device_is_suspended(dev)) { error = EINVAL; break; } if (device_get_parent(dev) == NULL) { error = EINVAL; break; } error = BUS_RESUME_CHILD(device_get_parent(dev), dev); break; case DEV_SET_DRIVER: { devclass_t dc; char driver[128]; error = copyinstr(req->dr_data, driver, sizeof(driver), NULL); if (error) break; if (driver[0] == '\0') { error = EINVAL; break; } if (dev->devclass != NULL && strcmp(driver, dev->devclass->name) == 0) /* XXX: Could possibly force DF_FIXEDCLASS on? */ break; /* * Scan drivers for this device's bus looking for at * least one matching driver. */ if (dev->parent == NULL) { error = EINVAL; break; } if (!driver_exists(dev->parent, driver)) { error = ENOENT; break; } dc = devclass_create(driver); if (dc == NULL) { error = ENOMEM; break; } /* Detach device if necessary. */ if (device_is_attached(dev)) { if (req->dr_flags & DEVF_SET_DRIVER_DETACH) error = device_detach(dev); else error = EBUSY; if (error) break; } /* Clear any previously-fixed device class and unit. */ if (dev->flags & DF_FIXEDCLASS) devclass_delete_device(dev->devclass, dev); dev->flags |= DF_WILDCARD; dev->unit = -1; /* Force the new device class. */ error = devclass_add_device(dc, dev); if (error) break; dev->flags |= DF_FIXEDCLASS; error = device_probe_and_attach(dev); break; } } mtx_unlock(&Giant); return (error); } static struct cdevsw devctl2_cdevsw = { .d_version = D_VERSION, .d_ioctl = devctl2_ioctl, .d_name = "devctl2", }; static void devctl2_init(void) { make_dev_credf(MAKEDEV_ETERNAL, &devctl2_cdevsw, 0, NULL, UID_ROOT, GID_WHEEL, 0600, "devctl2"); } Index: user/ngie/more-tests/sys/kern/vfs_subr.c =================================================================== --- user/ngie/more-tests/sys/kern/vfs_subr.c (revision 281584) +++ user/ngie/more-tests/sys/kern/vfs_subr.c (revision 281585) @@ -1,4892 +1,4893 @@ /*- * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)vfs_subr.c 8.31 (Berkeley) 5/26/95 */ /* * External virtual filesystem routines */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include "opt_ddb.h" #include "opt_watchdog.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DDB #include #endif static void delmntque(struct vnode *vp); static int flushbuflist(struct bufv *bufv, int flags, struct bufobj *bo, int slpflag, int slptimeo); static void syncer_shutdown(void *arg, int howto); static int vtryrecycle(struct vnode *vp); static void v_incr_usecount(struct vnode *); static void v_decr_usecount(struct vnode *); static void v_decr_useonly(struct vnode *); static void v_upgrade_usecount(struct vnode *); static void vnlru_free(int); static void vgonel(struct vnode *); static void vfs_knllock(void *arg); static void vfs_knlunlock(void *arg); static void vfs_knl_assert_locked(void *arg); static void vfs_knl_assert_unlocked(void *arg); static void destroy_vpollinfo(struct vpollinfo *vi); /* * Number of vnodes in existence. Increased whenever getnewvnode() * allocates a new vnode, decreased in vdropl() for VI_DOOMED vnode. */ static unsigned long numvnodes; SYSCTL_ULONG(_vfs, OID_AUTO, numvnodes, CTLFLAG_RD, &numvnodes, 0, "Number of vnodes in existence"); static u_long vnodes_created; SYSCTL_ULONG(_vfs, OID_AUTO, vnodes_created, CTLFLAG_RD, &vnodes_created, 0, "Number of vnodes created by getnewvnode"); /* * Conversion tables for conversion from vnode types to inode formats * and back. */ enum vtype iftovt_tab[16] = { VNON, VFIFO, VCHR, VNON, VDIR, VNON, VBLK, VNON, VREG, VNON, VLNK, VNON, VSOCK, VNON, VNON, VBAD, }; int vttoif_tab[10] = { 0, S_IFREG, S_IFDIR, S_IFBLK, S_IFCHR, S_IFLNK, S_IFSOCK, S_IFIFO, S_IFMT, S_IFMT }; /* * List of vnodes that are ready for recycling. */ static TAILQ_HEAD(freelst, vnode) vnode_free_list; /* * Free vnode target. Free vnodes may simply be files which have been stat'd * but not read. This is somewhat common, and a small cache of such files * should be kept to avoid recreation costs. */ static u_long wantfreevnodes; SYSCTL_ULONG(_vfs, OID_AUTO, wantfreevnodes, CTLFLAG_RW, &wantfreevnodes, 0, ""); /* Number of vnodes in the free list. */ static u_long freevnodes; SYSCTL_ULONG(_vfs, OID_AUTO, freevnodes, CTLFLAG_RD, &freevnodes, 0, "Number of vnodes in the free list"); static int vlru_allow_cache_src; SYSCTL_INT(_vfs, OID_AUTO, vlru_allow_cache_src, CTLFLAG_RW, &vlru_allow_cache_src, 0, "Allow vlru to reclaim source vnode"); static u_long recycles_count; SYSCTL_ULONG(_vfs, OID_AUTO, recycles, CTLFLAG_RD, &recycles_count, 0, "Number of vnodes recycled to avoid exceding kern.maxvnodes"); /* * Various variables used for debugging the new implementation of * reassignbuf(). * XXX these are probably of (very) limited utility now. */ static int reassignbufcalls; SYSCTL_INT(_vfs, OID_AUTO, reassignbufcalls, CTLFLAG_RW, &reassignbufcalls, 0, "Number of calls to reassignbuf"); /* * Cache for the mount type id assigned to NFS. This is used for * special checks in nfs/nfs_nqlease.c and vm/vnode_pager.c. */ int nfs_mount_type = -1; /* To keep more than one thread at a time from running vfs_getnewfsid */ static struct mtx mntid_mtx; /* * Lock for any access to the following: * vnode_free_list * numvnodes * freevnodes */ static struct mtx vnode_free_list_mtx; /* Publicly exported FS */ struct nfs_public nfs_pub; static uma_zone_t buf_trie_zone; /* Zone for allocation of new vnodes - used exclusively by getnewvnode() */ static uma_zone_t vnode_zone; static uma_zone_t vnodepoll_zone; /* * The workitem queue. * * It is useful to delay writes of file data and filesystem metadata * for tens of seconds so that quickly created and deleted files need * not waste disk bandwidth being created and removed. To realize this, * we append vnodes to a "workitem" queue. When running with a soft * updates implementation, most pending metadata dependencies should * not wait for more than a few seconds. Thus, mounted on block devices * are delayed only about a half the time that file data is delayed. * Similarly, directory updates are more critical, so are only delayed * about a third the time that file data is delayed. Thus, there are * SYNCER_MAXDELAY queues that are processed round-robin at a rate of * one each second (driven off the filesystem syncer process). The * syncer_delayno variable indicates the next queue that is to be processed. * Items that need to be processed soon are placed in this queue: * * syncer_workitem_pending[syncer_delayno] * * A delay of fifteen seconds is done by placing the request fifteen * entries later in the queue: * * syncer_workitem_pending[(syncer_delayno + 15) & syncer_mask] * */ static int syncer_delayno; static long syncer_mask; LIST_HEAD(synclist, bufobj); static struct synclist *syncer_workitem_pending; /* * The sync_mtx protects: * bo->bo_synclist * sync_vnode_count * syncer_delayno * syncer_state * syncer_workitem_pending * syncer_worklist_len * rushjob */ static struct mtx sync_mtx; static struct cv sync_wakeup; #define SYNCER_MAXDELAY 32 static int syncer_maxdelay = SYNCER_MAXDELAY; /* maximum delay time */ static int syncdelay = 30; /* max time to delay syncing data */ static int filedelay = 30; /* time to delay syncing files */ SYSCTL_INT(_kern, OID_AUTO, filedelay, CTLFLAG_RW, &filedelay, 0, "Time to delay syncing files (in seconds)"); static int dirdelay = 29; /* time to delay syncing directories */ SYSCTL_INT(_kern, OID_AUTO, dirdelay, CTLFLAG_RW, &dirdelay, 0, "Time to delay syncing directories (in seconds)"); static int metadelay = 28; /* time to delay syncing metadata */ SYSCTL_INT(_kern, OID_AUTO, metadelay, CTLFLAG_RW, &metadelay, 0, "Time to delay syncing metadata (in seconds)"); static int rushjob; /* number of slots to run ASAP */ static int stat_rush_requests; /* number of times I/O speeded up */ SYSCTL_INT(_debug, OID_AUTO, rush_requests, CTLFLAG_RW, &stat_rush_requests, 0, "Number of times I/O speeded up (rush requests)"); /* * When shutting down the syncer, run it at four times normal speed. */ #define SYNCER_SHUTDOWN_SPEEDUP 4 static int sync_vnode_count; static int syncer_worklist_len; static enum { SYNCER_RUNNING, SYNCER_SHUTTING_DOWN, SYNCER_FINAL_DELAY } syncer_state; /* * Number of vnodes we want to exist at any one time. This is mostly used * to size hash tables in vnode-related code. It is normally not used in * getnewvnode(), as wantfreevnodes is normally nonzero.) * * XXX desiredvnodes is historical cruft and should not exist. */ int desiredvnodes; SYSCTL_INT(_kern, KERN_MAXVNODES, maxvnodes, CTLFLAG_RW, &desiredvnodes, 0, "Maximum number of vnodes"); SYSCTL_ULONG(_kern, OID_AUTO, minvnodes, CTLFLAG_RW, &wantfreevnodes, 0, "Minimum number of vnodes (legacy)"); static int vnlru_nowhere; SYSCTL_INT(_debug, OID_AUTO, vnlru_nowhere, CTLFLAG_RW, &vnlru_nowhere, 0, "Number of times the vnlru process ran without success"); /* Shift count for (uintptr_t)vp to initialize vp->v_hash. */ static int vnsz2log; /* * Support for the bufobj clean & dirty pctrie. */ static void * buf_trie_alloc(struct pctrie *ptree) { return uma_zalloc(buf_trie_zone, M_NOWAIT); } static void buf_trie_free(struct pctrie *ptree, void *node) { uma_zfree(buf_trie_zone, node); } PCTRIE_DEFINE(BUF, buf, b_lblkno, buf_trie_alloc, buf_trie_free); /* * Initialize the vnode management data structures. * * Reevaluate the following cap on the number of vnodes after the physical * memory size exceeds 512GB. In the limit, as the physical memory size * grows, the ratio of physical pages to vnodes approaches sixteen to one. */ #ifndef MAXVNODES_MAX #define MAXVNODES_MAX (512 * (1024 * 1024 * 1024 / (int)PAGE_SIZE / 16)) #endif static void vntblinit(void *dummy __unused) { u_int i; int physvnodes, virtvnodes; /* * Desiredvnodes is a function of the physical memory size and the * kernel's heap size. Generally speaking, it scales with the * physical memory size. The ratio of desiredvnodes to physical pages * is one to four until desiredvnodes exceeds 98,304. Thereafter, the * marginal ratio of desiredvnodes to physical pages is one to * sixteen. However, desiredvnodes is limited by the kernel's heap * size. The memory required by desiredvnodes vnodes and vm objects * may not exceed one seventh of the kernel's heap size. */ physvnodes = maxproc + vm_cnt.v_page_count / 16 + 3 * min(98304 * 4, vm_cnt.v_page_count) / 16; virtvnodes = vm_kmem_size / (7 * (sizeof(struct vm_object) + sizeof(struct vnode))); desiredvnodes = min(physvnodes, virtvnodes); if (desiredvnodes > MAXVNODES_MAX) { if (bootverbose) printf("Reducing kern.maxvnodes %d -> %d\n", desiredvnodes, MAXVNODES_MAX); desiredvnodes = MAXVNODES_MAX; } wantfreevnodes = desiredvnodes / 4; mtx_init(&mntid_mtx, "mntid", NULL, MTX_DEF); TAILQ_INIT(&vnode_free_list); mtx_init(&vnode_free_list_mtx, "vnode_free_list", NULL, MTX_DEF); vnode_zone = uma_zcreate("VNODE", sizeof (struct vnode), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); vnodepoll_zone = uma_zcreate("VNODEPOLL", sizeof (struct vpollinfo), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); /* * Preallocate enough nodes to support one-per buf so that * we can not fail an insert. reassignbuf() callers can not * tolerate the insertion failure. */ buf_trie_zone = uma_zcreate("BUF TRIE", pctrie_node_size(), NULL, NULL, pctrie_zone_init, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE | UMA_ZONE_VM); uma_prealloc(buf_trie_zone, nbuf); /* * Initialize the filesystem syncer. */ syncer_workitem_pending = hashinit(syncer_maxdelay, M_VNODE, &syncer_mask); syncer_maxdelay = syncer_mask + 1; mtx_init(&sync_mtx, "Syncer mtx", NULL, MTX_DEF); cv_init(&sync_wakeup, "syncer"); for (i = 1; i <= sizeof(struct vnode); i <<= 1) vnsz2log++; vnsz2log--; } SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL); /* * Mark a mount point as busy. Used to synchronize access and to delay * unmounting. Eventually, mountlist_mtx is not released on failure. * * vfs_busy() is a custom lock, it can block the caller. * vfs_busy() only sleeps if the unmount is active on the mount point. * For a mountpoint mp, vfs_busy-enforced lock is before lock of any * vnode belonging to mp. * * Lookup uses vfs_busy() to traverse mount points. * root fs var fs * / vnode lock A / vnode lock (/var) D * /var vnode lock B /log vnode lock(/var/log) E * vfs_busy lock C vfs_busy lock F * * Within each file system, the lock order is C->A->B and F->D->E. * * When traversing across mounts, the system follows that lock order: * * C->A->B * | * +->F->D->E * * The lookup() process for namei("/var") illustrates the process: * VOP_LOOKUP() obtains B while A is held * vfs_busy() obtains a shared lock on F while A and B are held * vput() releases lock on B * vput() releases lock on A * VFS_ROOT() obtains lock on D while shared lock on F is held * vfs_unbusy() releases shared lock on F * vn_lock() obtains lock on deadfs vnode vp_crossmp instead of A. * Attempt to lock A (instead of vp_crossmp) while D is held would * violate the global order, causing deadlocks. * * dounmount() locks B while F is drained. */ int vfs_busy(struct mount *mp, int flags) { MPASS((flags & ~MBF_MASK) == 0); CTR3(KTR_VFS, "%s: mp %p with flags %d", __func__, mp, flags); MNT_ILOCK(mp); MNT_REF(mp); /* * If mount point is currenly being unmounted, sleep until the * mount point fate is decided. If thread doing the unmounting fails, * it will clear MNTK_UNMOUNT flag before waking us up, indicating * that this mount point has survived the unmount attempt and vfs_busy * should retry. Otherwise the unmounter thread will set MNTK_REFEXPIRE * flag in addition to MNTK_UNMOUNT, indicating that mount point is * about to be really destroyed. vfs_busy needs to release its * reference on the mount point in this case and return with ENOENT, * telling the caller that mount mount it tried to busy is no longer * valid. */ while (mp->mnt_kern_flag & MNTK_UNMOUNT) { if (flags & MBF_NOWAIT || mp->mnt_kern_flag & MNTK_REFEXPIRE) { MNT_REL(mp); MNT_IUNLOCK(mp); CTR1(KTR_VFS, "%s: failed busying before sleeping", __func__); return (ENOENT); } if (flags & MBF_MNTLSTLOCK) mtx_unlock(&mountlist_mtx); mp->mnt_kern_flag |= MNTK_MWAIT; msleep(mp, MNT_MTX(mp), PVFS | PDROP, "vfs_busy", 0); if (flags & MBF_MNTLSTLOCK) mtx_lock(&mountlist_mtx); MNT_ILOCK(mp); } if (flags & MBF_MNTLSTLOCK) mtx_unlock(&mountlist_mtx); mp->mnt_lockref++; MNT_IUNLOCK(mp); return (0); } /* * Free a busy filesystem. */ void vfs_unbusy(struct mount *mp) { CTR2(KTR_VFS, "%s: mp %p", __func__, mp); MNT_ILOCK(mp); MNT_REL(mp); KASSERT(mp->mnt_lockref > 0, ("negative mnt_lockref")); mp->mnt_lockref--; if (mp->mnt_lockref == 0 && (mp->mnt_kern_flag & MNTK_DRAINING) != 0) { MPASS(mp->mnt_kern_flag & MNTK_UNMOUNT); CTR1(KTR_VFS, "%s: waking up waiters", __func__); mp->mnt_kern_flag &= ~MNTK_DRAINING; wakeup(&mp->mnt_lockref); } MNT_IUNLOCK(mp); } /* * Lookup a mount point by filesystem identifier. */ struct mount * vfs_getvfs(fsid_t *fsid) { struct mount *mp; CTR2(KTR_VFS, "%s: fsid %p", __func__, fsid); mtx_lock(&mountlist_mtx); TAILQ_FOREACH(mp, &mountlist, mnt_list) { if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] && mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) { vfs_ref(mp); mtx_unlock(&mountlist_mtx); return (mp); } } mtx_unlock(&mountlist_mtx); CTR2(KTR_VFS, "%s: lookup failed for %p id", __func__, fsid); return ((struct mount *) 0); } /* * Lookup a mount point by filesystem identifier, busying it before * returning. * * To avoid congestion on mountlist_mtx, implement simple direct-mapped * cache for popular filesystem identifiers. The cache is lockess, using * the fact that struct mount's are never freed. In worst case we may * get pointer to unmounted or even different filesystem, so we have to * check what we got, and go slow way if so. */ struct mount * vfs_busyfs(fsid_t *fsid) { #define FSID_CACHE_SIZE 256 typedef struct mount * volatile vmp_t; static vmp_t cache[FSID_CACHE_SIZE]; struct mount *mp; int error; uint32_t hash; CTR2(KTR_VFS, "%s: fsid %p", __func__, fsid); hash = fsid->val[0] ^ fsid->val[1]; hash = (hash >> 16 ^ hash) & (FSID_CACHE_SIZE - 1); mp = cache[hash]; if (mp == NULL || mp->mnt_stat.f_fsid.val[0] != fsid->val[0] || mp->mnt_stat.f_fsid.val[1] != fsid->val[1]) goto slow; if (vfs_busy(mp, 0) != 0) { cache[hash] = NULL; goto slow; } if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] && mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) return (mp); else vfs_unbusy(mp); slow: mtx_lock(&mountlist_mtx); TAILQ_FOREACH(mp, &mountlist, mnt_list) { if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] && mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) { error = vfs_busy(mp, MBF_MNTLSTLOCK); if (error) { cache[hash] = NULL; mtx_unlock(&mountlist_mtx); return (NULL); } cache[hash] = mp; return (mp); } } CTR2(KTR_VFS, "%s: lookup failed for %p id", __func__, fsid); mtx_unlock(&mountlist_mtx); return ((struct mount *) 0); } /* * Check if a user can access privileged mount options. */ int vfs_suser(struct mount *mp, struct thread *td) { int error; /* * If the thread is jailed, but this is not a jail-friendly file * system, deny immediately. */ if (!(mp->mnt_vfc->vfc_flags & VFCF_JAIL) && jailed(td->td_ucred)) return (EPERM); /* * If the file system was mounted outside the jail of the calling * thread, deny immediately. */ if (prison_check(td->td_ucred, mp->mnt_cred) != 0) return (EPERM); /* * If file system supports delegated administration, we don't check * for the PRIV_VFS_MOUNT_OWNER privilege - it will be better verified * by the file system itself. * If this is not the user that did original mount, we check for * the PRIV_VFS_MOUNT_OWNER privilege. */ if (!(mp->mnt_vfc->vfc_flags & VFCF_DELEGADMIN) && mp->mnt_cred->cr_uid != td->td_ucred->cr_uid) { if ((error = priv_check(td, PRIV_VFS_MOUNT_OWNER)) != 0) return (error); } return (0); } /* * Get a new unique fsid. Try to make its val[0] unique, since this value * will be used to create fake device numbers for stat(). Also try (but * not so hard) make its val[0] unique mod 2^16, since some emulators only * support 16-bit device numbers. We end up with unique val[0]'s for the * first 2^16 calls and unique val[0]'s mod 2^16 for the first 2^8 calls. * * Keep in mind that several mounts may be running in parallel. Starting * the search one past where the previous search terminated is both a * micro-optimization and a defense against returning the same fsid to * different mounts. */ void vfs_getnewfsid(struct mount *mp) { static uint16_t mntid_base; struct mount *nmp; fsid_t tfsid; int mtype; CTR2(KTR_VFS, "%s: mp %p", __func__, mp); mtx_lock(&mntid_mtx); mtype = mp->mnt_vfc->vfc_typenum; tfsid.val[1] = mtype; mtype = (mtype & 0xFF) << 24; for (;;) { tfsid.val[0] = makedev(255, mtype | ((mntid_base & 0xFF00) << 8) | (mntid_base & 0xFF)); mntid_base++; if ((nmp = vfs_getvfs(&tfsid)) == NULL) break; vfs_rel(nmp); } mp->mnt_stat.f_fsid.val[0] = tfsid.val[0]; mp->mnt_stat.f_fsid.val[1] = tfsid.val[1]; mtx_unlock(&mntid_mtx); } /* * Knob to control the precision of file timestamps: * * 0 = seconds only; nanoseconds zeroed. * 1 = seconds and nanoseconds, accurate within 1/HZ. * 2 = seconds and nanoseconds, truncated to microseconds. * >=3 = seconds and nanoseconds, maximum precision. */ enum { TSP_SEC, TSP_HZ, TSP_USEC, TSP_NSEC }; static int timestamp_precision = TSP_USEC; SYSCTL_INT(_vfs, OID_AUTO, timestamp_precision, CTLFLAG_RW, ×tamp_precision, 0, "File timestamp precision (0: seconds, " "1: sec + ns accurate to 1/HZ, 2: sec + ns truncated to ms, " "3+: sec + ns (max. precision))"); /* * Get a current timestamp. */ void vfs_timestamp(struct timespec *tsp) { struct timeval tv; switch (timestamp_precision) { case TSP_SEC: tsp->tv_sec = time_second; tsp->tv_nsec = 0; break; case TSP_HZ: getnanotime(tsp); break; case TSP_USEC: microtime(&tv); TIMEVAL_TO_TIMESPEC(&tv, tsp); break; case TSP_NSEC: default: nanotime(tsp); break; } } /* * Set vnode attributes to VNOVAL */ void vattr_null(struct vattr *vap) { vap->va_type = VNON; vap->va_size = VNOVAL; vap->va_bytes = VNOVAL; vap->va_mode = VNOVAL; vap->va_nlink = VNOVAL; vap->va_uid = VNOVAL; vap->va_gid = VNOVAL; vap->va_fsid = VNOVAL; vap->va_fileid = VNOVAL; vap->va_blocksize = VNOVAL; vap->va_rdev = VNOVAL; vap->va_atime.tv_sec = VNOVAL; vap->va_atime.tv_nsec = VNOVAL; vap->va_mtime.tv_sec = VNOVAL; vap->va_mtime.tv_nsec = VNOVAL; vap->va_ctime.tv_sec = VNOVAL; vap->va_ctime.tv_nsec = VNOVAL; vap->va_birthtime.tv_sec = VNOVAL; vap->va_birthtime.tv_nsec = VNOVAL; vap->va_flags = VNOVAL; vap->va_gen = VNOVAL; vap->va_vaflags = 0; } /* * This routine is called when we have too many vnodes. It attempts * to free vnodes and will potentially free vnodes that still * have VM backing store (VM backing store is typically the cause * of a vnode blowout so we want to do this). Therefore, this operation * is not considered cheap. * * A number of conditions may prevent a vnode from being reclaimed. * the buffer cache may have references on the vnode, a directory * vnode may still have references due to the namei cache representing * underlying files, or the vnode may be in active use. It is not * desireable to reuse such vnodes. These conditions may cause the * number of vnodes to reach some minimum value regardless of what * you set kern.maxvnodes to. Do not set kern.maxvnodes too low. */ static int vlrureclaim(struct mount *mp) { struct vnode *vp; int done; int trigger; int usevnodes; int count; /* * Calculate the trigger point, don't allow user * screwups to blow us up. This prevents us from * recycling vnodes with lots of resident pages. We * aren't trying to free memory, we are trying to * free vnodes. */ usevnodes = desiredvnodes; if (usevnodes <= 0) usevnodes = 1; trigger = vm_cnt.v_page_count * 2 / usevnodes; done = 0; vn_start_write(NULL, &mp, V_WAIT); MNT_ILOCK(mp); count = mp->mnt_nvnodelistsize / 10 + 1; while (count != 0) { vp = TAILQ_FIRST(&mp->mnt_nvnodelist); while (vp != NULL && vp->v_type == VMARKER) vp = TAILQ_NEXT(vp, v_nmntvnodes); if (vp == NULL) break; TAILQ_REMOVE(&mp->mnt_nvnodelist, vp, v_nmntvnodes); TAILQ_INSERT_TAIL(&mp->mnt_nvnodelist, vp, v_nmntvnodes); --count; if (!VI_TRYLOCK(vp)) goto next_iter; /* * If it's been deconstructed already, it's still * referenced, or it exceeds the trigger, skip it. */ if (vp->v_usecount || (!vlru_allow_cache_src && !LIST_EMPTY(&(vp)->v_cache_src)) || (vp->v_iflag & VI_DOOMED) != 0 || (vp->v_object != NULL && vp->v_object->resident_page_count > trigger)) { VI_UNLOCK(vp); goto next_iter; } MNT_IUNLOCK(mp); vholdl(vp); if (VOP_LOCK(vp, LK_INTERLOCK|LK_EXCLUSIVE|LK_NOWAIT)) { vdrop(vp); goto next_iter_mntunlocked; } VI_LOCK(vp); /* * v_usecount may have been bumped after VOP_LOCK() dropped * the vnode interlock and before it was locked again. * * It is not necessary to recheck VI_DOOMED because it can * only be set by another thread that holds both the vnode * lock and vnode interlock. If another thread has the * vnode lock before we get to VOP_LOCK() and obtains the * vnode interlock after VOP_LOCK() drops the vnode * interlock, the other thread will be unable to drop the * vnode lock before our VOP_LOCK() call fails. */ if (vp->v_usecount || (!vlru_allow_cache_src && !LIST_EMPTY(&(vp)->v_cache_src)) || (vp->v_object != NULL && vp->v_object->resident_page_count > trigger)) { VOP_UNLOCK(vp, LK_INTERLOCK); vdrop(vp); goto next_iter_mntunlocked; } KASSERT((vp->v_iflag & VI_DOOMED) == 0, ("VI_DOOMED unexpectedly detected in vlrureclaim()")); atomic_add_long(&recycles_count, 1); vgonel(vp); VOP_UNLOCK(vp, 0); vdropl(vp); done++; next_iter_mntunlocked: if (!should_yield()) goto relock_mnt; goto yield; next_iter: if (!should_yield()) continue; MNT_IUNLOCK(mp); yield: kern_yield(PRI_USER); relock_mnt: MNT_ILOCK(mp); } MNT_IUNLOCK(mp); vn_finished_write(mp); return done; } /* * Attempt to keep the free list at wantfreevnodes length. */ static void vnlru_free(int count) { struct vnode *vp; mtx_assert(&vnode_free_list_mtx, MA_OWNED); for (; count > 0; count--) { vp = TAILQ_FIRST(&vnode_free_list); /* * The list can be modified while the free_list_mtx * has been dropped and vp could be NULL here. */ if (!vp) break; VNASSERT(vp->v_op != NULL, vp, ("vnlru_free: vnode already reclaimed.")); KASSERT((vp->v_iflag & VI_FREE) != 0, ("Removing vnode not on freelist")); KASSERT((vp->v_iflag & VI_ACTIVE) == 0, ("Mangling active vnode")); TAILQ_REMOVE(&vnode_free_list, vp, v_actfreelist); /* * Don't recycle if we can't get the interlock. */ if (!VI_TRYLOCK(vp)) { TAILQ_INSERT_TAIL(&vnode_free_list, vp, v_actfreelist); continue; } VNASSERT((vp->v_iflag & VI_FREE) != 0 && vp->v_holdcnt == 0, vp, ("vp inconsistent on freelist")); /* * The clear of VI_FREE prevents activation of the * vnode. There is no sense in putting the vnode on * the mount point active list, only to remove it * later during recycling. Inline the relevant part * of vholdl(), to avoid triggering assertions or * activating. */ freevnodes--; vp->v_iflag &= ~VI_FREE; vp->v_holdcnt++; mtx_unlock(&vnode_free_list_mtx); VI_UNLOCK(vp); vtryrecycle(vp); /* * If the recycled succeeded this vdrop will actually free * the vnode. If not it will simply place it back on * the free list. */ vdrop(vp); mtx_lock(&vnode_free_list_mtx); } } /* * Attempt to recycle vnodes in a context that is always safe to block. * Calling vlrurecycle() from the bowels of filesystem code has some * interesting deadlock problems. */ static struct proc *vnlruproc; static int vnlruproc_sig; static void vnlru_proc(void) { struct mount *mp, *nmp; int done; struct proc *p = vnlruproc; EVENTHANDLER_REGISTER(shutdown_pre_sync, kproc_shutdown, p, SHUTDOWN_PRI_FIRST); for (;;) { kproc_suspend_check(p); mtx_lock(&vnode_free_list_mtx); if (freevnodes > wantfreevnodes) vnlru_free(freevnodes - wantfreevnodes); if (numvnodes <= desiredvnodes * 9 / 10) { vnlruproc_sig = 0; wakeup(&vnlruproc_sig); msleep(vnlruproc, &vnode_free_list_mtx, PVFS|PDROP, "vlruwt", hz); continue; } mtx_unlock(&vnode_free_list_mtx); done = 0; mtx_lock(&mountlist_mtx); for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) { if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) { nmp = TAILQ_NEXT(mp, mnt_list); continue; } done += vlrureclaim(mp); mtx_lock(&mountlist_mtx); nmp = TAILQ_NEXT(mp, mnt_list); vfs_unbusy(mp); } mtx_unlock(&mountlist_mtx); if (done == 0) { #if 0 /* These messages are temporary debugging aids */ if (vnlru_nowhere < 5) printf("vnlru process getting nowhere..\n"); else if (vnlru_nowhere == 5) printf("vnlru process messages stopped.\n"); #endif vnlru_nowhere++; tsleep(vnlruproc, PPAUSE, "vlrup", hz * 3); } else kern_yield(PRI_USER); } } static struct kproc_desc vnlru_kp = { "vnlru", vnlru_proc, &vnlruproc }; SYSINIT(vnlru, SI_SUB_KTHREAD_UPDATE, SI_ORDER_FIRST, kproc_start, &vnlru_kp); /* * Routines having to do with the management of the vnode table. */ /* * Try to recycle a freed vnode. We abort if anyone picks up a reference * before we actually vgone(). This function must be called with the vnode * held to prevent the vnode from being returned to the free list midway * through vgone(). */ static int vtryrecycle(struct vnode *vp) { struct mount *vnmp; CTR2(KTR_VFS, "%s: vp %p", __func__, vp); VNASSERT(vp->v_holdcnt, vp, ("vtryrecycle: Recycling vp %p without a reference.", vp)); /* * This vnode may found and locked via some other list, if so we * can't recycle it yet. */ if (VOP_LOCK(vp, LK_EXCLUSIVE | LK_NOWAIT) != 0) { CTR2(KTR_VFS, "%s: impossible to recycle, vp %p lock is already held", __func__, vp); return (EWOULDBLOCK); } /* * Don't recycle if its filesystem is being suspended. */ if (vn_start_write(vp, &vnmp, V_NOWAIT) != 0) { VOP_UNLOCK(vp, 0); CTR2(KTR_VFS, "%s: impossible to recycle, cannot start the write for %p", __func__, vp); return (EBUSY); } /* * If we got this far, we need to acquire the interlock and see if * anyone picked up this vnode from another list. If not, we will * mark it with DOOMED via vgonel() so that anyone who does find it * will skip over it. */ VI_LOCK(vp); if (vp->v_usecount) { VOP_UNLOCK(vp, LK_INTERLOCK); vn_finished_write(vnmp); CTR2(KTR_VFS, "%s: impossible to recycle, %p is already referenced", __func__, vp); return (EBUSY); } if ((vp->v_iflag & VI_DOOMED) == 0) { atomic_add_long(&recycles_count, 1); vgonel(vp); } VOP_UNLOCK(vp, LK_INTERLOCK); vn_finished_write(vnmp); return (0); } /* * Wait for available vnodes. */ static int getnewvnode_wait(int suspended) { mtx_assert(&vnode_free_list_mtx, MA_OWNED); if (numvnodes > desiredvnodes) { if (suspended) { /* * File system is beeing suspended, we cannot risk a * deadlock here, so allocate new vnode anyway. */ if (freevnodes > wantfreevnodes) vnlru_free(freevnodes - wantfreevnodes); return (0); } if (vnlruproc_sig == 0) { vnlruproc_sig = 1; /* avoid unnecessary wakeups */ wakeup(vnlruproc); } msleep(&vnlruproc_sig, &vnode_free_list_mtx, PVFS, "vlruwk", hz); } return (numvnodes > desiredvnodes ? ENFILE : 0); } void getnewvnode_reserve(u_int count) { struct thread *td; td = curthread; /* First try to be quick and racy. */ if (atomic_fetchadd_long(&numvnodes, count) + count <= desiredvnodes) { td->td_vp_reserv += count; return; } else atomic_subtract_long(&numvnodes, count); mtx_lock(&vnode_free_list_mtx); while (count > 0) { if (getnewvnode_wait(0) == 0) { count--; td->td_vp_reserv++; atomic_add_long(&numvnodes, 1); } } mtx_unlock(&vnode_free_list_mtx); } void getnewvnode_drop_reserve(void) { struct thread *td; td = curthread; atomic_subtract_long(&numvnodes, td->td_vp_reserv); td->td_vp_reserv = 0; } /* * Return the next vnode from the free list. */ int getnewvnode(const char *tag, struct mount *mp, struct vop_vector *vops, struct vnode **vpp) { struct vnode *vp; struct bufobj *bo; struct thread *td; int error; CTR3(KTR_VFS, "%s: mp %p with tag %s", __func__, mp, tag); vp = NULL; td = curthread; if (td->td_vp_reserv > 0) { td->td_vp_reserv -= 1; goto alloc; } mtx_lock(&vnode_free_list_mtx); /* * Lend our context to reclaim vnodes if they've exceeded the max. */ if (freevnodes > wantfreevnodes) vnlru_free(1); error = getnewvnode_wait(mp != NULL && (mp->mnt_kern_flag & MNTK_SUSPEND)); #if 0 /* XXX Not all VFS_VGET/ffs_vget callers check returns. */ if (error != 0) { mtx_unlock(&vnode_free_list_mtx); return (error); } #endif atomic_add_long(&numvnodes, 1); mtx_unlock(&vnode_free_list_mtx); alloc: atomic_add_long(&vnodes_created, 1); vp = (struct vnode *) uma_zalloc(vnode_zone, M_WAITOK|M_ZERO); /* * Setup locks. */ vp->v_vnlock = &vp->v_lock; mtx_init(&vp->v_interlock, "vnode interlock", NULL, MTX_DEF); /* * By default, don't allow shared locks unless filesystems * opt-in. */ lockinit(vp->v_vnlock, PVFS, tag, VLKTIMEOUT, LK_NOSHARE | LK_IS_VNODE); /* * Initialize bufobj. */ bo = &vp->v_bufobj; bo->__bo_vnode = vp; rw_init(BO_LOCKPTR(bo), "bufobj interlock"); bo->bo_ops = &buf_ops_bio; bo->bo_private = vp; TAILQ_INIT(&bo->bo_clean.bv_hd); TAILQ_INIT(&bo->bo_dirty.bv_hd); /* * Initialize namecache. */ LIST_INIT(&vp->v_cache_src); TAILQ_INIT(&vp->v_cache_dst); /* * Finalize various vnode identity bits. */ vp->v_type = VNON; vp->v_tag = tag; vp->v_op = vops; v_incr_usecount(vp); vp->v_data = NULL; #ifdef MAC mac_vnode_init(vp); if (mp != NULL && (mp->mnt_flag & MNT_MULTILABEL) == 0) mac_vnode_associate_singlelabel(mp, vp); else if (mp == NULL && vops != &dead_vnodeops) printf("NULL mp in getnewvnode()\n"); #endif if (mp != NULL) { bo->bo_bsize = mp->mnt_stat.f_iosize; if ((mp->mnt_kern_flag & MNTK_NOKNOTE) != 0) vp->v_vflag |= VV_NOKNOTE; } rangelock_init(&vp->v_rl); /* * For the filesystems which do not use vfs_hash_insert(), * still initialize v_hash to have vfs_hash_index() useful. * E.g., nullfs uses vfs_hash_index() on the lower vnode for * its own hashing. */ vp->v_hash = (uintptr_t)vp >> vnsz2log; *vpp = vp; return (0); } /* * Delete from old mount point vnode list, if on one. */ static void delmntque(struct vnode *vp) { struct mount *mp; int active; mp = vp->v_mount; if (mp == NULL) return; MNT_ILOCK(mp); VI_LOCK(vp); KASSERT(mp->mnt_activevnodelistsize <= mp->mnt_nvnodelistsize, ("Active vnode list size %d > Vnode list size %d", mp->mnt_activevnodelistsize, mp->mnt_nvnodelistsize)); active = vp->v_iflag & VI_ACTIVE; vp->v_iflag &= ~VI_ACTIVE; if (active) { mtx_lock(&vnode_free_list_mtx); TAILQ_REMOVE(&mp->mnt_activevnodelist, vp, v_actfreelist); mp->mnt_activevnodelistsize--; mtx_unlock(&vnode_free_list_mtx); } vp->v_mount = NULL; VI_UNLOCK(vp); VNASSERT(mp->mnt_nvnodelistsize > 0, vp, ("bad mount point vnode list size")); TAILQ_REMOVE(&mp->mnt_nvnodelist, vp, v_nmntvnodes); mp->mnt_nvnodelistsize--; MNT_REL(mp); MNT_IUNLOCK(mp); } static void insmntque_stddtr(struct vnode *vp, void *dtr_arg) { vp->v_data = NULL; vp->v_op = &dead_vnodeops; vgone(vp); vput(vp); } /* * Insert into list of vnodes for the new mount point, if available. */ int insmntque1(struct vnode *vp, struct mount *mp, void (*dtr)(struct vnode *, void *), void *dtr_arg) { KASSERT(vp->v_mount == NULL, ("insmntque: vnode already on per mount vnode list")); VNASSERT(mp != NULL, vp, ("Don't call insmntque(foo, NULL)")); ASSERT_VOP_ELOCKED(vp, "insmntque: non-locked vp"); /* * We acquire the vnode interlock early to ensure that the * vnode cannot be recycled by another process releasing a * holdcnt on it before we get it on both the vnode list * and the active vnode list. The mount mutex protects only * manipulation of the vnode list and the vnode freelist * mutex protects only manipulation of the active vnode list. * Hence the need to hold the vnode interlock throughout. */ MNT_ILOCK(mp); VI_LOCK(vp); if (((mp->mnt_kern_flag & MNTK_NOINSMNTQ) != 0 && ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0 || mp->mnt_nvnodelistsize == 0)) && (vp->v_vflag & VV_FORCEINSMQ) == 0) { VI_UNLOCK(vp); MNT_IUNLOCK(mp); if (dtr != NULL) dtr(vp, dtr_arg); return (EBUSY); } vp->v_mount = mp; MNT_REF(mp); TAILQ_INSERT_TAIL(&mp->mnt_nvnodelist, vp, v_nmntvnodes); VNASSERT(mp->mnt_nvnodelistsize >= 0, vp, ("neg mount point vnode list size")); mp->mnt_nvnodelistsize++; KASSERT((vp->v_iflag & VI_ACTIVE) == 0, ("Activating already active vnode")); vp->v_iflag |= VI_ACTIVE; mtx_lock(&vnode_free_list_mtx); TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfreelist); mp->mnt_activevnodelistsize++; mtx_unlock(&vnode_free_list_mtx); VI_UNLOCK(vp); MNT_IUNLOCK(mp); return (0); } int insmntque(struct vnode *vp, struct mount *mp) { return (insmntque1(vp, mp, insmntque_stddtr, NULL)); } /* * Flush out and invalidate all buffers associated with a bufobj * Called with the underlying object locked. */ int bufobj_invalbuf(struct bufobj *bo, int flags, int slpflag, int slptimeo) { int error; BO_LOCK(bo); if (flags & V_SAVE) { error = bufobj_wwait(bo, slpflag, slptimeo); if (error) { BO_UNLOCK(bo); return (error); } if (bo->bo_dirty.bv_cnt > 0) { BO_UNLOCK(bo); if ((error = BO_SYNC(bo, MNT_WAIT)) != 0) return (error); /* * XXX We could save a lock/unlock if this was only * enabled under INVARIANTS */ BO_LOCK(bo); if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0) panic("vinvalbuf: dirty bufs"); } } /* * If you alter this loop please notice that interlock is dropped and * reacquired in flushbuflist. Special care is needed to ensure that * no race conditions occur from this. */ do { error = flushbuflist(&bo->bo_clean, flags, bo, slpflag, slptimeo); if (error == 0 && !(flags & V_CLEANONLY)) error = flushbuflist(&bo->bo_dirty, flags, bo, slpflag, slptimeo); if (error != 0 && error != EAGAIN) { BO_UNLOCK(bo); return (error); } } while (error != 0); /* * Wait for I/O to complete. XXX needs cleaning up. The vnode can * have write I/O in-progress but if there is a VM object then the * VM object can also have read-I/O in-progress. */ do { bufobj_wwait(bo, 0, 0); BO_UNLOCK(bo); if (bo->bo_object != NULL) { VM_OBJECT_WLOCK(bo->bo_object); vm_object_pip_wait(bo->bo_object, "bovlbx"); VM_OBJECT_WUNLOCK(bo->bo_object); } BO_LOCK(bo); } while (bo->bo_numoutput > 0); BO_UNLOCK(bo); /* * Destroy the copy in the VM cache, too. */ if (bo->bo_object != NULL && (flags & (V_ALT | V_NORMAL | V_CLEANONLY)) == 0) { VM_OBJECT_WLOCK(bo->bo_object); vm_object_page_remove(bo->bo_object, 0, 0, (flags & V_SAVE) ? OBJPR_CLEANONLY : 0); VM_OBJECT_WUNLOCK(bo->bo_object); } #ifdef INVARIANTS BO_LOCK(bo); if ((flags & (V_ALT | V_NORMAL | V_CLEANONLY)) == 0 && (bo->bo_dirty.bv_cnt > 0 || bo->bo_clean.bv_cnt > 0)) panic("vinvalbuf: flush failed"); BO_UNLOCK(bo); #endif return (0); } /* * Flush out and invalidate all buffers associated with a vnode. * Called with the underlying object locked. */ int vinvalbuf(struct vnode *vp, int flags, int slpflag, int slptimeo) { CTR3(KTR_VFS, "%s: vp %p with flags %d", __func__, vp, flags); ASSERT_VOP_LOCKED(vp, "vinvalbuf"); if (vp->v_object != NULL && vp->v_object->handle != vp) return (0); return (bufobj_invalbuf(&vp->v_bufobj, flags, slpflag, slptimeo)); } /* * Flush out buffers on the specified list. * */ static int flushbuflist(struct bufv *bufv, int flags, struct bufobj *bo, int slpflag, int slptimeo) { struct buf *bp, *nbp; int retval, error; daddr_t lblkno; b_xflags_t xflags; ASSERT_BO_WLOCKED(bo); retval = 0; TAILQ_FOREACH_SAFE(bp, &bufv->bv_hd, b_bobufs, nbp) { if (((flags & V_NORMAL) && (bp->b_xflags & BX_ALTDATA)) || ((flags & V_ALT) && (bp->b_xflags & BX_ALTDATA) == 0)) { continue; } lblkno = 0; xflags = 0; if (nbp != NULL) { lblkno = nbp->b_lblkno; xflags = nbp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN); } retval = EAGAIN; error = BUF_TIMELOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo), "flushbuf", slpflag, slptimeo); if (error) { BO_LOCK(bo); return (error != ENOLCK ? error : EAGAIN); } KASSERT(bp->b_bufobj == bo, ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo)); if (bp->b_bufobj != bo) { /* XXX: necessary ? */ BUF_UNLOCK(bp); BO_LOCK(bo); return (EAGAIN); } /* * XXX Since there are no node locks for NFS, I * believe there is a slight chance that a delayed * write will occur while sleeping just above, so * check for it. */ if (((bp->b_flags & (B_DELWRI | B_INVAL)) == B_DELWRI) && (flags & V_SAVE)) { bremfree(bp); bp->b_flags |= B_ASYNC; bwrite(bp); BO_LOCK(bo); return (EAGAIN); /* XXX: why not loop ? */ } bremfree(bp); bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~B_ASYNC; brelse(bp); BO_LOCK(bo); if (nbp != NULL && (nbp->b_bufobj != bo || nbp->b_lblkno != lblkno || (nbp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN)) != xflags)) break; /* nbp invalid */ } return (retval); } /* * Truncate a file's buffer and pages to a specified length. This * is in lieu of the old vinvalbuf mechanism, which performed unneeded * sync activity. */ int vtruncbuf(struct vnode *vp, struct ucred *cred, off_t length, int blksize) { struct buf *bp, *nbp; int anyfreed; int trunclbn; struct bufobj *bo; CTR5(KTR_VFS, "%s: vp %p with cred %p and block %d:%ju", __func__, vp, cred, blksize, (uintmax_t)length); /* * Round up to the *next* lbn. */ trunclbn = (length + blksize - 1) / blksize; ASSERT_VOP_LOCKED(vp, "vtruncbuf"); restart: bo = &vp->v_bufobj; BO_LOCK(bo); anyfreed = 1; for (;anyfreed;) { anyfreed = 0; TAILQ_FOREACH_SAFE(bp, &bo->bo_clean.bv_hd, b_bobufs, nbp) { if (bp->b_lblkno < trunclbn) continue; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) goto restart; bremfree(bp); bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~B_ASYNC; brelse(bp); anyfreed = 1; BO_LOCK(bo); if (nbp != NULL && (((nbp->b_xflags & BX_VNCLEAN) == 0) || (nbp->b_vp != vp) || (nbp->b_flags & B_DELWRI))) { BO_UNLOCK(bo); goto restart; } } TAILQ_FOREACH_SAFE(bp, &bo->bo_dirty.bv_hd, b_bobufs, nbp) { if (bp->b_lblkno < trunclbn) continue; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) goto restart; bremfree(bp); bp->b_flags |= (B_INVAL | B_RELBUF); bp->b_flags &= ~B_ASYNC; brelse(bp); anyfreed = 1; BO_LOCK(bo); if (nbp != NULL && (((nbp->b_xflags & BX_VNDIRTY) == 0) || (nbp->b_vp != vp) || (nbp->b_flags & B_DELWRI) == 0)) { BO_UNLOCK(bo); goto restart; } } } if (length > 0) { restartsync: TAILQ_FOREACH_SAFE(bp, &bo->bo_dirty.bv_hd, b_bobufs, nbp) { if (bp->b_lblkno > 0) continue; /* * Since we hold the vnode lock this should only * fail if we're racing with the buf daemon. */ if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) { goto restart; } VNASSERT((bp->b_flags & B_DELWRI), vp, ("buf(%p) on dirty queue without DELWRI", bp)); bremfree(bp); bawrite(bp); BO_LOCK(bo); goto restartsync; } } bufobj_wwait(bo, 0, 0); BO_UNLOCK(bo); vnode_pager_setsize(vp, length); return (0); } static void buf_vlist_remove(struct buf *bp) { struct bufv *bv; KASSERT(bp->b_bufobj != NULL, ("No b_bufobj %p", bp)); ASSERT_BO_WLOCKED(bp->b_bufobj); KASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) != (BX_VNDIRTY|BX_VNCLEAN), ("buf_vlist_remove: Buf %p is on two lists", bp)); if (bp->b_xflags & BX_VNDIRTY) bv = &bp->b_bufobj->bo_dirty; else bv = &bp->b_bufobj->bo_clean; BUF_PCTRIE_REMOVE(&bv->bv_root, bp->b_lblkno); TAILQ_REMOVE(&bv->bv_hd, bp, b_bobufs); bv->bv_cnt--; bp->b_xflags &= ~(BX_VNDIRTY | BX_VNCLEAN); } /* * Add the buffer to the sorted clean or dirty block list. * * NOTE: xflags is passed as a constant, optimizing this inline function! */ static void buf_vlist_add(struct buf *bp, struct bufobj *bo, b_xflags_t xflags) { struct bufv *bv; struct buf *n; int error; ASSERT_BO_WLOCKED(bo); KASSERT((bo->bo_flag & BO_DEAD) == 0, ("dead bo %p", bo)); KASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) == 0, ("buf_vlist_add: Buf %p has existing xflags %d", bp, bp->b_xflags)); bp->b_xflags |= xflags; if (xflags & BX_VNDIRTY) bv = &bo->bo_dirty; else bv = &bo->bo_clean; /* * Keep the list ordered. Optimize empty list insertion. Assume * we tend to grow at the tail so lookup_le should usually be cheaper * than _ge. */ if (bv->bv_cnt == 0 || bp->b_lblkno > TAILQ_LAST(&bv->bv_hd, buflists)->b_lblkno) TAILQ_INSERT_TAIL(&bv->bv_hd, bp, b_bobufs); else if ((n = BUF_PCTRIE_LOOKUP_LE(&bv->bv_root, bp->b_lblkno)) == NULL) TAILQ_INSERT_HEAD(&bv->bv_hd, bp, b_bobufs); else TAILQ_INSERT_AFTER(&bv->bv_hd, n, bp, b_bobufs); error = BUF_PCTRIE_INSERT(&bv->bv_root, bp); if (error) panic("buf_vlist_add: Preallocated nodes insufficient."); bv->bv_cnt++; } /* * Lookup a buffer using the splay tree. Note that we specifically avoid * shadow buffers used in background bitmap writes. * * This code isn't quite efficient as it could be because we are maintaining * two sorted lists and do not know which list the block resides in. * * During a "make buildworld" the desired buffer is found at one of * the roots more than 60% of the time. Thus, checking both roots * before performing either splay eliminates unnecessary splays on the * first tree splayed. */ struct buf * gbincore(struct bufobj *bo, daddr_t lblkno) { struct buf *bp; ASSERT_BO_LOCKED(bo); bp = BUF_PCTRIE_LOOKUP(&bo->bo_clean.bv_root, lblkno); if (bp != NULL) return (bp); return BUF_PCTRIE_LOOKUP(&bo->bo_dirty.bv_root, lblkno); } /* * Associate a buffer with a vnode. */ void bgetvp(struct vnode *vp, struct buf *bp) { struct bufobj *bo; bo = &vp->v_bufobj; ASSERT_BO_WLOCKED(bo); VNASSERT(bp->b_vp == NULL, bp->b_vp, ("bgetvp: not free")); CTR3(KTR_BUF, "bgetvp(%p) vp %p flags %X", bp, vp, bp->b_flags); VNASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) == 0, vp, ("bgetvp: bp already attached! %p", bp)); vhold(vp); bp->b_vp = vp; bp->b_bufobj = bo; /* * Insert onto list for new vnode. */ buf_vlist_add(bp, bo, BX_VNCLEAN); } /* * Disassociate a buffer from a vnode. */ void brelvp(struct buf *bp) { struct bufobj *bo; struct vnode *vp; CTR3(KTR_BUF, "brelvp(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags); KASSERT(bp->b_vp != NULL, ("brelvp: NULL")); /* * Delete from old vnode list, if on one. */ vp = bp->b_vp; /* XXX */ bo = bp->b_bufobj; BO_LOCK(bo); if (bp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN)) buf_vlist_remove(bp); else panic("brelvp: Buffer %p not on queue.", bp); if ((bo->bo_flag & BO_ONWORKLST) && bo->bo_dirty.bv_cnt == 0) { bo->bo_flag &= ~BO_ONWORKLST; mtx_lock(&sync_mtx); LIST_REMOVE(bo, bo_synclist); syncer_worklist_len--; mtx_unlock(&sync_mtx); } bp->b_vp = NULL; bp->b_bufobj = NULL; BO_UNLOCK(bo); vdrop(vp); } /* * Add an item to the syncer work queue. */ static void vn_syncer_add_to_worklist(struct bufobj *bo, int delay) { int slot; ASSERT_BO_WLOCKED(bo); mtx_lock(&sync_mtx); if (bo->bo_flag & BO_ONWORKLST) LIST_REMOVE(bo, bo_synclist); else { bo->bo_flag |= BO_ONWORKLST; syncer_worklist_len++; } if (delay > syncer_maxdelay - 2) delay = syncer_maxdelay - 2; slot = (syncer_delayno + delay) & syncer_mask; LIST_INSERT_HEAD(&syncer_workitem_pending[slot], bo, bo_synclist); mtx_unlock(&sync_mtx); } static int sysctl_vfs_worklist_len(SYSCTL_HANDLER_ARGS) { int error, len; mtx_lock(&sync_mtx); len = syncer_worklist_len - sync_vnode_count; mtx_unlock(&sync_mtx); error = SYSCTL_OUT(req, &len, sizeof(len)); return (error); } SYSCTL_PROC(_vfs, OID_AUTO, worklist_len, CTLTYPE_INT | CTLFLAG_RD, NULL, 0, sysctl_vfs_worklist_len, "I", "Syncer thread worklist length"); static struct proc *updateproc; static void sched_sync(void); static struct kproc_desc up_kp = { "syncer", sched_sync, &updateproc }; SYSINIT(syncer, SI_SUB_KTHREAD_UPDATE, SI_ORDER_FIRST, kproc_start, &up_kp); static int sync_vnode(struct synclist *slp, struct bufobj **bo, struct thread *td) { struct vnode *vp; struct mount *mp; *bo = LIST_FIRST(slp); if (*bo == NULL) return (0); vp = (*bo)->__bo_vnode; /* XXX */ if (VOP_ISLOCKED(vp) != 0 || VI_TRYLOCK(vp) == 0) return (1); /* * We use vhold in case the vnode does not * successfully sync. vhold prevents the vnode from * going away when we unlock the sync_mtx so that * we can acquire the vnode interlock. */ vholdl(vp); mtx_unlock(&sync_mtx); VI_UNLOCK(vp); if (vn_start_write(vp, &mp, V_NOWAIT) != 0) { vdrop(vp); mtx_lock(&sync_mtx); return (*bo == LIST_FIRST(slp)); } vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); (void) VOP_FSYNC(vp, MNT_LAZY, td); VOP_UNLOCK(vp, 0); vn_finished_write(mp); BO_LOCK(*bo); if (((*bo)->bo_flag & BO_ONWORKLST) != 0) { /* * Put us back on the worklist. The worklist * routine will remove us from our current * position and then add us back in at a later * position. */ vn_syncer_add_to_worklist(*bo, syncdelay); } BO_UNLOCK(*bo); vdrop(vp); mtx_lock(&sync_mtx); return (0); } static int first_printf = 1; /* * System filesystem synchronizer daemon. */ static void sched_sync(void) { struct synclist *next, *slp; struct bufobj *bo; long starttime; struct thread *td = curthread; int last_work_seen; int net_worklist_len; int syncer_final_iter; int error; last_work_seen = 0; syncer_final_iter = 0; syncer_state = SYNCER_RUNNING; starttime = time_uptime; td->td_pflags |= TDP_NORUNNINGBUF; EVENTHANDLER_REGISTER(shutdown_pre_sync, syncer_shutdown, td->td_proc, SHUTDOWN_PRI_LAST); mtx_lock(&sync_mtx); for (;;) { if (syncer_state == SYNCER_FINAL_DELAY && syncer_final_iter == 0) { mtx_unlock(&sync_mtx); kproc_suspend_check(td->td_proc); mtx_lock(&sync_mtx); } net_worklist_len = syncer_worklist_len - sync_vnode_count; if (syncer_state != SYNCER_RUNNING && starttime != time_uptime) { if (first_printf) { printf("\nSyncing disks, vnodes remaining..."); first_printf = 0; } printf("%d ", net_worklist_len); } starttime = time_uptime; /* * Push files whose dirty time has expired. Be careful * of interrupt race on slp queue. * * Skip over empty worklist slots when shutting down. */ do { slp = &syncer_workitem_pending[syncer_delayno]; syncer_delayno += 1; if (syncer_delayno == syncer_maxdelay) syncer_delayno = 0; next = &syncer_workitem_pending[syncer_delayno]; /* * If the worklist has wrapped since the * it was emptied of all but syncer vnodes, * switch to the FINAL_DELAY state and run * for one more second. */ if (syncer_state == SYNCER_SHUTTING_DOWN && net_worklist_len == 0 && last_work_seen == syncer_delayno) { syncer_state = SYNCER_FINAL_DELAY; syncer_final_iter = SYNCER_SHUTDOWN_SPEEDUP; } } while (syncer_state != SYNCER_RUNNING && LIST_EMPTY(slp) && syncer_worklist_len > 0); /* * Keep track of the last time there was anything * on the worklist other than syncer vnodes. * Return to the SHUTTING_DOWN state if any * new work appears. */ if (net_worklist_len > 0 || syncer_state == SYNCER_RUNNING) last_work_seen = syncer_delayno; if (net_worklist_len > 0 && syncer_state == SYNCER_FINAL_DELAY) syncer_state = SYNCER_SHUTTING_DOWN; while (!LIST_EMPTY(slp)) { error = sync_vnode(slp, &bo, td); if (error == 1) { LIST_REMOVE(bo, bo_synclist); LIST_INSERT_HEAD(next, bo, bo_synclist); continue; } if (first_printf == 0) { /* * Drop the sync mutex, because some watchdog * drivers need to sleep while patting */ mtx_unlock(&sync_mtx); wdog_kern_pat(WD_LASTVAL); mtx_lock(&sync_mtx); } } if (syncer_state == SYNCER_FINAL_DELAY && syncer_final_iter > 0) syncer_final_iter--; /* * The variable rushjob allows the kernel to speed up the * processing of the filesystem syncer process. A rushjob * value of N tells the filesystem syncer to process the next * N seconds worth of work on its queue ASAP. Currently rushjob * is used by the soft update code to speed up the filesystem * syncer process when the incore state is getting so far * ahead of the disk that the kernel memory pool is being * threatened with exhaustion. */ if (rushjob > 0) { rushjob -= 1; continue; } /* * Just sleep for a short period of time between * iterations when shutting down to allow some I/O * to happen. * * If it has taken us less than a second to process the * current work, then wait. Otherwise start right over * again. We can still lose time if any single round * takes more than two seconds, but it does not really * matter as we are just trying to generally pace the * filesystem activity. */ if (syncer_state != SYNCER_RUNNING || time_uptime == starttime) { thread_lock(td); sched_prio(td, PPAUSE); thread_unlock(td); } if (syncer_state != SYNCER_RUNNING) cv_timedwait(&sync_wakeup, &sync_mtx, hz / SYNCER_SHUTDOWN_SPEEDUP); else if (time_uptime == starttime) cv_timedwait(&sync_wakeup, &sync_mtx, hz); } } /* * Request the syncer daemon to speed up its work. * We never push it to speed up more than half of its * normal turn time, otherwise it could take over the cpu. */ int speedup_syncer(void) { int ret = 0; mtx_lock(&sync_mtx); if (rushjob < syncdelay / 2) { rushjob += 1; stat_rush_requests += 1; ret = 1; } mtx_unlock(&sync_mtx); cv_broadcast(&sync_wakeup); return (ret); } /* * Tell the syncer to speed up its work and run though its work * list several times, then tell it to shut down. */ static void syncer_shutdown(void *arg, int howto) { if (howto & RB_NOSYNC) return; mtx_lock(&sync_mtx); syncer_state = SYNCER_SHUTTING_DOWN; rushjob = 0; mtx_unlock(&sync_mtx); cv_broadcast(&sync_wakeup); kproc_shutdown(arg, howto); } void syncer_suspend(void) { syncer_shutdown(updateproc, 0); } void syncer_resume(void) { mtx_lock(&sync_mtx); first_printf = 1; syncer_state = SYNCER_RUNNING; mtx_unlock(&sync_mtx); cv_broadcast(&sync_wakeup); kproc_resume(updateproc); } /* * Reassign a buffer from one vnode to another. * Used to assign file specific control information * (indirect blocks) to the vnode to which they belong. */ void reassignbuf(struct buf *bp) { struct vnode *vp; struct bufobj *bo; int delay; #ifdef INVARIANTS struct bufv *bv; #endif vp = bp->b_vp; bo = bp->b_bufobj; ++reassignbufcalls; CTR3(KTR_BUF, "reassignbuf(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags); /* * B_PAGING flagged buffers cannot be reassigned because their vp * is not fully linked in. */ if (bp->b_flags & B_PAGING) panic("cannot reassign paging buffer"); /* * Delete from old vnode list, if on one. */ BO_LOCK(bo); if (bp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN)) buf_vlist_remove(bp); else panic("reassignbuf: Buffer %p not on queue.", bp); /* * If dirty, put on list of dirty buffers; otherwise insert onto list * of clean buffers. */ if (bp->b_flags & B_DELWRI) { if ((bo->bo_flag & BO_ONWORKLST) == 0) { switch (vp->v_type) { case VDIR: delay = dirdelay; break; case VCHR: delay = metadelay; break; default: delay = filedelay; } vn_syncer_add_to_worklist(bo, delay); } buf_vlist_add(bp, bo, BX_VNDIRTY); } else { buf_vlist_add(bp, bo, BX_VNCLEAN); if ((bo->bo_flag & BO_ONWORKLST) && bo->bo_dirty.bv_cnt == 0) { mtx_lock(&sync_mtx); LIST_REMOVE(bo, bo_synclist); syncer_worklist_len--; mtx_unlock(&sync_mtx); bo->bo_flag &= ~BO_ONWORKLST; } } #ifdef INVARIANTS bv = &bo->bo_clean; bp = TAILQ_FIRST(&bv->bv_hd); KASSERT(bp == NULL || bp->b_bufobj == bo, ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo)); bp = TAILQ_LAST(&bv->bv_hd, buflists); KASSERT(bp == NULL || bp->b_bufobj == bo, ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo)); bv = &bo->bo_dirty; bp = TAILQ_FIRST(&bv->bv_hd); KASSERT(bp == NULL || bp->b_bufobj == bo, ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo)); bp = TAILQ_LAST(&bv->bv_hd, buflists); KASSERT(bp == NULL || bp->b_bufobj == bo, ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo)); #endif BO_UNLOCK(bo); } /* * Increment the use and hold counts on the vnode, taking care to reference * the driver's usecount if this is a chardev. The vholdl() will remove * the vnode from the free list if it is presently free. Requires the * vnode interlock and returns with it held. */ static void v_incr_usecount(struct vnode *vp) { CTR2(KTR_VFS, "%s: vp %p", __func__, vp); vholdl(vp); vp->v_usecount++; if (vp->v_type == VCHR && vp->v_rdev != NULL) { dev_lock(); vp->v_rdev->si_usecount++; dev_unlock(); } } /* * Turn a holdcnt into a use+holdcnt such that only one call to * v_decr_usecount is needed. */ static void v_upgrade_usecount(struct vnode *vp) { CTR2(KTR_VFS, "%s: vp %p", __func__, vp); vp->v_usecount++; if (vp->v_type == VCHR && vp->v_rdev != NULL) { dev_lock(); vp->v_rdev->si_usecount++; dev_unlock(); } } /* * Decrement the vnode use and hold count along with the driver's usecount * if this is a chardev. The vdropl() below releases the vnode interlock * as it may free the vnode. */ static void v_decr_usecount(struct vnode *vp) { ASSERT_VI_LOCKED(vp, __FUNCTION__); VNASSERT(vp->v_usecount > 0, vp, ("v_decr_usecount: negative usecount")); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); vp->v_usecount--; if (vp->v_type == VCHR && vp->v_rdev != NULL) { dev_lock(); vp->v_rdev->si_usecount--; dev_unlock(); } vdropl(vp); } /* * Decrement only the use count and driver use count. This is intended to * be paired with a follow on vdropl() to release the remaining hold count. * In this way we may vgone() a vnode with a 0 usecount without risk of * having it end up on a free list because the hold count is kept above 0. */ static void v_decr_useonly(struct vnode *vp) { ASSERT_VI_LOCKED(vp, __FUNCTION__); VNASSERT(vp->v_usecount > 0, vp, ("v_decr_useonly: negative usecount")); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); vp->v_usecount--; if (vp->v_type == VCHR && vp->v_rdev != NULL) { dev_lock(); vp->v_rdev->si_usecount--; dev_unlock(); } } /* * Grab a particular vnode from the free list, increment its * reference count and lock it. VI_DOOMED is set if the vnode * is being destroyed. Only callers who specify LK_RETRY will * see doomed vnodes. If inactive processing was delayed in * vput try to do it here. */ int vget(struct vnode *vp, int flags, struct thread *td) { int error; error = 0; VNASSERT((flags & LK_TYPE_MASK) != 0, vp, ("vget: invalid lock operation")); CTR3(KTR_VFS, "%s: vp %p with flags %d", __func__, vp, flags); if ((flags & LK_INTERLOCK) == 0) VI_LOCK(vp); vholdl(vp); if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) { vdrop(vp); CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__, vp); return (error); } if (vp->v_iflag & VI_DOOMED && (flags & LK_RETRY) == 0) panic("vget: vn_lock failed to return ENOENT\n"); VI_LOCK(vp); /* Upgrade our holdcnt to a usecount. */ v_upgrade_usecount(vp); /* * We don't guarantee that any particular close will * trigger inactive processing so just make a best effort * here at preventing a reference to a removed file. If * we don't succeed no harm is done. */ if (vp->v_iflag & VI_OWEINACT) { if (VOP_ISLOCKED(vp) == LK_EXCLUSIVE && (flags & LK_NOWAIT) == 0) vinactive(vp, td); vp->v_iflag &= ~VI_OWEINACT; } VI_UNLOCK(vp); return (0); } /* * Increase the reference count of a vnode. */ void vref(struct vnode *vp) { CTR2(KTR_VFS, "%s: vp %p", __func__, vp); VI_LOCK(vp); v_incr_usecount(vp); VI_UNLOCK(vp); } /* * Return reference count of a vnode. * * The results of this call are only guaranteed when some mechanism other * than the VI lock is used to stop other processes from gaining references * to the vnode. This may be the case if the caller holds the only reference. * This is also useful when stale data is acceptable as race conditions may * be accounted for by some other means. */ int vrefcnt(struct vnode *vp) { int usecnt; VI_LOCK(vp); usecnt = vp->v_usecount; VI_UNLOCK(vp); return (usecnt); } #define VPUTX_VRELE 1 #define VPUTX_VPUT 2 #define VPUTX_VUNREF 3 static void vputx(struct vnode *vp, int func) { int error; KASSERT(vp != NULL, ("vputx: null vp")); if (func == VPUTX_VUNREF) ASSERT_VOP_LOCKED(vp, "vunref"); else if (func == VPUTX_VPUT) ASSERT_VOP_LOCKED(vp, "vput"); else KASSERT(func == VPUTX_VRELE, ("vputx: wrong func")); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); VI_LOCK(vp); /* Skip this v_writecount check if we're going to panic below. */ VNASSERT(vp->v_writecount < vp->v_usecount || vp->v_usecount < 1, vp, ("vputx: missed vn_close")); error = 0; if (vp->v_usecount > 1 || ((vp->v_iflag & VI_DOINGINACT) && vp->v_usecount == 1)) { if (func == VPUTX_VPUT) VOP_UNLOCK(vp, 0); v_decr_usecount(vp); return; } if (vp->v_usecount != 1) { vprint("vputx: negative ref count", vp); panic("vputx: negative ref cnt"); } CTR2(KTR_VFS, "%s: return vnode %p to the freelist", __func__, vp); /* * We want to hold the vnode until the inactive finishes to * prevent vgone() races. We drop the use count here and the * hold count below when we're done. */ v_decr_useonly(vp); /* * We must call VOP_INACTIVE with the node locked. Mark * as VI_DOINGINACT to avoid recursion. */ vp->v_iflag |= VI_OWEINACT; switch (func) { case VPUTX_VRELE: error = vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK); VI_LOCK(vp); break; case VPUTX_VPUT: if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) { error = VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK | LK_NOWAIT); VI_LOCK(vp); } break; case VPUTX_VUNREF: if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) { error = VOP_LOCK(vp, LK_TRYUPGRADE | LK_INTERLOCK); VI_LOCK(vp); } break; } if (vp->v_usecount > 0) vp->v_iflag &= ~VI_OWEINACT; if (error == 0) { if (vp->v_iflag & VI_OWEINACT) vinactive(vp, curthread); if (func != VPUTX_VUNREF) VOP_UNLOCK(vp, 0); } vdropl(vp); } /* * Vnode put/release. * If count drops to zero, call inactive routine and return to freelist. */ void vrele(struct vnode *vp) { vputx(vp, VPUTX_VRELE); } /* * Release an already locked vnode. This give the same effects as * unlock+vrele(), but takes less time and avoids releasing and * re-aquiring the lock (as vrele() acquires the lock internally.) */ void vput(struct vnode *vp) { vputx(vp, VPUTX_VPUT); } /* * Release an exclusively locked vnode. Do not unlock the vnode lock. */ void vunref(struct vnode *vp) { vputx(vp, VPUTX_VUNREF); } /* * Somebody doesn't want the vnode recycled. */ void vhold(struct vnode *vp) { VI_LOCK(vp); vholdl(vp); VI_UNLOCK(vp); } /* * Increase the hold count and activate if this is the first reference. */ void vholdl(struct vnode *vp) { struct mount *mp; CTR2(KTR_VFS, "%s: vp %p", __func__, vp); #ifdef INVARIANTS /* getnewvnode() calls v_incr_usecount() without holding interlock. */ if (vp->v_type != VNON || vp->v_data != NULL) { ASSERT_VI_LOCKED(vp, "vholdl"); VNASSERT(vp->v_holdcnt > 0 || (vp->v_iflag & VI_FREE) != 0, vp, ("vholdl: free vnode is held")); } #endif vp->v_holdcnt++; if ((vp->v_iflag & VI_FREE) == 0) return; VNASSERT(vp->v_holdcnt == 1, vp, ("vholdl: wrong hold count")); VNASSERT(vp->v_op != NULL, vp, ("vholdl: vnode already reclaimed.")); /* * Remove a vnode from the free list, mark it as in use, * and put it on the active list. */ mtx_lock(&vnode_free_list_mtx); TAILQ_REMOVE(&vnode_free_list, vp, v_actfreelist); freevnodes--; vp->v_iflag &= ~(VI_FREE|VI_AGE); KASSERT((vp->v_iflag & VI_ACTIVE) == 0, ("Activating already active vnode")); vp->v_iflag |= VI_ACTIVE; mp = vp->v_mount; TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfreelist); mp->mnt_activevnodelistsize++; mtx_unlock(&vnode_free_list_mtx); } /* * Note that there is one less who cares about this vnode. * vdrop() is the opposite of vhold(). */ void vdrop(struct vnode *vp) { VI_LOCK(vp); vdropl(vp); } /* * Drop the hold count of the vnode. If this is the last reference to * the vnode we place it on the free list unless it has been vgone'd * (marked VI_DOOMED) in which case we will free it. */ void vdropl(struct vnode *vp) { struct bufobj *bo; struct mount *mp; int active; ASSERT_VI_LOCKED(vp, "vdropl"); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); if (vp->v_holdcnt <= 0) panic("vdrop: holdcnt %d", vp->v_holdcnt); vp->v_holdcnt--; VNASSERT(vp->v_holdcnt >= vp->v_usecount, vp, ("hold count less than use count")); if (vp->v_holdcnt > 0) { VI_UNLOCK(vp); return; } if ((vp->v_iflag & VI_DOOMED) == 0) { /* * Mark a vnode as free: remove it from its active list * and put it up for recycling on the freelist. */ VNASSERT(vp->v_op != NULL, vp, ("vdropl: vnode already reclaimed.")); VNASSERT((vp->v_iflag & VI_FREE) == 0, vp, ("vnode already free")); VNASSERT(vp->v_holdcnt == 0, vp, ("vdropl: freeing when we shouldn't")); active = vp->v_iflag & VI_ACTIVE; vp->v_iflag &= ~VI_ACTIVE; mp = vp->v_mount; mtx_lock(&vnode_free_list_mtx); if (active) { TAILQ_REMOVE(&mp->mnt_activevnodelist, vp, v_actfreelist); mp->mnt_activevnodelistsize--; } if (vp->v_iflag & VI_AGE) { TAILQ_INSERT_HEAD(&vnode_free_list, vp, v_actfreelist); } else { TAILQ_INSERT_TAIL(&vnode_free_list, vp, v_actfreelist); } freevnodes++; vp->v_iflag &= ~VI_AGE; vp->v_iflag |= VI_FREE; mtx_unlock(&vnode_free_list_mtx); VI_UNLOCK(vp); return; } /* * The vnode has been marked for destruction, so free it. */ CTR2(KTR_VFS, "%s: destroying the vnode %p", __func__, vp); atomic_subtract_long(&numvnodes, 1); bo = &vp->v_bufobj; VNASSERT((vp->v_iflag & VI_FREE) == 0, vp, ("cleaned vnode still on the free list.")); VNASSERT(vp->v_data == NULL, vp, ("cleaned vnode isn't")); VNASSERT(vp->v_holdcnt == 0, vp, ("Non-zero hold count")); VNASSERT(vp->v_usecount == 0, vp, ("Non-zero use count")); VNASSERT(vp->v_writecount == 0, vp, ("Non-zero write count")); VNASSERT(bo->bo_numoutput == 0, vp, ("Clean vnode has pending I/O's")); VNASSERT(bo->bo_clean.bv_cnt == 0, vp, ("cleanbufcnt not 0")); VNASSERT(pctrie_is_empty(&bo->bo_clean.bv_root), vp, ("clean blk trie not empty")); VNASSERT(bo->bo_dirty.bv_cnt == 0, vp, ("dirtybufcnt not 0")); VNASSERT(pctrie_is_empty(&bo->bo_dirty.bv_root), vp, ("dirty blk trie not empty")); VNASSERT(TAILQ_EMPTY(&vp->v_cache_dst), vp, ("vp has namecache dst")); VNASSERT(LIST_EMPTY(&vp->v_cache_src), vp, ("vp has namecache src")); VNASSERT(vp->v_cache_dd == NULL, vp, ("vp has namecache for ..")); VI_UNLOCK(vp); #ifdef MAC mac_vnode_destroy(vp); #endif if (vp->v_pollinfo != NULL) destroy_vpollinfo(vp->v_pollinfo); #ifdef INVARIANTS /* XXX Elsewhere we detect an already freed vnode via NULL v_op. */ vp->v_op = NULL; #endif rangelock_destroy(&vp->v_rl); lockdestroy(vp->v_vnlock); mtx_destroy(&vp->v_interlock); rw_destroy(BO_LOCKPTR(bo)); uma_zfree(vnode_zone, vp); } /* * Call VOP_INACTIVE on the vnode and manage the DOINGINACT and OWEINACT * flags. DOINGINACT prevents us from recursing in calls to vinactive. * OWEINACT tracks whether a vnode missed a call to inactive due to a * failed lock upgrade. */ void vinactive(struct vnode *vp, struct thread *td) { struct vm_object *obj; ASSERT_VOP_ELOCKED(vp, "vinactive"); ASSERT_VI_LOCKED(vp, "vinactive"); VNASSERT((vp->v_iflag & VI_DOINGINACT) == 0, vp, ("vinactive: recursed on VI_DOINGINACT")); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); vp->v_iflag |= VI_DOINGINACT; vp->v_iflag &= ~VI_OWEINACT; VI_UNLOCK(vp); /* * Before moving off the active list, we must be sure that any * modified pages are on the vnode's dirty list since these will * no longer be checked once the vnode is on the inactive list. * Because the vnode vm object keeps a hold reference on the vnode * if there is at least one resident non-cached page, the vnode * cannot leave the active list without the page cleanup done. */ obj = vp->v_object; if (obj != NULL && (obj->flags & OBJ_MIGHTBEDIRTY) != 0) { VM_OBJECT_WLOCK(obj); vm_object_page_clean(obj, 0, 0, OBJPC_NOSYNC); VM_OBJECT_WUNLOCK(obj); } VOP_INACTIVE(vp, td); VI_LOCK(vp); VNASSERT(vp->v_iflag & VI_DOINGINACT, vp, ("vinactive: lost VI_DOINGINACT")); vp->v_iflag &= ~VI_DOINGINACT; } /* * Remove any vnodes in the vnode table belonging to mount point mp. * * If FORCECLOSE is not specified, there should not be any active ones, * return error if any are found (nb: this is a user error, not a * system error). If FORCECLOSE is specified, detach any active vnodes * that are found. * * If WRITECLOSE is set, only flush out regular file vnodes open for * writing. * * SKIPSYSTEM causes any vnodes marked VV_SYSTEM to be skipped. * * `rootrefs' specifies the base reference count for the root vnode * of this filesystem. The root vnode is considered busy if its * v_usecount exceeds this value. On a successful return, vflush(, td) * will call vrele() on the root vnode exactly rootrefs times. * If the SKIPSYSTEM or WRITECLOSE flags are specified, rootrefs must * be zero. */ #ifdef DIAGNOSTIC static int busyprt = 0; /* print out busy vnodes */ SYSCTL_INT(_debug, OID_AUTO, busyprt, CTLFLAG_RW, &busyprt, 0, "Print out busy vnodes"); #endif int vflush(struct mount *mp, int rootrefs, int flags, struct thread *td) { struct vnode *vp, *mvp, *rootvp = NULL; struct vattr vattr; int busy = 0, error; CTR4(KTR_VFS, "%s: mp %p with rootrefs %d and flags %d", __func__, mp, rootrefs, flags); if (rootrefs > 0) { KASSERT((flags & (SKIPSYSTEM | WRITECLOSE)) == 0, ("vflush: bad args")); /* * Get the filesystem root vnode. We can vput() it * immediately, since with rootrefs > 0, it won't go away. */ if ((error = VFS_ROOT(mp, LK_EXCLUSIVE, &rootvp)) != 0) { CTR2(KTR_VFS, "%s: vfs_root lookup failed with %d", __func__, error); return (error); } vput(rootvp); } loop: MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { vholdl(vp); error = vn_lock(vp, LK_INTERLOCK | LK_EXCLUSIVE); if (error) { vdrop(vp); MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); goto loop; } /* * Skip over a vnodes marked VV_SYSTEM. */ if ((flags & SKIPSYSTEM) && (vp->v_vflag & VV_SYSTEM)) { VOP_UNLOCK(vp, 0); vdrop(vp); continue; } /* * If WRITECLOSE is set, flush out unlinked but still open * files (even if open only for reading) and regular file * vnodes open for writing. */ if (flags & WRITECLOSE) { if (vp->v_object != NULL) { VM_OBJECT_WLOCK(vp->v_object); vm_object_page_clean(vp->v_object, 0, 0, 0); VM_OBJECT_WUNLOCK(vp->v_object); } error = VOP_FSYNC(vp, MNT_WAIT, td); if (error != 0) { VOP_UNLOCK(vp, 0); vdrop(vp); MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); return (error); } error = VOP_GETATTR(vp, &vattr, td->td_ucred); VI_LOCK(vp); if ((vp->v_type == VNON || (error == 0 && vattr.va_nlink > 0)) && (vp->v_writecount == 0 || vp->v_type != VREG)) { VOP_UNLOCK(vp, 0); vdropl(vp); continue; } } else VI_LOCK(vp); /* * With v_usecount == 0, all we need to do is clear out the * vnode data structures and we are done. * * If FORCECLOSE is set, forcibly close the vnode. */ if (vp->v_usecount == 0 || (flags & FORCECLOSE)) { VNASSERT(vp->v_usecount == 0 || vp->v_op != &devfs_specops || (vp->v_type != VCHR && vp->v_type != VBLK), vp, ("device VNODE %p is FORCECLOSED", vp)); vgonel(vp); } else { busy++; #ifdef DIAGNOSTIC if (busyprt) vprint("vflush: busy vnode", vp); #endif } VOP_UNLOCK(vp, 0); vdropl(vp); } if (rootrefs > 0 && (flags & FORCECLOSE) == 0) { /* * If just the root vnode is busy, and if its refcount * is equal to `rootrefs', then go ahead and kill it. */ VI_LOCK(rootvp); KASSERT(busy > 0, ("vflush: not busy")); VNASSERT(rootvp->v_usecount >= rootrefs, rootvp, ("vflush: usecount %d < rootrefs %d", rootvp->v_usecount, rootrefs)); if (busy == 1 && rootvp->v_usecount == rootrefs) { VOP_LOCK(rootvp, LK_EXCLUSIVE|LK_INTERLOCK); vgone(rootvp); VOP_UNLOCK(rootvp, 0); busy = 0; } else VI_UNLOCK(rootvp); } if (busy) { CTR2(KTR_VFS, "%s: failing as %d vnodes are busy", __func__, busy); return (EBUSY); } for (; rootrefs > 0; rootrefs--) vrele(rootvp); return (0); } /* * Recycle an unused vnode to the front of the free list. */ int vrecycle(struct vnode *vp) { int recycled; ASSERT_VOP_ELOCKED(vp, "vrecycle"); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); recycled = 0; VI_LOCK(vp); if (vp->v_usecount == 0) { recycled = 1; vgonel(vp); } VI_UNLOCK(vp); return (recycled); } /* * Eliminate all activity associated with a vnode * in preparation for reuse. */ void vgone(struct vnode *vp) { VI_LOCK(vp); vgonel(vp); VI_UNLOCK(vp); } static void notify_lowervp_vfs_dummy(struct mount *mp __unused, struct vnode *lowervp __unused) { } /* * Notify upper mounts about reclaimed or unlinked vnode. */ void vfs_notify_upper(struct vnode *vp, int event) { static struct vfsops vgonel_vfsops = { .vfs_reclaim_lowervp = notify_lowervp_vfs_dummy, .vfs_unlink_lowervp = notify_lowervp_vfs_dummy, }; struct mount *mp, *ump, *mmp; mp = vp->v_mount; if (mp == NULL) return; MNT_ILOCK(mp); if (TAILQ_EMPTY(&mp->mnt_uppers)) goto unlock; MNT_IUNLOCK(mp); mmp = malloc(sizeof(struct mount), M_TEMP, M_WAITOK | M_ZERO); mmp->mnt_op = &vgonel_vfsops; mmp->mnt_kern_flag |= MNTK_MARKER; MNT_ILOCK(mp); mp->mnt_kern_flag |= MNTK_VGONE_UPPER; for (ump = TAILQ_FIRST(&mp->mnt_uppers); ump != NULL;) { if ((ump->mnt_kern_flag & MNTK_MARKER) != 0) { ump = TAILQ_NEXT(ump, mnt_upper_link); continue; } TAILQ_INSERT_AFTER(&mp->mnt_uppers, ump, mmp, mnt_upper_link); MNT_IUNLOCK(mp); switch (event) { case VFS_NOTIFY_UPPER_RECLAIM: VFS_RECLAIM_LOWERVP(ump, vp); break; case VFS_NOTIFY_UPPER_UNLINK: VFS_UNLINK_LOWERVP(ump, vp); break; default: KASSERT(0, ("invalid event %d", event)); break; } MNT_ILOCK(mp); ump = TAILQ_NEXT(mmp, mnt_upper_link); TAILQ_REMOVE(&mp->mnt_uppers, mmp, mnt_upper_link); } free(mmp, M_TEMP); mp->mnt_kern_flag &= ~MNTK_VGONE_UPPER; if ((mp->mnt_kern_flag & MNTK_VGONE_WAITER) != 0) { mp->mnt_kern_flag &= ~MNTK_VGONE_WAITER; wakeup(&mp->mnt_uppers); } unlock: MNT_IUNLOCK(mp); } /* * vgone, with the vp interlock held. */ void vgonel(struct vnode *vp) { struct thread *td; int oweinact; int active; struct mount *mp; ASSERT_VOP_ELOCKED(vp, "vgonel"); ASSERT_VI_LOCKED(vp, "vgonel"); VNASSERT(vp->v_holdcnt, vp, ("vgonel: vp %p has no reference.", vp)); CTR2(KTR_VFS, "%s: vp %p", __func__, vp); td = curthread; /* * Don't vgonel if we're already doomed. */ if (vp->v_iflag & VI_DOOMED) return; vp->v_iflag |= VI_DOOMED; /* * Check to see if the vnode is in use. If so, we have to call * VOP_CLOSE() and VOP_INACTIVE(). */ active = vp->v_usecount; oweinact = (vp->v_iflag & VI_OWEINACT); VI_UNLOCK(vp); vfs_notify_upper(vp, VFS_NOTIFY_UPPER_RECLAIM); /* * If purging an active vnode, it must be closed and * deactivated before being reclaimed. */ if (active) VOP_CLOSE(vp, FNONBLOCK, NOCRED, td); if (oweinact || active) { VI_LOCK(vp); if ((vp->v_iflag & VI_DOINGINACT) == 0) vinactive(vp, td); VI_UNLOCK(vp); } if (vp->v_type == VSOCK) vfs_unp_reclaim(vp); /* * Clean out any buffers associated with the vnode. * If the flush fails, just toss the buffers. */ mp = NULL; if (!TAILQ_EMPTY(&vp->v_bufobj.bo_dirty.bv_hd)) (void) vn_start_secondary_write(vp, &mp, V_WAIT); if (vinvalbuf(vp, V_SAVE, 0, 0) != 0) { while (vinvalbuf(vp, 0, 0, 0) != 0) ; } #ifdef INVARIANTS BO_LOCK(&vp->v_bufobj); KASSERT(TAILQ_EMPTY(&vp->v_bufobj.bo_dirty.bv_hd) && vp->v_bufobj.bo_dirty.bv_cnt == 0 && TAILQ_EMPTY(&vp->v_bufobj.bo_clean.bv_hd) && vp->v_bufobj.bo_clean.bv_cnt == 0, ("vp %p bufobj not invalidated", vp)); vp->v_bufobj.bo_flag |= BO_DEAD; BO_UNLOCK(&vp->v_bufobj); #endif /* * Reclaim the vnode. */ if (VOP_RECLAIM(vp, td)) panic("vgone: cannot reclaim"); if (mp != NULL) vn_finished_secondary_write(mp); VNASSERT(vp->v_object == NULL, vp, ("vop_reclaim left v_object vp=%p, tag=%s", vp, vp->v_tag)); /* * Clear the advisory locks and wake up waiting threads. */ (void)VOP_ADVLOCKPURGE(vp); /* * Delete from old mount point vnode list. */ delmntque(vp); cache_purge(vp); /* * Done with purge, reset to the standard lock and invalidate * the vnode. */ VI_LOCK(vp); vp->v_vnlock = &vp->v_lock; vp->v_op = &dead_vnodeops; vp->v_tag = "none"; vp->v_type = VBAD; } /* * Calculate the total number of references to a special device. */ int vcount(struct vnode *vp) { int count; dev_lock(); count = vp->v_rdev->si_usecount; dev_unlock(); return (count); } /* * Same as above, but using the struct cdev *as argument */ int count_dev(struct cdev *dev) { int count; dev_lock(); count = dev->si_usecount; dev_unlock(); return(count); } /* * Print out a description of a vnode. */ static char *typename[] = {"VNON", "VREG", "VDIR", "VBLK", "VCHR", "VLNK", "VSOCK", "VFIFO", "VBAD", "VMARKER"}; void vn_printf(struct vnode *vp, const char *fmt, ...) { va_list ap; char buf[256], buf2[16]; u_long flags; va_start(ap, fmt); vprintf(fmt, ap); va_end(ap); printf("%p: ", (void *)vp); printf("tag %s, type %s\n", vp->v_tag, typename[vp->v_type]); printf(" usecount %d, writecount %d, refcount %d mountedhere %p\n", vp->v_usecount, vp->v_writecount, vp->v_holdcnt, vp->v_mountedhere); buf[0] = '\0'; buf[1] = '\0'; if (vp->v_vflag & VV_ROOT) strlcat(buf, "|VV_ROOT", sizeof(buf)); if (vp->v_vflag & VV_ISTTY) strlcat(buf, "|VV_ISTTY", sizeof(buf)); if (vp->v_vflag & VV_NOSYNC) strlcat(buf, "|VV_NOSYNC", sizeof(buf)); if (vp->v_vflag & VV_ETERNALDEV) strlcat(buf, "|VV_ETERNALDEV", sizeof(buf)); if (vp->v_vflag & VV_CACHEDLABEL) strlcat(buf, "|VV_CACHEDLABEL", sizeof(buf)); if (vp->v_vflag & VV_TEXT) strlcat(buf, "|VV_TEXT", sizeof(buf)); if (vp->v_vflag & VV_COPYONWRITE) strlcat(buf, "|VV_COPYONWRITE", sizeof(buf)); if (vp->v_vflag & VV_SYSTEM) strlcat(buf, "|VV_SYSTEM", sizeof(buf)); if (vp->v_vflag & VV_PROCDEP) strlcat(buf, "|VV_PROCDEP", sizeof(buf)); if (vp->v_vflag & VV_NOKNOTE) strlcat(buf, "|VV_NOKNOTE", sizeof(buf)); if (vp->v_vflag & VV_DELETED) strlcat(buf, "|VV_DELETED", sizeof(buf)); if (vp->v_vflag & VV_MD) strlcat(buf, "|VV_MD", sizeof(buf)); if (vp->v_vflag & VV_FORCEINSMQ) strlcat(buf, "|VV_FORCEINSMQ", sizeof(buf)); flags = vp->v_vflag & ~(VV_ROOT | VV_ISTTY | VV_NOSYNC | VV_ETERNALDEV | VV_CACHEDLABEL | VV_TEXT | VV_COPYONWRITE | VV_SYSTEM | VV_PROCDEP | VV_NOKNOTE | VV_DELETED | VV_MD | VV_FORCEINSMQ); if (flags != 0) { snprintf(buf2, sizeof(buf2), "|VV(0x%lx)", flags); strlcat(buf, buf2, sizeof(buf)); } if (vp->v_iflag & VI_MOUNT) strlcat(buf, "|VI_MOUNT", sizeof(buf)); if (vp->v_iflag & VI_AGE) strlcat(buf, "|VI_AGE", sizeof(buf)); if (vp->v_iflag & VI_DOOMED) strlcat(buf, "|VI_DOOMED", sizeof(buf)); if (vp->v_iflag & VI_FREE) strlcat(buf, "|VI_FREE", sizeof(buf)); if (vp->v_iflag & VI_ACTIVE) strlcat(buf, "|VI_ACTIVE", sizeof(buf)); if (vp->v_iflag & VI_DOINGINACT) strlcat(buf, "|VI_DOINGINACT", sizeof(buf)); if (vp->v_iflag & VI_OWEINACT) strlcat(buf, "|VI_OWEINACT", sizeof(buf)); flags = vp->v_iflag & ~(VI_MOUNT | VI_AGE | VI_DOOMED | VI_FREE | VI_ACTIVE | VI_DOINGINACT | VI_OWEINACT); if (flags != 0) { snprintf(buf2, sizeof(buf2), "|VI(0x%lx)", flags); strlcat(buf, buf2, sizeof(buf)); } printf(" flags (%s)\n", buf + 1); if (mtx_owned(VI_MTX(vp))) printf(" VI_LOCKed"); if (vp->v_object != NULL) printf(" v_object %p ref %d pages %d " "cleanbuf %d dirtybuf %d\n", vp->v_object, vp->v_object->ref_count, vp->v_object->resident_page_count, vp->v_bufobj.bo_dirty.bv_cnt, vp->v_bufobj.bo_clean.bv_cnt); printf(" "); lockmgr_printinfo(vp->v_vnlock); if (vp->v_data != NULL) VOP_PRINT(vp); } #ifdef DDB /* * List all of the locked vnodes in the system. * Called when debugging the kernel. */ DB_SHOW_COMMAND(lockedvnods, lockedvnodes) { struct mount *mp; struct vnode *vp; /* * Note: because this is DDB, we can't obey the locking semantics * for these structures, which means we could catch an inconsistent * state and dereference a nasty pointer. Not much to be done * about that. */ db_printf("Locked vnodes\n"); TAILQ_FOREACH(mp, &mountlist, mnt_list) { TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) { if (vp->v_type != VMARKER && VOP_ISLOCKED(vp)) vprint("", vp); } } } /* * Show details about the given vnode. */ DB_SHOW_COMMAND(vnode, db_show_vnode) { struct vnode *vp; if (!have_addr) return; vp = (struct vnode *)addr; vn_printf(vp, "vnode "); } /* * Show details about the given mount point. */ DB_SHOW_COMMAND(mount, db_show_mount) { struct mount *mp; struct vfsopt *opt; struct statfs *sp; struct vnode *vp; char buf[512]; uint64_t mflags; u_int flags; if (!have_addr) { /* No address given, print short info about all mount points. */ TAILQ_FOREACH(mp, &mountlist, mnt_list) { db_printf("%p %s on %s (%s)\n", mp, mp->mnt_stat.f_mntfromname, mp->mnt_stat.f_mntonname, mp->mnt_stat.f_fstypename); if (db_pager_quit) break; } db_printf("\nMore info: show mount \n"); return; } mp = (struct mount *)addr; db_printf("%p %s on %s (%s)\n", mp, mp->mnt_stat.f_mntfromname, mp->mnt_stat.f_mntonname, mp->mnt_stat.f_fstypename); buf[0] = '\0'; mflags = mp->mnt_flag; #define MNT_FLAG(flag) do { \ if (mflags & (flag)) { \ if (buf[0] != '\0') \ strlcat(buf, ", ", sizeof(buf)); \ strlcat(buf, (#flag) + 4, sizeof(buf)); \ mflags &= ~(flag); \ } \ } while (0) MNT_FLAG(MNT_RDONLY); MNT_FLAG(MNT_SYNCHRONOUS); MNT_FLAG(MNT_NOEXEC); MNT_FLAG(MNT_NOSUID); MNT_FLAG(MNT_NFS4ACLS); MNT_FLAG(MNT_UNION); MNT_FLAG(MNT_ASYNC); MNT_FLAG(MNT_SUIDDIR); MNT_FLAG(MNT_SOFTDEP); MNT_FLAG(MNT_NOSYMFOLLOW); MNT_FLAG(MNT_GJOURNAL); MNT_FLAG(MNT_MULTILABEL); MNT_FLAG(MNT_ACLS); MNT_FLAG(MNT_NOATIME); MNT_FLAG(MNT_NOCLUSTERR); MNT_FLAG(MNT_NOCLUSTERW); MNT_FLAG(MNT_SUJ); MNT_FLAG(MNT_EXRDONLY); MNT_FLAG(MNT_EXPORTED); MNT_FLAG(MNT_DEFEXPORTED); MNT_FLAG(MNT_EXPORTANON); MNT_FLAG(MNT_EXKERB); MNT_FLAG(MNT_EXPUBLIC); MNT_FLAG(MNT_LOCAL); MNT_FLAG(MNT_QUOTA); MNT_FLAG(MNT_ROOTFS); MNT_FLAG(MNT_USER); MNT_FLAG(MNT_IGNORE); MNT_FLAG(MNT_UPDATE); MNT_FLAG(MNT_DELEXPORT); MNT_FLAG(MNT_RELOAD); MNT_FLAG(MNT_FORCE); MNT_FLAG(MNT_SNAPSHOT); MNT_FLAG(MNT_BYFSID); #undef MNT_FLAG if (mflags != 0) { if (buf[0] != '\0') strlcat(buf, ", ", sizeof(buf)); snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), "0x%016jx", mflags); } db_printf(" mnt_flag = %s\n", buf); buf[0] = '\0'; flags = mp->mnt_kern_flag; #define MNT_KERN_FLAG(flag) do { \ if (flags & (flag)) { \ if (buf[0] != '\0') \ strlcat(buf, ", ", sizeof(buf)); \ strlcat(buf, (#flag) + 5, sizeof(buf)); \ flags &= ~(flag); \ } \ } while (0) MNT_KERN_FLAG(MNTK_UNMOUNTF); MNT_KERN_FLAG(MNTK_ASYNC); MNT_KERN_FLAG(MNTK_SOFTDEP); MNT_KERN_FLAG(MNTK_NOINSMNTQ); MNT_KERN_FLAG(MNTK_DRAINING); MNT_KERN_FLAG(MNTK_REFEXPIRE); MNT_KERN_FLAG(MNTK_EXTENDED_SHARED); MNT_KERN_FLAG(MNTK_SHARED_WRITES); MNT_KERN_FLAG(MNTK_NO_IOPF); MNT_KERN_FLAG(MNTK_VGONE_UPPER); MNT_KERN_FLAG(MNTK_VGONE_WAITER); MNT_KERN_FLAG(MNTK_LOOKUP_EXCL_DOTDOT); MNT_KERN_FLAG(MNTK_MARKER); + MNT_KERN_FLAG(MNTK_USES_BCACHE); MNT_KERN_FLAG(MNTK_NOASYNC); MNT_KERN_FLAG(MNTK_UNMOUNT); MNT_KERN_FLAG(MNTK_MWAIT); MNT_KERN_FLAG(MNTK_SUSPEND); MNT_KERN_FLAG(MNTK_SUSPEND2); MNT_KERN_FLAG(MNTK_SUSPENDED); MNT_KERN_FLAG(MNTK_LOOKUP_SHARED); MNT_KERN_FLAG(MNTK_NOKNOTE); #undef MNT_KERN_FLAG if (flags != 0) { if (buf[0] != '\0') strlcat(buf, ", ", sizeof(buf)); snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), "0x%08x", flags); } db_printf(" mnt_kern_flag = %s\n", buf); db_printf(" mnt_opt = "); opt = TAILQ_FIRST(mp->mnt_opt); if (opt != NULL) { db_printf("%s", opt->name); opt = TAILQ_NEXT(opt, link); while (opt != NULL) { db_printf(", %s", opt->name); opt = TAILQ_NEXT(opt, link); } } db_printf("\n"); sp = &mp->mnt_stat; db_printf(" mnt_stat = { version=%u type=%u flags=0x%016jx " "bsize=%ju iosize=%ju blocks=%ju bfree=%ju bavail=%jd files=%ju " "ffree=%jd syncwrites=%ju asyncwrites=%ju syncreads=%ju " "asyncreads=%ju namemax=%u owner=%u fsid=[%d, %d] }\n", (u_int)sp->f_version, (u_int)sp->f_type, (uintmax_t)sp->f_flags, (uintmax_t)sp->f_bsize, (uintmax_t)sp->f_iosize, (uintmax_t)sp->f_blocks, (uintmax_t)sp->f_bfree, (intmax_t)sp->f_bavail, (uintmax_t)sp->f_files, (intmax_t)sp->f_ffree, (uintmax_t)sp->f_syncwrites, (uintmax_t)sp->f_asyncwrites, (uintmax_t)sp->f_syncreads, (uintmax_t)sp->f_asyncreads, (u_int)sp->f_namemax, (u_int)sp->f_owner, (int)sp->f_fsid.val[0], (int)sp->f_fsid.val[1]); db_printf(" mnt_cred = { uid=%u ruid=%u", (u_int)mp->mnt_cred->cr_uid, (u_int)mp->mnt_cred->cr_ruid); if (jailed(mp->mnt_cred)) db_printf(", jail=%d", mp->mnt_cred->cr_prison->pr_id); db_printf(" }\n"); db_printf(" mnt_ref = %d\n", mp->mnt_ref); db_printf(" mnt_gen = %d\n", mp->mnt_gen); db_printf(" mnt_nvnodelistsize = %d\n", mp->mnt_nvnodelistsize); db_printf(" mnt_activevnodelistsize = %d\n", mp->mnt_activevnodelistsize); db_printf(" mnt_writeopcount = %d\n", mp->mnt_writeopcount); db_printf(" mnt_maxsymlinklen = %d\n", mp->mnt_maxsymlinklen); db_printf(" mnt_iosize_max = %d\n", mp->mnt_iosize_max); db_printf(" mnt_hashseed = %u\n", mp->mnt_hashseed); db_printf(" mnt_lockref = %d\n", mp->mnt_lockref); db_printf(" mnt_secondary_writes = %d\n", mp->mnt_secondary_writes); db_printf(" mnt_secondary_accwrites = %d\n", mp->mnt_secondary_accwrites); db_printf(" mnt_gjprovider = %s\n", mp->mnt_gjprovider != NULL ? mp->mnt_gjprovider : "NULL"); db_printf("\n\nList of active vnodes\n"); TAILQ_FOREACH(vp, &mp->mnt_activevnodelist, v_actfreelist) { if (vp->v_type != VMARKER) { vn_printf(vp, "vnode "); if (db_pager_quit) break; } } db_printf("\n\nList of inactive vnodes\n"); TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) { if (vp->v_type != VMARKER && (vp->v_iflag & VI_ACTIVE) == 0) { vn_printf(vp, "vnode "); if (db_pager_quit) break; } } } #endif /* DDB */ /* * Fill in a struct xvfsconf based on a struct vfsconf. */ static int vfsconf2x(struct sysctl_req *req, struct vfsconf *vfsp) { struct xvfsconf xvfsp; bzero(&xvfsp, sizeof(xvfsp)); strcpy(xvfsp.vfc_name, vfsp->vfc_name); xvfsp.vfc_typenum = vfsp->vfc_typenum; xvfsp.vfc_refcount = vfsp->vfc_refcount; xvfsp.vfc_flags = vfsp->vfc_flags; /* * These are unused in userland, we keep them * to not break binary compatibility. */ xvfsp.vfc_vfsops = NULL; xvfsp.vfc_next = NULL; return (SYSCTL_OUT(req, &xvfsp, sizeof(xvfsp))); } #ifdef COMPAT_FREEBSD32 struct xvfsconf32 { uint32_t vfc_vfsops; char vfc_name[MFSNAMELEN]; int32_t vfc_typenum; int32_t vfc_refcount; int32_t vfc_flags; uint32_t vfc_next; }; static int vfsconf2x32(struct sysctl_req *req, struct vfsconf *vfsp) { struct xvfsconf32 xvfsp; strcpy(xvfsp.vfc_name, vfsp->vfc_name); xvfsp.vfc_typenum = vfsp->vfc_typenum; xvfsp.vfc_refcount = vfsp->vfc_refcount; xvfsp.vfc_flags = vfsp->vfc_flags; xvfsp.vfc_vfsops = 0; xvfsp.vfc_next = 0; return (SYSCTL_OUT(req, &xvfsp, sizeof(xvfsp))); } #endif /* * Top level filesystem related information gathering. */ static int sysctl_vfs_conflist(SYSCTL_HANDLER_ARGS) { struct vfsconf *vfsp; int error; error = 0; vfsconf_slock(); TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) { #ifdef COMPAT_FREEBSD32 if (req->flags & SCTL_MASK32) error = vfsconf2x32(req, vfsp); else #endif error = vfsconf2x(req, vfsp); if (error) break; } vfsconf_sunlock(); return (error); } SYSCTL_PROC(_vfs, OID_AUTO, conflist, CTLTYPE_OPAQUE | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0, sysctl_vfs_conflist, "S,xvfsconf", "List of all configured filesystems"); #ifndef BURN_BRIDGES static int sysctl_ovfs_conf(SYSCTL_HANDLER_ARGS); static int vfs_sysctl(SYSCTL_HANDLER_ARGS) { int *name = (int *)arg1 - 1; /* XXX */ u_int namelen = arg2 + 1; /* XXX */ struct vfsconf *vfsp; log(LOG_WARNING, "userland calling deprecated sysctl, " "please rebuild world\n"); #if 1 || defined(COMPAT_PRELITE2) /* Resolve ambiguity between VFS_VFSCONF and VFS_GENERIC. */ if (namelen == 1) return (sysctl_ovfs_conf(oidp, arg1, arg2, req)); #endif switch (name[1]) { case VFS_MAXTYPENUM: if (namelen != 2) return (ENOTDIR); return (SYSCTL_OUT(req, &maxvfsconf, sizeof(int))); case VFS_CONF: if (namelen != 3) return (ENOTDIR); /* overloaded */ vfsconf_slock(); TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) { if (vfsp->vfc_typenum == name[2]) break; } vfsconf_sunlock(); if (vfsp == NULL) return (EOPNOTSUPP); #ifdef COMPAT_FREEBSD32 if (req->flags & SCTL_MASK32) return (vfsconf2x32(req, vfsp)); else #endif return (vfsconf2x(req, vfsp)); } return (EOPNOTSUPP); } static SYSCTL_NODE(_vfs, VFS_GENERIC, generic, CTLFLAG_RD | CTLFLAG_SKIP | CTLFLAG_MPSAFE, vfs_sysctl, "Generic filesystem"); #if 1 || defined(COMPAT_PRELITE2) static int sysctl_ovfs_conf(SYSCTL_HANDLER_ARGS) { int error; struct vfsconf *vfsp; struct ovfsconf ovfs; vfsconf_slock(); TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) { bzero(&ovfs, sizeof(ovfs)); ovfs.vfc_vfsops = vfsp->vfc_vfsops; /* XXX used as flag */ strcpy(ovfs.vfc_name, vfsp->vfc_name); ovfs.vfc_index = vfsp->vfc_typenum; ovfs.vfc_refcount = vfsp->vfc_refcount; ovfs.vfc_flags = vfsp->vfc_flags; error = SYSCTL_OUT(req, &ovfs, sizeof ovfs); if (error != 0) { vfsconf_sunlock(); return (error); } } vfsconf_sunlock(); return (0); } #endif /* 1 || COMPAT_PRELITE2 */ #endif /* !BURN_BRIDGES */ #define KINFO_VNODESLOP 10 #ifdef notyet /* * Dump vnode list (via sysctl). */ /* ARGSUSED */ static int sysctl_vnode(SYSCTL_HANDLER_ARGS) { struct xvnode *xvn; struct mount *mp; struct vnode *vp; int error, len, n; /* * Stale numvnodes access is not fatal here. */ req->lock = 0; len = (numvnodes + KINFO_VNODESLOP) * sizeof *xvn; if (!req->oldptr) /* Make an estimate */ return (SYSCTL_OUT(req, 0, len)); error = sysctl_wire_old_buffer(req, 0); if (error != 0) return (error); xvn = malloc(len, M_TEMP, M_ZERO | M_WAITOK); n = 0; mtx_lock(&mountlist_mtx); TAILQ_FOREACH(mp, &mountlist, mnt_list) { if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) continue; MNT_ILOCK(mp); TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) { if (n == len) break; vref(vp); xvn[n].xv_size = sizeof *xvn; xvn[n].xv_vnode = vp; xvn[n].xv_id = 0; /* XXX compat */ #define XV_COPY(field) xvn[n].xv_##field = vp->v_##field XV_COPY(usecount); XV_COPY(writecount); XV_COPY(holdcnt); XV_COPY(mount); XV_COPY(numoutput); XV_COPY(type); #undef XV_COPY xvn[n].xv_flag = vp->v_vflag; switch (vp->v_type) { case VREG: case VDIR: case VLNK: break; case VBLK: case VCHR: if (vp->v_rdev == NULL) { vrele(vp); continue; } xvn[n].xv_dev = dev2udev(vp->v_rdev); break; case VSOCK: xvn[n].xv_socket = vp->v_socket; break; case VFIFO: xvn[n].xv_fifo = vp->v_fifoinfo; break; case VNON: case VBAD: default: /* shouldn't happen? */ vrele(vp); continue; } vrele(vp); ++n; } MNT_IUNLOCK(mp); mtx_lock(&mountlist_mtx); vfs_unbusy(mp); if (n == len) break; } mtx_unlock(&mountlist_mtx); error = SYSCTL_OUT(req, xvn, n * sizeof *xvn); free(xvn, M_TEMP); return (error); } SYSCTL_PROC(_kern, KERN_VNODE, vnode, CTLTYPE_OPAQUE | CTLFLAG_RD | CTLFLAG_MPSAFE, 0, 0, sysctl_vnode, "S,xvnode", ""); #endif /* * Unmount all filesystems. The list is traversed in reverse order * of mounting to avoid dependencies. */ void vfs_unmountall(void) { struct mount *mp; struct thread *td; int error; CTR1(KTR_VFS, "%s: unmounting all filesystems", __func__); td = curthread; /* * Since this only runs when rebooting, it is not interlocked. */ while(!TAILQ_EMPTY(&mountlist)) { mp = TAILQ_LAST(&mountlist, mntlist); error = dounmount(mp, MNT_FORCE, td); if (error) { TAILQ_REMOVE(&mountlist, mp, mnt_list); /* * XXX: Due to the way in which we mount the root * file system off of devfs, devfs will generate a * "busy" warning when we try to unmount it before * the root. Don't print a warning as a result in * order to avoid false positive errors that may * cause needless upset. */ if (strcmp(mp->mnt_vfc->vfc_name, "devfs") != 0) { printf("unmount of %s failed (", mp->mnt_stat.f_mntonname); if (error == EBUSY) printf("BUSY)\n"); else printf("%d)\n", error); } } else { /* The unmount has removed mp from the mountlist */ } } } /* * perform msync on all vnodes under a mount point * the mount point must be locked. */ void vfs_msync(struct mount *mp, int flags) { struct vnode *vp, *mvp; struct vm_object *obj; CTR2(KTR_VFS, "%s: mp %p", __func__, mp); MNT_VNODE_FOREACH_ACTIVE(vp, mp, mvp) { obj = vp->v_object; if (obj != NULL && (obj->flags & OBJ_MIGHTBEDIRTY) != 0 && (flags == MNT_WAIT || VOP_ISLOCKED(vp) == 0)) { if (!vget(vp, LK_EXCLUSIVE | LK_RETRY | LK_INTERLOCK, curthread)) { if (vp->v_vflag & VV_NOSYNC) { /* unlinked */ vput(vp); continue; } obj = vp->v_object; if (obj != NULL) { VM_OBJECT_WLOCK(obj); vm_object_page_clean(obj, 0, 0, flags == MNT_WAIT ? OBJPC_SYNC : OBJPC_NOSYNC); VM_OBJECT_WUNLOCK(obj); } vput(vp); } } else VI_UNLOCK(vp); } } static void destroy_vpollinfo_free(struct vpollinfo *vi) { knlist_destroy(&vi->vpi_selinfo.si_note); mtx_destroy(&vi->vpi_lock); uma_zfree(vnodepoll_zone, vi); } static void destroy_vpollinfo(struct vpollinfo *vi) { knlist_clear(&vi->vpi_selinfo.si_note, 1); seldrain(&vi->vpi_selinfo); destroy_vpollinfo_free(vi); } /* * Initalize per-vnode helper structure to hold poll-related state. */ void v_addpollinfo(struct vnode *vp) { struct vpollinfo *vi; if (vp->v_pollinfo != NULL) return; vi = uma_zalloc(vnodepoll_zone, M_WAITOK); mtx_init(&vi->vpi_lock, "vnode pollinfo", NULL, MTX_DEF); knlist_init(&vi->vpi_selinfo.si_note, vp, vfs_knllock, vfs_knlunlock, vfs_knl_assert_locked, vfs_knl_assert_unlocked); VI_LOCK(vp); if (vp->v_pollinfo != NULL) { VI_UNLOCK(vp); destroy_vpollinfo_free(vi); return; } vp->v_pollinfo = vi; VI_UNLOCK(vp); } /* * Record a process's interest in events which might happen to * a vnode. Because poll uses the historic select-style interface * internally, this routine serves as both the ``check for any * pending events'' and the ``record my interest in future events'' * functions. (These are done together, while the lock is held, * to avoid race conditions.) */ int vn_pollrecord(struct vnode *vp, struct thread *td, int events) { v_addpollinfo(vp); mtx_lock(&vp->v_pollinfo->vpi_lock); if (vp->v_pollinfo->vpi_revents & events) { /* * This leaves events we are not interested * in available for the other process which * which presumably had requested them * (otherwise they would never have been * recorded). */ events &= vp->v_pollinfo->vpi_revents; vp->v_pollinfo->vpi_revents &= ~events; mtx_unlock(&vp->v_pollinfo->vpi_lock); return (events); } vp->v_pollinfo->vpi_events |= events; selrecord(td, &vp->v_pollinfo->vpi_selinfo); mtx_unlock(&vp->v_pollinfo->vpi_lock); return (0); } /* * Routine to create and manage a filesystem syncer vnode. */ #define sync_close ((int (*)(struct vop_close_args *))nullop) static int sync_fsync(struct vop_fsync_args *); static int sync_inactive(struct vop_inactive_args *); static int sync_reclaim(struct vop_reclaim_args *); static struct vop_vector sync_vnodeops = { .vop_bypass = VOP_EOPNOTSUPP, .vop_close = sync_close, /* close */ .vop_fsync = sync_fsync, /* fsync */ .vop_inactive = sync_inactive, /* inactive */ .vop_reclaim = sync_reclaim, /* reclaim */ .vop_lock1 = vop_stdlock, /* lock */ .vop_unlock = vop_stdunlock, /* unlock */ .vop_islocked = vop_stdislocked, /* islocked */ }; /* * Create a new filesystem syncer vnode for the specified mount point. */ void vfs_allocate_syncvnode(struct mount *mp) { struct vnode *vp; struct bufobj *bo; static long start, incr, next; int error; /* Allocate a new vnode */ error = getnewvnode("syncer", mp, &sync_vnodeops, &vp); if (error != 0) panic("vfs_allocate_syncvnode: getnewvnode() failed"); vp->v_type = VNON; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); vp->v_vflag |= VV_FORCEINSMQ; error = insmntque(vp, mp); if (error != 0) panic("vfs_allocate_syncvnode: insmntque() failed"); vp->v_vflag &= ~VV_FORCEINSMQ; VOP_UNLOCK(vp, 0); /* * Place the vnode onto the syncer worklist. We attempt to * scatter them about on the list so that they will go off * at evenly distributed times even if all the filesystems * are mounted at once. */ next += incr; if (next == 0 || next > syncer_maxdelay) { start /= 2; incr /= 2; if (start == 0) { start = syncer_maxdelay / 2; incr = syncer_maxdelay; } next = start; } bo = &vp->v_bufobj; BO_LOCK(bo); vn_syncer_add_to_worklist(bo, syncdelay > 0 ? next % syncdelay : 0); /* XXX - vn_syncer_add_to_worklist() also grabs and drops sync_mtx. */ mtx_lock(&sync_mtx); sync_vnode_count++; if (mp->mnt_syncer == NULL) { mp->mnt_syncer = vp; vp = NULL; } mtx_unlock(&sync_mtx); BO_UNLOCK(bo); if (vp != NULL) { vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); vgone(vp); vput(vp); } } void vfs_deallocate_syncvnode(struct mount *mp) { struct vnode *vp; mtx_lock(&sync_mtx); vp = mp->mnt_syncer; if (vp != NULL) mp->mnt_syncer = NULL; mtx_unlock(&sync_mtx); if (vp != NULL) vrele(vp); } /* * Do a lazy sync of the filesystem. */ static int sync_fsync(struct vop_fsync_args *ap) { struct vnode *syncvp = ap->a_vp; struct mount *mp = syncvp->v_mount; int error, save; struct bufobj *bo; /* * We only need to do something if this is a lazy evaluation. */ if (ap->a_waitfor != MNT_LAZY) return (0); /* * Move ourselves to the back of the sync list. */ bo = &syncvp->v_bufobj; BO_LOCK(bo); vn_syncer_add_to_worklist(bo, syncdelay); BO_UNLOCK(bo); /* * Walk the list of vnodes pushing all that are dirty and * not already on the sync list. */ if (vfs_busy(mp, MBF_NOWAIT) != 0) return (0); if (vn_start_write(NULL, &mp, V_NOWAIT) != 0) { vfs_unbusy(mp); return (0); } save = curthread_pflags_set(TDP_SYNCIO); vfs_msync(mp, MNT_NOWAIT); error = VFS_SYNC(mp, MNT_LAZY); curthread_pflags_restore(save); vn_finished_write(mp); vfs_unbusy(mp); return (error); } /* * The syncer vnode is no referenced. */ static int sync_inactive(struct vop_inactive_args *ap) { vgone(ap->a_vp); return (0); } /* * The syncer vnode is no longer needed and is being decommissioned. * * Modifications to the worklist must be protected by sync_mtx. */ static int sync_reclaim(struct vop_reclaim_args *ap) { struct vnode *vp = ap->a_vp; struct bufobj *bo; bo = &vp->v_bufobj; BO_LOCK(bo); mtx_lock(&sync_mtx); if (vp->v_mount->mnt_syncer == vp) vp->v_mount->mnt_syncer = NULL; if (bo->bo_flag & BO_ONWORKLST) { LIST_REMOVE(bo, bo_synclist); syncer_worklist_len--; sync_vnode_count--; bo->bo_flag &= ~BO_ONWORKLST; } mtx_unlock(&sync_mtx); BO_UNLOCK(bo); return (0); } /* * Check if vnode represents a disk device */ int vn_isdisk(struct vnode *vp, int *errp) { int error; if (vp->v_type != VCHR) { error = ENOTBLK; goto out; } error = 0; dev_lock(); if (vp->v_rdev == NULL) error = ENXIO; else if (vp->v_rdev->si_devsw == NULL) error = ENXIO; else if (!(vp->v_rdev->si_devsw->d_flags & D_DISK)) error = ENOTBLK; dev_unlock(); out: if (errp != NULL) *errp = error; return (error == 0); } /* * Common filesystem object access control check routine. Accepts a * vnode's type, "mode", uid and gid, requested access mode, credentials, * and optional call-by-reference privused argument allowing vaccess() * to indicate to the caller whether privilege was used to satisfy the * request (obsoleted). Returns 0 on success, or an errno on failure. */ int vaccess(enum vtype type, mode_t file_mode, uid_t file_uid, gid_t file_gid, accmode_t accmode, struct ucred *cred, int *privused) { accmode_t dac_granted; accmode_t priv_granted; KASSERT((accmode & ~(VEXEC | VWRITE | VREAD | VADMIN | VAPPEND)) == 0, ("invalid bit in accmode")); KASSERT((accmode & VAPPEND) == 0 || (accmode & VWRITE), ("VAPPEND without VWRITE")); /* * Look for a normal, non-privileged way to access the file/directory * as requested. If it exists, go with that. */ if (privused != NULL) *privused = 0; dac_granted = 0; /* Check the owner. */ if (cred->cr_uid == file_uid) { dac_granted |= VADMIN; if (file_mode & S_IXUSR) dac_granted |= VEXEC; if (file_mode & S_IRUSR) dac_granted |= VREAD; if (file_mode & S_IWUSR) dac_granted |= (VWRITE | VAPPEND); if ((accmode & dac_granted) == accmode) return (0); goto privcheck; } /* Otherwise, check the groups (first match) */ if (groupmember(file_gid, cred)) { if (file_mode & S_IXGRP) dac_granted |= VEXEC; if (file_mode & S_IRGRP) dac_granted |= VREAD; if (file_mode & S_IWGRP) dac_granted |= (VWRITE | VAPPEND); if ((accmode & dac_granted) == accmode) return (0); goto privcheck; } /* Otherwise, check everyone else. */ if (file_mode & S_IXOTH) dac_granted |= VEXEC; if (file_mode & S_IROTH) dac_granted |= VREAD; if (file_mode & S_IWOTH) dac_granted |= (VWRITE | VAPPEND); if ((accmode & dac_granted) == accmode) return (0); privcheck: /* * Build a privilege mask to determine if the set of privileges * satisfies the requirements when combined with the granted mask * from above. For each privilege, if the privilege is required, * bitwise or the request type onto the priv_granted mask. */ priv_granted = 0; if (type == VDIR) { /* * For directories, use PRIV_VFS_LOOKUP to satisfy VEXEC * requests, instead of PRIV_VFS_EXEC. */ if ((accmode & VEXEC) && ((dac_granted & VEXEC) == 0) && !priv_check_cred(cred, PRIV_VFS_LOOKUP, 0)) priv_granted |= VEXEC; } else { /* * Ensure that at least one execute bit is on. Otherwise, * a privileged user will always succeed, and we don't want * this to happen unless the file really is executable. */ if ((accmode & VEXEC) && ((dac_granted & VEXEC) == 0) && (file_mode & (S_IXUSR | S_IXGRP | S_IXOTH)) != 0 && !priv_check_cred(cred, PRIV_VFS_EXEC, 0)) priv_granted |= VEXEC; } if ((accmode & VREAD) && ((dac_granted & VREAD) == 0) && !priv_check_cred(cred, PRIV_VFS_READ, 0)) priv_granted |= VREAD; if ((accmode & VWRITE) && ((dac_granted & VWRITE) == 0) && !priv_check_cred(cred, PRIV_VFS_WRITE, 0)) priv_granted |= (VWRITE | VAPPEND); if ((accmode & VADMIN) && ((dac_granted & VADMIN) == 0) && !priv_check_cred(cred, PRIV_VFS_ADMIN, 0)) priv_granted |= VADMIN; if ((accmode & (priv_granted | dac_granted)) == accmode) { /* XXX audit: privilege used */ if (privused != NULL) *privused = 1; return (0); } return ((accmode & VADMIN) ? EPERM : EACCES); } /* * Credential check based on process requesting service, and per-attribute * permissions. */ int extattr_check_cred(struct vnode *vp, int attrnamespace, struct ucred *cred, struct thread *td, accmode_t accmode) { /* * Kernel-invoked always succeeds. */ if (cred == NOCRED) return (0); /* * Do not allow privileged processes in jail to directly manipulate * system attributes. */ switch (attrnamespace) { case EXTATTR_NAMESPACE_SYSTEM: /* Potentially should be: return (EPERM); */ return (priv_check_cred(cred, PRIV_VFS_EXTATTR_SYSTEM, 0)); case EXTATTR_NAMESPACE_USER: return (VOP_ACCESS(vp, accmode, cred, td)); default: return (EPERM); } } #ifdef DEBUG_VFS_LOCKS /* * This only exists to supress warnings from unlocked specfs accesses. It is * no longer ok to have an unlocked VFS. */ #define IGNORE_LOCK(vp) (panicstr != NULL || (vp) == NULL || \ (vp)->v_type == VCHR || (vp)->v_type == VBAD) int vfs_badlock_ddb = 1; /* Drop into debugger on violation. */ SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_ddb, CTLFLAG_RW, &vfs_badlock_ddb, 0, "Drop into debugger on lock violation"); int vfs_badlock_mutex = 1; /* Check for interlock across VOPs. */ SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_mutex, CTLFLAG_RW, &vfs_badlock_mutex, 0, "Check for interlock across VOPs"); int vfs_badlock_print = 1; /* Print lock violations. */ SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_print, CTLFLAG_RW, &vfs_badlock_print, 0, "Print lock violations"); #ifdef KDB int vfs_badlock_backtrace = 1; /* Print backtrace at lock violations. */ SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_backtrace, CTLFLAG_RW, &vfs_badlock_backtrace, 0, "Print backtrace at lock violations"); #endif static void vfs_badlock(const char *msg, const char *str, struct vnode *vp) { #ifdef KDB if (vfs_badlock_backtrace) kdb_backtrace(); #endif if (vfs_badlock_print) printf("%s: %p %s\n", str, (void *)vp, msg); if (vfs_badlock_ddb) kdb_enter(KDB_WHY_VFSLOCK, "lock violation"); } void assert_vi_locked(struct vnode *vp, const char *str) { if (vfs_badlock_mutex && !mtx_owned(VI_MTX(vp))) vfs_badlock("interlock is not locked but should be", str, vp); } void assert_vi_unlocked(struct vnode *vp, const char *str) { if (vfs_badlock_mutex && mtx_owned(VI_MTX(vp))) vfs_badlock("interlock is locked but should not be", str, vp); } void assert_vop_locked(struct vnode *vp, const char *str) { int locked; if (!IGNORE_LOCK(vp)) { locked = VOP_ISLOCKED(vp); if (locked == 0 || locked == LK_EXCLOTHER) vfs_badlock("is not locked but should be", str, vp); } } void assert_vop_unlocked(struct vnode *vp, const char *str) { if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) == LK_EXCLUSIVE) vfs_badlock("is locked but should not be", str, vp); } void assert_vop_elocked(struct vnode *vp, const char *str) { if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) != LK_EXCLUSIVE) vfs_badlock("is not exclusive locked but should be", str, vp); } #if 0 void assert_vop_elocked_other(struct vnode *vp, const char *str) { if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) != LK_EXCLOTHER) vfs_badlock("is not exclusive locked by another thread", str, vp); } void assert_vop_slocked(struct vnode *vp, const char *str) { if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) != LK_SHARED) vfs_badlock("is not locked shared but should be", str, vp); } #endif /* 0 */ #endif /* DEBUG_VFS_LOCKS */ void vop_rename_fail(struct vop_rename_args *ap) { if (ap->a_tvp != NULL) vput(ap->a_tvp); if (ap->a_tdvp == ap->a_tvp) vrele(ap->a_tdvp); else vput(ap->a_tdvp); vrele(ap->a_fdvp); vrele(ap->a_fvp); } void vop_rename_pre(void *ap) { struct vop_rename_args *a = ap; #ifdef DEBUG_VFS_LOCKS if (a->a_tvp) ASSERT_VI_UNLOCKED(a->a_tvp, "VOP_RENAME"); ASSERT_VI_UNLOCKED(a->a_tdvp, "VOP_RENAME"); ASSERT_VI_UNLOCKED(a->a_fvp, "VOP_RENAME"); ASSERT_VI_UNLOCKED(a->a_fdvp, "VOP_RENAME"); /* Check the source (from). */ if (a->a_tdvp->v_vnlock != a->a_fdvp->v_vnlock && (a->a_tvp == NULL || a->a_tvp->v_vnlock != a->a_fdvp->v_vnlock)) ASSERT_VOP_UNLOCKED(a->a_fdvp, "vop_rename: fdvp locked"); if (a->a_tvp == NULL || a->a_tvp->v_vnlock != a->a_fvp->v_vnlock) ASSERT_VOP_UNLOCKED(a->a_fvp, "vop_rename: fvp locked"); /* Check the target. */ if (a->a_tvp) ASSERT_VOP_LOCKED(a->a_tvp, "vop_rename: tvp not locked"); ASSERT_VOP_LOCKED(a->a_tdvp, "vop_rename: tdvp not locked"); #endif if (a->a_tdvp != a->a_fdvp) vhold(a->a_fdvp); if (a->a_tvp != a->a_fvp) vhold(a->a_fvp); vhold(a->a_tdvp); if (a->a_tvp) vhold(a->a_tvp); } void vop_strategy_pre(void *ap) { #ifdef DEBUG_VFS_LOCKS struct vop_strategy_args *a; struct buf *bp; a = ap; bp = a->a_bp; /* * Cluster ops lock their component buffers but not the IO container. */ if ((bp->b_flags & B_CLUSTER) != 0) return; if (panicstr == NULL && !BUF_ISLOCKED(bp)) { if (vfs_badlock_print) printf( "VOP_STRATEGY: bp is not locked but should be\n"); if (vfs_badlock_ddb) kdb_enter(KDB_WHY_VFSLOCK, "lock violation"); } #endif } void vop_lock_pre(void *ap) { #ifdef DEBUG_VFS_LOCKS struct vop_lock1_args *a = ap; if ((a->a_flags & LK_INTERLOCK) == 0) ASSERT_VI_UNLOCKED(a->a_vp, "VOP_LOCK"); else ASSERT_VI_LOCKED(a->a_vp, "VOP_LOCK"); #endif } void vop_lock_post(void *ap, int rc) { #ifdef DEBUG_VFS_LOCKS struct vop_lock1_args *a = ap; ASSERT_VI_UNLOCKED(a->a_vp, "VOP_LOCK"); if (rc == 0 && (a->a_flags & LK_EXCLOTHER) == 0) ASSERT_VOP_LOCKED(a->a_vp, "VOP_LOCK"); #endif } void vop_unlock_pre(void *ap) { #ifdef DEBUG_VFS_LOCKS struct vop_unlock_args *a = ap; if (a->a_flags & LK_INTERLOCK) ASSERT_VI_LOCKED(a->a_vp, "VOP_UNLOCK"); ASSERT_VOP_LOCKED(a->a_vp, "VOP_UNLOCK"); #endif } void vop_unlock_post(void *ap, int rc) { #ifdef DEBUG_VFS_LOCKS struct vop_unlock_args *a = ap; if (a->a_flags & LK_INTERLOCK) ASSERT_VI_UNLOCKED(a->a_vp, "VOP_UNLOCK"); #endif } void vop_create_post(void *ap, int rc) { struct vop_create_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE); } void vop_deleteextattr_post(void *ap, int rc) { struct vop_deleteextattr_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_vp, NOTE_ATTRIB); } void vop_link_post(void *ap, int rc) { struct vop_link_args *a = ap; if (!rc) { VFS_KNOTE_LOCKED(a->a_vp, NOTE_LINK); VFS_KNOTE_LOCKED(a->a_tdvp, NOTE_WRITE); } } void vop_mkdir_post(void *ap, int rc) { struct vop_mkdir_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE | NOTE_LINK); } void vop_mknod_post(void *ap, int rc) { struct vop_mknod_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE); } void vop_remove_post(void *ap, int rc) { struct vop_remove_args *a = ap; if (!rc) { VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE); VFS_KNOTE_LOCKED(a->a_vp, NOTE_DELETE); } } void vop_rename_post(void *ap, int rc) { struct vop_rename_args *a = ap; if (!rc) { VFS_KNOTE_UNLOCKED(a->a_fdvp, NOTE_WRITE); VFS_KNOTE_UNLOCKED(a->a_tdvp, NOTE_WRITE); VFS_KNOTE_UNLOCKED(a->a_fvp, NOTE_RENAME); if (a->a_tvp) VFS_KNOTE_UNLOCKED(a->a_tvp, NOTE_DELETE); } if (a->a_tdvp != a->a_fdvp) vdrop(a->a_fdvp); if (a->a_tvp != a->a_fvp) vdrop(a->a_fvp); vdrop(a->a_tdvp); if (a->a_tvp) vdrop(a->a_tvp); } void vop_rmdir_post(void *ap, int rc) { struct vop_rmdir_args *a = ap; if (!rc) { VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE | NOTE_LINK); VFS_KNOTE_LOCKED(a->a_vp, NOTE_DELETE); } } void vop_setattr_post(void *ap, int rc) { struct vop_setattr_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_vp, NOTE_ATTRIB); } void vop_setextattr_post(void *ap, int rc) { struct vop_setextattr_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_vp, NOTE_ATTRIB); } void vop_symlink_post(void *ap, int rc) { struct vop_symlink_args *a = ap; if (!rc) VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE); } static struct knlist fs_knlist; static void vfs_event_init(void *arg) { knlist_init_mtx(&fs_knlist, NULL); } /* XXX - correct order? */ SYSINIT(vfs_knlist, SI_SUB_VFS, SI_ORDER_ANY, vfs_event_init, NULL); void vfs_event_signal(fsid_t *fsid, uint32_t event, intptr_t data __unused) { KNOTE_UNLOCKED(&fs_knlist, event); } static int filt_fsattach(struct knote *kn); static void filt_fsdetach(struct knote *kn); static int filt_fsevent(struct knote *kn, long hint); struct filterops fs_filtops = { .f_isfd = 0, .f_attach = filt_fsattach, .f_detach = filt_fsdetach, .f_event = filt_fsevent }; static int filt_fsattach(struct knote *kn) { kn->kn_flags |= EV_CLEAR; knlist_add(&fs_knlist, kn, 0); return (0); } static void filt_fsdetach(struct knote *kn) { knlist_remove(&fs_knlist, kn, 0); } static int filt_fsevent(struct knote *kn, long hint) { kn->kn_fflags |= hint; return (kn->kn_fflags != 0); } static int sysctl_vfs_ctl(SYSCTL_HANDLER_ARGS) { struct vfsidctl vc; int error; struct mount *mp; error = SYSCTL_IN(req, &vc, sizeof(vc)); if (error) return (error); if (vc.vc_vers != VFS_CTL_VERS1) return (EINVAL); mp = vfs_getvfs(&vc.vc_fsid); if (mp == NULL) return (ENOENT); /* ensure that a specific sysctl goes to the right filesystem. */ if (strcmp(vc.vc_fstypename, "*") != 0 && strcmp(vc.vc_fstypename, mp->mnt_vfc->vfc_name) != 0) { vfs_rel(mp); return (EINVAL); } VCTLTOREQ(&vc, req); error = VFS_SYSCTL(mp, vc.vc_op, req); vfs_rel(mp); return (error); } SYSCTL_PROC(_vfs, OID_AUTO, ctl, CTLTYPE_OPAQUE | CTLFLAG_WR, NULL, 0, sysctl_vfs_ctl, "", "Sysctl by fsid"); /* * Function to initialize a va_filerev field sensibly. * XXX: Wouldn't a random number make a lot more sense ?? */ u_quad_t init_va_filerev(void) { struct bintime bt; getbinuptime(&bt); return (((u_quad_t)bt.sec << 32LL) | (bt.frac >> 32LL)); } static int filt_vfsread(struct knote *kn, long hint); static int filt_vfswrite(struct knote *kn, long hint); static int filt_vfsvnode(struct knote *kn, long hint); static void filt_vfsdetach(struct knote *kn); static struct filterops vfsread_filtops = { .f_isfd = 1, .f_detach = filt_vfsdetach, .f_event = filt_vfsread }; static struct filterops vfswrite_filtops = { .f_isfd = 1, .f_detach = filt_vfsdetach, .f_event = filt_vfswrite }; static struct filterops vfsvnode_filtops = { .f_isfd = 1, .f_detach = filt_vfsdetach, .f_event = filt_vfsvnode }; static void vfs_knllock(void *arg) { struct vnode *vp = arg; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); } static void vfs_knlunlock(void *arg) { struct vnode *vp = arg; VOP_UNLOCK(vp, 0); } static void vfs_knl_assert_locked(void *arg) { #ifdef DEBUG_VFS_LOCKS struct vnode *vp = arg; ASSERT_VOP_LOCKED(vp, "vfs_knl_assert_locked"); #endif } static void vfs_knl_assert_unlocked(void *arg) { #ifdef DEBUG_VFS_LOCKS struct vnode *vp = arg; ASSERT_VOP_UNLOCKED(vp, "vfs_knl_assert_unlocked"); #endif } int vfs_kqfilter(struct vop_kqfilter_args *ap) { struct vnode *vp = ap->a_vp; struct knote *kn = ap->a_kn; struct knlist *knl; switch (kn->kn_filter) { case EVFILT_READ: kn->kn_fop = &vfsread_filtops; break; case EVFILT_WRITE: kn->kn_fop = &vfswrite_filtops; break; case EVFILT_VNODE: kn->kn_fop = &vfsvnode_filtops; break; default: return (EINVAL); } kn->kn_hook = (caddr_t)vp; v_addpollinfo(vp); if (vp->v_pollinfo == NULL) return (ENOMEM); knl = &vp->v_pollinfo->vpi_selinfo.si_note; vhold(vp); knlist_add(knl, kn, 0); return (0); } /* * Detach knote from vnode */ static void filt_vfsdetach(struct knote *kn) { struct vnode *vp = (struct vnode *)kn->kn_hook; KASSERT(vp->v_pollinfo != NULL, ("Missing v_pollinfo")); knlist_remove(&vp->v_pollinfo->vpi_selinfo.si_note, kn, 0); vdrop(vp); } /*ARGSUSED*/ static int filt_vfsread(struct knote *kn, long hint) { struct vnode *vp = (struct vnode *)kn->kn_hook; struct vattr va; int res; /* * filesystem is gone, so set the EOF flag and schedule * the knote for deletion. */ if (hint == NOTE_REVOKE) { VI_LOCK(vp); kn->kn_flags |= (EV_EOF | EV_ONESHOT); VI_UNLOCK(vp); return (1); } if (VOP_GETATTR(vp, &va, curthread->td_ucred)) return (0); VI_LOCK(vp); kn->kn_data = va.va_size - kn->kn_fp->f_offset; res = (kn->kn_data != 0); VI_UNLOCK(vp); return (res); } /*ARGSUSED*/ static int filt_vfswrite(struct knote *kn, long hint) { struct vnode *vp = (struct vnode *)kn->kn_hook; VI_LOCK(vp); /* * filesystem is gone, so set the EOF flag and schedule * the knote for deletion. */ if (hint == NOTE_REVOKE) kn->kn_flags |= (EV_EOF | EV_ONESHOT); kn->kn_data = 0; VI_UNLOCK(vp); return (1); } static int filt_vfsvnode(struct knote *kn, long hint) { struct vnode *vp = (struct vnode *)kn->kn_hook; int res; VI_LOCK(vp); if (kn->kn_sfflags & hint) kn->kn_fflags |= hint; if (hint == NOTE_REVOKE) { kn->kn_flags |= EV_EOF; VI_UNLOCK(vp); return (1); } res = (kn->kn_fflags != 0); VI_UNLOCK(vp); return (res); } int vfs_read_dirent(struct vop_readdir_args *ap, struct dirent *dp, off_t off) { int error; if (dp->d_reclen > ap->a_uio->uio_resid) return (ENAMETOOLONG); error = uiomove(dp, dp->d_reclen, ap->a_uio); if (error) { if (ap->a_ncookies != NULL) { if (ap->a_cookies != NULL) free(ap->a_cookies, M_TEMP); ap->a_cookies = NULL; *ap->a_ncookies = 0; } return (error); } if (ap->a_ncookies == NULL) return (0); KASSERT(ap->a_cookies, ("NULL ap->a_cookies value with non-NULL ap->a_ncookies!")); *ap->a_cookies = realloc(*ap->a_cookies, (*ap->a_ncookies + 1) * sizeof(u_long), M_TEMP, M_WAITOK | M_ZERO); (*ap->a_cookies)[*ap->a_ncookies] = off; return (0); } /* * Mark for update the access time of the file if the filesystem * supports VOP_MARKATIME. This functionality is used by execve and * mmap, so we want to avoid the I/O implied by directly setting * va_atime for the sake of efficiency. */ void vfs_mark_atime(struct vnode *vp, struct ucred *cred) { struct mount *mp; mp = vp->v_mount; ASSERT_VOP_LOCKED(vp, "vfs_mark_atime"); if (mp != NULL && (mp->mnt_flag & (MNT_NOATIME | MNT_RDONLY)) == 0) (void)VOP_MARKATIME(vp); } /* * The purpose of this routine is to remove granularity from accmode_t, * reducing it into standard unix access bits - VEXEC, VREAD, VWRITE, * VADMIN and VAPPEND. * * If it returns 0, the caller is supposed to continue with the usual * access checks using 'accmode' as modified by this routine. If it * returns nonzero value, the caller is supposed to return that value * as errno. * * Note that after this routine runs, accmode may be zero. */ int vfs_unixify_accmode(accmode_t *accmode) { /* * There is no way to specify explicit "deny" rule using * file mode or POSIX.1e ACLs. */ if (*accmode & VEXPLICIT_DENY) { *accmode = 0; return (0); } /* * None of these can be translated into usual access bits. * Also, the common case for NFSv4 ACLs is to not contain * either of these bits. Caller should check for VWRITE * on the containing directory instead. */ if (*accmode & (VDELETE_CHILD | VDELETE)) return (EPERM); if (*accmode & VADMIN_PERMS) { *accmode &= ~VADMIN_PERMS; *accmode |= VADMIN; } /* * There is no way to deny VREAD_ATTRIBUTES, VREAD_ACL * or VSYNCHRONIZE using file mode or POSIX.1e ACL. */ *accmode &= ~(VSTAT_PERMS | VSYNCHRONIZE); return (0); } /* * These are helper functions for filesystems to traverse all * their vnodes. See MNT_VNODE_FOREACH_ALL() in sys/mount.h. * * This interface replaces MNT_VNODE_FOREACH. */ MALLOC_DEFINE(M_VNODE_MARKER, "vnodemarker", "vnode marker"); struct vnode * __mnt_vnode_next_all(struct vnode **mvp, struct mount *mp) { struct vnode *vp; if (should_yield()) kern_yield(PRI_USER); MNT_ILOCK(mp); KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch")); vp = TAILQ_NEXT(*mvp, v_nmntvnodes); while (vp != NULL && (vp->v_type == VMARKER || (vp->v_iflag & VI_DOOMED) != 0)) vp = TAILQ_NEXT(vp, v_nmntvnodes); /* Check if we are done */ if (vp == NULL) { __mnt_vnode_markerfree_all(mvp, mp); /* MNT_IUNLOCK(mp); -- done in above function */ mtx_assert(MNT_MTX(mp), MA_NOTOWNED); return (NULL); } TAILQ_REMOVE(&mp->mnt_nvnodelist, *mvp, v_nmntvnodes); TAILQ_INSERT_AFTER(&mp->mnt_nvnodelist, vp, *mvp, v_nmntvnodes); VI_LOCK(vp); MNT_IUNLOCK(mp); return (vp); } struct vnode * __mnt_vnode_first_all(struct vnode **mvp, struct mount *mp) { struct vnode *vp; *mvp = malloc(sizeof(struct vnode), M_VNODE_MARKER, M_WAITOK | M_ZERO); MNT_ILOCK(mp); MNT_REF(mp); (*mvp)->v_type = VMARKER; vp = TAILQ_FIRST(&mp->mnt_nvnodelist); while (vp != NULL && (vp->v_type == VMARKER || (vp->v_iflag & VI_DOOMED) != 0)) vp = TAILQ_NEXT(vp, v_nmntvnodes); /* Check if we are done */ if (vp == NULL) { MNT_REL(mp); MNT_IUNLOCK(mp); free(*mvp, M_VNODE_MARKER); *mvp = NULL; return (NULL); } (*mvp)->v_mount = mp; TAILQ_INSERT_AFTER(&mp->mnt_nvnodelist, vp, *mvp, v_nmntvnodes); VI_LOCK(vp); MNT_IUNLOCK(mp); return (vp); } void __mnt_vnode_markerfree_all(struct vnode **mvp, struct mount *mp) { if (*mvp == NULL) { MNT_IUNLOCK(mp); return; } mtx_assert(MNT_MTX(mp), MA_OWNED); KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch")); TAILQ_REMOVE(&mp->mnt_nvnodelist, *mvp, v_nmntvnodes); MNT_REL(mp); MNT_IUNLOCK(mp); free(*mvp, M_VNODE_MARKER); *mvp = NULL; } /* * These are helper functions for filesystems to traverse their * active vnodes. See MNT_VNODE_FOREACH_ACTIVE() in sys/mount.h */ static void mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *mp) { KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch")); MNT_ILOCK(mp); MNT_REL(mp); MNT_IUNLOCK(mp); free(*mvp, M_VNODE_MARKER); *mvp = NULL; } static struct vnode * mnt_vnode_next_active(struct vnode **mvp, struct mount *mp) { struct vnode *vp, *nvp; mtx_assert(&vnode_free_list_mtx, MA_OWNED); KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch")); restart: vp = TAILQ_NEXT(*mvp, v_actfreelist); TAILQ_REMOVE(&mp->mnt_activevnodelist, *mvp, v_actfreelist); while (vp != NULL) { if (vp->v_type == VMARKER) { vp = TAILQ_NEXT(vp, v_actfreelist); continue; } if (!VI_TRYLOCK(vp)) { if (mp_ncpus == 1 || should_yield()) { TAILQ_INSERT_BEFORE(vp, *mvp, v_actfreelist); mtx_unlock(&vnode_free_list_mtx); pause("vnacti", 1); mtx_lock(&vnode_free_list_mtx); goto restart; } continue; } KASSERT(vp->v_type != VMARKER, ("locked marker %p", vp)); KASSERT(vp->v_mount == mp || vp->v_mount == NULL, ("alien vnode on the active list %p %p", vp, mp)); if (vp->v_mount == mp && (vp->v_iflag & VI_DOOMED) == 0) break; nvp = TAILQ_NEXT(vp, v_actfreelist); VI_UNLOCK(vp); vp = nvp; } /* Check if we are done */ if (vp == NULL) { mtx_unlock(&vnode_free_list_mtx); mnt_vnode_markerfree_active(mvp, mp); return (NULL); } TAILQ_INSERT_AFTER(&mp->mnt_activevnodelist, vp, *mvp, v_actfreelist); mtx_unlock(&vnode_free_list_mtx); ASSERT_VI_LOCKED(vp, "active iter"); KASSERT((vp->v_iflag & VI_ACTIVE) != 0, ("Non-active vp %p", vp)); return (vp); } struct vnode * __mnt_vnode_next_active(struct vnode **mvp, struct mount *mp) { if (should_yield()) kern_yield(PRI_USER); mtx_lock(&vnode_free_list_mtx); return (mnt_vnode_next_active(mvp, mp)); } struct vnode * __mnt_vnode_first_active(struct vnode **mvp, struct mount *mp) { struct vnode *vp; *mvp = malloc(sizeof(struct vnode), M_VNODE_MARKER, M_WAITOK | M_ZERO); MNT_ILOCK(mp); MNT_REF(mp); MNT_IUNLOCK(mp); (*mvp)->v_type = VMARKER; (*mvp)->v_mount = mp; mtx_lock(&vnode_free_list_mtx); vp = TAILQ_FIRST(&mp->mnt_activevnodelist); if (vp == NULL) { mtx_unlock(&vnode_free_list_mtx); mnt_vnode_markerfree_active(mvp, mp); return (NULL); } TAILQ_INSERT_BEFORE(vp, *mvp, v_actfreelist); return (mnt_vnode_next_active(mvp, mp)); } void __mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *mp) { if (*mvp == NULL) return; mtx_lock(&vnode_free_list_mtx); TAILQ_REMOVE(&mp->mnt_activevnodelist, *mvp, v_actfreelist); mtx_unlock(&vnode_free_list_mtx); mnt_vnode_markerfree_active(mvp, mp); } Index: user/ngie/more-tests/sys/kern/vfs_syscalls.c =================================================================== --- user/ngie/more-tests/sys/kern/vfs_syscalls.c (revision 281584) +++ user/ngie/more-tests/sys/kern/vfs_syscalls.c (revision 281585) @@ -1,4753 +1,4760 @@ /*- * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)vfs_syscalls.c 8.13 (Berkeley) 4/15/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_capsicum.h" #include "opt_compat.h" #include "opt_ktrace.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef KTRACE #include #endif #include #include #include #include #include #include #include #include MALLOC_DEFINE(M_FADVISE, "fadvise", "posix_fadvise(2) information"); SDT_PROVIDER_DEFINE(vfs); SDT_PROBE_DEFINE2(vfs, , stat, mode, "char *", "int"); SDT_PROBE_DEFINE2(vfs, , stat, reg, "char *", "int"); static int chroot_refuse_vdir_fds(struct filedesc *fdp); static int kern_chflagsat(struct thread *td, int fd, const char *path, enum uio_seg pathseg, u_long flags, int atflag); static int setfflags(struct thread *td, struct vnode *, u_long); static int getutimes(const struct timeval *, enum uio_seg, struct timespec *); static int getutimens(const struct timespec *, enum uio_seg, struct timespec *, int *); static int setutimes(struct thread *td, struct vnode *, const struct timespec *, int, int); static int vn_access(struct vnode *vp, int user_flags, struct ucred *cred, struct thread *td); /* * The module initialization routine for POSIX asynchronous I/O will * set this to the version of AIO that it implements. (Zero means * that it is not implemented.) This value is used here by pathconf() * and in kern_descrip.c by fpathconf(). */ int async_io_version; /* * Sync each mounted filesystem. */ #ifndef _SYS_SYSPROTO_H_ struct sync_args { int dummy; }; #endif /* ARGSUSED */ int sys_sync(td, uap) struct thread *td; struct sync_args *uap; { struct mount *mp, *nmp; int save; mtx_lock(&mountlist_mtx); for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) { if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) { nmp = TAILQ_NEXT(mp, mnt_list); continue; } if ((mp->mnt_flag & MNT_RDONLY) == 0 && vn_start_write(NULL, &mp, V_NOWAIT) == 0) { save = curthread_pflags_set(TDP_SYNCIO); vfs_msync(mp, MNT_NOWAIT); VFS_SYNC(mp, MNT_NOWAIT); curthread_pflags_restore(save); vn_finished_write(mp); } mtx_lock(&mountlist_mtx); nmp = TAILQ_NEXT(mp, mnt_list); vfs_unbusy(mp); } mtx_unlock(&mountlist_mtx); return (0); } /* * Change filesystem quotas. */ #ifndef _SYS_SYSPROTO_H_ struct quotactl_args { char *path; int cmd; int uid; caddr_t arg; }; #endif int sys_quotactl(td, uap) struct thread *td; register struct quotactl_args /* { char *path; int cmd; int uid; caddr_t arg; } */ *uap; { struct mount *mp; struct nameidata nd; int error; AUDIT_ARG_CMD(uap->cmd); AUDIT_ARG_UID(uap->uid); if (!prison_allow(td->td_ucred, PR_ALLOW_QUOTAS)) return (EPERM); NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE, uap->path, td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); mp = nd.ni_vp->v_mount; vfs_ref(mp); vput(nd.ni_vp); error = vfs_busy(mp, 0); vfs_rel(mp); if (error != 0) return (error); error = VFS_QUOTACTL(mp, uap->cmd, uap->uid, uap->arg); /* * Since quota on operation typically needs to open quota * file, the Q_QUOTAON handler needs to unbusy the mount point * before calling into namei. Otherwise, unmount might be * started between two vfs_busy() invocations (first is our, * second is from mount point cross-walk code in lookup()), * causing deadlock. * * Require that Q_QUOTAON handles the vfs_busy() reference on * its own, always returning with ubusied mount point. */ if ((uap->cmd >> SUBCMDSHIFT) != Q_QUOTAON) vfs_unbusy(mp); return (error); } /* * Used by statfs conversion routines to scale the block size up if * necessary so that all of the block counts are <= 'max_size'. Note * that 'max_size' should be a bitmask, i.e. 2^n - 1 for some non-zero * value of 'n'. */ void statfs_scale_blocks(struct statfs *sf, long max_size) { uint64_t count; int shift; KASSERT(powerof2(max_size + 1), ("%s: invalid max_size", __func__)); /* * Attempt to scale the block counts to give a more accurate * overview to userland of the ratio of free space to used * space. To do this, find the largest block count and compute * a divisor that lets it fit into a signed integer <= max_size. */ if (sf->f_bavail < 0) count = -sf->f_bavail; else count = sf->f_bavail; count = MAX(sf->f_blocks, MAX(sf->f_bfree, count)); if (count <= max_size) return; count >>= flsl(max_size); shift = 0; while (count > 0) { shift++; count >>=1; } sf->f_bsize <<= shift; sf->f_blocks >>= shift; sf->f_bfree >>= shift; sf->f_bavail >>= shift; } /* * Get filesystem statistics. */ #ifndef _SYS_SYSPROTO_H_ struct statfs_args { char *path; struct statfs *buf; }; #endif int sys_statfs(td, uap) struct thread *td; register struct statfs_args /* { char *path; struct statfs *buf; } */ *uap; { struct statfs sf; int error; error = kern_statfs(td, uap->path, UIO_USERSPACE, &sf); if (error == 0) error = copyout(&sf, uap->buf, sizeof(sf)); return (error); } int kern_statfs(struct thread *td, char *path, enum uio_seg pathseg, struct statfs *buf) { struct mount *mp; struct statfs *sp, sb; struct nameidata nd; int error; NDINIT(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1, pathseg, path, td); error = namei(&nd); if (error != 0) return (error); mp = nd.ni_vp->v_mount; vfs_ref(mp); NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_vp); error = vfs_busy(mp, 0); vfs_rel(mp); if (error != 0) return (error); #ifdef MAC error = mac_mount_check_stat(td->td_ucred, mp); if (error != 0) goto out; #endif /* * Set these in case the underlying filesystem fails to do so. */ sp = &mp->mnt_stat; sp->f_version = STATFS_VERSION; sp->f_namemax = NAME_MAX; sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; error = VFS_STATFS(mp, sp); if (error != 0) goto out; if (priv_check(td, PRIV_VFS_GENERATION)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; prison_enforce_statfs(td->td_ucred, mp, &sb); sp = &sb; } *buf = *sp; out: vfs_unbusy(mp); return (error); } /* * Get filesystem statistics. */ #ifndef _SYS_SYSPROTO_H_ struct fstatfs_args { int fd; struct statfs *buf; }; #endif int sys_fstatfs(td, uap) struct thread *td; register struct fstatfs_args /* { int fd; struct statfs *buf; } */ *uap; { struct statfs sf; int error; error = kern_fstatfs(td, uap->fd, &sf); if (error == 0) error = copyout(&sf, uap->buf, sizeof(sf)); return (error); } int kern_fstatfs(struct thread *td, int fd, struct statfs *buf) { struct file *fp; struct mount *mp; struct statfs *sp, sb; struct vnode *vp; cap_rights_t rights; int error; AUDIT_ARG_FD(fd); error = getvnode(td->td_proc->p_fd, fd, cap_rights_init(&rights, CAP_FSTATFS), &fp); if (error != 0) return (error); vp = fp->f_vnode; vn_lock(vp, LK_SHARED | LK_RETRY); #ifdef AUDIT AUDIT_ARG_VNODE1(vp); #endif mp = vp->v_mount; if (mp) vfs_ref(mp); VOP_UNLOCK(vp, 0); fdrop(fp, td); if (mp == NULL) { error = EBADF; goto out; } error = vfs_busy(mp, 0); vfs_rel(mp); if (error != 0) return (error); #ifdef MAC error = mac_mount_check_stat(td->td_ucred, mp); if (error != 0) goto out; #endif /* * Set these in case the underlying filesystem fails to do so. */ sp = &mp->mnt_stat; sp->f_version = STATFS_VERSION; sp->f_namemax = NAME_MAX; sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; error = VFS_STATFS(mp, sp); if (error != 0) goto out; if (priv_check(td, PRIV_VFS_GENERATION)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; prison_enforce_statfs(td->td_ucred, mp, &sb); sp = &sb; } *buf = *sp; out: if (mp) vfs_unbusy(mp); return (error); } /* * Get statistics on all filesystems. */ #ifndef _SYS_SYSPROTO_H_ struct getfsstat_args { struct statfs *buf; long bufsize; int flags; }; #endif int sys_getfsstat(td, uap) struct thread *td; register struct getfsstat_args /* { struct statfs *buf; long bufsize; int flags; } */ *uap; { + size_t count; + int error; - return (kern_getfsstat(td, &uap->buf, uap->bufsize, UIO_USERSPACE, - uap->flags)); + error = kern_getfsstat(td, &uap->buf, uap->bufsize, &count, + UIO_USERSPACE, uap->flags); + if (error == 0) + td->td_retval[0] = count; + return (error); } /* * If (bufsize > 0 && bufseg == UIO_SYSSPACE) * The caller is responsible for freeing memory which will be allocated * in '*buf'. */ int kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize, - enum uio_seg bufseg, int flags) + size_t *countp, enum uio_seg bufseg, int flags) { struct mount *mp, *nmp; struct statfs *sfsp, *sp, sb; size_t count, maxcount; int error; maxcount = bufsize / sizeof(struct statfs); if (bufsize == 0) sfsp = NULL; else if (bufseg == UIO_USERSPACE) sfsp = *buf; else /* if (bufseg == UIO_SYSSPACE) */ { count = 0; mtx_lock(&mountlist_mtx); TAILQ_FOREACH(mp, &mountlist, mnt_list) { count++; } mtx_unlock(&mountlist_mtx); if (maxcount > count) maxcount = count; sfsp = *buf = malloc(maxcount * sizeof(struct statfs), M_TEMP, M_WAITOK); } count = 0; mtx_lock(&mountlist_mtx); for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) { if (prison_canseemount(td->td_ucred, mp) != 0) { nmp = TAILQ_NEXT(mp, mnt_list); continue; } #ifdef MAC if (mac_mount_check_stat(td->td_ucred, mp) != 0) { nmp = TAILQ_NEXT(mp, mnt_list); continue; } #endif if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) { nmp = TAILQ_NEXT(mp, mnt_list); continue; } if (sfsp && count < maxcount) { sp = &mp->mnt_stat; /* * Set these in case the underlying filesystem * fails to do so. */ sp->f_version = STATFS_VERSION; sp->f_namemax = NAME_MAX; sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; /* * If MNT_NOWAIT or MNT_LAZY is specified, do not * refresh the fsstat cache. MNT_NOWAIT or MNT_LAZY * overrides MNT_WAIT. */ if (((flags & (MNT_LAZY|MNT_NOWAIT)) == 0 || (flags & MNT_WAIT)) && (error = VFS_STATFS(mp, sp))) { mtx_lock(&mountlist_mtx); nmp = TAILQ_NEXT(mp, mnt_list); vfs_unbusy(mp); continue; } if (priv_check(td, PRIV_VFS_GENERATION)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; prison_enforce_statfs(td->td_ucred, mp, &sb); sp = &sb; } if (bufseg == UIO_SYSSPACE) bcopy(sp, sfsp, sizeof(*sp)); else /* if (bufseg == UIO_USERSPACE) */ { error = copyout(sp, sfsp, sizeof(*sp)); if (error != 0) { vfs_unbusy(mp); return (error); } } sfsp++; } count++; mtx_lock(&mountlist_mtx); nmp = TAILQ_NEXT(mp, mnt_list); vfs_unbusy(mp); } mtx_unlock(&mountlist_mtx); if (sfsp && count > maxcount) - td->td_retval[0] = maxcount; + *countp = maxcount; else - td->td_retval[0] = count; + *countp = count; return (0); } #ifdef COMPAT_FREEBSD4 /* * Get old format filesystem statistics. */ static void cvtstatfs(struct statfs *, struct ostatfs *); #ifndef _SYS_SYSPROTO_H_ struct freebsd4_statfs_args { char *path; struct ostatfs *buf; }; #endif int freebsd4_statfs(td, uap) struct thread *td; struct freebsd4_statfs_args /* { char *path; struct ostatfs *buf; } */ *uap; { struct ostatfs osb; struct statfs sf; int error; error = kern_statfs(td, uap->path, UIO_USERSPACE, &sf); if (error != 0) return (error); cvtstatfs(&sf, &osb); return (copyout(&osb, uap->buf, sizeof(osb))); } /* * Get filesystem statistics. */ #ifndef _SYS_SYSPROTO_H_ struct freebsd4_fstatfs_args { int fd; struct ostatfs *buf; }; #endif int freebsd4_fstatfs(td, uap) struct thread *td; struct freebsd4_fstatfs_args /* { int fd; struct ostatfs *buf; } */ *uap; { struct ostatfs osb; struct statfs sf; int error; error = kern_fstatfs(td, uap->fd, &sf); if (error != 0) return (error); cvtstatfs(&sf, &osb); return (copyout(&osb, uap->buf, sizeof(osb))); } /* * Get statistics on all filesystems. */ #ifndef _SYS_SYSPROTO_H_ struct freebsd4_getfsstat_args { struct ostatfs *buf; long bufsize; int flags; }; #endif int freebsd4_getfsstat(td, uap) struct thread *td; register struct freebsd4_getfsstat_args /* { struct ostatfs *buf; long bufsize; int flags; } */ *uap; { struct statfs *buf, *sp; struct ostatfs osb; size_t count, size; int error; count = uap->bufsize / sizeof(struct ostatfs); size = count * sizeof(struct statfs); - error = kern_getfsstat(td, &buf, size, UIO_SYSSPACE, uap->flags); + error = kern_getfsstat(td, &buf, size, &count, UIO_SYSSPACE, + uap->flags); if (size > 0) { - count = td->td_retval[0]; sp = buf; while (count > 0 && error == 0) { cvtstatfs(sp, &osb); error = copyout(&osb, uap->buf, sizeof(osb)); sp++; uap->buf++; count--; } free(buf, M_TEMP); } + if (error == 0) + td->td_retval[0] = count; return (error); } /* * Implement fstatfs() for (NFS) file handles. */ #ifndef _SYS_SYSPROTO_H_ struct freebsd4_fhstatfs_args { struct fhandle *u_fhp; struct ostatfs *buf; }; #endif int freebsd4_fhstatfs(td, uap) struct thread *td; struct freebsd4_fhstatfs_args /* { struct fhandle *u_fhp; struct ostatfs *buf; } */ *uap; { struct ostatfs osb; struct statfs sf; fhandle_t fh; int error; error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t)); if (error != 0) return (error); error = kern_fhstatfs(td, fh, &sf); if (error != 0) return (error); cvtstatfs(&sf, &osb); return (copyout(&osb, uap->buf, sizeof(osb))); } /* * Convert a new format statfs structure to an old format statfs structure. */ static void cvtstatfs(nsp, osp) struct statfs *nsp; struct ostatfs *osp; { statfs_scale_blocks(nsp, LONG_MAX); bzero(osp, sizeof(*osp)); osp->f_bsize = nsp->f_bsize; osp->f_iosize = MIN(nsp->f_iosize, LONG_MAX); osp->f_blocks = nsp->f_blocks; osp->f_bfree = nsp->f_bfree; osp->f_bavail = nsp->f_bavail; osp->f_files = MIN(nsp->f_files, LONG_MAX); osp->f_ffree = MIN(nsp->f_ffree, LONG_MAX); osp->f_owner = nsp->f_owner; osp->f_type = nsp->f_type; osp->f_flags = nsp->f_flags; osp->f_syncwrites = MIN(nsp->f_syncwrites, LONG_MAX); osp->f_asyncwrites = MIN(nsp->f_asyncwrites, LONG_MAX); osp->f_syncreads = MIN(nsp->f_syncreads, LONG_MAX); osp->f_asyncreads = MIN(nsp->f_asyncreads, LONG_MAX); strlcpy(osp->f_fstypename, nsp->f_fstypename, MIN(MFSNAMELEN, OMFSNAMELEN)); strlcpy(osp->f_mntonname, nsp->f_mntonname, MIN(MNAMELEN, OMNAMELEN)); strlcpy(osp->f_mntfromname, nsp->f_mntfromname, MIN(MNAMELEN, OMNAMELEN)); osp->f_fsid = nsp->f_fsid; } #endif /* COMPAT_FREEBSD4 */ /* * Change current working directory to a given file descriptor. */ #ifndef _SYS_SYSPROTO_H_ struct fchdir_args { int fd; }; #endif int sys_fchdir(td, uap) struct thread *td; struct fchdir_args /* { int fd; } */ *uap; { register struct filedesc *fdp = td->td_proc->p_fd; struct vnode *vp, *tdp, *vpold; struct mount *mp; struct file *fp; cap_rights_t rights; int error; AUDIT_ARG_FD(uap->fd); error = getvnode(fdp, uap->fd, cap_rights_init(&rights, CAP_FCHDIR), &fp); if (error != 0) return (error); vp = fp->f_vnode; VREF(vp); fdrop(fp, td); vn_lock(vp, LK_SHARED | LK_RETRY); AUDIT_ARG_VNODE1(vp); error = change_dir(vp, td); while (!error && (mp = vp->v_mountedhere) != NULL) { if (vfs_busy(mp, 0)) continue; error = VFS_ROOT(mp, LK_SHARED, &tdp); vfs_unbusy(mp); if (error != 0) break; vput(vp); vp = tdp; } if (error != 0) { vput(vp); return (error); } VOP_UNLOCK(vp, 0); FILEDESC_XLOCK(fdp); vpold = fdp->fd_cdir; fdp->fd_cdir = vp; FILEDESC_XUNLOCK(fdp); vrele(vpold); return (0); } /* * Change current working directory (``.''). */ #ifndef _SYS_SYSPROTO_H_ struct chdir_args { char *path; }; #endif int sys_chdir(td, uap) struct thread *td; struct chdir_args /* { char *path; } */ *uap; { return (kern_chdir(td, uap->path, UIO_USERSPACE)); } int kern_chdir(struct thread *td, char *path, enum uio_seg pathseg) { register struct filedesc *fdp = td->td_proc->p_fd; struct nameidata nd; struct vnode *vp; int error; NDINIT(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1, pathseg, path, td); if ((error = namei(&nd)) != 0) return (error); if ((error = change_dir(nd.ni_vp, td)) != 0) { vput(nd.ni_vp); NDFREE(&nd, NDF_ONLY_PNBUF); return (error); } VOP_UNLOCK(nd.ni_vp, 0); NDFREE(&nd, NDF_ONLY_PNBUF); FILEDESC_XLOCK(fdp); vp = fdp->fd_cdir; fdp->fd_cdir = nd.ni_vp; FILEDESC_XUNLOCK(fdp); vrele(vp); return (0); } /* * Helper function for raised chroot(2) security function: Refuse if * any filedescriptors are open directories. */ static int chroot_refuse_vdir_fds(fdp) struct filedesc *fdp; { struct vnode *vp; struct file *fp; int fd; FILEDESC_LOCK_ASSERT(fdp); for (fd = 0; fd <= fdp->fd_lastfile; fd++) { fp = fget_locked(fdp, fd); if (fp == NULL) continue; if (fp->f_type == DTYPE_VNODE) { vp = fp->f_vnode; if (vp->v_type == VDIR) return (EPERM); } } return (0); } /* * This sysctl determines if we will allow a process to chroot(2) if it * has a directory open: * 0: disallowed for all processes. * 1: allowed for processes that were not already chroot(2)'ed. * 2: allowed for all processes. */ static int chroot_allow_open_directories = 1; SYSCTL_INT(_kern, OID_AUTO, chroot_allow_open_directories, CTLFLAG_RW, &chroot_allow_open_directories, 0, "Allow a process to chroot(2) if it has a directory open"); /* * Change notion of root (``/'') directory. */ #ifndef _SYS_SYSPROTO_H_ struct chroot_args { char *path; }; #endif int sys_chroot(td, uap) struct thread *td; struct chroot_args /* { char *path; } */ *uap; { struct nameidata nd; int error; error = priv_check(td, PRIV_VFS_CHROOT); if (error != 0) return (error); NDINIT(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE, uap->path, td); error = namei(&nd); if (error != 0) goto error; error = change_dir(nd.ni_vp, td); if (error != 0) goto e_vunlock; #ifdef MAC error = mac_vnode_check_chroot(td->td_ucred, nd.ni_vp); if (error != 0) goto e_vunlock; #endif VOP_UNLOCK(nd.ni_vp, 0); error = change_root(nd.ni_vp, td); vrele(nd.ni_vp); NDFREE(&nd, NDF_ONLY_PNBUF); return (error); e_vunlock: vput(nd.ni_vp); error: NDFREE(&nd, NDF_ONLY_PNBUF); return (error); } /* * Common routine for chroot and chdir. Callers must provide a locked vnode * instance. */ int change_dir(vp, td) struct vnode *vp; struct thread *td; { #ifdef MAC int error; #endif ASSERT_VOP_LOCKED(vp, "change_dir(): vp not locked"); if (vp->v_type != VDIR) return (ENOTDIR); #ifdef MAC error = mac_vnode_check_chdir(td->td_ucred, vp); if (error != 0) return (error); #endif return (VOP_ACCESS(vp, VEXEC, td->td_ucred, td)); } /* * Common routine for kern_chroot() and jail_attach(). The caller is * responsible for invoking priv_check() and mac_vnode_check_chroot() to * authorize this operation. */ int change_root(vp, td) struct vnode *vp; struct thread *td; { struct filedesc *fdp; struct vnode *oldvp; int error; fdp = td->td_proc->p_fd; FILEDESC_XLOCK(fdp); if (chroot_allow_open_directories == 0 || (chroot_allow_open_directories == 1 && fdp->fd_rdir != rootvnode)) { error = chroot_refuse_vdir_fds(fdp); if (error != 0) { FILEDESC_XUNLOCK(fdp); return (error); } } oldvp = fdp->fd_rdir; fdp->fd_rdir = vp; VREF(fdp->fd_rdir); if (!fdp->fd_jdir) { fdp->fd_jdir = vp; VREF(fdp->fd_jdir); } FILEDESC_XUNLOCK(fdp); vrele(oldvp); return (0); } static __inline void flags_to_rights(int flags, cap_rights_t *rightsp) { if (flags & O_EXEC) { cap_rights_set(rightsp, CAP_FEXECVE); } else { switch ((flags & O_ACCMODE)) { case O_RDONLY: cap_rights_set(rightsp, CAP_READ); break; case O_RDWR: cap_rights_set(rightsp, CAP_READ); /* FALLTHROUGH */ case O_WRONLY: cap_rights_set(rightsp, CAP_WRITE); if (!(flags & (O_APPEND | O_TRUNC))) cap_rights_set(rightsp, CAP_SEEK); break; } } if (flags & O_CREAT) cap_rights_set(rightsp, CAP_CREATE); if (flags & O_TRUNC) cap_rights_set(rightsp, CAP_FTRUNCATE); if (flags & (O_SYNC | O_FSYNC)) cap_rights_set(rightsp, CAP_FSYNC); if (flags & (O_EXLOCK | O_SHLOCK)) cap_rights_set(rightsp, CAP_FLOCK); } /* * Check permissions, allocate an open file structure, and call the device * open routine if any. */ #ifndef _SYS_SYSPROTO_H_ struct open_args { char *path; int flags; int mode; }; #endif int sys_open(td, uap) struct thread *td; register struct open_args /* { char *path; int flags; int mode; } */ *uap; { return (kern_openat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->flags, uap->mode)); } #ifndef _SYS_SYSPROTO_H_ struct openat_args { int fd; char *path; int flag; int mode; }; #endif int sys_openat(struct thread *td, struct openat_args *uap) { return (kern_openat(td, uap->fd, uap->path, UIO_USERSPACE, uap->flag, uap->mode)); } int kern_openat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int flags, int mode) { struct proc *p = td->td_proc; struct filedesc *fdp = p->p_fd; struct file *fp; struct vnode *vp; struct nameidata nd; cap_rights_t rights; int cmode, error, indx; indx = -1; AUDIT_ARG_FFLAGS(flags); AUDIT_ARG_MODE(mode); /* XXX: audit dirfd */ cap_rights_init(&rights, CAP_LOOKUP); flags_to_rights(flags, &rights); /* * Only one of the O_EXEC, O_RDONLY, O_WRONLY and O_RDWR flags * may be specified. */ if (flags & O_EXEC) { if (flags & O_ACCMODE) return (EINVAL); } else if ((flags & O_ACCMODE) == O_ACCMODE) { return (EINVAL); } else { flags = FFLAGS(flags); } /* * Allocate the file descriptor, but don't install a descriptor yet. */ error = falloc_noinstall(td, &fp); if (error != 0) return (error); /* * An extra reference on `fp' has been held for us by * falloc_noinstall(). */ /* Set the flags early so the finit in devfs can pick them up. */ fp->f_flag = flags & FMASK; cmode = ((mode & ~fdp->fd_cmask) & ALLPERMS) & ~S_ISTXT; NDINIT_ATRIGHTS(&nd, LOOKUP, FOLLOW | AUDITVNODE1, pathseg, path, fd, &rights, td); td->td_dupfd = -1; /* XXX check for fdopen */ error = vn_open(&nd, &flags, cmode, fp); if (error != 0) { /* * If the vn_open replaced the method vector, something * wonderous happened deep below and we just pass it up * pretending we know what we do. */ if (error == ENXIO && fp->f_ops != &badfileops) goto success; /* * Handle special fdopen() case. bleh. * * Don't do this for relative (capability) lookups; we don't * understand exactly what would happen, and we don't think * that it ever should. */ if (nd.ni_strictrelative == 0 && (error == ENODEV || error == ENXIO) && td->td_dupfd >= 0) { error = dupfdopen(td, fdp, td->td_dupfd, flags, error, &indx); if (error == 0) goto success; } goto bad; } td->td_dupfd = 0; NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; /* * Store the vnode, for any f_type. Typically, the vnode use * count is decremented by direct call to vn_closefile() for * files that switched type in the cdevsw fdopen() method. */ fp->f_vnode = vp; /* * If the file wasn't claimed by devfs bind it to the normal * vnode operations here. */ if (fp->f_ops == &badfileops) { KASSERT(vp->v_type != VFIFO, ("Unexpected fifo.")); fp->f_seqcount = 1; finit(fp, (flags & FMASK) | (fp->f_flag & FHASLOCK), DTYPE_VNODE, vp, &vnops); } VOP_UNLOCK(vp, 0); if (flags & O_TRUNC) { error = fo_truncate(fp, 0, td->td_ucred, td); if (error != 0) goto bad; } success: /* * If we haven't already installed the FD (for dupfdopen), do so now. */ if (indx == -1) { struct filecaps *fcaps; #ifdef CAPABILITIES if (nd.ni_strictrelative == 1) fcaps = &nd.ni_filecaps; else #endif fcaps = NULL; error = finstall(td, fp, &indx, flags, fcaps); /* On success finstall() consumes fcaps. */ if (error != 0) { filecaps_free(&nd.ni_filecaps); goto bad; } } else { filecaps_free(&nd.ni_filecaps); } /* * Release our private reference, leaving the one associated with * the descriptor table intact. */ fdrop(fp, td); td->td_retval[0] = indx; return (0); bad: KASSERT(indx == -1, ("indx=%d, should be -1", indx)); fdrop(fp, td); return (error); } #ifdef COMPAT_43 /* * Create a file. */ #ifndef _SYS_SYSPROTO_H_ struct ocreat_args { char *path; int mode; }; #endif int ocreat(td, uap) struct thread *td; register struct ocreat_args /* { char *path; int mode; } */ *uap; { return (kern_openat(td, AT_FDCWD, uap->path, UIO_USERSPACE, O_WRONLY | O_CREAT | O_TRUNC, uap->mode)); } #endif /* COMPAT_43 */ /* * Create a special file. */ #ifndef _SYS_SYSPROTO_H_ struct mknod_args { char *path; int mode; int dev; }; #endif int sys_mknod(td, uap) struct thread *td; register struct mknod_args /* { char *path; int mode; int dev; } */ *uap; { return (kern_mknodat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->mode, uap->dev)); } #ifndef _SYS_SYSPROTO_H_ struct mknodat_args { int fd; char *path; mode_t mode; dev_t dev; }; #endif int sys_mknodat(struct thread *td, struct mknodat_args *uap) { return (kern_mknodat(td, uap->fd, uap->path, UIO_USERSPACE, uap->mode, uap->dev)); } int kern_mknodat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int mode, int dev) { struct vnode *vp; struct mount *mp; struct vattr vattr; struct nameidata nd; cap_rights_t rights; int error, whiteout = 0; AUDIT_ARG_MODE(mode); AUDIT_ARG_DEV(dev); switch (mode & S_IFMT) { case S_IFCHR: case S_IFBLK: error = priv_check(td, PRIV_VFS_MKNOD_DEV); break; case S_IFMT: error = priv_check(td, PRIV_VFS_MKNOD_BAD); break; case S_IFWHT: error = priv_check(td, PRIV_VFS_MKNOD_WHT); break; case S_IFIFO: if (dev == 0) return (kern_mkfifoat(td, fd, path, pathseg, mode)); /* FALLTHROUGH */ default: error = EINVAL; break; } if (error != 0) return (error); restart: bwillwrite(); NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 | NOCACHE, pathseg, path, fd, cap_rights_init(&rights, CAP_MKNODAT), td); if ((error = namei(&nd)) != 0) return (error); vp = nd.ni_vp; if (vp != NULL) { NDFREE(&nd, NDF_ONLY_PNBUF); if (vp == nd.ni_dvp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); vrele(vp); return (EEXIST); } else { VATTR_NULL(&vattr); vattr.va_mode = (mode & ALLPERMS) & ~td->td_proc->p_fd->fd_cmask; vattr.va_rdev = dev; whiteout = 0; switch (mode & S_IFMT) { case S_IFMT: /* used by badsect to flag bad sectors */ vattr.va_type = VBAD; break; case S_IFCHR: vattr.va_type = VCHR; break; case S_IFBLK: vattr.va_type = VBLK; break; case S_IFWHT: whiteout = 1; break; default: panic("kern_mknod: invalid mode"); } } if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) return (error); goto restart; } #ifdef MAC if (error == 0 && !whiteout) error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd, &vattr); #endif if (error == 0) { if (whiteout) error = VOP_WHITEOUT(nd.ni_dvp, &nd.ni_cnd, CREATE); else { error = VOP_MKNOD(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr); if (error == 0) vput(nd.ni_vp); } } NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); vn_finished_write(mp); return (error); } /* * Create a named pipe. */ #ifndef _SYS_SYSPROTO_H_ struct mkfifo_args { char *path; int mode; }; #endif int sys_mkfifo(td, uap) struct thread *td; register struct mkfifo_args /* { char *path; int mode; } */ *uap; { return (kern_mkfifoat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->mode)); } #ifndef _SYS_SYSPROTO_H_ struct mkfifoat_args { int fd; char *path; mode_t mode; }; #endif int sys_mkfifoat(struct thread *td, struct mkfifoat_args *uap) { return (kern_mkfifoat(td, uap->fd, uap->path, UIO_USERSPACE, uap->mode)); } int kern_mkfifoat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int mode) { struct mount *mp; struct vattr vattr; struct nameidata nd; cap_rights_t rights; int error; AUDIT_ARG_MODE(mode); restart: bwillwrite(); NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 | NOCACHE, pathseg, path, fd, cap_rights_init(&rights, CAP_MKFIFOAT), td); if ((error = namei(&nd)) != 0) return (error); if (nd.ni_vp != NULL) { NDFREE(&nd, NDF_ONLY_PNBUF); if (nd.ni_vp == nd.ni_dvp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); vrele(nd.ni_vp); return (EEXIST); } if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) return (error); goto restart; } VATTR_NULL(&vattr); vattr.va_type = VFIFO; vattr.va_mode = (mode & ALLPERMS) & ~td->td_proc->p_fd->fd_cmask; #ifdef MAC error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd, &vattr); if (error != 0) goto out; #endif error = VOP_MKNOD(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr); if (error == 0) vput(nd.ni_vp); #ifdef MAC out: #endif vput(nd.ni_dvp); vn_finished_write(mp); NDFREE(&nd, NDF_ONLY_PNBUF); return (error); } /* * Make a hard file link. */ #ifndef _SYS_SYSPROTO_H_ struct link_args { char *path; char *link; }; #endif int sys_link(td, uap) struct thread *td; register struct link_args /* { char *path; char *link; } */ *uap; { return (kern_linkat(td, AT_FDCWD, AT_FDCWD, uap->path, uap->link, UIO_USERSPACE, FOLLOW)); } #ifndef _SYS_SYSPROTO_H_ struct linkat_args { int fd1; char *path1; int fd2; char *path2; int flag; }; #endif int sys_linkat(struct thread *td, struct linkat_args *uap) { int flag; flag = uap->flag; if (flag & ~AT_SYMLINK_FOLLOW) return (EINVAL); return (kern_linkat(td, uap->fd1, uap->fd2, uap->path1, uap->path2, UIO_USERSPACE, (flag & AT_SYMLINK_FOLLOW) ? FOLLOW : NOFOLLOW)); } int hardlink_check_uid = 0; SYSCTL_INT(_security_bsd, OID_AUTO, hardlink_check_uid, CTLFLAG_RW, &hardlink_check_uid, 0, "Unprivileged processes cannot create hard links to files owned by other " "users"); static int hardlink_check_gid = 0; SYSCTL_INT(_security_bsd, OID_AUTO, hardlink_check_gid, CTLFLAG_RW, &hardlink_check_gid, 0, "Unprivileged processes cannot create hard links to files owned by other " "groups"); static int can_hardlink(struct vnode *vp, struct ucred *cred) { struct vattr va; int error; if (!hardlink_check_uid && !hardlink_check_gid) return (0); error = VOP_GETATTR(vp, &va, cred); if (error != 0) return (error); if (hardlink_check_uid && cred->cr_uid != va.va_uid) { error = priv_check_cred(cred, PRIV_VFS_LINK, 0); if (error != 0) return (error); } if (hardlink_check_gid && !groupmember(va.va_gid, cred)) { error = priv_check_cred(cred, PRIV_VFS_LINK, 0); if (error != 0) return (error); } return (0); } int kern_linkat(struct thread *td, int fd1, int fd2, char *path1, char *path2, enum uio_seg segflg, int follow) { struct vnode *vp; struct mount *mp; struct nameidata nd; cap_rights_t rights; int error; again: bwillwrite(); NDINIT_AT(&nd, LOOKUP, follow | AUDITVNODE1, segflg, path1, fd1, td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; if (vp->v_type == VDIR) { vrele(vp); return (EPERM); /* POSIX */ } NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE2 | NOCACHE, segflg, path2, fd2, cap_rights_init(&rights, CAP_LINKAT), td); if ((error = namei(&nd)) == 0) { if (nd.ni_vp != NULL) { NDFREE(&nd, NDF_ONLY_PNBUF); if (nd.ni_dvp == nd.ni_vp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); vrele(nd.ni_vp); vrele(vp); return (EEXIST); } else if (nd.ni_dvp->v_mount != vp->v_mount) { /* * Cross-device link. No need to recheck * vp->v_type, since it cannot change, except * to VBAD. */ NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); vrele(vp); return (EXDEV); } else if ((error = vn_lock(vp, LK_EXCLUSIVE)) == 0) { error = can_hardlink(vp, td->td_ucred); #ifdef MAC if (error == 0) error = mac_vnode_check_link(td->td_ucred, nd.ni_dvp, vp, &nd.ni_cnd); #endif if (error != 0) { vput(vp); vput(nd.ni_dvp); NDFREE(&nd, NDF_ONLY_PNBUF); return (error); } error = vn_start_write(vp, &mp, V_NOWAIT); if (error != 0) { vput(vp); vput(nd.ni_dvp); NDFREE(&nd, NDF_ONLY_PNBUF); error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH); if (error != 0) return (error); goto again; } error = VOP_LINK(nd.ni_dvp, vp, &nd.ni_cnd); VOP_UNLOCK(vp, 0); vput(nd.ni_dvp); vn_finished_write(mp); NDFREE(&nd, NDF_ONLY_PNBUF); } else { vput(nd.ni_dvp); NDFREE(&nd, NDF_ONLY_PNBUF); vrele(vp); goto again; } } vrele(vp); return (error); } /* * Make a symbolic link. */ #ifndef _SYS_SYSPROTO_H_ struct symlink_args { char *path; char *link; }; #endif int sys_symlink(td, uap) struct thread *td; register struct symlink_args /* { char *path; char *link; } */ *uap; { return (kern_symlinkat(td, uap->path, AT_FDCWD, uap->link, UIO_USERSPACE)); } #ifndef _SYS_SYSPROTO_H_ struct symlinkat_args { char *path; int fd; char *path2; }; #endif int sys_symlinkat(struct thread *td, struct symlinkat_args *uap) { return (kern_symlinkat(td, uap->path1, uap->fd, uap->path2, UIO_USERSPACE)); } int kern_symlinkat(struct thread *td, char *path1, int fd, char *path2, enum uio_seg segflg) { struct mount *mp; struct vattr vattr; char *syspath; struct nameidata nd; int error; cap_rights_t rights; if (segflg == UIO_SYSSPACE) { syspath = path1; } else { syspath = uma_zalloc(namei_zone, M_WAITOK); if ((error = copyinstr(path1, syspath, MAXPATHLEN, NULL)) != 0) goto out; } AUDIT_ARG_TEXT(syspath); restart: bwillwrite(); NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 | NOCACHE, segflg, path2, fd, cap_rights_init(&rights, CAP_SYMLINKAT), td); if ((error = namei(&nd)) != 0) goto out; if (nd.ni_vp) { NDFREE(&nd, NDF_ONLY_PNBUF); if (nd.ni_vp == nd.ni_dvp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); vrele(nd.ni_vp); error = EEXIST; goto out; } if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) goto out; goto restart; } VATTR_NULL(&vattr); vattr.va_mode = ACCESSPERMS &~ td->td_proc->p_fd->fd_cmask; #ifdef MAC vattr.va_type = VLNK; error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd, &vattr); if (error != 0) goto out2; #endif error = VOP_SYMLINK(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr, syspath); if (error == 0) vput(nd.ni_vp); #ifdef MAC out2: #endif NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); vn_finished_write(mp); out: if (segflg != UIO_SYSSPACE) uma_zfree(namei_zone, syspath); return (error); } /* * Delete a whiteout from the filesystem. */ int sys_undelete(td, uap) struct thread *td; register struct undelete_args /* { char *path; } */ *uap; { struct mount *mp; struct nameidata nd; int error; restart: bwillwrite(); NDINIT(&nd, DELETE, LOCKPARENT | DOWHITEOUT | AUDITVNODE1, UIO_USERSPACE, uap->path, td); error = namei(&nd); if (error != 0) return (error); if (nd.ni_vp != NULLVP || !(nd.ni_cnd.cn_flags & ISWHITEOUT)) { NDFREE(&nd, NDF_ONLY_PNBUF); if (nd.ni_vp == nd.ni_dvp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); if (nd.ni_vp) vrele(nd.ni_vp); return (EEXIST); } if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) return (error); goto restart; } error = VOP_WHITEOUT(nd.ni_dvp, &nd.ni_cnd, DELETE); NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); vn_finished_write(mp); return (error); } /* * Delete a name from the filesystem. */ #ifndef _SYS_SYSPROTO_H_ struct unlink_args { char *path; }; #endif int sys_unlink(td, uap) struct thread *td; struct unlink_args /* { char *path; } */ *uap; { return (kern_unlinkat(td, AT_FDCWD, uap->path, UIO_USERSPACE, 0)); } #ifndef _SYS_SYSPROTO_H_ struct unlinkat_args { int fd; char *path; int flag; }; #endif int sys_unlinkat(struct thread *td, struct unlinkat_args *uap) { int flag = uap->flag; int fd = uap->fd; char *path = uap->path; if (flag & ~AT_REMOVEDIR) return (EINVAL); if (flag & AT_REMOVEDIR) return (kern_rmdirat(td, fd, path, UIO_USERSPACE)); else return (kern_unlinkat(td, fd, path, UIO_USERSPACE, 0)); } int kern_unlinkat(struct thread *td, int fd, char *path, enum uio_seg pathseg, ino_t oldinum) { struct mount *mp; struct vnode *vp; struct nameidata nd; struct stat sb; cap_rights_t rights; int error; restart: bwillwrite(); NDINIT_ATRIGHTS(&nd, DELETE, LOCKPARENT | LOCKLEAF | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_UNLINKAT), td); if ((error = namei(&nd)) != 0) return (error == EINVAL ? EPERM : error); vp = nd.ni_vp; if (vp->v_type == VDIR && oldinum == 0) { error = EPERM; /* POSIX */ } else if (oldinum != 0 && ((error = vn_stat(vp, &sb, td->td_ucred, NOCRED, td)) == 0) && sb.st_ino != oldinum) { error = EIDRM; /* Identifier removed */ } else { /* * The root of a mounted filesystem cannot be deleted. * * XXX: can this only be a VDIR case? */ if (vp->v_vflag & VV_ROOT) error = EBUSY; } if (error == 0) { if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if (vp == nd.ni_dvp) vrele(vp); else vput(vp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) return (error); goto restart; } #ifdef MAC error = mac_vnode_check_unlink(td->td_ucred, nd.ni_dvp, vp, &nd.ni_cnd); if (error != 0) goto out; #endif vfs_notify_upper(vp, VFS_NOTIFY_UPPER_UNLINK); error = VOP_REMOVE(nd.ni_dvp, vp, &nd.ni_cnd); #ifdef MAC out: #endif vn_finished_write(mp); } NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if (vp == nd.ni_dvp) vrele(vp); else vput(vp); return (error); } /* * Reposition read/write file offset. */ #ifndef _SYS_SYSPROTO_H_ struct lseek_args { int fd; int pad; off_t offset; int whence; }; #endif int sys_lseek(td, uap) struct thread *td; register struct lseek_args /* { int fd; int pad; off_t offset; int whence; } */ *uap; { struct file *fp; cap_rights_t rights; int error; AUDIT_ARG_FD(uap->fd); error = fget(td, uap->fd, cap_rights_init(&rights, CAP_SEEK), &fp); if (error != 0) return (error); error = (fp->f_ops->fo_flags & DFLAG_SEEKABLE) != 0 ? fo_seek(fp, uap->offset, uap->whence, td) : ESPIPE; fdrop(fp, td); return (error); } #if defined(COMPAT_43) /* * Reposition read/write file offset. */ #ifndef _SYS_SYSPROTO_H_ struct olseek_args { int fd; long offset; int whence; }; #endif int olseek(td, uap) struct thread *td; register struct olseek_args /* { int fd; long offset; int whence; } */ *uap; { struct lseek_args /* { int fd; int pad; off_t offset; int whence; } */ nuap; nuap.fd = uap->fd; nuap.offset = uap->offset; nuap.whence = uap->whence; return (sys_lseek(td, &nuap)); } #endif /* COMPAT_43 */ /* Version with the 'pad' argument */ int freebsd6_lseek(td, uap) struct thread *td; register struct freebsd6_lseek_args *uap; { struct lseek_args ouap; ouap.fd = uap->fd; ouap.offset = uap->offset; ouap.whence = uap->whence; return (sys_lseek(td, &ouap)); } /* * Check access permissions using passed credentials. */ static int vn_access(vp, user_flags, cred, td) struct vnode *vp; int user_flags; struct ucred *cred; struct thread *td; { accmode_t accmode; int error; /* Flags == 0 means only check for existence. */ if (user_flags == 0) return (0); accmode = 0; if (user_flags & R_OK) accmode |= VREAD; if (user_flags & W_OK) accmode |= VWRITE; if (user_flags & X_OK) accmode |= VEXEC; #ifdef MAC error = mac_vnode_check_access(cred, vp, accmode); if (error != 0) return (error); #endif if ((accmode & VWRITE) == 0 || (error = vn_writechk(vp)) == 0) error = VOP_ACCESS(vp, accmode, cred, td); return (error); } /* * Check access permissions using "real" credentials. */ #ifndef _SYS_SYSPROTO_H_ struct access_args { char *path; int amode; }; #endif int sys_access(td, uap) struct thread *td; register struct access_args /* { char *path; int amode; } */ *uap; { return (kern_accessat(td, AT_FDCWD, uap->path, UIO_USERSPACE, 0, uap->amode)); } #ifndef _SYS_SYSPROTO_H_ struct faccessat_args { int dirfd; char *path; int amode; int flag; } #endif int sys_faccessat(struct thread *td, struct faccessat_args *uap) { return (kern_accessat(td, uap->fd, uap->path, UIO_USERSPACE, uap->flag, uap->amode)); } int kern_accessat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int flag, int amode) { struct ucred *cred, *usecred; struct vnode *vp; struct nameidata nd; cap_rights_t rights; int error; if (flag & ~AT_EACCESS) return (EINVAL); if (amode != F_OK && (amode & ~(R_OK | W_OK | X_OK)) != 0) return (EINVAL); /* * Create and modify a temporary credential instead of one that * is potentially shared (if we need one). */ cred = td->td_ucred; if ((flag & AT_EACCESS) == 0 && ((cred->cr_uid != cred->cr_ruid || cred->cr_rgid != cred->cr_groups[0]))) { usecred = crdup(cred); usecred->cr_uid = cred->cr_ruid; usecred->cr_groups[0] = cred->cr_rgid; td->td_ucred = usecred; } else usecred = cred; AUDIT_ARG_VALUE(amode); NDINIT_ATRIGHTS(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FSTAT), td); if ((error = namei(&nd)) != 0) goto out; vp = nd.ni_vp; error = vn_access(vp, amode, usecred, td); NDFREE(&nd, NDF_ONLY_PNBUF); vput(vp); out: if (usecred != cred) { td->td_ucred = cred; crfree(usecred); } return (error); } /* * Check access permissions using "effective" credentials. */ #ifndef _SYS_SYSPROTO_H_ struct eaccess_args { char *path; int amode; }; #endif int sys_eaccess(td, uap) struct thread *td; register struct eaccess_args /* { char *path; int amode; } */ *uap; { return (kern_accessat(td, AT_FDCWD, uap->path, UIO_USERSPACE, AT_EACCESS, uap->amode)); } #if defined(COMPAT_43) /* * Get file status; this version follows links. */ #ifndef _SYS_SYSPROTO_H_ struct ostat_args { char *path; struct ostat *ub; }; #endif int ostat(td, uap) struct thread *td; register struct ostat_args /* { char *path; struct ostat *ub; } */ *uap; { struct stat sb; struct ostat osb; int error; error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error != 0) return (error); cvtstat(&sb, &osb); return (copyout(&osb, uap->ub, sizeof (osb))); } /* * Get file status; this version does not follow links. */ #ifndef _SYS_SYSPROTO_H_ struct olstat_args { char *path; struct ostat *ub; }; #endif int olstat(td, uap) struct thread *td; register struct olstat_args /* { char *path; struct ostat *ub; } */ *uap; { struct stat sb; struct ostat osb; int error; error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error != 0) return (error); cvtstat(&sb, &osb); return (copyout(&osb, uap->ub, sizeof (osb))); } /* * Convert from an old to a new stat structure. */ void cvtstat(st, ost) struct stat *st; struct ostat *ost; { ost->st_dev = st->st_dev; ost->st_ino = st->st_ino; ost->st_mode = st->st_mode; ost->st_nlink = st->st_nlink; ost->st_uid = st->st_uid; ost->st_gid = st->st_gid; ost->st_rdev = st->st_rdev; if (st->st_size < (quad_t)1 << 32) ost->st_size = st->st_size; else ost->st_size = -2; ost->st_atim = st->st_atim; ost->st_mtim = st->st_mtim; ost->st_ctim = st->st_ctim; ost->st_blksize = st->st_blksize; ost->st_blocks = st->st_blocks; ost->st_flags = st->st_flags; ost->st_gen = st->st_gen; } #endif /* COMPAT_43 */ /* * Get file status; this version follows links. */ #ifndef _SYS_SYSPROTO_H_ struct stat_args { char *path; struct stat *ub; }; #endif int sys_stat(td, uap) struct thread *td; register struct stat_args /* { char *path; struct stat *ub; } */ *uap; { struct stat sb; int error; error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error == 0) error = copyout(&sb, uap->ub, sizeof (sb)); return (error); } #ifndef _SYS_SYSPROTO_H_ struct fstatat_args { int fd; char *path; struct stat *buf; int flag; } #endif int sys_fstatat(struct thread *td, struct fstatat_args *uap) { struct stat sb; int error; error = kern_statat(td, uap->flag, uap->fd, uap->path, UIO_USERSPACE, &sb, NULL); if (error == 0) error = copyout(&sb, uap->buf, sizeof (sb)); return (error); } int kern_statat(struct thread *td, int flag, int fd, char *path, enum uio_seg pathseg, struct stat *sbp, void (*hook)(struct vnode *vp, struct stat *sbp)) { struct nameidata nd; struct stat sb; cap_rights_t rights; int error; if (flag & ~AT_SYMLINK_NOFOLLOW) return (EINVAL); NDINIT_ATRIGHTS(&nd, LOOKUP, ((flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW) | LOCKSHARED | LOCKLEAF | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FSTAT), td); if ((error = namei(&nd)) != 0) return (error); error = vn_stat(nd.ni_vp, &sb, td->td_ucred, NOCRED, td); if (error == 0) { SDT_PROBE(vfs, , stat, mode, path, sb.st_mode, 0, 0, 0); if (S_ISREG(sb.st_mode)) SDT_PROBE(vfs, , stat, reg, path, pathseg, 0, 0, 0); if (__predict_false(hook != NULL)) hook(nd.ni_vp, &sb); } NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_vp); if (error != 0) return (error); *sbp = sb; #ifdef KTRACE if (KTRPOINT(td, KTR_STRUCT)) ktrstat(&sb); #endif return (0); } /* * Get file status; this version does not follow links. */ #ifndef _SYS_SYSPROTO_H_ struct lstat_args { char *path; struct stat *ub; }; #endif int sys_lstat(td, uap) struct thread *td; register struct lstat_args /* { char *path; struct stat *ub; } */ *uap; { struct stat sb; int error; error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error == 0) error = copyout(&sb, uap->ub, sizeof (sb)); return (error); } /* * Implementation of the NetBSD [l]stat() functions. */ void cvtnstat(sb, nsb) struct stat *sb; struct nstat *nsb; { bzero(nsb, sizeof *nsb); nsb->st_dev = sb->st_dev; nsb->st_ino = sb->st_ino; nsb->st_mode = sb->st_mode; nsb->st_nlink = sb->st_nlink; nsb->st_uid = sb->st_uid; nsb->st_gid = sb->st_gid; nsb->st_rdev = sb->st_rdev; nsb->st_atim = sb->st_atim; nsb->st_mtim = sb->st_mtim; nsb->st_ctim = sb->st_ctim; nsb->st_size = sb->st_size; nsb->st_blocks = sb->st_blocks; nsb->st_blksize = sb->st_blksize; nsb->st_flags = sb->st_flags; nsb->st_gen = sb->st_gen; nsb->st_birthtim = sb->st_birthtim; } #ifndef _SYS_SYSPROTO_H_ struct nstat_args { char *path; struct nstat *ub; }; #endif int sys_nstat(td, uap) struct thread *td; register struct nstat_args /* { char *path; struct nstat *ub; } */ *uap; { struct stat sb; struct nstat nsb; int error; error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error != 0) return (error); cvtnstat(&sb, &nsb); return (copyout(&nsb, uap->ub, sizeof (nsb))); } /* * NetBSD lstat. Get file status; this version does not follow links. */ #ifndef _SYS_SYSPROTO_H_ struct lstat_args { char *path; struct stat *ub; }; #endif int sys_nlstat(td, uap) struct thread *td; register struct nlstat_args /* { char *path; struct nstat *ub; } */ *uap; { struct stat sb; struct nstat nsb; int error; error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path, UIO_USERSPACE, &sb, NULL); if (error != 0) return (error); cvtnstat(&sb, &nsb); return (copyout(&nsb, uap->ub, sizeof (nsb))); } /* * Get configurable pathname variables. */ #ifndef _SYS_SYSPROTO_H_ struct pathconf_args { char *path; int name; }; #endif int sys_pathconf(td, uap) struct thread *td; register struct pathconf_args /* { char *path; int name; } */ *uap; { return (kern_pathconf(td, uap->path, UIO_USERSPACE, uap->name, FOLLOW)); } #ifndef _SYS_SYSPROTO_H_ struct lpathconf_args { char *path; int name; }; #endif int sys_lpathconf(td, uap) struct thread *td; register struct lpathconf_args /* { char *path; int name; } */ *uap; { return (kern_pathconf(td, uap->path, UIO_USERSPACE, uap->name, NOFOLLOW)); } int kern_pathconf(struct thread *td, char *path, enum uio_seg pathseg, int name, u_long flags) { struct nameidata nd; int error; NDINIT(&nd, LOOKUP, LOCKSHARED | LOCKLEAF | AUDITVNODE1 | flags, pathseg, path, td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); /* If asynchronous I/O is available, it works for all files. */ if (name == _PC_ASYNC_IO) td->td_retval[0] = async_io_version; else error = VOP_PATHCONF(nd.ni_vp, name, td->td_retval); vput(nd.ni_vp); return (error); } /* * Return target name of a symbolic link. */ #ifndef _SYS_SYSPROTO_H_ struct readlink_args { char *path; char *buf; size_t count; }; #endif int sys_readlink(td, uap) struct thread *td; register struct readlink_args /* { char *path; char *buf; size_t count; } */ *uap; { return (kern_readlinkat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->buf, UIO_USERSPACE, uap->count)); } #ifndef _SYS_SYSPROTO_H_ struct readlinkat_args { int fd; char *path; char *buf; size_t bufsize; }; #endif int sys_readlinkat(struct thread *td, struct readlinkat_args *uap) { return (kern_readlinkat(td, uap->fd, uap->path, UIO_USERSPACE, uap->buf, UIO_USERSPACE, uap->bufsize)); } int kern_readlinkat(struct thread *td, int fd, char *path, enum uio_seg pathseg, char *buf, enum uio_seg bufseg, size_t count) { struct vnode *vp; struct iovec aiov; struct uio auio; struct nameidata nd; int error; if (count > IOSIZE_MAX) return (EINVAL); NDINIT_AT(&nd, LOOKUP, NOFOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1, pathseg, path, fd, td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; #ifdef MAC error = mac_vnode_check_readlink(td->td_ucred, vp); if (error != 0) { vput(vp); return (error); } #endif if (vp->v_type != VLNK) error = EINVAL; else { aiov.iov_base = buf; aiov.iov_len = count; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_offset = 0; auio.uio_rw = UIO_READ; auio.uio_segflg = bufseg; auio.uio_td = td; auio.uio_resid = count; error = VOP_READLINK(vp, &auio, td->td_ucred); td->td_retval[0] = count - auio.uio_resid; } vput(vp); return (error); } /* * Common implementation code for chflags() and fchflags(). */ static int setfflags(td, vp, flags) struct thread *td; struct vnode *vp; u_long flags; { struct mount *mp; struct vattr vattr; int error; /* We can't support the value matching VNOVAL. */ if (flags == VNOVAL) return (EOPNOTSUPP); /* * Prevent non-root users from setting flags on devices. When * a device is reused, users can retain ownership of the device * if they are allowed to set flags and programs assume that * chown can't fail when done as root. */ if (vp->v_type == VCHR || vp->v_type == VBLK) { error = priv_check(td, PRIV_VFS_CHFLAGS_DEV); if (error != 0) return (error); } if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) return (error); VATTR_NULL(&vattr); vattr.va_flags = flags; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); #ifdef MAC error = mac_vnode_check_setflags(td->td_ucred, vp, vattr.va_flags); if (error == 0) #endif error = VOP_SETATTR(vp, &vattr, td->td_ucred); VOP_UNLOCK(vp, 0); vn_finished_write(mp); return (error); } /* * Change flags of a file given a path name. */ #ifndef _SYS_SYSPROTO_H_ struct chflags_args { const char *path; u_long flags; }; #endif int sys_chflags(td, uap) struct thread *td; register struct chflags_args /* { const char *path; u_long flags; } */ *uap; { return (kern_chflagsat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->flags, 0)); } #ifndef _SYS_SYSPROTO_H_ struct chflagsat_args { int fd; const char *path; u_long flags; int atflag; } #endif int sys_chflagsat(struct thread *td, struct chflagsat_args *uap) { int fd = uap->fd; const char *path = uap->path; u_long flags = uap->flags; int atflag = uap->atflag; if (atflag & ~AT_SYMLINK_NOFOLLOW) return (EINVAL); return (kern_chflagsat(td, fd, path, UIO_USERSPACE, flags, atflag)); } /* * Same as chflags() but doesn't follow symlinks. */ int sys_lchflags(td, uap) struct thread *td; register struct lchflags_args /* { const char *path; u_long flags; } */ *uap; { return (kern_chflagsat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->flags, AT_SYMLINK_NOFOLLOW)); } static int kern_chflagsat(struct thread *td, int fd, const char *path, enum uio_seg pathseg, u_long flags, int atflag) { struct nameidata nd; cap_rights_t rights; int error, follow; AUDIT_ARG_FFLAGS(flags); follow = (atflag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW; NDINIT_ATRIGHTS(&nd, LOOKUP, follow | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FCHFLAGS), td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); error = setfflags(td, nd.ni_vp, flags); vrele(nd.ni_vp); return (error); } /* * Change flags of a file given a file descriptor. */ #ifndef _SYS_SYSPROTO_H_ struct fchflags_args { int fd; u_long flags; }; #endif int sys_fchflags(td, uap) struct thread *td; register struct fchflags_args /* { int fd; u_long flags; } */ *uap; { struct file *fp; cap_rights_t rights; int error; AUDIT_ARG_FD(uap->fd); AUDIT_ARG_FFLAGS(uap->flags); error = getvnode(td->td_proc->p_fd, uap->fd, cap_rights_init(&rights, CAP_FCHFLAGS), &fp); if (error != 0) return (error); #ifdef AUDIT vn_lock(fp->f_vnode, LK_SHARED | LK_RETRY); AUDIT_ARG_VNODE1(fp->f_vnode); VOP_UNLOCK(fp->f_vnode, 0); #endif error = setfflags(td, fp->f_vnode, uap->flags); fdrop(fp, td); return (error); } /* * Common implementation code for chmod(), lchmod() and fchmod(). */ int setfmode(td, cred, vp, mode) struct thread *td; struct ucred *cred; struct vnode *vp; int mode; { struct mount *mp; struct vattr vattr; int error; if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) return (error); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); VATTR_NULL(&vattr); vattr.va_mode = mode & ALLPERMS; #ifdef MAC error = mac_vnode_check_setmode(cred, vp, vattr.va_mode); if (error == 0) #endif error = VOP_SETATTR(vp, &vattr, cred); VOP_UNLOCK(vp, 0); vn_finished_write(mp); return (error); } /* * Change mode of a file given path name. */ #ifndef _SYS_SYSPROTO_H_ struct chmod_args { char *path; int mode; }; #endif int sys_chmod(td, uap) struct thread *td; register struct chmod_args /* { char *path; int mode; } */ *uap; { return (kern_fchmodat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->mode, 0)); } #ifndef _SYS_SYSPROTO_H_ struct fchmodat_args { int dirfd; char *path; mode_t mode; int flag; } #endif int sys_fchmodat(struct thread *td, struct fchmodat_args *uap) { int flag = uap->flag; int fd = uap->fd; char *path = uap->path; mode_t mode = uap->mode; if (flag & ~AT_SYMLINK_NOFOLLOW) return (EINVAL); return (kern_fchmodat(td, fd, path, UIO_USERSPACE, mode, flag)); } /* * Change mode of a file given path name (don't follow links.) */ #ifndef _SYS_SYSPROTO_H_ struct lchmod_args { char *path; int mode; }; #endif int sys_lchmod(td, uap) struct thread *td; register struct lchmod_args /* { char *path; int mode; } */ *uap; { return (kern_fchmodat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->mode, AT_SYMLINK_NOFOLLOW)); } int kern_fchmodat(struct thread *td, int fd, char *path, enum uio_seg pathseg, mode_t mode, int flag) { struct nameidata nd; cap_rights_t rights; int error, follow; AUDIT_ARG_MODE(mode); follow = (flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW; NDINIT_ATRIGHTS(&nd, LOOKUP, follow | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FCHMOD), td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); error = setfmode(td, td->td_ucred, nd.ni_vp, mode); vrele(nd.ni_vp); return (error); } /* * Change mode of a file given a file descriptor. */ #ifndef _SYS_SYSPROTO_H_ struct fchmod_args { int fd; int mode; }; #endif int sys_fchmod(struct thread *td, struct fchmod_args *uap) { struct file *fp; cap_rights_t rights; int error; AUDIT_ARG_FD(uap->fd); AUDIT_ARG_MODE(uap->mode); error = fget(td, uap->fd, cap_rights_init(&rights, CAP_FCHMOD), &fp); if (error != 0) return (error); error = fo_chmod(fp, uap->mode, td->td_ucred, td); fdrop(fp, td); return (error); } /* * Common implementation for chown(), lchown(), and fchown() */ int setfown(td, cred, vp, uid, gid) struct thread *td; struct ucred *cred; struct vnode *vp; uid_t uid; gid_t gid; { struct mount *mp; struct vattr vattr; int error; if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) return (error); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); VATTR_NULL(&vattr); vattr.va_uid = uid; vattr.va_gid = gid; #ifdef MAC error = mac_vnode_check_setowner(cred, vp, vattr.va_uid, vattr.va_gid); if (error == 0) #endif error = VOP_SETATTR(vp, &vattr, cred); VOP_UNLOCK(vp, 0); vn_finished_write(mp); return (error); } /* * Set ownership given a path name. */ #ifndef _SYS_SYSPROTO_H_ struct chown_args { char *path; int uid; int gid; }; #endif int sys_chown(td, uap) struct thread *td; register struct chown_args /* { char *path; int uid; int gid; } */ *uap; { return (kern_fchownat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->uid, uap->gid, 0)); } #ifndef _SYS_SYSPROTO_H_ struct fchownat_args { int fd; const char * path; uid_t uid; gid_t gid; int flag; }; #endif int sys_fchownat(struct thread *td, struct fchownat_args *uap) { int flag; flag = uap->flag; if (flag & ~AT_SYMLINK_NOFOLLOW) return (EINVAL); return (kern_fchownat(td, uap->fd, uap->path, UIO_USERSPACE, uap->uid, uap->gid, uap->flag)); } int kern_fchownat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int uid, int gid, int flag) { struct nameidata nd; cap_rights_t rights; int error, follow; AUDIT_ARG_OWNER(uid, gid); follow = (flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW; NDINIT_ATRIGHTS(&nd, LOOKUP, follow | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FCHOWN), td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); error = setfown(td, td->td_ucred, nd.ni_vp, uid, gid); vrele(nd.ni_vp); return (error); } /* * Set ownership given a path name, do not cross symlinks. */ #ifndef _SYS_SYSPROTO_H_ struct lchown_args { char *path; int uid; int gid; }; #endif int sys_lchown(td, uap) struct thread *td; register struct lchown_args /* { char *path; int uid; int gid; } */ *uap; { return (kern_fchownat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->uid, uap->gid, AT_SYMLINK_NOFOLLOW)); } /* * Set ownership given a file descriptor. */ #ifndef _SYS_SYSPROTO_H_ struct fchown_args { int fd; int uid; int gid; }; #endif int sys_fchown(td, uap) struct thread *td; register struct fchown_args /* { int fd; int uid; int gid; } */ *uap; { struct file *fp; cap_rights_t rights; int error; AUDIT_ARG_FD(uap->fd); AUDIT_ARG_OWNER(uap->uid, uap->gid); error = fget(td, uap->fd, cap_rights_init(&rights, CAP_FCHOWN), &fp); if (error != 0) return (error); error = fo_chown(fp, uap->uid, uap->gid, td->td_ucred, td); fdrop(fp, td); return (error); } /* * Common implementation code for utimes(), lutimes(), and futimes(). */ static int getutimes(usrtvp, tvpseg, tsp) const struct timeval *usrtvp; enum uio_seg tvpseg; struct timespec *tsp; { struct timeval tv[2]; const struct timeval *tvp; int error; if (usrtvp == NULL) { vfs_timestamp(&tsp[0]); tsp[1] = tsp[0]; } else { if (tvpseg == UIO_SYSSPACE) { tvp = usrtvp; } else { if ((error = copyin(usrtvp, tv, sizeof(tv))) != 0) return (error); tvp = tv; } if (tvp[0].tv_usec < 0 || tvp[0].tv_usec >= 1000000 || tvp[1].tv_usec < 0 || tvp[1].tv_usec >= 1000000) return (EINVAL); TIMEVAL_TO_TIMESPEC(&tvp[0], &tsp[0]); TIMEVAL_TO_TIMESPEC(&tvp[1], &tsp[1]); } return (0); } /* * Common implementation code for futimens(), utimensat(). */ #define UTIMENS_NULL 0x1 #define UTIMENS_EXIT 0x2 static int getutimens(const struct timespec *usrtsp, enum uio_seg tspseg, struct timespec *tsp, int *retflags) { struct timespec tsnow; int error; vfs_timestamp(&tsnow); *retflags = 0; if (usrtsp == NULL) { tsp[0] = tsnow; tsp[1] = tsnow; *retflags |= UTIMENS_NULL; return (0); } if (tspseg == UIO_SYSSPACE) { tsp[0] = usrtsp[0]; tsp[1] = usrtsp[1]; } else if ((error = copyin(usrtsp, tsp, sizeof(*tsp) * 2)) != 0) return (error); if (tsp[0].tv_nsec == UTIME_OMIT && tsp[1].tv_nsec == UTIME_OMIT) *retflags |= UTIMENS_EXIT; if (tsp[0].tv_nsec == UTIME_NOW && tsp[1].tv_nsec == UTIME_NOW) *retflags |= UTIMENS_NULL; if (tsp[0].tv_nsec == UTIME_OMIT) tsp[0].tv_sec = VNOVAL; else if (tsp[0].tv_nsec == UTIME_NOW) tsp[0] = tsnow; else if (tsp[0].tv_nsec < 0 || tsp[0].tv_nsec >= 1000000000L) return (EINVAL); if (tsp[1].tv_nsec == UTIME_OMIT) tsp[1].tv_sec = VNOVAL; else if (tsp[1].tv_nsec == UTIME_NOW) tsp[1] = tsnow; else if (tsp[1].tv_nsec < 0 || tsp[1].tv_nsec >= 1000000000L) return (EINVAL); return (0); } /* * Common implementation code for utimes(), lutimes(), futimes(), futimens(), * and utimensat(). */ static int setutimes(td, vp, ts, numtimes, nullflag) struct thread *td; struct vnode *vp; const struct timespec *ts; int numtimes; int nullflag; { struct mount *mp; struct vattr vattr; int error, setbirthtime; if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) return (error); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); setbirthtime = 0; if (numtimes < 3 && !VOP_GETATTR(vp, &vattr, td->td_ucred) && timespeccmp(&ts[1], &vattr.va_birthtime, < )) setbirthtime = 1; VATTR_NULL(&vattr); vattr.va_atime = ts[0]; vattr.va_mtime = ts[1]; if (setbirthtime) vattr.va_birthtime = ts[1]; if (numtimes > 2) vattr.va_birthtime = ts[2]; if (nullflag) vattr.va_vaflags |= VA_UTIMES_NULL; #ifdef MAC error = mac_vnode_check_setutimes(td->td_ucred, vp, vattr.va_atime, vattr.va_mtime); #endif if (error == 0) error = VOP_SETATTR(vp, &vattr, td->td_ucred); VOP_UNLOCK(vp, 0); vn_finished_write(mp); return (error); } /* * Set the access and modification times of a file. */ #ifndef _SYS_SYSPROTO_H_ struct utimes_args { char *path; struct timeval *tptr; }; #endif int sys_utimes(td, uap) struct thread *td; register struct utimes_args /* { char *path; struct timeval *tptr; } */ *uap; { return (kern_utimesat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->tptr, UIO_USERSPACE)); } #ifndef _SYS_SYSPROTO_H_ struct futimesat_args { int fd; const char * path; const struct timeval * times; }; #endif int sys_futimesat(struct thread *td, struct futimesat_args *uap) { return (kern_utimesat(td, uap->fd, uap->path, UIO_USERSPACE, uap->times, UIO_USERSPACE)); } int kern_utimesat(struct thread *td, int fd, char *path, enum uio_seg pathseg, struct timeval *tptr, enum uio_seg tptrseg) { struct nameidata nd; struct timespec ts[2]; cap_rights_t rights; int error; if ((error = getutimes(tptr, tptrseg, ts)) != 0) return (error); NDINIT_ATRIGHTS(&nd, LOOKUP, FOLLOW | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FUTIMES), td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); error = setutimes(td, nd.ni_vp, ts, 2, tptr == NULL); vrele(nd.ni_vp); return (error); } /* * Set the access and modification times of a file. */ #ifndef _SYS_SYSPROTO_H_ struct lutimes_args { char *path; struct timeval *tptr; }; #endif int sys_lutimes(td, uap) struct thread *td; register struct lutimes_args /* { char *path; struct timeval *tptr; } */ *uap; { return (kern_lutimes(td, uap->path, UIO_USERSPACE, uap->tptr, UIO_USERSPACE)); } int kern_lutimes(struct thread *td, char *path, enum uio_seg pathseg, struct timeval *tptr, enum uio_seg tptrseg) { struct timespec ts[2]; struct nameidata nd; int error; if ((error = getutimes(tptr, tptrseg, ts)) != 0) return (error); NDINIT(&nd, LOOKUP, NOFOLLOW | AUDITVNODE1, pathseg, path, td); if ((error = namei(&nd)) != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); error = setutimes(td, nd.ni_vp, ts, 2, tptr == NULL); vrele(nd.ni_vp); return (error); } /* * Set the access and modification times of a file. */ #ifndef _SYS_SYSPROTO_H_ struct futimes_args { int fd; struct timeval *tptr; }; #endif int sys_futimes(td, uap) struct thread *td; register struct futimes_args /* { int fd; struct timeval *tptr; } */ *uap; { return (kern_futimes(td, uap->fd, uap->tptr, UIO_USERSPACE)); } int kern_futimes(struct thread *td, int fd, struct timeval *tptr, enum uio_seg tptrseg) { struct timespec ts[2]; struct file *fp; cap_rights_t rights; int error; AUDIT_ARG_FD(fd); error = getutimes(tptr, tptrseg, ts); if (error != 0) return (error); error = getvnode(td->td_proc->p_fd, fd, cap_rights_init(&rights, CAP_FUTIMES), &fp); if (error != 0) return (error); #ifdef AUDIT vn_lock(fp->f_vnode, LK_SHARED | LK_RETRY); AUDIT_ARG_VNODE1(fp->f_vnode); VOP_UNLOCK(fp->f_vnode, 0); #endif error = setutimes(td, fp->f_vnode, ts, 2, tptr == NULL); fdrop(fp, td); return (error); } int sys_futimens(struct thread *td, struct futimens_args *uap) { return (kern_futimens(td, uap->fd, uap->times, UIO_USERSPACE)); } int kern_futimens(struct thread *td, int fd, struct timespec *tptr, enum uio_seg tptrseg) { struct timespec ts[2]; struct file *fp; cap_rights_t rights; int error, flags; AUDIT_ARG_FD(fd); error = getutimens(tptr, tptrseg, ts, &flags); if (error != 0) return (error); if (flags & UTIMENS_EXIT) return (0); error = getvnode(td->td_proc->p_fd, fd, cap_rights_init(&rights, CAP_FUTIMES), &fp); if (error != 0) return (error); #ifdef AUDIT vn_lock(fp->f_vnode, LK_SHARED | LK_RETRY); AUDIT_ARG_VNODE1(fp->f_vnode); VOP_UNLOCK(fp->f_vnode, 0); #endif error = setutimes(td, fp->f_vnode, ts, 2, flags & UTIMENS_NULL); fdrop(fp, td); return (error); } int sys_utimensat(struct thread *td, struct utimensat_args *uap) { return (kern_utimensat(td, uap->fd, uap->path, UIO_USERSPACE, uap->times, UIO_USERSPACE, uap->flag)); } int kern_utimensat(struct thread *td, int fd, char *path, enum uio_seg pathseg, struct timespec *tptr, enum uio_seg tptrseg, int flag) { struct nameidata nd; struct timespec ts[2]; cap_rights_t rights; int error, flags; if (flag & ~AT_SYMLINK_NOFOLLOW) return (EINVAL); if ((error = getutimens(tptr, tptrseg, ts, &flags)) != 0) return (error); NDINIT_ATRIGHTS(&nd, LOOKUP, ((flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW) | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FUTIMES), td); if ((error = namei(&nd)) != 0) return (error); /* * We are allowed to call namei() regardless of 2xUTIME_OMIT. * POSIX states: * "If both tv_nsec fields are UTIME_OMIT... EACCESS may be detected." * "Search permission is denied by a component of the path prefix." */ NDFREE(&nd, NDF_ONLY_PNBUF); if ((flags & UTIMENS_EXIT) == 0) error = setutimes(td, nd.ni_vp, ts, 2, flags & UTIMENS_NULL); vrele(nd.ni_vp); return (error); } /* * Truncate a file given its path name. */ #ifndef _SYS_SYSPROTO_H_ struct truncate_args { char *path; int pad; off_t length; }; #endif int sys_truncate(td, uap) struct thread *td; register struct truncate_args /* { char *path; int pad; off_t length; } */ *uap; { return (kern_truncate(td, uap->path, UIO_USERSPACE, uap->length)); } int kern_truncate(struct thread *td, char *path, enum uio_seg pathseg, off_t length) { struct mount *mp; struct vnode *vp; void *rl_cookie; struct vattr vattr; struct nameidata nd; int error; if (length < 0) return(EINVAL); NDINIT(&nd, LOOKUP, FOLLOW | AUDITVNODE1, pathseg, path, td); if ((error = namei(&nd)) != 0) return (error); vp = nd.ni_vp; rl_cookie = vn_rangelock_wlock(vp, 0, OFF_MAX); if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) { vn_rangelock_unlock(vp, rl_cookie); vrele(vp); return (error); } NDFREE(&nd, NDF_ONLY_PNBUF); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); if (vp->v_type == VDIR) error = EISDIR; #ifdef MAC else if ((error = mac_vnode_check_write(td->td_ucred, NOCRED, vp))) { } #endif else if ((error = vn_writechk(vp)) == 0 && (error = VOP_ACCESS(vp, VWRITE, td->td_ucred, td)) == 0) { VATTR_NULL(&vattr); vattr.va_size = length; error = VOP_SETATTR(vp, &vattr, td->td_ucred); } VOP_UNLOCK(vp, 0); vn_finished_write(mp); vn_rangelock_unlock(vp, rl_cookie); vrele(vp); return (error); } #if defined(COMPAT_43) /* * Truncate a file given its path name. */ #ifndef _SYS_SYSPROTO_H_ struct otruncate_args { char *path; long length; }; #endif int otruncate(td, uap) struct thread *td; register struct otruncate_args /* { char *path; long length; } */ *uap; { struct truncate_args /* { char *path; int pad; off_t length; } */ nuap; nuap.path = uap->path; nuap.length = uap->length; return (sys_truncate(td, &nuap)); } #endif /* COMPAT_43 */ /* Versions with the pad argument */ int freebsd6_truncate(struct thread *td, struct freebsd6_truncate_args *uap) { struct truncate_args ouap; ouap.path = uap->path; ouap.length = uap->length; return (sys_truncate(td, &ouap)); } int freebsd6_ftruncate(struct thread *td, struct freebsd6_ftruncate_args *uap) { struct ftruncate_args ouap; ouap.fd = uap->fd; ouap.length = uap->length; return (sys_ftruncate(td, &ouap)); } /* * Sync an open file. */ #ifndef _SYS_SYSPROTO_H_ struct fsync_args { int fd; }; #endif int sys_fsync(td, uap) struct thread *td; struct fsync_args /* { int fd; } */ *uap; { struct vnode *vp; struct mount *mp; struct file *fp; cap_rights_t rights; int error, lock_flags; AUDIT_ARG_FD(uap->fd); error = getvnode(td->td_proc->p_fd, uap->fd, cap_rights_init(&rights, CAP_FSYNC), &fp); if (error != 0) return (error); vp = fp->f_vnode; error = vn_start_write(vp, &mp, V_WAIT | PCATCH); if (error != 0) goto drop; if (MNT_SHARED_WRITES(mp) || ((mp == NULL) && MNT_SHARED_WRITES(vp->v_mount))) { lock_flags = LK_SHARED; } else { lock_flags = LK_EXCLUSIVE; } vn_lock(vp, lock_flags | LK_RETRY); AUDIT_ARG_VNODE1(vp); if (vp->v_object != NULL) { VM_OBJECT_WLOCK(vp->v_object); vm_object_page_clean(vp->v_object, 0, 0, 0); VM_OBJECT_WUNLOCK(vp->v_object); } error = VOP_FSYNC(vp, MNT_WAIT, td); VOP_UNLOCK(vp, 0); vn_finished_write(mp); drop: fdrop(fp, td); return (error); } /* * Rename files. Source and destination must either both be directories, or * both not be directories. If target is a directory, it must be empty. */ #ifndef _SYS_SYSPROTO_H_ struct rename_args { char *from; char *to; }; #endif int sys_rename(td, uap) struct thread *td; register struct rename_args /* { char *from; char *to; } */ *uap; { return (kern_renameat(td, AT_FDCWD, uap->from, AT_FDCWD, uap->to, UIO_USERSPACE)); } #ifndef _SYS_SYSPROTO_H_ struct renameat_args { int oldfd; char *old; int newfd; char *new; }; #endif int sys_renameat(struct thread *td, struct renameat_args *uap) { return (kern_renameat(td, uap->oldfd, uap->old, uap->newfd, uap->new, UIO_USERSPACE)); } int kern_renameat(struct thread *td, int oldfd, char *old, int newfd, char *new, enum uio_seg pathseg) { struct mount *mp = NULL; struct vnode *tvp, *fvp, *tdvp; struct nameidata fromnd, tond; cap_rights_t rights; int error; again: bwillwrite(); #ifdef MAC NDINIT_ATRIGHTS(&fromnd, DELETE, LOCKPARENT | LOCKLEAF | SAVESTART | AUDITVNODE1, pathseg, old, oldfd, cap_rights_init(&rights, CAP_RENAMEAT), td); #else NDINIT_ATRIGHTS(&fromnd, DELETE, WANTPARENT | SAVESTART | AUDITVNODE1, pathseg, old, oldfd, cap_rights_init(&rights, CAP_RENAMEAT), td); #endif if ((error = namei(&fromnd)) != 0) return (error); #ifdef MAC error = mac_vnode_check_rename_from(td->td_ucred, fromnd.ni_dvp, fromnd.ni_vp, &fromnd.ni_cnd); VOP_UNLOCK(fromnd.ni_dvp, 0); if (fromnd.ni_dvp != fromnd.ni_vp) VOP_UNLOCK(fromnd.ni_vp, 0); #endif fvp = fromnd.ni_vp; NDINIT_ATRIGHTS(&tond, RENAME, LOCKPARENT | LOCKLEAF | NOCACHE | SAVESTART | AUDITVNODE2, pathseg, new, newfd, cap_rights_init(&rights, CAP_LINKAT), td); if (fromnd.ni_vp->v_type == VDIR) tond.ni_cnd.cn_flags |= WILLBEDIR; if ((error = namei(&tond)) != 0) { /* Translate error code for rename("dir1", "dir2/."). */ if (error == EISDIR && fvp->v_type == VDIR) error = EINVAL; NDFREE(&fromnd, NDF_ONLY_PNBUF); vrele(fromnd.ni_dvp); vrele(fvp); goto out1; } tdvp = tond.ni_dvp; tvp = tond.ni_vp; error = vn_start_write(fvp, &mp, V_NOWAIT); if (error != 0) { NDFREE(&fromnd, NDF_ONLY_PNBUF); NDFREE(&tond, NDF_ONLY_PNBUF); if (tvp != NULL) vput(tvp); if (tdvp == tvp) vrele(tdvp); else vput(tdvp); vrele(fromnd.ni_dvp); vrele(fvp); vrele(tond.ni_startdir); if (fromnd.ni_startdir != NULL) vrele(fromnd.ni_startdir); error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH); if (error != 0) return (error); goto again; } if (tvp != NULL) { if (fvp->v_type == VDIR && tvp->v_type != VDIR) { error = ENOTDIR; goto out; } else if (fvp->v_type != VDIR && tvp->v_type == VDIR) { error = EISDIR; goto out; } #ifdef CAPABILITIES if (newfd != AT_FDCWD) { /* * If the target already exists we require CAP_UNLINKAT * from 'newfd'. */ error = cap_check(&tond.ni_filecaps.fc_rights, cap_rights_init(&rights, CAP_UNLINKAT)); if (error != 0) goto out; } #endif } if (fvp == tdvp) { error = EINVAL; goto out; } /* * If the source is the same as the destination (that is, if they * are links to the same vnode), then there is nothing to do. */ if (fvp == tvp) error = -1; #ifdef MAC else error = mac_vnode_check_rename_to(td->td_ucred, tdvp, tond.ni_vp, fromnd.ni_dvp == tdvp, &tond.ni_cnd); #endif out: if (error == 0) { error = VOP_RENAME(fromnd.ni_dvp, fromnd.ni_vp, &fromnd.ni_cnd, tond.ni_dvp, tond.ni_vp, &tond.ni_cnd); NDFREE(&fromnd, NDF_ONLY_PNBUF); NDFREE(&tond, NDF_ONLY_PNBUF); } else { NDFREE(&fromnd, NDF_ONLY_PNBUF); NDFREE(&tond, NDF_ONLY_PNBUF); if (tvp != NULL) vput(tvp); if (tdvp == tvp) vrele(tdvp); else vput(tdvp); vrele(fromnd.ni_dvp); vrele(fvp); } vrele(tond.ni_startdir); vn_finished_write(mp); out1: if (fromnd.ni_startdir) vrele(fromnd.ni_startdir); if (error == -1) return (0); return (error); } /* * Make a directory file. */ #ifndef _SYS_SYSPROTO_H_ struct mkdir_args { char *path; int mode; }; #endif int sys_mkdir(td, uap) struct thread *td; register struct mkdir_args /* { char *path; int mode; } */ *uap; { return (kern_mkdirat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->mode)); } #ifndef _SYS_SYSPROTO_H_ struct mkdirat_args { int fd; char *path; mode_t mode; }; #endif int sys_mkdirat(struct thread *td, struct mkdirat_args *uap) { return (kern_mkdirat(td, uap->fd, uap->path, UIO_USERSPACE, uap->mode)); } int kern_mkdirat(struct thread *td, int fd, char *path, enum uio_seg segflg, int mode) { struct mount *mp; struct vnode *vp; struct vattr vattr; struct nameidata nd; cap_rights_t rights; int error; AUDIT_ARG_MODE(mode); restart: bwillwrite(); NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 | NOCACHE, segflg, path, fd, cap_rights_init(&rights, CAP_MKDIRAT), td); nd.ni_cnd.cn_flags |= WILLBEDIR; if ((error = namei(&nd)) != 0) return (error); vp = nd.ni_vp; if (vp != NULL) { NDFREE(&nd, NDF_ONLY_PNBUF); /* * XXX namei called with LOCKPARENT but not LOCKLEAF has * the strange behaviour of leaving the vnode unlocked * if the target is the same vnode as the parent. */ if (vp == nd.ni_dvp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); vrele(vp); return (EEXIST); } if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) return (error); goto restart; } VATTR_NULL(&vattr); vattr.va_type = VDIR; vattr.va_mode = (mode & ACCESSPERMS) &~ td->td_proc->p_fd->fd_cmask; #ifdef MAC error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd, &vattr); if (error != 0) goto out; #endif error = VOP_MKDIR(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr); #ifdef MAC out: #endif NDFREE(&nd, NDF_ONLY_PNBUF); vput(nd.ni_dvp); if (error == 0) vput(nd.ni_vp); vn_finished_write(mp); return (error); } /* * Remove a directory file. */ #ifndef _SYS_SYSPROTO_H_ struct rmdir_args { char *path; }; #endif int sys_rmdir(td, uap) struct thread *td; struct rmdir_args /* { char *path; } */ *uap; { return (kern_rmdirat(td, AT_FDCWD, uap->path, UIO_USERSPACE)); } int kern_rmdirat(struct thread *td, int fd, char *path, enum uio_seg pathseg) { struct mount *mp; struct vnode *vp; struct nameidata nd; cap_rights_t rights; int error; restart: bwillwrite(); NDINIT_ATRIGHTS(&nd, DELETE, LOCKPARENT | LOCKLEAF | AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_UNLINKAT), td); if ((error = namei(&nd)) != 0) return (error); vp = nd.ni_vp; if (vp->v_type != VDIR) { error = ENOTDIR; goto out; } /* * No rmdir "." please. */ if (nd.ni_dvp == vp) { error = EINVAL; goto out; } /* * The root of a mounted filesystem cannot be deleted. */ if (vp->v_vflag & VV_ROOT) { error = EBUSY; goto out; } #ifdef MAC error = mac_vnode_check_unlink(td->td_ucred, nd.ni_dvp, vp, &nd.ni_cnd); if (error != 0) goto out; #endif if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) { NDFREE(&nd, NDF_ONLY_PNBUF); vput(vp); if (nd.ni_dvp == vp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0) return (error); goto restart; } vfs_notify_upper(vp, VFS_NOTIFY_UPPER_UNLINK); error = VOP_RMDIR(nd.ni_dvp, nd.ni_vp, &nd.ni_cnd); vn_finished_write(mp); out: NDFREE(&nd, NDF_ONLY_PNBUF); vput(vp); if (nd.ni_dvp == vp) vrele(nd.ni_dvp); else vput(nd.ni_dvp); return (error); } #ifdef COMPAT_43 /* * Read a block of directory entries in a filesystem independent format. */ #ifndef _SYS_SYSPROTO_H_ struct ogetdirentries_args { int fd; char *buf; u_int count; long *basep; }; #endif int ogetdirentries(struct thread *td, struct ogetdirentries_args *uap) { long loff; int error; error = kern_ogetdirentries(td, uap, &loff); if (error == 0) error = copyout(&loff, uap->basep, sizeof(long)); return (error); } int kern_ogetdirentries(struct thread *td, struct ogetdirentries_args *uap, long *ploff) { struct vnode *vp; struct file *fp; struct uio auio, kuio; struct iovec aiov, kiov; struct dirent *dp, *edp; cap_rights_t rights; caddr_t dirbuf; int error, eofflag, readcnt; long loff; off_t foffset; /* XXX arbitrary sanity limit on `count'. */ if (uap->count > 64 * 1024) return (EINVAL); error = getvnode(td->td_proc->p_fd, uap->fd, cap_rights_init(&rights, CAP_READ), &fp); if (error != 0) return (error); if ((fp->f_flag & FREAD) == 0) { fdrop(fp, td); return (EBADF); } vp = fp->f_vnode; foffset = foffset_lock(fp, 0); unionread: if (vp->v_type != VDIR) { foffset_unlock(fp, foffset, 0); fdrop(fp, td); return (EINVAL); } aiov.iov_base = uap->buf; aiov.iov_len = uap->count; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_rw = UIO_READ; auio.uio_segflg = UIO_USERSPACE; auio.uio_td = td; auio.uio_resid = uap->count; vn_lock(vp, LK_SHARED | LK_RETRY); loff = auio.uio_offset = foffset; #ifdef MAC error = mac_vnode_check_readdir(td->td_ucred, vp); if (error != 0) { VOP_UNLOCK(vp, 0); foffset_unlock(fp, foffset, FOF_NOUPDATE); fdrop(fp, td); return (error); } #endif # if (BYTE_ORDER != LITTLE_ENDIAN) if (vp->v_mount->mnt_maxsymlinklen <= 0) { error = VOP_READDIR(vp, &auio, fp->f_cred, &eofflag, NULL, NULL); foffset = auio.uio_offset; } else # endif { kuio = auio; kuio.uio_iov = &kiov; kuio.uio_segflg = UIO_SYSSPACE; kiov.iov_len = uap->count; dirbuf = malloc(uap->count, M_TEMP, M_WAITOK); kiov.iov_base = dirbuf; error = VOP_READDIR(vp, &kuio, fp->f_cred, &eofflag, NULL, NULL); foffset = kuio.uio_offset; if (error == 0) { readcnt = uap->count - kuio.uio_resid; edp = (struct dirent *)&dirbuf[readcnt]; for (dp = (struct dirent *)dirbuf; dp < edp; ) { # if (BYTE_ORDER == LITTLE_ENDIAN) /* * The expected low byte of * dp->d_namlen is our dp->d_type. * The high MBZ byte of dp->d_namlen * is our dp->d_namlen. */ dp->d_type = dp->d_namlen; dp->d_namlen = 0; # else /* * The dp->d_type is the high byte * of the expected dp->d_namlen, * so must be zero'ed. */ dp->d_type = 0; # endif if (dp->d_reclen > 0) { dp = (struct dirent *) ((char *)dp + dp->d_reclen); } else { error = EIO; break; } } if (dp >= edp) error = uiomove(dirbuf, readcnt, &auio); } free(dirbuf, M_TEMP); } if (error != 0) { VOP_UNLOCK(vp, 0); foffset_unlock(fp, foffset, 0); fdrop(fp, td); return (error); } if (uap->count == auio.uio_resid && (vp->v_vflag & VV_ROOT) && (vp->v_mount->mnt_flag & MNT_UNION)) { struct vnode *tvp = vp; vp = vp->v_mount->mnt_vnodecovered; VREF(vp); fp->f_vnode = vp; fp->f_data = vp; foffset = 0; vput(tvp); goto unionread; } VOP_UNLOCK(vp, 0); foffset_unlock(fp, foffset, 0); fdrop(fp, td); td->td_retval[0] = uap->count - auio.uio_resid; if (error == 0) *ploff = loff; return (error); } #endif /* COMPAT_43 */ /* * Read a block of directory entries in a filesystem independent format. */ #ifndef _SYS_SYSPROTO_H_ struct getdirentries_args { int fd; char *buf; u_int count; long *basep; }; #endif int sys_getdirentries(td, uap) struct thread *td; register struct getdirentries_args /* { int fd; char *buf; u_int count; long *basep; } */ *uap; { long base; int error; error = kern_getdirentries(td, uap->fd, uap->buf, uap->count, &base, NULL, UIO_USERSPACE); if (error != 0) return (error); if (uap->basep != NULL) error = copyout(&base, uap->basep, sizeof(long)); return (error); } int kern_getdirentries(struct thread *td, int fd, char *buf, u_int count, long *basep, ssize_t *residp, enum uio_seg bufseg) { struct vnode *vp; struct file *fp; struct uio auio; struct iovec aiov; cap_rights_t rights; long loff; int error, eofflag; off_t foffset; AUDIT_ARG_FD(fd); if (count > IOSIZE_MAX) return (EINVAL); auio.uio_resid = count; error = getvnode(td->td_proc->p_fd, fd, cap_rights_init(&rights, CAP_READ), &fp); if (error != 0) return (error); if ((fp->f_flag & FREAD) == 0) { fdrop(fp, td); return (EBADF); } vp = fp->f_vnode; foffset = foffset_lock(fp, 0); unionread: if (vp->v_type != VDIR) { error = EINVAL; goto fail; } aiov.iov_base = buf; aiov.iov_len = count; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_rw = UIO_READ; auio.uio_segflg = bufseg; auio.uio_td = td; vn_lock(vp, LK_SHARED | LK_RETRY); AUDIT_ARG_VNODE1(vp); loff = auio.uio_offset = foffset; #ifdef MAC error = mac_vnode_check_readdir(td->td_ucred, vp); if (error == 0) #endif error = VOP_READDIR(vp, &auio, fp->f_cred, &eofflag, NULL, NULL); foffset = auio.uio_offset; if (error != 0) { VOP_UNLOCK(vp, 0); goto fail; } if (count == auio.uio_resid && (vp->v_vflag & VV_ROOT) && (vp->v_mount->mnt_flag & MNT_UNION)) { struct vnode *tvp = vp; vp = vp->v_mount->mnt_vnodecovered; VREF(vp); fp->f_vnode = vp; fp->f_data = vp; foffset = 0; vput(tvp); goto unionread; } VOP_UNLOCK(vp, 0); *basep = loff; if (residp != NULL) *residp = auio.uio_resid; td->td_retval[0] = count - auio.uio_resid; fail: foffset_unlock(fp, foffset, 0); fdrop(fp, td); return (error); } #ifndef _SYS_SYSPROTO_H_ struct getdents_args { int fd; char *buf; size_t count; }; #endif int sys_getdents(td, uap) struct thread *td; register struct getdents_args /* { int fd; char *buf; u_int count; } */ *uap; { struct getdirentries_args ap; ap.fd = uap->fd; ap.buf = uap->buf; ap.count = uap->count; ap.basep = NULL; return (sys_getdirentries(td, &ap)); } /* * Set the mode mask for creation of filesystem nodes. */ #ifndef _SYS_SYSPROTO_H_ struct umask_args { int newmask; }; #endif int sys_umask(td, uap) struct thread *td; struct umask_args /* { int newmask; } */ *uap; { register struct filedesc *fdp; FILEDESC_XLOCK(td->td_proc->p_fd); fdp = td->td_proc->p_fd; td->td_retval[0] = fdp->fd_cmask; fdp->fd_cmask = uap->newmask & ALLPERMS; FILEDESC_XUNLOCK(td->td_proc->p_fd); return (0); } /* * Void all references to file by ripping underlying filesystem away from * vnode. */ #ifndef _SYS_SYSPROTO_H_ struct revoke_args { char *path; }; #endif int sys_revoke(td, uap) struct thread *td; register struct revoke_args /* { char *path; } */ *uap; { struct vnode *vp; struct vattr vattr; struct nameidata nd; int error; NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE, uap->path, td); if ((error = namei(&nd)) != 0) return (error); vp = nd.ni_vp; NDFREE(&nd, NDF_ONLY_PNBUF); if (vp->v_type != VCHR || vp->v_rdev == NULL) { error = EINVAL; goto out; } #ifdef MAC error = mac_vnode_check_revoke(td->td_ucred, vp); if (error != 0) goto out; #endif error = VOP_GETATTR(vp, &vattr, td->td_ucred); if (error != 0) goto out; if (td->td_ucred->cr_uid != vattr.va_uid) { error = priv_check(td, PRIV_VFS_ADMIN); if (error != 0) goto out; } if (vcount(vp) > 1) VOP_REVOKE(vp, REVOKEALL); out: vput(vp); return (error); } /* * Convert a user file descriptor to a kernel file entry and check that, if it * is a capability, the correct rights are present. A reference on the file * entry is held upon returning. */ int getvnode(struct filedesc *fdp, int fd, cap_rights_t *rightsp, struct file **fpp) { struct file *fp; int error; error = fget_unlocked(fdp, fd, rightsp, &fp, NULL); if (error != 0) return (error); /* * The file could be not of the vnode type, or it may be not * yet fully initialized, in which case the f_vnode pointer * may be set, but f_ops is still badfileops. E.g., * devfs_open() transiently create such situation to * facilitate csw d_fdopen(). * * Dupfdopen() handling in kern_openat() installs the * half-baked file into the process descriptor table, allowing * other thread to dereference it. Guard against the race by * checking f_ops. */ if (fp->f_vnode == NULL || fp->f_ops == &badfileops) { fdrop(fp, curthread); return (EINVAL); } *fpp = fp; return (0); } /* * Get an (NFS) file handle. */ #ifndef _SYS_SYSPROTO_H_ struct lgetfh_args { char *fname; fhandle_t *fhp; }; #endif int sys_lgetfh(td, uap) struct thread *td; register struct lgetfh_args *uap; { struct nameidata nd; fhandle_t fh; register struct vnode *vp; int error; error = priv_check(td, PRIV_VFS_GETFH); if (error != 0) return (error); NDINIT(&nd, LOOKUP, NOFOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE, uap->fname, td); error = namei(&nd); if (error != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; bzero(&fh, sizeof(fh)); fh.fh_fsid = vp->v_mount->mnt_stat.f_fsid; error = VOP_VPTOFH(vp, &fh.fh_fid); vput(vp); if (error == 0) error = copyout(&fh, uap->fhp, sizeof (fh)); return (error); } #ifndef _SYS_SYSPROTO_H_ struct getfh_args { char *fname; fhandle_t *fhp; }; #endif int sys_getfh(td, uap) struct thread *td; register struct getfh_args *uap; { struct nameidata nd; fhandle_t fh; register struct vnode *vp; int error; error = priv_check(td, PRIV_VFS_GETFH); if (error != 0) return (error); NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE, uap->fname, td); error = namei(&nd); if (error != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; bzero(&fh, sizeof(fh)); fh.fh_fsid = vp->v_mount->mnt_stat.f_fsid; error = VOP_VPTOFH(vp, &fh.fh_fid); vput(vp); if (error == 0) error = copyout(&fh, uap->fhp, sizeof (fh)); return (error); } /* * syscall for the rpc.lockd to use to translate a NFS file handle into an * open descriptor. * * warning: do not remove the priv_check() call or this becomes one giant * security hole. */ #ifndef _SYS_SYSPROTO_H_ struct fhopen_args { const struct fhandle *u_fhp; int flags; }; #endif int sys_fhopen(td, uap) struct thread *td; struct fhopen_args /* { const struct fhandle *u_fhp; int flags; } */ *uap; { struct mount *mp; struct vnode *vp; struct fhandle fhp; struct file *fp; int fmode, error; int indx; error = priv_check(td, PRIV_VFS_FHOPEN); if (error != 0) return (error); indx = -1; fmode = FFLAGS(uap->flags); /* why not allow a non-read/write open for our lockd? */ if (((fmode & (FREAD | FWRITE)) == 0) || (fmode & O_CREAT)) return (EINVAL); error = copyin(uap->u_fhp, &fhp, sizeof(fhp)); if (error != 0) return(error); /* find the mount point */ mp = vfs_busyfs(&fhp.fh_fsid); if (mp == NULL) return (ESTALE); /* now give me my vnode, it gets returned to me locked */ error = VFS_FHTOVP(mp, &fhp.fh_fid, LK_EXCLUSIVE, &vp); vfs_unbusy(mp); if (error != 0) return (error); error = falloc_noinstall(td, &fp); if (error != 0) { vput(vp); return (error); } /* * An extra reference on `fp' has been held for us by * falloc_noinstall(). */ #ifdef INVARIANTS td->td_dupfd = -1; #endif error = vn_open_vnode(vp, fmode, td->td_ucred, td, fp); if (error != 0) { KASSERT(fp->f_ops == &badfileops, ("VOP_OPEN in fhopen() set f_ops")); KASSERT(td->td_dupfd < 0, ("fhopen() encountered fdopen()")); vput(vp); goto bad; } #ifdef INVARIANTS td->td_dupfd = 0; #endif fp->f_vnode = vp; fp->f_seqcount = 1; finit(fp, (fmode & FMASK) | (fp->f_flag & FHASLOCK), DTYPE_VNODE, vp, &vnops); VOP_UNLOCK(vp, 0); if ((fmode & O_TRUNC) != 0) { error = fo_truncate(fp, 0, td->td_ucred, td); if (error != 0) goto bad; } error = finstall(td, fp, &indx, fmode, NULL); bad: fdrop(fp, td); td->td_retval[0] = indx; return (error); } /* * Stat an (NFS) file handle. */ #ifndef _SYS_SYSPROTO_H_ struct fhstat_args { struct fhandle *u_fhp; struct stat *sb; }; #endif int sys_fhstat(td, uap) struct thread *td; register struct fhstat_args /* { struct fhandle *u_fhp; struct stat *sb; } */ *uap; { struct stat sb; struct fhandle fh; int error; error = copyin(uap->u_fhp, &fh, sizeof(fh)); if (error != 0) return (error); error = kern_fhstat(td, fh, &sb); if (error == 0) error = copyout(&sb, uap->sb, sizeof(sb)); return (error); } int kern_fhstat(struct thread *td, struct fhandle fh, struct stat *sb) { struct mount *mp; struct vnode *vp; int error; error = priv_check(td, PRIV_VFS_FHSTAT); if (error != 0) return (error); if ((mp = vfs_busyfs(&fh.fh_fsid)) == NULL) return (ESTALE); error = VFS_FHTOVP(mp, &fh.fh_fid, LK_EXCLUSIVE, &vp); vfs_unbusy(mp); if (error != 0) return (error); error = vn_stat(vp, sb, td->td_ucred, NOCRED, td); vput(vp); return (error); } /* * Implement fstatfs() for (NFS) file handles. */ #ifndef _SYS_SYSPROTO_H_ struct fhstatfs_args { struct fhandle *u_fhp; struct statfs *buf; }; #endif int sys_fhstatfs(td, uap) struct thread *td; struct fhstatfs_args /* { struct fhandle *u_fhp; struct statfs *buf; } */ *uap; { struct statfs sf; fhandle_t fh; int error; error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t)); if (error != 0) return (error); error = kern_fhstatfs(td, fh, &sf); if (error != 0) return (error); return (copyout(&sf, uap->buf, sizeof(sf))); } int kern_fhstatfs(struct thread *td, fhandle_t fh, struct statfs *buf) { struct statfs *sp; struct mount *mp; struct vnode *vp; int error; error = priv_check(td, PRIV_VFS_FHSTATFS); if (error != 0) return (error); if ((mp = vfs_busyfs(&fh.fh_fsid)) == NULL) return (ESTALE); error = VFS_FHTOVP(mp, &fh.fh_fid, LK_EXCLUSIVE, &vp); if (error != 0) { vfs_unbusy(mp); return (error); } vput(vp); error = prison_canseemount(td->td_ucred, mp); if (error != 0) goto out; #ifdef MAC error = mac_mount_check_stat(td->td_ucred, mp); if (error != 0) goto out; #endif /* * Set these in case the underlying filesystem fails to do so. */ sp = &mp->mnt_stat; sp->f_version = STATFS_VERSION; sp->f_namemax = NAME_MAX; sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; error = VFS_STATFS(mp, sp); if (error == 0) *buf = *sp; out: vfs_unbusy(mp); return (error); } int kern_posix_fallocate(struct thread *td, int fd, off_t offset, off_t len) { struct file *fp; struct mount *mp; struct vnode *vp; cap_rights_t rights; off_t olen, ooffset; int error; if (offset < 0 || len <= 0) return (EINVAL); /* Check for wrap. */ if (offset > OFF_MAX - len) return (EFBIG); error = fget(td, fd, cap_rights_init(&rights, CAP_WRITE), &fp); if (error != 0) return (error); if ((fp->f_ops->fo_flags & DFLAG_SEEKABLE) == 0) { error = ESPIPE; goto out; } if ((fp->f_flag & FWRITE) == 0) { error = EBADF; goto out; } if (fp->f_type != DTYPE_VNODE) { error = ENODEV; goto out; } vp = fp->f_vnode; if (vp->v_type != VREG) { error = ENODEV; goto out; } /* Allocating blocks may take a long time, so iterate. */ for (;;) { olen = len; ooffset = offset; bwillwrite(); mp = NULL; error = vn_start_write(vp, &mp, V_WAIT | PCATCH); if (error != 0) break; error = vn_lock(vp, LK_EXCLUSIVE); if (error != 0) { vn_finished_write(mp); break; } #ifdef MAC error = mac_vnode_check_write(td->td_ucred, fp->f_cred, vp); if (error == 0) #endif error = VOP_ALLOCATE(vp, &offset, &len); VOP_UNLOCK(vp, 0); vn_finished_write(mp); if (olen + ooffset != offset + len) { panic("offset + len changed from %jx/%jx to %jx/%jx", ooffset, olen, offset, len); } if (error != 0 || len == 0) break; KASSERT(olen > len, ("Iteration did not make progress?")); maybe_yield(); } out: fdrop(fp, td); return (error); } int sys_posix_fallocate(struct thread *td, struct posix_fallocate_args *uap) { td->td_retval[0] = kern_posix_fallocate(td, uap->fd, uap->offset, uap->len); return (0); } /* * Unlike madvise(2), we do not make a best effort to remember every * possible caching hint. Instead, we remember the last setting with * the exception that we will allow POSIX_FADV_NORMAL to adjust the * region of any current setting. */ int kern_posix_fadvise(struct thread *td, int fd, off_t offset, off_t len, int advice) { struct fadvise_info *fa, *new; struct file *fp; struct vnode *vp; cap_rights_t rights; off_t end; int error; if (offset < 0 || len < 0 || offset > OFF_MAX - len) return (EINVAL); switch (advice) { case POSIX_FADV_SEQUENTIAL: case POSIX_FADV_RANDOM: case POSIX_FADV_NOREUSE: new = malloc(sizeof(*fa), M_FADVISE, M_WAITOK); break; case POSIX_FADV_NORMAL: case POSIX_FADV_WILLNEED: case POSIX_FADV_DONTNEED: new = NULL; break; default: return (EINVAL); } /* XXX: CAP_POSIX_FADVISE? */ error = fget(td, fd, cap_rights_init(&rights), &fp); if (error != 0) goto out; if ((fp->f_ops->fo_flags & DFLAG_SEEKABLE) == 0) { error = ESPIPE; goto out; } if (fp->f_type != DTYPE_VNODE) { error = ENODEV; goto out; } vp = fp->f_vnode; if (vp->v_type != VREG) { error = ENODEV; goto out; } if (len == 0) end = OFF_MAX; else end = offset + len - 1; switch (advice) { case POSIX_FADV_SEQUENTIAL: case POSIX_FADV_RANDOM: case POSIX_FADV_NOREUSE: /* * Try to merge any existing non-standard region with * this new region if possible, otherwise create a new * non-standard region for this request. */ mtx_pool_lock(mtxpool_sleep, fp); fa = fp->f_advice; if (fa != NULL && fa->fa_advice == advice && ((fa->fa_start <= end && fa->fa_end >= offset) || (end != OFF_MAX && fa->fa_start == end + 1) || (fa->fa_end != OFF_MAX && fa->fa_end + 1 == offset))) { if (offset < fa->fa_start) fa->fa_start = offset; if (end > fa->fa_end) fa->fa_end = end; } else { new->fa_advice = advice; new->fa_start = offset; new->fa_end = end; new->fa_prevstart = 0; new->fa_prevend = 0; fp->f_advice = new; new = fa; } mtx_pool_unlock(mtxpool_sleep, fp); break; case POSIX_FADV_NORMAL: /* * If a the "normal" region overlaps with an existing * non-standard region, trim or remove the * non-standard region. */ mtx_pool_lock(mtxpool_sleep, fp); fa = fp->f_advice; if (fa != NULL) { if (offset <= fa->fa_start && end >= fa->fa_end) { new = fa; fp->f_advice = NULL; } else if (offset <= fa->fa_start && end >= fa->fa_start) fa->fa_start = end + 1; else if (offset <= fa->fa_end && end >= fa->fa_end) fa->fa_end = offset - 1; else if (offset >= fa->fa_start && end <= fa->fa_end) { /* * If the "normal" region is a middle * portion of the existing * non-standard region, just remove * the whole thing rather than picking * one side or the other to * preserve. */ new = fa; fp->f_advice = NULL; } } mtx_pool_unlock(mtxpool_sleep, fp); break; case POSIX_FADV_WILLNEED: case POSIX_FADV_DONTNEED: error = VOP_ADVISE(vp, offset, end, advice); break; } out: if (fp != NULL) fdrop(fp, td); free(new, M_FADVISE); return (error); } int sys_posix_fadvise(struct thread *td, struct posix_fadvise_args *uap) { td->td_retval[0] = kern_posix_fadvise(td, uap->fd, uap->offset, uap->len, uap->advice); return (0); } Index: user/ngie/more-tests/sys/net/if_types.h =================================================================== --- user/ngie/more-tests/sys/net/if_types.h (revision 281584) +++ user/ngie/more-tests/sys/net/if_types.h (revision 281585) @@ -1,252 +1,252 @@ /*- * Copyright (c) 1989, 1993, 1994 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)if_types.h 8.3 (Berkeley) 4/28/95 * $FreeBSD$ * $NetBSD: if_types.h,v 1.16 2000/04/19 06:30:53 itojun Exp $ */ #ifndef _NET_IF_TYPES_H_ #define _NET_IF_TYPES_H_ /* * Interface types for benefit of parsing media address headers. * This list is derived from the SNMP list of ifTypes, originally * documented in RFC1573, now maintained as: * * http://www.iana.org/assignments/smi-numbers */ #define IFT_OTHER 0x1 /* none of the following */ #define IFT_1822 0x2 /* old-style arpanet imp */ #define IFT_HDH1822 0x3 /* HDH arpanet imp */ #define IFT_X25DDN 0x4 /* x25 to imp */ #define IFT_X25 0x5 /* PDN X25 interface (RFC877) */ #define IFT_ETHER 0x6 /* Ethernet CSMA/CD */ #define IFT_ISO88023 0x7 /* CMSA/CD */ #define IFT_ISO88024 0x8 /* Token Bus */ #define IFT_ISO88025 0x9 /* Token Ring */ #define IFT_ISO88026 0xa /* MAN */ #define IFT_STARLAN 0xb #define IFT_P10 0xc /* Proteon 10MBit ring */ #define IFT_P80 0xd /* Proteon 80MBit ring */ #define IFT_HY 0xe /* Hyperchannel */ #define IFT_FDDI 0xf #define IFT_LAPB 0x10 #define IFT_SDLC 0x11 #define IFT_T1 0x12 #define IFT_CEPT 0x13 /* E1 - european T1 */ #define IFT_ISDNBASIC 0x14 #define IFT_ISDNPRIMARY 0x15 #define IFT_PTPSERIAL 0x16 /* Proprietary PTP serial */ #define IFT_PPP 0x17 /* RFC 1331 */ #define IFT_LOOP 0x18 /* loopback */ #define IFT_EON 0x19 /* ISO over IP */ #define IFT_XETHER 0x1a /* obsolete 3MB experimental ethernet */ #define IFT_NSIP 0x1b /* XNS over IP */ #define IFT_SLIP 0x1c /* IP over generic TTY */ #define IFT_ULTRA 0x1d /* Ultra Technologies */ #define IFT_DS3 0x1e /* Generic T3 */ #define IFT_SIP 0x1f /* SMDS */ #define IFT_FRELAY 0x20 /* Frame Relay DTE only */ #define IFT_RS232 0x21 #define IFT_PARA 0x22 /* parallel-port */ #define IFT_ARCNET 0x23 #define IFT_ARCNETPLUS 0x24 #define IFT_ATM 0x25 /* ATM cells */ #define IFT_MIOX25 0x26 #define IFT_SONET 0x27 /* SONET or SDH */ #define IFT_X25PLE 0x28 #define IFT_ISO88022LLC 0x29 #define IFT_LOCALTALK 0x2a #define IFT_SMDSDXI 0x2b #define IFT_FRELAYDCE 0x2c /* Frame Relay DCE */ #define IFT_V35 0x2d #define IFT_HSSI 0x2e #define IFT_HIPPI 0x2f #define IFT_MODEM 0x30 /* Generic Modem */ #define IFT_AAL5 0x31 /* AAL5 over ATM */ #define IFT_SONETPATH 0x32 #define IFT_SONETVT 0x33 #define IFT_SMDSICIP 0x34 /* SMDS InterCarrier Interface */ #define IFT_PROPVIRTUAL 0x35 /* Proprietary Virtual/internal */ #define IFT_PROPMUX 0x36 /* Proprietary Multiplexing */ #define IFT_IEEE80212 0x37 /* 100BaseVG */ #define IFT_FIBRECHANNEL 0x38 /* Fibre Channel */ #define IFT_HIPPIINTERFACE 0x39 /* HIPPI interfaces */ #define IFT_FRAMERELAYINTERCONNECT 0x3a /* Obsolete, use either 0x20 or 0x2c */ #define IFT_AFLANE8023 0x3b /* ATM Emulated LAN for 802.3 */ #define IFT_AFLANE8025 0x3c /* ATM Emulated LAN for 802.5 */ #define IFT_CCTEMUL 0x3d /* ATM Emulated circuit */ #define IFT_FASTETHER 0x3e /* Fast Ethernet (100BaseT) */ #define IFT_ISDN 0x3f /* ISDN and X.25 */ #define IFT_V11 0x40 /* CCITT V.11/X.21 */ #define IFT_V36 0x41 /* CCITT V.36 */ #define IFT_G703AT64K 0x42 /* CCITT G703 at 64Kbps */ #define IFT_G703AT2MB 0x43 /* Obsolete see DS1-MIB */ #define IFT_QLLC 0x44 /* SNA QLLC */ #define IFT_FASTETHERFX 0x45 /* Fast Ethernet (100BaseFX) */ #define IFT_CHANNEL 0x46 /* channel */ #define IFT_IEEE80211 0x47 /* radio spread spectrum */ #define IFT_IBM370PARCHAN 0x48 /* IBM System 360/370 OEMI Channel */ #define IFT_ESCON 0x49 /* IBM Enterprise Systems Connection */ #define IFT_DLSW 0x4a /* Data Link Switching */ #define IFT_ISDNS 0x4b /* ISDN S/T interface */ #define IFT_ISDNU 0x4c /* ISDN U interface */ #define IFT_LAPD 0x4d /* Link Access Protocol D */ #define IFT_IPSWITCH 0x4e /* IP Switching Objects */ #define IFT_RSRB 0x4f /* Remote Source Route Bridging */ #define IFT_ATMLOGICAL 0x50 /* ATM Logical Port */ #define IFT_DS0 0x51 /* Digital Signal Level 0 */ #define IFT_DS0BUNDLE 0x52 /* group of ds0s on the same ds1 */ #define IFT_BSC 0x53 /* Bisynchronous Protocol */ #define IFT_ASYNC 0x54 /* Asynchronous Protocol */ #define IFT_CNR 0x55 /* Combat Net Radio */ #define IFT_ISO88025DTR 0x56 /* ISO 802.5r DTR */ #define IFT_EPLRS 0x57 /* Ext Pos Loc Report Sys */ #define IFT_ARAP 0x58 /* Appletalk Remote Access Protocol */ #define IFT_PROPCNLS 0x59 /* Proprietary Connectionless Protocol*/ #define IFT_HOSTPAD 0x5a /* CCITT-ITU X.29 PAD Protocol */ #define IFT_TERMPAD 0x5b /* CCITT-ITU X.3 PAD Facility */ #define IFT_FRAMERELAYMPI 0x5c /* Multiproto Interconnect over FR */ #define IFT_X213 0x5d /* CCITT-ITU X213 */ #define IFT_ADSL 0x5e /* Asymmetric Digital Subscriber Loop */ #define IFT_RADSL 0x5f /* Rate-Adapt. Digital Subscriber Loop*/ #define IFT_SDSL 0x60 /* Symmetric Digital Subscriber Loop */ #define IFT_VDSL 0x61 /* Very H-Speed Digital Subscrib. Loop*/ #define IFT_ISO88025CRFPINT 0x62 /* ISO 802.5 CRFP */ #define IFT_MYRINET 0x63 /* Myricom Myrinet */ #define IFT_VOICEEM 0x64 /* voice recEive and transMit */ #define IFT_VOICEFXO 0x65 /* voice Foreign Exchange Office */ #define IFT_VOICEFXS 0x66 /* voice Foreign Exchange Station */ #define IFT_VOICEENCAP 0x67 /* voice encapsulation */ #define IFT_VOICEOVERIP 0x68 /* voice over IP encapsulation */ #define IFT_ATMDXI 0x69 /* ATM DXI */ #define IFT_ATMFUNI 0x6a /* ATM FUNI */ #define IFT_ATMIMA 0x6b /* ATM IMA */ #define IFT_PPPMULTILINKBUNDLE 0x6c /* PPP Multilink Bundle */ #define IFT_IPOVERCDLC 0x6d /* IBM ipOverCdlc */ #define IFT_IPOVERCLAW 0x6e /* IBM Common Link Access to Workstn */ #define IFT_STACKTOSTACK 0x6f /* IBM stackToStack */ #define IFT_VIRTUALIPADDRESS 0x70 /* IBM VIPA */ #define IFT_MPC 0x71 /* IBM multi-protocol channel support */ #define IFT_IPOVERATM 0x72 /* IBM ipOverAtm */ #define IFT_ISO88025FIBER 0x73 /* ISO 802.5j Fiber Token Ring */ #define IFT_TDLC 0x74 /* IBM twinaxial data link control */ #define IFT_GIGABITETHERNET 0x75 /* Gigabit Ethernet */ #define IFT_HDLC 0x76 /* HDLC */ #define IFT_LAPF 0x77 /* LAP F */ #define IFT_V37 0x78 /* V.37 */ #define IFT_X25MLP 0x79 /* Multi-Link Protocol */ #define IFT_X25HUNTGROUP 0x7a /* X25 Hunt Group */ #define IFT_TRANSPHDLC 0x7b /* Transp HDLC */ #define IFT_INTERLEAVE 0x7c /* Interleave channel */ #define IFT_FAST 0x7d /* Fast channel */ #define IFT_IP 0x7e /* IP (for APPN HPR in IP networks) */ #define IFT_DOCSCABLEMACLAYER 0x7f /* CATV Mac Layer */ #define IFT_DOCSCABLEDOWNSTREAM 0x80 /* CATV Downstream interface */ #define IFT_DOCSCABLEUPSTREAM 0x81 /* CATV Upstream interface */ #define IFT_A12MPPSWITCH 0x82 /* Avalon Parallel Processor */ #define IFT_TUNNEL 0x83 /* Encapsulation interface */ #define IFT_COFFEE 0x84 /* coffee pot */ #define IFT_CES 0x85 /* Circiut Emulation Service */ #define IFT_ATMSUBINTERFACE 0x86 /* (x) ATM Sub Interface */ #define IFT_L2VLAN 0x87 /* Layer 2 Virtual LAN using 802.1Q */ #define IFT_L3IPVLAN 0x88 /* Layer 3 Virtual LAN - IP Protocol */ #define IFT_L3IPXVLAN 0x89 /* Layer 3 Virtual LAN - IPX Prot. */ #define IFT_DIGITALPOWERLINE 0x8a /* IP over Power Lines */ #define IFT_MEDIAMAILOVERIP 0x8b /* (xxx) Multimedia Mail over IP */ #define IFT_DTM 0x8c /* Dynamic synchronous Transfer Mode */ #define IFT_DCN 0x8d /* Data Communications Network */ #define IFT_IPFORWARD 0x8e /* IP Forwarding Interface */ #define IFT_MSDSL 0x8f /* Multi-rate Symmetric DSL */ #define IFT_IEEE1394 0x90 /* IEEE1394 High Performance SerialBus*/ #define IFT_IFGSN 0x91 /* HIPPI-6400 */ #define IFT_DVBRCCMACLAYER 0x92 /* DVB-RCC MAC Layer */ #define IFT_DVBRCCDOWNSTREAM 0x93 /* DVB-RCC Downstream Channel */ #define IFT_DVBRCCUPSTREAM 0x94 /* DVB-RCC Upstream Channel */ #define IFT_ATMVIRTUAL 0x95 /* ATM Virtual Interface */ #define IFT_MPLSTUNNEL 0x96 /* MPLS Tunnel Virtual Interface */ #define IFT_SRP 0x97 /* Spatial Reuse Protocol */ #define IFT_VOICEOVERATM 0x98 /* Voice over ATM */ #define IFT_VOICEOVERFRAMERELAY 0x99 /* Voice Over Frame Relay */ #define IFT_IDSL 0x9a /* Digital Subscriber Loop over ISDN */ #define IFT_COMPOSITELINK 0x9b /* Avici Composite Link Interface */ #define IFT_SS7SIGLINK 0x9c /* SS7 Signaling Link */ #define IFT_PROPWIRELESSP2P 0x9d /* Prop. P2P wireless interface */ #define IFT_FRFORWARD 0x9e /* Frame forward Interface */ #define IFT_RFC1483 0x9f /* Multiprotocol over ATM AAL5 */ #define IFT_USB 0xa0 /* USB Interface */ #define IFT_IEEE8023ADLAG 0xa1 /* IEEE 802.3ad Link Aggregate*/ #define IFT_BGPPOLICYACCOUNTING 0xa2 /* BGP Policy Accounting */ #define IFT_FRF16MFRBUNDLE 0xa3 /* FRF.16 Multilik Frame Relay*/ #define IFT_H323GATEKEEPER 0xa4 /* H323 Gatekeeper */ #define IFT_H323PROXY 0xa5 /* H323 Voice and Video Proxy */ #define IFT_MPLS 0xa6 /* MPLS */ #define IFT_MFSIGLINK 0xa7 /* Multi-frequency signaling link */ #define IFT_HDSL2 0xa8 /* High Bit-Rate DSL, 2nd gen. */ #define IFT_SHDSL 0xa9 /* Multirate HDSL2 */ #define IFT_DS1FDL 0xaa /* Facility Data Link (4Kbps) on a DS1*/ #define IFT_POS 0xab /* Packet over SONET/SDH Interface */ #define IFT_DVBASILN 0xac /* DVB-ASI Input */ #define IFT_DVBASIOUT 0xad /* DVB-ASI Output */ #define IFT_PLC 0xae /* Power Line Communications */ #define IFT_NFAS 0xaf /* Non-Facility Associated Signaling */ #define IFT_TR008 0xb0 /* TROO8 */ #define IFT_GR303RDT 0xb1 /* Remote Digital Terminal */ #define IFT_GR303IDT 0xb2 /* Integrated Digital Terminal */ #define IFT_ISUP 0xb3 /* ISUP */ #define IFT_PROPDOCSWIRELESSMACLAYER 0xb4 /* prop/Wireless MAC Layer */ #define IFT_PROPDOCSWIRELESSDOWNSTREAM 0xb5 /* prop/Wireless Downstream */ #define IFT_PROPDOCSWIRELESSUPSTREAM 0xb6 /* prop/Wireless Upstream */ #define IFT_HIPERLAN2 0xb7 /* HIPERLAN Type 2 Radio Interface */ #define IFT_PROPBWAP2MP 0xb8 /* PropBroadbandWirelessAccess P2MP*/ #define IFT_SONETOVERHEADCHANNEL 0xb9 /* SONET Overhead Channel */ #define IFT_DIGITALWRAPPEROVERHEADCHANNEL 0xba /* Digital Wrapper Overhead */ #define IFT_AAL2 0xbb /* ATM adaptation layer 2 */ #define IFT_RADIOMAC 0xbc /* MAC layer over radio links */ #define IFT_ATMRADIO 0xbd /* ATM over radio links */ #define IFT_IMT 0xbe /* Inter-Machine Trunks */ #define IFT_MVL 0xbf /* Multiple Virtual Lines DSL */ #define IFT_REACHDSL 0xc0 /* Long Reach DSL */ #define IFT_FRDLCIENDPT 0xc1 /* Frame Relay DLCI End Point */ #define IFT_ATMVCIENDPT 0xc2 /* ATM VCI End Point */ #define IFT_OPTICALCHANNEL 0xc3 /* Optical Channel */ #define IFT_OPTICALTRANSPORT 0xc4 /* Optical Transport */ #define IFT_INFINIBAND 0xc7 /* Infiniband */ #define IFT_BRIDGE 0xd1 /* Transparent bridge interface */ #define IFT_STF 0xd7 /* 6to4 interface */ -/* not based on IANA assignments */ -#define IFT_GIF 0xf0 -#define IFT_PVC 0xf1 -#define IFT_ENC 0xf4 -#define IFT_PFLOG 0xf6 -#define IFT_PFSYNC 0xf7 +/* FreeBSD specific, not based on IANA assignments */ +#define IFT_GIF 0xf0 /* Generic tunnel interface */ +#define IFT_PVC 0xf1 /* Unused */ +#define IFT_ENC 0xf4 /* Encapsulating interface */ +#define IFT_PFLOG 0xf6 /* PF packet filter logging */ +#define IFT_PFSYNC 0xf7 /* PF packet filter synchronization */ #endif /* !_NET_IF_TYPES_H_ */ Index: user/ngie/more-tests/sys/net/pfvar.h =================================================================== --- user/ngie/more-tests/sys/net/pfvar.h (revision 281584) +++ user/ngie/more-tests/sys/net/pfvar.h (revision 281585) @@ -1,1744 +1,1743 @@ /* * Copyright (c) 2001 Daniel Hartmeier * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials provided * with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * $OpenBSD: pfvar.h,v 1.282 2009/01/29 15:12:28 pyr Exp $ * $FreeBSD$ */ #ifndef _NET_PFVAR_H_ #define _NET_PFVAR_H_ #include #include #include #include #include #include #include #include #include #include struct pf_addr { union { struct in_addr v4; struct in6_addr v6; u_int8_t addr8[16]; u_int16_t addr16[8]; u_int32_t addr32[4]; } pfa; /* 128-bit address */ #define v4 pfa.v4 #define v6 pfa.v6 #define addr8 pfa.addr8 #define addr16 pfa.addr16 #define addr32 pfa.addr32 }; #define PFI_AFLAG_NETWORK 0x01 #define PFI_AFLAG_BROADCAST 0x02 #define PFI_AFLAG_PEER 0x04 #define PFI_AFLAG_MODEMASK 0x07 #define PFI_AFLAG_NOALIAS 0x08 struct pf_addr_wrap { union { struct { struct pf_addr addr; struct pf_addr mask; } a; char ifname[IFNAMSIZ]; char tblname[PF_TABLE_NAME_SIZE]; } v; union { struct pfi_dynaddr *dyn; struct pfr_ktable *tbl; int dyncnt; int tblcnt; } p; u_int8_t type; /* PF_ADDR_* */ u_int8_t iflags; /* PFI_AFLAG_* */ }; #ifdef _KERNEL struct pfi_dynaddr { TAILQ_ENTRY(pfi_dynaddr) entry; struct pf_addr pfid_addr4; struct pf_addr pfid_mask4; struct pf_addr pfid_addr6; struct pf_addr pfid_mask6; struct pfr_ktable *pfid_kt; struct pfi_kif *pfid_kif; int pfid_net; /* mask or 128 */ int pfid_acnt4; /* address count IPv4 */ int pfid_acnt6; /* address count IPv6 */ sa_family_t pfid_af; /* rule af */ u_int8_t pfid_iflags; /* PFI_AFLAG_* */ }; /* * Address manipulation macros */ #define HTONL(x) (x) = htonl((__uint32_t)(x)) #define HTONS(x) (x) = htons((__uint16_t)(x)) #define NTOHL(x) (x) = ntohl((__uint32_t)(x)) #define NTOHS(x) (x) = ntohs((__uint16_t)(x)) #define PF_NAME "pf" #define PF_HASHROW_ASSERT(h) mtx_assert(&(h)->lock, MA_OWNED) #define PF_HASHROW_LOCK(h) mtx_lock(&(h)->lock) #define PF_HASHROW_UNLOCK(h) mtx_unlock(&(h)->lock) #define PF_STATE_LOCK(s) \ do { \ struct pf_idhash *_ih = &V_pf_idhash[PF_IDHASH(s)]; \ PF_HASHROW_LOCK(_ih); \ } while (0) #define PF_STATE_UNLOCK(s) \ do { \ struct pf_idhash *_ih = &V_pf_idhash[PF_IDHASH((s))]; \ PF_HASHROW_UNLOCK(_ih); \ } while (0) #ifdef INVARIANTS #define PF_STATE_LOCK_ASSERT(s) \ do { \ struct pf_idhash *_ih = &V_pf_idhash[PF_IDHASH(s)]; \ PF_HASHROW_ASSERT(_ih); \ } while (0) #else /* !INVARIANTS */ #define PF_STATE_LOCK_ASSERT(s) do {} while (0) #endif /* INVARIANTS */ extern struct mtx pf_unlnkdrules_mtx; #define PF_UNLNKDRULES_LOCK() mtx_lock(&pf_unlnkdrules_mtx) #define PF_UNLNKDRULES_UNLOCK() mtx_unlock(&pf_unlnkdrules_mtx) extern struct rwlock pf_rules_lock; #define PF_RULES_RLOCK() rw_rlock(&pf_rules_lock) #define PF_RULES_RUNLOCK() rw_runlock(&pf_rules_lock) #define PF_RULES_WLOCK() rw_wlock(&pf_rules_lock) #define PF_RULES_WUNLOCK() rw_wunlock(&pf_rules_lock) #define PF_RULES_ASSERT() rw_assert(&pf_rules_lock, RA_LOCKED) #define PF_RULES_RASSERT() rw_assert(&pf_rules_lock, RA_RLOCKED) #define PF_RULES_WASSERT() rw_assert(&pf_rules_lock, RA_WLOCKED) #define PF_MODVER 1 #define PFLOG_MODVER 1 #define PFSYNC_MODVER 1 #define PFLOG_MINVER 1 #define PFLOG_PREFVER PFLOG_MODVER #define PFLOG_MAXVER 1 #define PFSYNC_MINVER 1 #define PFSYNC_PREFVER PFSYNC_MODVER #define PFSYNC_MAXVER 1 #ifdef INET #ifndef INET6 #define PF_INET_ONLY #endif /* ! INET6 */ #endif /* INET */ #ifdef INET6 #ifndef INET #define PF_INET6_ONLY #endif /* ! INET */ #endif /* INET6 */ #ifdef INET #ifdef INET6 #define PF_INET_INET6 #endif /* INET6 */ #endif /* INET */ #else #define PF_INET_INET6 #endif /* _KERNEL */ /* Both IPv4 and IPv6 */ #ifdef PF_INET_INET6 #define PF_AEQ(a, b, c) \ ((c == AF_INET && (a)->addr32[0] == (b)->addr32[0]) || \ - ((a)->addr32[3] == (b)->addr32[3] && \ + (c == AF_INET6 && (a)->addr32[3] == (b)->addr32[3] && \ (a)->addr32[2] == (b)->addr32[2] && \ (a)->addr32[1] == (b)->addr32[1] && \ (a)->addr32[0] == (b)->addr32[0])) \ #define PF_ANEQ(a, b, c) \ - ((c == AF_INET && (a)->addr32[0] != (b)->addr32[0]) || \ - ((a)->addr32[3] != (b)->addr32[3] || \ - (a)->addr32[2] != (b)->addr32[2] || \ + ((a)->addr32[0] != (b)->addr32[0] || \ (a)->addr32[1] != (b)->addr32[1] || \ - (a)->addr32[0] != (b)->addr32[0])) \ + (a)->addr32[2] != (b)->addr32[2] || \ + (a)->addr32[3] != (b)->addr32[3]) \ #define PF_AZERO(a, c) \ ((c == AF_INET && !(a)->addr32[0]) || \ - (!(a)->addr32[0] && !(a)->addr32[1] && \ + (c == AF_INET6 && !(a)->addr32[0] && !(a)->addr32[1] && \ !(a)->addr32[2] && !(a)->addr32[3] )) \ #define PF_MATCHA(n, a, m, b, f) \ pf_match_addr(n, a, m, b, f) #define PF_ACPY(a, b, f) \ pf_addrcpy(a, b, f) #define PF_AINC(a, f) \ pf_addr_inc(a, f) #define PF_POOLMASK(a, b, c, d, f) \ pf_poolmask(a, b, c, d, f) #else /* Just IPv6 */ #ifdef PF_INET6_ONLY #define PF_AEQ(a, b, c) \ ((a)->addr32[3] == (b)->addr32[3] && \ (a)->addr32[2] == (b)->addr32[2] && \ (a)->addr32[1] == (b)->addr32[1] && \ (a)->addr32[0] == (b)->addr32[0]) \ #define PF_ANEQ(a, b, c) \ ((a)->addr32[3] != (b)->addr32[3] || \ (a)->addr32[2] != (b)->addr32[2] || \ (a)->addr32[1] != (b)->addr32[1] || \ (a)->addr32[0] != (b)->addr32[0]) \ #define PF_AZERO(a, c) \ (!(a)->addr32[0] && \ !(a)->addr32[1] && \ !(a)->addr32[2] && \ !(a)->addr32[3] ) \ #define PF_MATCHA(n, a, m, b, f) \ pf_match_addr(n, a, m, b, f) #define PF_ACPY(a, b, f) \ pf_addrcpy(a, b, f) #define PF_AINC(a, f) \ pf_addr_inc(a, f) #define PF_POOLMASK(a, b, c, d, f) \ pf_poolmask(a, b, c, d, f) #else /* Just IPv4 */ #ifdef PF_INET_ONLY #define PF_AEQ(a, b, c) \ ((a)->addr32[0] == (b)->addr32[0]) #define PF_ANEQ(a, b, c) \ ((a)->addr32[0] != (b)->addr32[0]) #define PF_AZERO(a, c) \ (!(a)->addr32[0]) #define PF_MATCHA(n, a, m, b, f) \ pf_match_addr(n, a, m, b, f) #define PF_ACPY(a, b, f) \ (a)->v4.s_addr = (b)->v4.s_addr #define PF_AINC(a, f) \ do { \ (a)->addr32[0] = htonl(ntohl((a)->addr32[0]) + 1); \ } while (0) #define PF_POOLMASK(a, b, c, d, f) \ do { \ (a)->addr32[0] = ((b)->addr32[0] & (c)->addr32[0]) | \ (((c)->addr32[0] ^ 0xffffffff ) & (d)->addr32[0]); \ } while (0) #endif /* PF_INET_ONLY */ #endif /* PF_INET6_ONLY */ #endif /* PF_INET_INET6 */ /* * XXX callers not FIB-aware in our version of pf yet. * OpenBSD fixed it later it seems, 2010/05/07 13:33:16 claudio. */ #define PF_MISMATCHAW(aw, x, af, neg, ifp, rtid) \ ( \ (((aw)->type == PF_ADDR_NOROUTE && \ pf_routable((x), (af), NULL, (rtid))) || \ (((aw)->type == PF_ADDR_URPFFAILED && (ifp) != NULL && \ pf_routable((x), (af), (ifp), (rtid))) || \ ((aw)->type == PF_ADDR_TABLE && \ !pfr_match_addr((aw)->p.tbl, (x), (af))) || \ ((aw)->type == PF_ADDR_DYNIFTL && \ !pfi_match_addr((aw)->p.dyn, (x), (af))) || \ ((aw)->type == PF_ADDR_RANGE && \ !pf_match_addr_range(&(aw)->v.a.addr, \ &(aw)->v.a.mask, (x), (af))) || \ ((aw)->type == PF_ADDR_ADDRMASK && \ !PF_AZERO(&(aw)->v.a.mask, (af)) && \ !PF_MATCHA(0, &(aw)->v.a.addr, \ &(aw)->v.a.mask, (x), (af))))) != \ (neg) \ ) struct pf_rule_uid { uid_t uid[2]; u_int8_t op; }; struct pf_rule_gid { uid_t gid[2]; u_int8_t op; }; struct pf_rule_addr { struct pf_addr_wrap addr; u_int16_t port[2]; u_int8_t neg; u_int8_t port_op; }; struct pf_pooladdr { struct pf_addr_wrap addr; TAILQ_ENTRY(pf_pooladdr) entries; char ifname[IFNAMSIZ]; struct pfi_kif *kif; }; TAILQ_HEAD(pf_palist, pf_pooladdr); struct pf_poolhashkey { union { u_int8_t key8[16]; u_int16_t key16[8]; u_int32_t key32[4]; } pfk; /* 128-bit hash key */ #define key8 pfk.key8 #define key16 pfk.key16 #define key32 pfk.key32 }; struct pf_pool { struct pf_palist list; struct pf_pooladdr *cur; struct pf_poolhashkey key; struct pf_addr counter; int tblidx; u_int16_t proxy_port[2]; u_int8_t opts; }; /* A packed Operating System description for fingerprinting */ typedef u_int32_t pf_osfp_t; #define PF_OSFP_ANY ((pf_osfp_t)0) #define PF_OSFP_UNKNOWN ((pf_osfp_t)-1) #define PF_OSFP_NOMATCH ((pf_osfp_t)-2) struct pf_osfp_entry { SLIST_ENTRY(pf_osfp_entry) fp_entry; pf_osfp_t fp_os; int fp_enflags; #define PF_OSFP_EXPANDED 0x001 /* expanded entry */ #define PF_OSFP_GENERIC 0x002 /* generic signature */ #define PF_OSFP_NODETAIL 0x004 /* no p0f details */ #define PF_OSFP_LEN 32 char fp_class_nm[PF_OSFP_LEN]; char fp_version_nm[PF_OSFP_LEN]; char fp_subtype_nm[PF_OSFP_LEN]; }; #define PF_OSFP_ENTRY_EQ(a, b) \ ((a)->fp_os == (b)->fp_os && \ memcmp((a)->fp_class_nm, (b)->fp_class_nm, PF_OSFP_LEN) == 0 && \ memcmp((a)->fp_version_nm, (b)->fp_version_nm, PF_OSFP_LEN) == 0 && \ memcmp((a)->fp_subtype_nm, (b)->fp_subtype_nm, PF_OSFP_LEN) == 0) /* handle pf_osfp_t packing */ #define _FP_RESERVED_BIT 1 /* For the special negative #defines */ #define _FP_UNUSED_BITS 1 #define _FP_CLASS_BITS 10 /* OS Class (Windows, Linux) */ #define _FP_VERSION_BITS 10 /* OS version (95, 98, NT, 2.4.54, 3.2) */ #define _FP_SUBTYPE_BITS 10 /* patch level (NT SP4, SP3, ECN patch) */ #define PF_OSFP_UNPACK(osfp, class, version, subtype) do { \ (class) = ((osfp) >> (_FP_VERSION_BITS+_FP_SUBTYPE_BITS)) & \ ((1 << _FP_CLASS_BITS) - 1); \ (version) = ((osfp) >> _FP_SUBTYPE_BITS) & \ ((1 << _FP_VERSION_BITS) - 1);\ (subtype) = (osfp) & ((1 << _FP_SUBTYPE_BITS) - 1); \ } while(0) #define PF_OSFP_PACK(osfp, class, version, subtype) do { \ (osfp) = ((class) & ((1 << _FP_CLASS_BITS) - 1)) << (_FP_VERSION_BITS \ + _FP_SUBTYPE_BITS); \ (osfp) |= ((version) & ((1 << _FP_VERSION_BITS) - 1)) << \ _FP_SUBTYPE_BITS; \ (osfp) |= (subtype) & ((1 << _FP_SUBTYPE_BITS) - 1); \ } while(0) /* the fingerprint of an OSes TCP SYN packet */ typedef u_int64_t pf_tcpopts_t; struct pf_os_fingerprint { SLIST_HEAD(pf_osfp_enlist, pf_osfp_entry) fp_oses; /* list of matches */ pf_tcpopts_t fp_tcpopts; /* packed TCP options */ u_int16_t fp_wsize; /* TCP window size */ u_int16_t fp_psize; /* ip->ip_len */ u_int16_t fp_mss; /* TCP MSS */ u_int16_t fp_flags; #define PF_OSFP_WSIZE_MOD 0x0001 /* Window modulus */ #define PF_OSFP_WSIZE_DC 0x0002 /* Window don't care */ #define PF_OSFP_WSIZE_MSS 0x0004 /* Window multiple of MSS */ #define PF_OSFP_WSIZE_MTU 0x0008 /* Window multiple of MTU */ #define PF_OSFP_PSIZE_MOD 0x0010 /* packet size modulus */ #define PF_OSFP_PSIZE_DC 0x0020 /* packet size don't care */ #define PF_OSFP_WSCALE 0x0040 /* TCP window scaling */ #define PF_OSFP_WSCALE_MOD 0x0080 /* TCP window scale modulus */ #define PF_OSFP_WSCALE_DC 0x0100 /* TCP window scale dont-care */ #define PF_OSFP_MSS 0x0200 /* TCP MSS */ #define PF_OSFP_MSS_MOD 0x0400 /* TCP MSS modulus */ #define PF_OSFP_MSS_DC 0x0800 /* TCP MSS dont-care */ #define PF_OSFP_DF 0x1000 /* IPv4 don't fragment bit */ #define PF_OSFP_TS0 0x2000 /* Zero timestamp */ #define PF_OSFP_INET6 0x4000 /* IPv6 */ u_int8_t fp_optcnt; /* TCP option count */ u_int8_t fp_wscale; /* TCP window scaling */ u_int8_t fp_ttl; /* IPv4 TTL */ #define PF_OSFP_MAXTTL_OFFSET 40 /* TCP options packing */ #define PF_OSFP_TCPOPT_NOP 0x0 /* TCP NOP option */ #define PF_OSFP_TCPOPT_WSCALE 0x1 /* TCP window scaling option */ #define PF_OSFP_TCPOPT_MSS 0x2 /* TCP max segment size opt */ #define PF_OSFP_TCPOPT_SACK 0x3 /* TCP SACK OK option */ #define PF_OSFP_TCPOPT_TS 0x4 /* TCP timestamp option */ #define PF_OSFP_TCPOPT_BITS 3 /* bits used by each option */ #define PF_OSFP_MAX_OPTS \ (sizeof(((struct pf_os_fingerprint *)0)->fp_tcpopts) * 8) \ / PF_OSFP_TCPOPT_BITS SLIST_ENTRY(pf_os_fingerprint) fp_next; }; struct pf_osfp_ioctl { struct pf_osfp_entry fp_os; pf_tcpopts_t fp_tcpopts; /* packed TCP options */ u_int16_t fp_wsize; /* TCP window size */ u_int16_t fp_psize; /* ip->ip_len */ u_int16_t fp_mss; /* TCP MSS */ u_int16_t fp_flags; u_int8_t fp_optcnt; /* TCP option count */ u_int8_t fp_wscale; /* TCP window scaling */ u_int8_t fp_ttl; /* IPv4 TTL */ int fp_getnum; /* DIOCOSFPGET number */ }; union pf_rule_ptr { struct pf_rule *ptr; u_int32_t nr; }; #define PF_ANCHOR_NAME_SIZE 64 struct pf_rule { struct pf_rule_addr src; struct pf_rule_addr dst; #define PF_SKIP_IFP 0 #define PF_SKIP_DIR 1 #define PF_SKIP_AF 2 #define PF_SKIP_PROTO 3 #define PF_SKIP_SRC_ADDR 4 #define PF_SKIP_SRC_PORT 5 #define PF_SKIP_DST_ADDR 6 #define PF_SKIP_DST_PORT 7 #define PF_SKIP_COUNT 8 union pf_rule_ptr skip[PF_SKIP_COUNT]; #define PF_RULE_LABEL_SIZE 64 char label[PF_RULE_LABEL_SIZE]; char ifname[IFNAMSIZ]; char qname[PF_QNAME_SIZE]; char pqname[PF_QNAME_SIZE]; #define PF_TAG_NAME_SIZE 64 char tagname[PF_TAG_NAME_SIZE]; char match_tagname[PF_TAG_NAME_SIZE]; char overload_tblname[PF_TABLE_NAME_SIZE]; TAILQ_ENTRY(pf_rule) entries; struct pf_pool rpool; u_int64_t evaluations; u_int64_t packets[2]; u_int64_t bytes[2]; struct pfi_kif *kif; struct pf_anchor *anchor; struct pfr_ktable *overload_tbl; pf_osfp_t os_fingerprint; int rtableid; u_int32_t timeout[PFTM_MAX]; u_int32_t max_states; u_int32_t max_src_nodes; u_int32_t max_src_states; u_int32_t max_src_conn; struct { u_int32_t limit; u_int32_t seconds; } max_src_conn_rate; u_int32_t qid; u_int32_t pqid; u_int32_t rt_listid; u_int32_t nr; u_int32_t prob; uid_t cuid; pid_t cpid; counter_u64_t states_cur; counter_u64_t states_tot; counter_u64_t src_nodes; u_int16_t return_icmp; u_int16_t return_icmp6; u_int16_t max_mss; u_int16_t tag; u_int16_t match_tag; u_int16_t spare2; /* netgraph */ struct pf_rule_uid uid; struct pf_rule_gid gid; u_int32_t rule_flag; u_int8_t action; u_int8_t direction; u_int8_t log; u_int8_t logif; u_int8_t quick; u_int8_t ifnot; u_int8_t match_tag_not; u_int8_t natpass; #define PF_STATE_NORMAL 0x1 #define PF_STATE_MODULATE 0x2 #define PF_STATE_SYNPROXY 0x3 u_int8_t keep_state; sa_family_t af; u_int8_t proto; u_int8_t type; u_int8_t code; u_int8_t flags; u_int8_t flagset; u_int8_t min_ttl; u_int8_t allow_opts; u_int8_t rt; u_int8_t return_ttl; u_int8_t tos; u_int8_t set_tos; u_int8_t anchor_relative; u_int8_t anchor_wildcard; #define PF_FLUSH 0x01 #define PF_FLUSH_GLOBAL 0x02 u_int8_t flush; struct { struct pf_addr addr; u_int16_t port; } divert; uint64_t u_states_cur; uint64_t u_states_tot; uint64_t u_src_nodes; }; /* rule flags */ #define PFRULE_DROP 0x0000 #define PFRULE_RETURNRST 0x0001 #define PFRULE_FRAGMENT 0x0002 #define PFRULE_RETURNICMP 0x0004 #define PFRULE_RETURN 0x0008 #define PFRULE_NOSYNC 0x0010 #define PFRULE_SRCTRACK 0x0020 /* track source states */ #define PFRULE_RULESRCTRACK 0x0040 /* per rule */ #define PFRULE_REFS 0x0080 /* rule has references */ /* scrub flags */ #define PFRULE_NODF 0x0100 #define PFRULE_FRAGCROP 0x0200 /* non-buffering frag cache */ #define PFRULE_FRAGDROP 0x0400 /* drop funny fragments */ #define PFRULE_RANDOMID 0x0800 #define PFRULE_REASSEMBLE_TCP 0x1000 #define PFRULE_SET_TOS 0x2000 /* rule flags again */ #define PFRULE_IFBOUND 0x00010000 /* if-bound */ #define PFRULE_STATESLOPPY 0x00020000 /* sloppy state tracking */ #define PFSTATE_HIWAT 10000 /* default state table size */ #define PFSTATE_ADAPT_START 6000 /* default adaptive timeout start */ #define PFSTATE_ADAPT_END 12000 /* default adaptive timeout end */ struct pf_threshold { u_int32_t limit; #define PF_THRESHOLD_MULT 1000 #define PF_THRESHOLD_MAX 0xffffffff / PF_THRESHOLD_MULT u_int32_t seconds; u_int32_t count; u_int32_t last; }; struct pf_src_node { LIST_ENTRY(pf_src_node) entry; struct pf_addr addr; struct pf_addr raddr; union pf_rule_ptr rule; struct pfi_kif *kif; u_int64_t bytes[2]; u_int64_t packets[2]; u_int32_t states; u_int32_t conn; struct pf_threshold conn_rate; u_int32_t creation; u_int32_t expire; sa_family_t af; u_int8_t ruletype; }; #define PFSNODE_HIWAT 10000 /* default source node table size */ struct pf_state_scrub { struct timeval pfss_last; /* time received last packet */ u_int32_t pfss_tsecr; /* last echoed timestamp */ u_int32_t pfss_tsval; /* largest timestamp */ u_int32_t pfss_tsval0; /* original timestamp */ u_int16_t pfss_flags; #define PFSS_TIMESTAMP 0x0001 /* modulate timestamp */ #define PFSS_PAWS 0x0010 /* stricter PAWS checks */ #define PFSS_PAWS_IDLED 0x0020 /* was idle too long. no PAWS */ #define PFSS_DATA_TS 0x0040 /* timestamp on data packets */ #define PFSS_DATA_NOTS 0x0080 /* no timestamp on data packets */ u_int8_t pfss_ttl; /* stashed TTL */ u_int8_t pad; u_int32_t pfss_ts_mod; /* timestamp modulation */ }; struct pf_state_host { struct pf_addr addr; u_int16_t port; u_int16_t pad; }; struct pf_state_peer { struct pf_state_scrub *scrub; /* state is scrubbed */ u_int32_t seqlo; /* Max sequence number sent */ u_int32_t seqhi; /* Max the other end ACKd + win */ u_int32_t seqdiff; /* Sequence number modulator */ u_int16_t max_win; /* largest window (pre scaling) */ u_int16_t mss; /* Maximum segment size option */ u_int8_t state; /* active state level */ u_int8_t wscale; /* window scaling factor */ u_int8_t tcp_est; /* Did we reach TCPS_ESTABLISHED */ u_int8_t pad[1]; }; /* Keep synced with struct pf_state_key. */ struct pf_state_key_cmp { struct pf_addr addr[2]; u_int16_t port[2]; sa_family_t af; u_int8_t proto; u_int8_t pad[2]; }; struct pf_state_key { struct pf_addr addr[2]; u_int16_t port[2]; sa_family_t af; u_int8_t proto; u_int8_t pad[2]; LIST_ENTRY(pf_state_key) entry; TAILQ_HEAD(, pf_state) states[2]; }; /* Keep synced with struct pf_state. */ struct pf_state_cmp { u_int64_t id; u_int32_t creatorid; u_int8_t direction; u_int8_t pad[3]; }; struct pf_state { u_int64_t id; u_int32_t creatorid; u_int8_t direction; u_int8_t pad[3]; u_int refs; TAILQ_ENTRY(pf_state) sync_list; TAILQ_ENTRY(pf_state) key_list[2]; LIST_ENTRY(pf_state) entry; struct pf_state_peer src; struct pf_state_peer dst; union pf_rule_ptr rule; union pf_rule_ptr anchor; union pf_rule_ptr nat_rule; struct pf_addr rt_addr; struct pf_state_key *key[2]; /* addresses stack and wire */ struct pfi_kif *kif; struct pfi_kif *rt_kif; struct pf_src_node *src_node; struct pf_src_node *nat_src_node; u_int64_t packets[2]; u_int64_t bytes[2]; u_int32_t creation; u_int32_t expire; u_int32_t pfsync_time; u_int16_t tag; u_int8_t log; u_int8_t state_flags; #define PFSTATE_ALLOWOPTS 0x01 #define PFSTATE_SLOPPY 0x02 /* was PFSTATE_PFLOW 0x04 */ #define PFSTATE_NOSYNC 0x08 #define PFSTATE_ACK 0x10 u_int8_t timeout; u_int8_t sync_state; /* PFSYNC_S_x */ /* XXX */ u_int8_t sync_updates; u_int8_t _tail[3]; }; /* * Unified state structures for pulling states out of the kernel * used by pfsync(4) and the pf(4) ioctl. */ struct pfsync_state_scrub { u_int16_t pfss_flags; u_int8_t pfss_ttl; /* stashed TTL */ #define PFSYNC_SCRUB_FLAG_VALID 0x01 u_int8_t scrub_flag; u_int32_t pfss_ts_mod; /* timestamp modulation */ } __packed; struct pfsync_state_peer { struct pfsync_state_scrub scrub; /* state is scrubbed */ u_int32_t seqlo; /* Max sequence number sent */ u_int32_t seqhi; /* Max the other end ACKd + win */ u_int32_t seqdiff; /* Sequence number modulator */ u_int16_t max_win; /* largest window (pre scaling) */ u_int16_t mss; /* Maximum segment size option */ u_int8_t state; /* active state level */ u_int8_t wscale; /* window scaling factor */ u_int8_t pad[6]; } __packed; struct pfsync_state_key { struct pf_addr addr[2]; u_int16_t port[2]; }; struct pfsync_state { u_int64_t id; char ifname[IFNAMSIZ]; struct pfsync_state_key key[2]; struct pfsync_state_peer src; struct pfsync_state_peer dst; struct pf_addr rt_addr; u_int32_t rule; u_int32_t anchor; u_int32_t nat_rule; u_int32_t creation; u_int32_t expire; u_int32_t packets[2][2]; u_int32_t bytes[2][2]; u_int32_t creatorid; sa_family_t af; u_int8_t proto; u_int8_t direction; u_int8_t __spare[2]; u_int8_t log; u_int8_t state_flags; u_int8_t timeout; u_int8_t sync_flags; u_int8_t updates; } __packed; #ifdef _KERNEL /* pfsync */ typedef int pfsync_state_import_t(struct pfsync_state *, u_int8_t); typedef void pfsync_insert_state_t(struct pf_state *); typedef void pfsync_update_state_t(struct pf_state *); typedef void pfsync_delete_state_t(struct pf_state *); typedef void pfsync_clear_states_t(u_int32_t, const char *); typedef int pfsync_defer_t(struct pf_state *, struct mbuf *); extern pfsync_state_import_t *pfsync_state_import_ptr; extern pfsync_insert_state_t *pfsync_insert_state_ptr; extern pfsync_update_state_t *pfsync_update_state_ptr; extern pfsync_delete_state_t *pfsync_delete_state_ptr; extern pfsync_clear_states_t *pfsync_clear_states_ptr; extern pfsync_defer_t *pfsync_defer_ptr; void pfsync_state_export(struct pfsync_state *, struct pf_state *); /* pflog */ struct pf_ruleset; struct pf_pdesc; typedef int pflog_packet_t(struct pfi_kif *, struct mbuf *, sa_family_t, u_int8_t, u_int8_t, struct pf_rule *, struct pf_rule *, struct pf_ruleset *, struct pf_pdesc *, int); extern pflog_packet_t *pflog_packet_ptr; #define V_pf_end_threads VNET(pf_end_threads) #endif /* _KERNEL */ #define PFSYNC_FLAG_SRCNODE 0x04 #define PFSYNC_FLAG_NATSRCNODE 0x08 /* for copies to/from network byte order */ /* ioctl interface also uses network byte order */ #define pf_state_peer_hton(s,d) do { \ (d)->seqlo = htonl((s)->seqlo); \ (d)->seqhi = htonl((s)->seqhi); \ (d)->seqdiff = htonl((s)->seqdiff); \ (d)->max_win = htons((s)->max_win); \ (d)->mss = htons((s)->mss); \ (d)->state = (s)->state; \ (d)->wscale = (s)->wscale; \ if ((s)->scrub) { \ (d)->scrub.pfss_flags = \ htons((s)->scrub->pfss_flags & PFSS_TIMESTAMP); \ (d)->scrub.pfss_ttl = (s)->scrub->pfss_ttl; \ (d)->scrub.pfss_ts_mod = htonl((s)->scrub->pfss_ts_mod);\ (d)->scrub.scrub_flag = PFSYNC_SCRUB_FLAG_VALID; \ } \ } while (0) #define pf_state_peer_ntoh(s,d) do { \ (d)->seqlo = ntohl((s)->seqlo); \ (d)->seqhi = ntohl((s)->seqhi); \ (d)->seqdiff = ntohl((s)->seqdiff); \ (d)->max_win = ntohs((s)->max_win); \ (d)->mss = ntohs((s)->mss); \ (d)->state = (s)->state; \ (d)->wscale = (s)->wscale; \ if ((s)->scrub.scrub_flag == PFSYNC_SCRUB_FLAG_VALID && \ (d)->scrub != NULL) { \ (d)->scrub->pfss_flags = \ ntohs((s)->scrub.pfss_flags) & PFSS_TIMESTAMP; \ (d)->scrub->pfss_ttl = (s)->scrub.pfss_ttl; \ (d)->scrub->pfss_ts_mod = ntohl((s)->scrub.pfss_ts_mod);\ } \ } while (0) #define pf_state_counter_hton(s,d) do { \ d[0] = htonl((s>>32)&0xffffffff); \ d[1] = htonl(s&0xffffffff); \ } while (0) #define pf_state_counter_from_pfsync(s) \ (((u_int64_t)(s[0])<<32) | (u_int64_t)(s[1])) #define pf_state_counter_ntoh(s,d) do { \ d = ntohl(s[0]); \ d = d<<32; \ d += ntohl(s[1]); \ } while (0) TAILQ_HEAD(pf_rulequeue, pf_rule); struct pf_anchor; struct pf_ruleset { struct { struct pf_rulequeue queues[2]; struct { struct pf_rulequeue *ptr; struct pf_rule **ptr_array; u_int32_t rcount; u_int32_t ticket; int open; } active, inactive; } rules[PF_RULESET_MAX]; struct pf_anchor *anchor; u_int32_t tticket; int tables; int topen; }; RB_HEAD(pf_anchor_global, pf_anchor); RB_HEAD(pf_anchor_node, pf_anchor); struct pf_anchor { RB_ENTRY(pf_anchor) entry_global; RB_ENTRY(pf_anchor) entry_node; struct pf_anchor *parent; struct pf_anchor_node children; char name[PF_ANCHOR_NAME_SIZE]; char path[MAXPATHLEN]; struct pf_ruleset ruleset; int refcnt; /* anchor rules */ int match; /* XXX: used for pfctl black magic */ }; RB_PROTOTYPE(pf_anchor_global, pf_anchor, entry_global, pf_anchor_compare); RB_PROTOTYPE(pf_anchor_node, pf_anchor, entry_node, pf_anchor_compare); #define PF_RESERVED_ANCHOR "_pf" #define PFR_TFLAG_PERSIST 0x00000001 #define PFR_TFLAG_CONST 0x00000002 #define PFR_TFLAG_ACTIVE 0x00000004 #define PFR_TFLAG_INACTIVE 0x00000008 #define PFR_TFLAG_REFERENCED 0x00000010 #define PFR_TFLAG_REFDANCHOR 0x00000020 #define PFR_TFLAG_COUNTERS 0x00000040 /* Adjust masks below when adding flags. */ #define PFR_TFLAG_USRMASK (PFR_TFLAG_PERSIST | \ PFR_TFLAG_CONST | \ PFR_TFLAG_COUNTERS) #define PFR_TFLAG_SETMASK (PFR_TFLAG_ACTIVE | \ PFR_TFLAG_INACTIVE | \ PFR_TFLAG_REFERENCED | \ PFR_TFLAG_REFDANCHOR) #define PFR_TFLAG_ALLMASK (PFR_TFLAG_PERSIST | \ PFR_TFLAG_CONST | \ PFR_TFLAG_ACTIVE | \ PFR_TFLAG_INACTIVE | \ PFR_TFLAG_REFERENCED | \ PFR_TFLAG_REFDANCHOR | \ PFR_TFLAG_COUNTERS) struct pf_anchor_stackframe; struct pfr_table { char pfrt_anchor[MAXPATHLEN]; char pfrt_name[PF_TABLE_NAME_SIZE]; u_int32_t pfrt_flags; u_int8_t pfrt_fback; }; enum { PFR_FB_NONE, PFR_FB_MATCH, PFR_FB_ADDED, PFR_FB_DELETED, PFR_FB_CHANGED, PFR_FB_CLEARED, PFR_FB_DUPLICATE, PFR_FB_NOTMATCH, PFR_FB_CONFLICT, PFR_FB_NOCOUNT, PFR_FB_MAX }; struct pfr_addr { union { struct in_addr _pfra_ip4addr; struct in6_addr _pfra_ip6addr; } pfra_u; u_int8_t pfra_af; u_int8_t pfra_net; u_int8_t pfra_not; u_int8_t pfra_fback; }; #define pfra_ip4addr pfra_u._pfra_ip4addr #define pfra_ip6addr pfra_u._pfra_ip6addr enum { PFR_DIR_IN, PFR_DIR_OUT, PFR_DIR_MAX }; enum { PFR_OP_BLOCK, PFR_OP_PASS, PFR_OP_ADDR_MAX, PFR_OP_TABLE_MAX }; #define PFR_OP_XPASS PFR_OP_ADDR_MAX struct pfr_astats { struct pfr_addr pfras_a; u_int64_t pfras_packets[PFR_DIR_MAX][PFR_OP_ADDR_MAX]; u_int64_t pfras_bytes[PFR_DIR_MAX][PFR_OP_ADDR_MAX]; long pfras_tzero; }; enum { PFR_REFCNT_RULE, PFR_REFCNT_ANCHOR, PFR_REFCNT_MAX }; struct pfr_tstats { struct pfr_table pfrts_t; u_int64_t pfrts_packets[PFR_DIR_MAX][PFR_OP_TABLE_MAX]; u_int64_t pfrts_bytes[PFR_DIR_MAX][PFR_OP_TABLE_MAX]; u_int64_t pfrts_match; u_int64_t pfrts_nomatch; long pfrts_tzero; int pfrts_cnt; int pfrts_refcnt[PFR_REFCNT_MAX]; }; #define pfrts_name pfrts_t.pfrt_name #define pfrts_flags pfrts_t.pfrt_flags #ifndef _SOCKADDR_UNION_DEFINED #define _SOCKADDR_UNION_DEFINED union sockaddr_union { struct sockaddr sa; struct sockaddr_in sin; struct sockaddr_in6 sin6; }; #endif /* _SOCKADDR_UNION_DEFINED */ struct pfr_kcounters { u_int64_t pfrkc_packets[PFR_DIR_MAX][PFR_OP_ADDR_MAX]; u_int64_t pfrkc_bytes[PFR_DIR_MAX][PFR_OP_ADDR_MAX]; }; SLIST_HEAD(pfr_kentryworkq, pfr_kentry); struct pfr_kentry { struct radix_node pfrke_node[2]; union sockaddr_union pfrke_sa; SLIST_ENTRY(pfr_kentry) pfrke_workq; struct pfr_kcounters *pfrke_counters; long pfrke_tzero; u_int8_t pfrke_af; u_int8_t pfrke_net; u_int8_t pfrke_not; u_int8_t pfrke_mark; }; SLIST_HEAD(pfr_ktableworkq, pfr_ktable); RB_HEAD(pfr_ktablehead, pfr_ktable); struct pfr_ktable { struct pfr_tstats pfrkt_ts; RB_ENTRY(pfr_ktable) pfrkt_tree; SLIST_ENTRY(pfr_ktable) pfrkt_workq; struct radix_node_head *pfrkt_ip4; struct radix_node_head *pfrkt_ip6; struct pfr_ktable *pfrkt_shadow; struct pfr_ktable *pfrkt_root; struct pf_ruleset *pfrkt_rs; long pfrkt_larg; int pfrkt_nflags; }; #define pfrkt_t pfrkt_ts.pfrts_t #define pfrkt_name pfrkt_t.pfrt_name #define pfrkt_anchor pfrkt_t.pfrt_anchor #define pfrkt_ruleset pfrkt_t.pfrt_ruleset #define pfrkt_flags pfrkt_t.pfrt_flags #define pfrkt_cnt pfrkt_ts.pfrts_cnt #define pfrkt_refcnt pfrkt_ts.pfrts_refcnt #define pfrkt_packets pfrkt_ts.pfrts_packets #define pfrkt_bytes pfrkt_ts.pfrts_bytes #define pfrkt_match pfrkt_ts.pfrts_match #define pfrkt_nomatch pfrkt_ts.pfrts_nomatch #define pfrkt_tzero pfrkt_ts.pfrts_tzero /* keep synced with pfi_kif, used in RB_FIND */ struct pfi_kif_cmp { char pfik_name[IFNAMSIZ]; }; struct pfi_kif { char pfik_name[IFNAMSIZ]; union { RB_ENTRY(pfi_kif) _pfik_tree; LIST_ENTRY(pfi_kif) _pfik_list; } _pfik_glue; #define pfik_tree _pfik_glue._pfik_tree #define pfik_list _pfik_glue._pfik_list u_int64_t pfik_packets[2][2][2]; u_int64_t pfik_bytes[2][2][2]; u_int32_t pfik_tzero; u_int pfik_flags; struct ifnet *pfik_ifp; struct ifg_group *pfik_group; u_int pfik_rulerefs; TAILQ_HEAD(, pfi_dynaddr) pfik_dynaddrs; }; #define PFI_IFLAG_REFS 0x0001 /* has state references */ #define PFI_IFLAG_SKIP 0x0100 /* skip filtering on interface */ struct pf_pdesc { struct { int done; uid_t uid; gid_t gid; } lookup; u_int64_t tot_len; /* Make Mickey money */ union { struct tcphdr *tcp; struct udphdr *udp; struct icmp *icmp; #ifdef INET6 struct icmp6_hdr *icmp6; #endif /* INET6 */ void *any; } hdr; struct pf_rule *nat_rule; /* nat/rdr rule applied to packet */ struct pf_addr *src; /* src address */ struct pf_addr *dst; /* dst address */ u_int16_t *sport; u_int16_t *dport; struct pf_mtag *pf_mtag; u_int32_t p_len; /* total length of payload */ u_int16_t *ip_sum; u_int16_t *proto_sum; u_int16_t flags; /* Let SCRUB trigger behavior in * state code. Easier than tags */ #define PFDESC_TCP_NORM 0x0001 /* TCP shall be statefully scrubbed */ #define PFDESC_IP_REAS 0x0002 /* IP frags would've been reassembled */ sa_family_t af; u_int8_t proto; u_int8_t tos; u_int8_t dir; /* direction */ u_int8_t sidx; /* key index for source */ u_int8_t didx; /* key index for destination */ }; /* flags for RDR options */ #define PF_DPORT_RANGE 0x01 /* Dest port uses range */ #define PF_RPORT_RANGE 0x02 /* RDR'ed port uses range */ /* UDP state enumeration */ #define PFUDPS_NO_TRAFFIC 0 #define PFUDPS_SINGLE 1 #define PFUDPS_MULTIPLE 2 #define PFUDPS_NSTATES 3 /* number of state levels */ #define PFUDPS_NAMES { \ "NO_TRAFFIC", \ "SINGLE", \ "MULTIPLE", \ NULL \ } /* Other protocol state enumeration */ #define PFOTHERS_NO_TRAFFIC 0 #define PFOTHERS_SINGLE 1 #define PFOTHERS_MULTIPLE 2 #define PFOTHERS_NSTATES 3 /* number of state levels */ #define PFOTHERS_NAMES { \ "NO_TRAFFIC", \ "SINGLE", \ "MULTIPLE", \ NULL \ } #define ACTION_SET(a, x) \ do { \ if ((a) != NULL) \ *(a) = (x); \ } while (0) #define REASON_SET(a, x) \ do { \ if ((a) != NULL) \ *(a) = (x); \ if (x < PFRES_MAX) \ counter_u64_add(V_pf_status.counters[x], 1); \ } while (0) struct pf_kstatus { counter_u64_t counters[PFRES_MAX]; /* reason for passing/dropping */ counter_u64_t lcounters[LCNT_MAX]; /* limit counters */ counter_u64_t fcounters[FCNT_MAX]; /* state operation counters */ counter_u64_t scounters[SCNT_MAX]; /* src_node operation counters */ uint32_t states; uint32_t src_nodes; uint32_t running; uint32_t since; uint32_t debug; uint32_t hostid; char ifname[IFNAMSIZ]; uint8_t pf_chksum[PF_MD5_DIGEST_LENGTH]; }; struct pf_divert { union { struct in_addr ipv4; struct in6_addr ipv6; } addr; u_int16_t port; }; #define PFFRAG_FRENT_HIWAT 5000 /* Number of fragment entries */ #define PFR_KENTRY_HIWAT 200000 /* Number of table entries */ /* * ioctl parameter structures */ struct pfioc_pooladdr { u_int32_t action; u_int32_t ticket; u_int32_t nr; u_int32_t r_num; u_int8_t r_action; u_int8_t r_last; u_int8_t af; char anchor[MAXPATHLEN]; struct pf_pooladdr addr; }; struct pfioc_rule { u_int32_t action; u_int32_t ticket; u_int32_t pool_ticket; u_int32_t nr; char anchor[MAXPATHLEN]; char anchor_call[MAXPATHLEN]; struct pf_rule rule; }; struct pfioc_natlook { struct pf_addr saddr; struct pf_addr daddr; struct pf_addr rsaddr; struct pf_addr rdaddr; u_int16_t sport; u_int16_t dport; u_int16_t rsport; u_int16_t rdport; sa_family_t af; u_int8_t proto; u_int8_t direction; }; struct pfioc_state { struct pfsync_state state; }; struct pfioc_src_node_kill { sa_family_t psnk_af; struct pf_rule_addr psnk_src; struct pf_rule_addr psnk_dst; u_int psnk_killed; }; struct pfioc_state_kill { struct pf_state_cmp psk_pfcmp; sa_family_t psk_af; int psk_proto; struct pf_rule_addr psk_src; struct pf_rule_addr psk_dst; char psk_ifname[IFNAMSIZ]; char psk_label[PF_RULE_LABEL_SIZE]; u_int psk_killed; }; struct pfioc_states { int ps_len; union { caddr_t psu_buf; struct pfsync_state *psu_states; } ps_u; #define ps_buf ps_u.psu_buf #define ps_states ps_u.psu_states }; struct pfioc_src_nodes { int psn_len; union { caddr_t psu_buf; struct pf_src_node *psu_src_nodes; } psn_u; #define psn_buf psn_u.psu_buf #define psn_src_nodes psn_u.psu_src_nodes }; struct pfioc_if { char ifname[IFNAMSIZ]; }; struct pfioc_tm { int timeout; int seconds; }; struct pfioc_limit { int index; unsigned limit; }; struct pfioc_altq { u_int32_t action; u_int32_t ticket; u_int32_t nr; struct pf_altq altq; }; struct pfioc_qstats { u_int32_t ticket; u_int32_t nr; void *buf; int nbytes; u_int8_t scheduler; }; struct pfioc_ruleset { u_int32_t nr; char path[MAXPATHLEN]; char name[PF_ANCHOR_NAME_SIZE]; }; #define PF_RULESET_ALTQ (PF_RULESET_MAX) #define PF_RULESET_TABLE (PF_RULESET_MAX+1) struct pfioc_trans { int size; /* number of elements */ int esize; /* size of each element in bytes */ struct pfioc_trans_e { int rs_num; char anchor[MAXPATHLEN]; u_int32_t ticket; } *array; }; #define PFR_FLAG_ATOMIC 0x00000001 /* unused */ #define PFR_FLAG_DUMMY 0x00000002 #define PFR_FLAG_FEEDBACK 0x00000004 #define PFR_FLAG_CLSTATS 0x00000008 #define PFR_FLAG_ADDRSTOO 0x00000010 #define PFR_FLAG_REPLACE 0x00000020 #define PFR_FLAG_ALLRSETS 0x00000040 #define PFR_FLAG_ALLMASK 0x0000007F #ifdef _KERNEL #define PFR_FLAG_USERIOCTL 0x10000000 #endif struct pfioc_table { struct pfr_table pfrio_table; void *pfrio_buffer; int pfrio_esize; int pfrio_size; int pfrio_size2; int pfrio_nadd; int pfrio_ndel; int pfrio_nchange; int pfrio_flags; u_int32_t pfrio_ticket; }; #define pfrio_exists pfrio_nadd #define pfrio_nzero pfrio_nadd #define pfrio_nmatch pfrio_nadd #define pfrio_naddr pfrio_size2 #define pfrio_setflag pfrio_size2 #define pfrio_clrflag pfrio_nadd struct pfioc_iface { char pfiio_name[IFNAMSIZ]; void *pfiio_buffer; int pfiio_esize; int pfiio_size; int pfiio_nzero; int pfiio_flags; }; /* * ioctl operations */ #define DIOCSTART _IO ('D', 1) #define DIOCSTOP _IO ('D', 2) #define DIOCADDRULE _IOWR('D', 4, struct pfioc_rule) #define DIOCGETRULES _IOWR('D', 6, struct pfioc_rule) #define DIOCGETRULE _IOWR('D', 7, struct pfioc_rule) /* XXX cut 8 - 17 */ #define DIOCCLRSTATES _IOWR('D', 18, struct pfioc_state_kill) #define DIOCGETSTATE _IOWR('D', 19, struct pfioc_state) #define DIOCSETSTATUSIF _IOWR('D', 20, struct pfioc_if) #define DIOCGETSTATUS _IOWR('D', 21, struct pf_status) #define DIOCCLRSTATUS _IO ('D', 22) #define DIOCNATLOOK _IOWR('D', 23, struct pfioc_natlook) #define DIOCSETDEBUG _IOWR('D', 24, u_int32_t) #define DIOCGETSTATES _IOWR('D', 25, struct pfioc_states) #define DIOCCHANGERULE _IOWR('D', 26, struct pfioc_rule) /* XXX cut 26 - 28 */ #define DIOCSETTIMEOUT _IOWR('D', 29, struct pfioc_tm) #define DIOCGETTIMEOUT _IOWR('D', 30, struct pfioc_tm) #define DIOCADDSTATE _IOWR('D', 37, struct pfioc_state) #define DIOCCLRRULECTRS _IO ('D', 38) #define DIOCGETLIMIT _IOWR('D', 39, struct pfioc_limit) #define DIOCSETLIMIT _IOWR('D', 40, struct pfioc_limit) #define DIOCKILLSTATES _IOWR('D', 41, struct pfioc_state_kill) #define DIOCSTARTALTQ _IO ('D', 42) #define DIOCSTOPALTQ _IO ('D', 43) #define DIOCADDALTQ _IOWR('D', 45, struct pfioc_altq) #define DIOCGETALTQS _IOWR('D', 47, struct pfioc_altq) #define DIOCGETALTQ _IOWR('D', 48, struct pfioc_altq) #define DIOCCHANGEALTQ _IOWR('D', 49, struct pfioc_altq) #define DIOCGETQSTATS _IOWR('D', 50, struct pfioc_qstats) #define DIOCBEGINADDRS _IOWR('D', 51, struct pfioc_pooladdr) #define DIOCADDADDR _IOWR('D', 52, struct pfioc_pooladdr) #define DIOCGETADDRS _IOWR('D', 53, struct pfioc_pooladdr) #define DIOCGETADDR _IOWR('D', 54, struct pfioc_pooladdr) #define DIOCCHANGEADDR _IOWR('D', 55, struct pfioc_pooladdr) /* XXX cut 55 - 57 */ #define DIOCGETRULESETS _IOWR('D', 58, struct pfioc_ruleset) #define DIOCGETRULESET _IOWR('D', 59, struct pfioc_ruleset) #define DIOCRCLRTABLES _IOWR('D', 60, struct pfioc_table) #define DIOCRADDTABLES _IOWR('D', 61, struct pfioc_table) #define DIOCRDELTABLES _IOWR('D', 62, struct pfioc_table) #define DIOCRGETTABLES _IOWR('D', 63, struct pfioc_table) #define DIOCRGETTSTATS _IOWR('D', 64, struct pfioc_table) #define DIOCRCLRTSTATS _IOWR('D', 65, struct pfioc_table) #define DIOCRCLRADDRS _IOWR('D', 66, struct pfioc_table) #define DIOCRADDADDRS _IOWR('D', 67, struct pfioc_table) #define DIOCRDELADDRS _IOWR('D', 68, struct pfioc_table) #define DIOCRSETADDRS _IOWR('D', 69, struct pfioc_table) #define DIOCRGETADDRS _IOWR('D', 70, struct pfioc_table) #define DIOCRGETASTATS _IOWR('D', 71, struct pfioc_table) #define DIOCRCLRASTATS _IOWR('D', 72, struct pfioc_table) #define DIOCRTSTADDRS _IOWR('D', 73, struct pfioc_table) #define DIOCRSETTFLAGS _IOWR('D', 74, struct pfioc_table) #define DIOCRINADEFINE _IOWR('D', 77, struct pfioc_table) #define DIOCOSFPFLUSH _IO('D', 78) #define DIOCOSFPADD _IOWR('D', 79, struct pf_osfp_ioctl) #define DIOCOSFPGET _IOWR('D', 80, struct pf_osfp_ioctl) #define DIOCXBEGIN _IOWR('D', 81, struct pfioc_trans) #define DIOCXCOMMIT _IOWR('D', 82, struct pfioc_trans) #define DIOCXROLLBACK _IOWR('D', 83, struct pfioc_trans) #define DIOCGETSRCNODES _IOWR('D', 84, struct pfioc_src_nodes) #define DIOCCLRSRCNODES _IO('D', 85) #define DIOCSETHOSTID _IOWR('D', 86, u_int32_t) #define DIOCIGETIFACES _IOWR('D', 87, struct pfioc_iface) #define DIOCSETIFFLAG _IOWR('D', 89, struct pfioc_iface) #define DIOCCLRIFFLAG _IOWR('D', 90, struct pfioc_iface) #define DIOCKILLSRCNODES _IOWR('D', 91, struct pfioc_src_node_kill) struct pf_ifspeed { char ifname[IFNAMSIZ]; u_int32_t baudrate; }; #define DIOCGIFSPEED _IOWR('D', 92, struct pf_ifspeed) #ifdef _KERNEL LIST_HEAD(pf_src_node_list, pf_src_node); struct pf_srchash { struct pf_src_node_list nodes; struct mtx lock; }; struct pf_keyhash { LIST_HEAD(, pf_state_key) keys; struct mtx lock; }; struct pf_idhash { LIST_HEAD(, pf_state) states; struct mtx lock; }; extern u_long pf_hashmask; extern u_long pf_srchashmask; #define PF_HASHSIZ (32768) VNET_DECLARE(struct pf_keyhash *, pf_keyhash); VNET_DECLARE(struct pf_idhash *, pf_idhash); #define V_pf_keyhash VNET(pf_keyhash) #define V_pf_idhash VNET(pf_idhash) VNET_DECLARE(struct pf_srchash *, pf_srchash); #define V_pf_srchash VNET(pf_srchash) #define PF_IDHASH(s) (be64toh((s)->id) % (pf_hashmask + 1)) VNET_DECLARE(void *, pf_swi_cookie); #define V_pf_swi_cookie VNET(pf_swi_cookie) VNET_DECLARE(uint64_t, pf_stateid[MAXCPU]); #define V_pf_stateid VNET(pf_stateid) TAILQ_HEAD(pf_altqqueue, pf_altq); VNET_DECLARE(struct pf_altqqueue, pf_altqs[2]); #define V_pf_altqs VNET(pf_altqs) VNET_DECLARE(struct pf_palist, pf_pabuf); #define V_pf_pabuf VNET(pf_pabuf) VNET_DECLARE(u_int32_t, ticket_altqs_active); #define V_ticket_altqs_active VNET(ticket_altqs_active) VNET_DECLARE(u_int32_t, ticket_altqs_inactive); #define V_ticket_altqs_inactive VNET(ticket_altqs_inactive) VNET_DECLARE(int, altqs_inactive_open); #define V_altqs_inactive_open VNET(altqs_inactive_open) VNET_DECLARE(u_int32_t, ticket_pabuf); #define V_ticket_pabuf VNET(ticket_pabuf) VNET_DECLARE(struct pf_altqqueue *, pf_altqs_active); #define V_pf_altqs_active VNET(pf_altqs_active) VNET_DECLARE(struct pf_altqqueue *, pf_altqs_inactive); #define V_pf_altqs_inactive VNET(pf_altqs_inactive) VNET_DECLARE(struct pf_rulequeue, pf_unlinked_rules); #define V_pf_unlinked_rules VNET(pf_unlinked_rules) void pf_initialize(void); void pf_mtag_initialize(void); void pf_mtag_cleanup(void); void pf_cleanup(void); struct pf_mtag *pf_get_mtag(struct mbuf *); extern void pf_calc_skip_steps(struct pf_rulequeue *); #ifdef ALTQ extern void pf_altq_ifnet_event(struct ifnet *, int); #endif VNET_DECLARE(uma_zone_t, pf_state_z); #define V_pf_state_z VNET(pf_state_z) VNET_DECLARE(uma_zone_t, pf_state_key_z); #define V_pf_state_key_z VNET(pf_state_key_z) VNET_DECLARE(uma_zone_t, pf_state_scrub_z); #define V_pf_state_scrub_z VNET(pf_state_scrub_z) extern void pf_purge_thread(void *); extern void pf_intr(void *); extern void pf_purge_expired_src_nodes(void); extern int pf_unlink_state(struct pf_state *, u_int); #define PF_ENTER_LOCKED 0x00000001 #define PF_RETURN_LOCKED 0x00000002 extern int pf_state_insert(struct pfi_kif *, struct pf_state_key *, struct pf_state_key *, struct pf_state *); extern void pf_free_state(struct pf_state *); static __inline void pf_ref_state(struct pf_state *s) { refcount_acquire(&s->refs); } static __inline int pf_release_state(struct pf_state *s) { if (refcount_release(&s->refs)) { pf_free_state(s); return (1); } else return (0); } extern struct pf_state *pf_find_state_byid(uint64_t, uint32_t); extern struct pf_state *pf_find_state_all(struct pf_state_key_cmp *, u_int, int *); extern struct pf_src_node *pf_find_src_node(struct pf_addr *, struct pf_rule *, sa_family_t, int); extern void pf_unlink_src_node(struct pf_src_node *); extern u_int pf_free_src_nodes(struct pf_src_node_list *); extern void pf_print_state(struct pf_state *); extern void pf_print_flags(u_int8_t); extern u_int16_t pf_cksum_fixup(u_int16_t, u_int16_t, u_int16_t, u_int8_t); VNET_DECLARE(struct ifnet *, sync_ifp); #define V_sync_ifp VNET(sync_ifp); VNET_DECLARE(struct pf_rule, pf_default_rule); #define V_pf_default_rule VNET(pf_default_rule) extern void pf_addrcpy(struct pf_addr *, struct pf_addr *, u_int8_t); void pf_free_rule(struct pf_rule *); #ifdef INET int pf_test(int, struct ifnet *, struct mbuf **, struct inpcb *); int pf_normalize_ip(struct mbuf **, int, struct pfi_kif *, u_short *, struct pf_pdesc *); #endif /* INET */ #ifdef INET6 int pf_test6(int, struct ifnet *, struct mbuf **, struct inpcb *); int pf_normalize_ip6(struct mbuf **, int, struct pfi_kif *, u_short *, struct pf_pdesc *); void pf_poolmask(struct pf_addr *, struct pf_addr*, struct pf_addr *, struct pf_addr *, u_int8_t); void pf_addr_inc(struct pf_addr *, sa_family_t); int pf_refragment6(struct ifnet *, struct mbuf **, struct m_tag *); #endif /* INET6 */ u_int32_t pf_new_isn(struct pf_state *); void *pf_pull_hdr(struct mbuf *, int, void *, int, u_short *, u_short *, sa_family_t); void pf_change_a(void *, u_int16_t *, u_int32_t, u_int8_t); void pf_send_deferred_syn(struct pf_state *); int pf_match_addr(u_int8_t, struct pf_addr *, struct pf_addr *, struct pf_addr *, sa_family_t); int pf_match_addr_range(struct pf_addr *, struct pf_addr *, struct pf_addr *, sa_family_t); int pf_match_port(u_int8_t, u_int16_t, u_int16_t, u_int16_t); void pf_normalize_init(void); void pf_normalize_cleanup(void); int pf_normalize_tcp(int, struct pfi_kif *, struct mbuf *, int, int, void *, struct pf_pdesc *); void pf_normalize_tcp_cleanup(struct pf_state *); int pf_normalize_tcp_init(struct mbuf *, int, struct pf_pdesc *, struct tcphdr *, struct pf_state_peer *, struct pf_state_peer *); int pf_normalize_tcp_stateful(struct mbuf *, int, struct pf_pdesc *, u_short *, struct tcphdr *, struct pf_state *, struct pf_state_peer *, struct pf_state_peer *, int *); u_int32_t pf_state_expires(const struct pf_state *); void pf_purge_expired_fragments(void); int pf_routable(struct pf_addr *addr, sa_family_t af, struct pfi_kif *, int); int pf_socket_lookup(int, struct pf_pdesc *, struct mbuf *); struct pf_state_key *pf_alloc_state_key(int); void pfr_initialize(void); void pfr_cleanup(void); int pfr_match_addr(struct pfr_ktable *, struct pf_addr *, sa_family_t); void pfr_update_stats(struct pfr_ktable *, struct pf_addr *, sa_family_t, u_int64_t, int, int, int); int pfr_pool_get(struct pfr_ktable *, int *, struct pf_addr *, sa_family_t); void pfr_dynaddr_update(struct pfr_ktable *, struct pfi_dynaddr *); struct pfr_ktable * pfr_attach_table(struct pf_ruleset *, char *); void pfr_detach_table(struct pfr_ktable *); int pfr_clr_tables(struct pfr_table *, int *, int); int pfr_add_tables(struct pfr_table *, int, int *, int); int pfr_del_tables(struct pfr_table *, int, int *, int); int pfr_get_tables(struct pfr_table *, struct pfr_table *, int *, int); int pfr_get_tstats(struct pfr_table *, struct pfr_tstats *, int *, int); int pfr_clr_tstats(struct pfr_table *, int, int *, int); int pfr_set_tflags(struct pfr_table *, int, int, int, int *, int *, int); int pfr_clr_addrs(struct pfr_table *, int *, int); int pfr_insert_kentry(struct pfr_ktable *, struct pfr_addr *, long); int pfr_add_addrs(struct pfr_table *, struct pfr_addr *, int, int *, int); int pfr_del_addrs(struct pfr_table *, struct pfr_addr *, int, int *, int); int pfr_set_addrs(struct pfr_table *, struct pfr_addr *, int, int *, int *, int *, int *, int, u_int32_t); int pfr_get_addrs(struct pfr_table *, struct pfr_addr *, int *, int); int pfr_get_astats(struct pfr_table *, struct pfr_astats *, int *, int); int pfr_clr_astats(struct pfr_table *, struct pfr_addr *, int, int *, int); int pfr_tst_addrs(struct pfr_table *, struct pfr_addr *, int, int *, int); int pfr_ina_begin(struct pfr_table *, u_int32_t *, int *, int); int pfr_ina_rollback(struct pfr_table *, u_int32_t, int *, int); int pfr_ina_commit(struct pfr_table *, u_int32_t, int *, int *, int); int pfr_ina_define(struct pfr_table *, struct pfr_addr *, int, int *, int *, u_int32_t, int); MALLOC_DECLARE(PFI_MTYPE); VNET_DECLARE(struct pfi_kif *, pfi_all); #define V_pfi_all VNET(pfi_all) void pfi_initialize(void); void pfi_cleanup(void); void pfi_kif_ref(struct pfi_kif *); void pfi_kif_unref(struct pfi_kif *); struct pfi_kif *pfi_kif_find(const char *); struct pfi_kif *pfi_kif_attach(struct pfi_kif *, const char *); int pfi_kif_match(struct pfi_kif *, struct pfi_kif *); void pfi_kif_purge(void); int pfi_match_addr(struct pfi_dynaddr *, struct pf_addr *, sa_family_t); int pfi_dynaddr_setup(struct pf_addr_wrap *, sa_family_t); void pfi_dynaddr_remove(struct pfi_dynaddr *); void pfi_dynaddr_copyout(struct pf_addr_wrap *); void pfi_update_status(const char *, struct pf_status *); void pfi_get_ifaces(const char *, struct pfi_kif *, int *); int pfi_set_flags(const char *, int); int pfi_clear_flags(const char *, int); int pf_match_tag(struct mbuf *, struct pf_rule *, int *, int); int pf_tag_packet(struct mbuf *, struct pf_pdesc *, int); int pf_addr_cmp(struct pf_addr *, struct pf_addr *, sa_family_t); void pf_qid2qname(u_int32_t, char *); VNET_DECLARE(struct pf_kstatus, pf_status); #define V_pf_status VNET(pf_status) struct pf_limit { uma_zone_t zone; u_int limit; }; VNET_DECLARE(struct pf_limit, pf_limits[PF_LIMIT_MAX]); #define V_pf_limits VNET(pf_limits) #endif /* _KERNEL */ #ifdef _KERNEL VNET_DECLARE(struct pf_anchor_global, pf_anchors); #define V_pf_anchors VNET(pf_anchors) VNET_DECLARE(struct pf_anchor, pf_main_anchor); #define V_pf_main_anchor VNET(pf_main_anchor) #define pf_main_ruleset V_pf_main_anchor.ruleset #endif /* these ruleset functions can be linked into userland programs (pfctl) */ int pf_get_ruleset_number(u_int8_t); void pf_init_ruleset(struct pf_ruleset *); int pf_anchor_setup(struct pf_rule *, const struct pf_ruleset *, const char *); int pf_anchor_copyout(const struct pf_ruleset *, const struct pf_rule *, struct pfioc_rule *); void pf_anchor_remove(struct pf_rule *); void pf_remove_if_empty_ruleset(struct pf_ruleset *); struct pf_ruleset *pf_find_ruleset(const char *); struct pf_ruleset *pf_find_or_create_ruleset(const char *); void pf_rs_initialize(void); /* The fingerprint functions can be linked into userland programs (tcpdump) */ int pf_osfp_add(struct pf_osfp_ioctl *); #ifdef _KERNEL struct pf_osfp_enlist * pf_osfp_fingerprint(struct pf_pdesc *, struct mbuf *, int, const struct tcphdr *); #endif /* _KERNEL */ void pf_osfp_flush(void); int pf_osfp_get(struct pf_osfp_ioctl *); int pf_osfp_match(struct pf_osfp_enlist *, pf_osfp_t); #ifdef _KERNEL void pf_print_host(struct pf_addr *, u_int16_t, u_int8_t); void pf_step_into_anchor(struct pf_anchor_stackframe *, int *, struct pf_ruleset **, int, struct pf_rule **, struct pf_rule **, int *); int pf_step_out_of_anchor(struct pf_anchor_stackframe *, int *, struct pf_ruleset **, int, struct pf_rule **, struct pf_rule **, int *); int pf_map_addr(u_int8_t, struct pf_rule *, struct pf_addr *, struct pf_addr *, struct pf_addr *, struct pf_src_node **); struct pf_rule *pf_get_translation(struct pf_pdesc *, struct mbuf *, int, int, struct pfi_kif *, struct pf_src_node **, struct pf_state_key **, struct pf_state_key **, struct pf_addr *, struct pf_addr *, uint16_t, uint16_t, struct pf_anchor_stackframe *); struct pf_state_key *pf_state_key_setup(struct pf_pdesc *, struct pf_addr *, struct pf_addr *, u_int16_t, u_int16_t); struct pf_state_key *pf_state_key_clone(struct pf_state_key *); #endif /* _KERNEL */ #endif /* _NET_PFVAR_H_ */ Index: user/ngie/more-tests/sys/net/route.c =================================================================== --- user/ngie/more-tests/sys/net/route.c (revision 281584) +++ user/ngie/more-tests/sys/net/route.c (revision 281585) @@ -1,2024 +1,2023 @@ /*- * Copyright (c) 1980, 1986, 1991, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)route.c 8.3.1.1 (Berkeley) 2/23/95 * $FreeBSD$ */ /************************************************************************ * Note: In this file a 'fib' is a "forwarding information base" * * Which is the new name for an in kernel routing (next hop) table. * ***********************************************************************/ #include "opt_inet.h" #include "opt_inet6.h" #include "opt_route.h" #include "opt_sctp.h" #include "opt_mrouting.h" #include "opt_mpath.h" #include #include -#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef RADIX_MPATH #include #endif #include #include #include #define RT_MAXFIBS UINT16_MAX /* Kernel config default option. */ #ifdef ROUTETABLES #if ROUTETABLES <= 0 #error "ROUTETABLES defined too low" #endif #if ROUTETABLES > RT_MAXFIBS #error "ROUTETABLES defined too big" #endif #define RT_NUMFIBS ROUTETABLES #endif /* ROUTETABLES */ /* Initialize to default if not otherwise set. */ #ifndef RT_NUMFIBS #define RT_NUMFIBS 1 #endif #if defined(INET) || defined(INET6) #ifdef SCTP extern void sctp_addr_change(struct ifaddr *ifa, int cmd); #endif /* SCTP */ #endif /* This is read-only.. */ u_int rt_numfibs = RT_NUMFIBS; SYSCTL_UINT(_net, OID_AUTO, fibs, CTLFLAG_RDTUN, &rt_numfibs, 0, ""); /* * By default add routes to all fibs for new interfaces. * Once this is set to 0 then only allocate routes on interface * changes for the FIB of the caller when adding a new set of addresses * to an interface. XXX this is a shotgun aproach to a problem that needs * a more fine grained solution.. that will come. * XXX also has the problems getting the FIB from curthread which will not * always work given the fib can be overridden and prefixes can be added * from the network stack context. */ VNET_DEFINE(u_int, rt_add_addr_allfibs) = 1; SYSCTL_UINT(_net, OID_AUTO, add_addr_allfibs, CTLFLAG_RWTUN | CTLFLAG_VNET, &VNET_NAME(rt_add_addr_allfibs), 0, ""); VNET_DEFINE(struct rtstat, rtstat); #define V_rtstat VNET(rtstat) VNET_DEFINE(struct radix_node_head *, rt_tables); #define V_rt_tables VNET(rt_tables) VNET_DEFINE(int, rttrash); /* routes not in table but not freed */ #define V_rttrash VNET(rttrash) /* * Convert a 'struct radix_node *' to a 'struct rtentry *'. * The operation can be done safely (in this code) because a * 'struct rtentry' starts with two 'struct radix_node''s, the first * one representing leaf nodes in the routing tree, which is * what the code in radix.c passes us as a 'struct radix_node'. * * But because there are a lot of assumptions in this conversion, * do not cast explicitly, but always use the macro below. */ #define RNTORT(p) ((struct rtentry *)(p)) static VNET_DEFINE(uma_zone_t, rtzone); /* Routing table UMA zone. */ #define V_rtzone VNET(rtzone) static int rtrequest1_fib_change(struct radix_node_head *, struct rt_addrinfo *, struct rtentry **, u_int); static void rt_setmetrics(const struct rt_addrinfo *, struct rtentry *); struct if_mtuinfo { struct ifnet *ifp; int mtu; }; static int if_updatemtu_cb(struct radix_node *, void *); /* * handler for net.my_fibnum */ static int sysctl_my_fibnum(SYSCTL_HANDLER_ARGS) { int fibnum; int error; fibnum = curthread->td_proc->p_fibnum; error = sysctl_handle_int(oidp, &fibnum, 0, req); return (error); } SYSCTL_PROC(_net, OID_AUTO, my_fibnum, CTLTYPE_INT|CTLFLAG_RD, NULL, 0, &sysctl_my_fibnum, "I", "default FIB of caller"); static __inline struct radix_node_head ** rt_tables_get_rnh_ptr(int table, int fam) { struct radix_node_head **rnh; KASSERT(table >= 0 && table < rt_numfibs, ("%s: table out of bounds.", __func__)); KASSERT(fam >= 0 && fam < (AF_MAX+1), ("%s: fam out of bounds.", __func__)); /* rnh is [fib=0][af=0]. */ rnh = (struct radix_node_head **)V_rt_tables; /* Get the offset to the requested table and fam. */ rnh += table * (AF_MAX+1) + fam; return (rnh); } struct radix_node_head * rt_tables_get_rnh(int table, int fam) { return (*rt_tables_get_rnh_ptr(table, fam)); } /* * route initialization must occur before ip6_init2(), which happenas at * SI_ORDER_MIDDLE. */ static void route_init(void) { /* whack the tunable ints into line. */ if (rt_numfibs > RT_MAXFIBS) rt_numfibs = RT_MAXFIBS; if (rt_numfibs == 0) rt_numfibs = 1; } SYSINIT(route_init, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD, route_init, 0); static int rtentry_zinit(void *mem, int size, int how) { struct rtentry *rt = mem; rt->rt_pksent = counter_u64_alloc(how); if (rt->rt_pksent == NULL) return (ENOMEM); RT_LOCK_INIT(rt); return (0); } static void rtentry_zfini(void *mem, int size) { struct rtentry *rt = mem; RT_LOCK_DESTROY(rt); counter_u64_free(rt->rt_pksent); } static int rtentry_ctor(void *mem, int size, void *arg, int how) { struct rtentry *rt = mem; bzero(rt, offsetof(struct rtentry, rt_endzero)); counter_u64_zero(rt->rt_pksent); return (0); } static void rtentry_dtor(void *mem, int size, void *arg) { struct rtentry *rt = mem; RT_UNLOCK_COND(rt); } static void vnet_route_init(const void *unused __unused) { struct domain *dom; struct radix_node_head **rnh; int table; int fam; V_rt_tables = malloc(rt_numfibs * (AF_MAX+1) * sizeof(struct radix_node_head *), M_RTABLE, M_WAITOK|M_ZERO); V_rtzone = uma_zcreate("rtentry", sizeof(struct rtentry), rtentry_ctor, rtentry_dtor, rtentry_zinit, rtentry_zfini, UMA_ALIGN_PTR, 0); for (dom = domains; dom; dom = dom->dom_next) { if (dom->dom_rtattach == NULL) continue; for (table = 0; table < rt_numfibs; table++) { fam = dom->dom_family; if (table != 0 && fam != AF_INET6 && fam != AF_INET) break; rnh = rt_tables_get_rnh_ptr(table, fam); if (rnh == NULL) panic("%s: rnh NULL", __func__); dom->dom_rtattach((void **)rnh, 0); } } } VNET_SYSINIT(vnet_route_init, SI_SUB_PROTO_DOMAIN, SI_ORDER_FOURTH, vnet_route_init, 0); #ifdef VIMAGE static void vnet_route_uninit(const void *unused __unused) { int table; int fam; struct domain *dom; struct radix_node_head **rnh; for (dom = domains; dom; dom = dom->dom_next) { if (dom->dom_rtdetach == NULL) continue; for (table = 0; table < rt_numfibs; table++) { fam = dom->dom_family; if (table != 0 && fam != AF_INET6 && fam != AF_INET) break; rnh = rt_tables_get_rnh_ptr(table, fam); if (rnh == NULL) panic("%s: rnh NULL", __func__); dom->dom_rtdetach((void **)rnh, 0); } } free(V_rt_tables, M_RTABLE); uma_zdestroy(V_rtzone); } VNET_SYSUNINIT(vnet_route_uninit, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD, vnet_route_uninit, 0); #endif #ifndef _SYS_SYSPROTO_H_ struct setfib_args { int fibnum; }; #endif int sys_setfib(struct thread *td, struct setfib_args *uap) { if (uap->fibnum < 0 || uap->fibnum >= rt_numfibs) return EINVAL; td->td_proc->p_fibnum = uap->fibnum; return (0); } /* * Packet routing routines. */ void rtalloc(struct route *ro) { rtalloc_ign_fib(ro, 0UL, RT_DEFAULT_FIB); } void rtalloc_fib(struct route *ro, u_int fibnum) { rtalloc_ign_fib(ro, 0UL, fibnum); } void rtalloc_ign(struct route *ro, u_long ignore) { struct rtentry *rt; if ((rt = ro->ro_rt) != NULL) { if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP) return; RTFREE(rt); ro->ro_rt = NULL; } ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, ignore, RT_DEFAULT_FIB); if (ro->ro_rt) RT_UNLOCK(ro->ro_rt); } void rtalloc_ign_fib(struct route *ro, u_long ignore, u_int fibnum) { struct rtentry *rt; if ((rt = ro->ro_rt) != NULL) { if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP) return; RTFREE(rt); ro->ro_rt = NULL; } ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, ignore, fibnum); if (ro->ro_rt) RT_UNLOCK(ro->ro_rt); } /* * Look up the route that matches the address given * Or, at least try.. Create a cloned route if needed. * * The returned route, if any, is locked. */ struct rtentry * rtalloc1(struct sockaddr *dst, int report, u_long ignflags) { return (rtalloc1_fib(dst, report, ignflags, RT_DEFAULT_FIB)); } struct rtentry * rtalloc1_fib(struct sockaddr *dst, int report, u_long ignflags, u_int fibnum) { struct radix_node_head *rnh; struct radix_node *rn; struct rtentry *newrt; struct rt_addrinfo info; int err = 0, msgtype = RTM_MISS; int needlock; KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum")); rnh = rt_tables_get_rnh(fibnum, dst->sa_family); newrt = NULL; if (rnh == NULL) goto miss; /* * Look up the address in the table for that Address Family */ needlock = !(ignflags & RTF_RNH_LOCKED); if (needlock) RADIX_NODE_HEAD_RLOCK(rnh); #ifdef INVARIANTS else RADIX_NODE_HEAD_LOCK_ASSERT(rnh); #endif rn = rnh->rnh_matchaddr(dst, rnh); if (rn && ((rn->rn_flags & RNF_ROOT) == 0)) { newrt = RNTORT(rn); RT_LOCK(newrt); RT_ADDREF(newrt); if (needlock) RADIX_NODE_HEAD_RUNLOCK(rnh); goto done; } else if (needlock) RADIX_NODE_HEAD_RUNLOCK(rnh); /* * Either we hit the root or couldn't find any match, * Which basically means * "caint get there frm here" */ miss: V_rtstat.rts_unreach++; if (report) { /* * If required, report the failure to the supervising * Authorities. * For a delete, this is not an error. (report == 0) */ bzero(&info, sizeof(info)); info.rti_info[RTAX_DST] = dst; rt_missmsg_fib(msgtype, &info, 0, err, fibnum); } done: if (newrt) RT_LOCK_ASSERT(newrt); return (newrt); } /* * Remove a reference count from an rtentry. * If the count gets low enough, take it out of the routing table */ void rtfree(struct rtentry *rt) { struct radix_node_head *rnh; KASSERT(rt != NULL,("%s: NULL rt", __func__)); rnh = rt_tables_get_rnh(rt->rt_fibnum, rt_key(rt)->sa_family); KASSERT(rnh != NULL,("%s: NULL rnh", __func__)); RT_LOCK_ASSERT(rt); /* * The callers should use RTFREE_LOCKED() or RTFREE(), so * we should come here exactly with the last reference. */ RT_REMREF(rt); if (rt->rt_refcnt > 0) { log(LOG_DEBUG, "%s: %p has %d refs\n", __func__, rt, rt->rt_refcnt); goto done; } /* * On last reference give the "close method" a chance * to cleanup private state. This also permits (for * IPv4 and IPv6) a chance to decide if the routing table * entry should be purged immediately or at a later time. * When an immediate purge is to happen the close routine * typically calls rtexpunge which clears the RTF_UP flag * on the entry so that the code below reclaims the storage. */ if (rt->rt_refcnt == 0 && rnh->rnh_close) rnh->rnh_close((struct radix_node *)rt, rnh); /* * If we are no longer "up" (and ref == 0) * then we can free the resources associated * with the route. */ if ((rt->rt_flags & RTF_UP) == 0) { if (rt->rt_nodes->rn_flags & (RNF_ACTIVE | RNF_ROOT)) panic("rtfree 2"); /* * the rtentry must have been removed from the routing table * so it is represented in rttrash.. remove that now. */ V_rttrash--; #ifdef DIAGNOSTIC if (rt->rt_refcnt < 0) { printf("rtfree: %p not freed (neg refs)\n", rt); goto done; } #endif /* * release references on items we hold them on.. * e.g other routes and ifaddrs. */ if (rt->rt_ifa) ifa_free(rt->rt_ifa); /* * The key is separatly alloc'd so free it (see rt_setgate()). * This also frees the gateway, as they are always malloc'd * together. */ Free(rt_key(rt)); /* * and the rtentry itself of course */ uma_zfree(V_rtzone, rt); return; } done: RT_UNLOCK(rt); } /* * Force a routing table entry to the specified * destination to go through the given gateway. * Normally called as a result of a routing redirect * message from the network layer. */ void rtredirect(struct sockaddr *dst, struct sockaddr *gateway, struct sockaddr *netmask, int flags, struct sockaddr *src) { rtredirect_fib(dst, gateway, netmask, flags, src, RT_DEFAULT_FIB); } void rtredirect_fib(struct sockaddr *dst, struct sockaddr *gateway, struct sockaddr *netmask, int flags, struct sockaddr *src, u_int fibnum) { struct rtentry *rt, *rt0 = NULL; int error = 0; short *stat = NULL; struct rt_addrinfo info; struct ifaddr *ifa; struct radix_node_head *rnh; ifa = NULL; rnh = rt_tables_get_rnh(fibnum, dst->sa_family); if (rnh == NULL) { error = EAFNOSUPPORT; goto out; } /* verify the gateway is directly reachable */ if ((ifa = ifa_ifwithnet(gateway, 0, fibnum)) == NULL) { error = ENETUNREACH; goto out; } rt = rtalloc1_fib(dst, 0, 0UL, fibnum); /* NB: rt is locked */ /* * If the redirect isn't from our current router for this dst, * it's either old or wrong. If it redirects us to ourselves, * we have a routing loop, perhaps as a result of an interface * going down recently. */ if (!(flags & RTF_DONE) && rt && (!sa_equal(src, rt->rt_gateway) || rt->rt_ifa != ifa)) error = EINVAL; else if (ifa_ifwithaddr_check(gateway)) error = EHOSTUNREACH; if (error) goto done; /* * Create a new entry if we just got back a wildcard entry * or the lookup failed. This is necessary for hosts * which use routing redirects generated by smart gateways * to dynamically build the routing tables. */ if (rt == NULL || (rt_mask(rt) && rt_mask(rt)->sa_len < 2)) goto create; /* * Don't listen to the redirect if it's * for a route to an interface. */ if (rt->rt_flags & RTF_GATEWAY) { if (((rt->rt_flags & RTF_HOST) == 0) && (flags & RTF_HOST)) { /* * Changing from route to net => route to host. * Create new route, rather than smashing route to net. */ create: rt0 = rt; rt = NULL; flags |= RTF_GATEWAY | RTF_DYNAMIC; bzero((caddr_t)&info, sizeof(info)); info.rti_info[RTAX_DST] = dst; info.rti_info[RTAX_GATEWAY] = gateway; info.rti_info[RTAX_NETMASK] = netmask; info.rti_ifa = ifa; info.rti_flags = flags; if (rt0 != NULL) RT_UNLOCK(rt0); /* drop lock to avoid LOR with RNH */ error = rtrequest1_fib(RTM_ADD, &info, &rt, fibnum); if (rt != NULL) { RT_LOCK(rt); if (rt0 != NULL) EVENTHANDLER_INVOKE(route_redirect_event, rt0, rt, dst); flags = rt->rt_flags; } if (rt0 != NULL) RTFREE(rt0); stat = &V_rtstat.rts_dynamic; } else { struct rtentry *gwrt; /* * Smash the current notion of the gateway to * this destination. Should check about netmask!!! */ rt->rt_flags |= RTF_MODIFIED; flags |= RTF_MODIFIED; stat = &V_rtstat.rts_newgateway; /* * add the key and gateway (in one malloc'd chunk). */ RT_UNLOCK(rt); RADIX_NODE_HEAD_LOCK(rnh); RT_LOCK(rt); rt_setgate(rt, rt_key(rt), gateway); gwrt = rtalloc1(gateway, 1, RTF_RNH_LOCKED); RADIX_NODE_HEAD_UNLOCK(rnh); EVENTHANDLER_INVOKE(route_redirect_event, rt, gwrt, dst); RTFREE_LOCKED(gwrt); } } else error = EHOSTUNREACH; done: if (rt) RTFREE_LOCKED(rt); out: if (error) V_rtstat.rts_badredirect++; else if (stat != NULL) (*stat)++; bzero((caddr_t)&info, sizeof(info)); info.rti_info[RTAX_DST] = dst; info.rti_info[RTAX_GATEWAY] = gateway; info.rti_info[RTAX_NETMASK] = netmask; info.rti_info[RTAX_AUTHOR] = src; rt_missmsg_fib(RTM_REDIRECT, &info, flags, error, fibnum); if (ifa != NULL) ifa_free(ifa); } int rtioctl(u_long req, caddr_t data) { return (rtioctl_fib(req, data, RT_DEFAULT_FIB)); } /* * Routing table ioctl interface. */ int rtioctl_fib(u_long req, caddr_t data, u_int fibnum) { /* * If more ioctl commands are added here, make sure the proper * super-user checks are being performed because it is possible for * prison-root to make it this far if raw sockets have been enabled * in jails. */ #ifdef INET /* Multicast goop, grrr... */ return mrt_ioctl ? mrt_ioctl(req, data, fibnum) : EOPNOTSUPP; #else /* INET */ return ENXIO; #endif /* INET */ } struct ifaddr * ifa_ifwithroute(int flags, struct sockaddr *dst, struct sockaddr *gateway, u_int fibnum) { struct ifaddr *ifa; int not_found = 0; if ((flags & RTF_GATEWAY) == 0) { /* * If we are adding a route to an interface, * and the interface is a pt to pt link * we should search for the destination * as our clue to the interface. Otherwise * we can use the local address. */ ifa = NULL; if (flags & RTF_HOST) ifa = ifa_ifwithdstaddr(dst, fibnum); if (ifa == NULL) ifa = ifa_ifwithaddr(gateway); } else { /* * If we are adding a route to a remote net * or host, the gateway may still be on the * other end of a pt to pt link. */ ifa = ifa_ifwithdstaddr(gateway, fibnum); } if (ifa == NULL) ifa = ifa_ifwithnet(gateway, 0, fibnum); if (ifa == NULL) { struct rtentry *rt = rtalloc1_fib(gateway, 0, RTF_RNH_LOCKED, fibnum); if (rt == NULL) return (NULL); /* * dismiss a gateway that is reachable only * through the default router */ switch (gateway->sa_family) { case AF_INET: if (satosin(rt_key(rt))->sin_addr.s_addr == INADDR_ANY) not_found = 1; break; case AF_INET6: if (IN6_IS_ADDR_UNSPECIFIED(&satosin6(rt_key(rt))->sin6_addr)) not_found = 1; break; default: break; } if (!not_found && rt->rt_ifa != NULL) { ifa = rt->rt_ifa; ifa_ref(ifa); } RT_REMREF(rt); RT_UNLOCK(rt); if (not_found || ifa == NULL) return (NULL); } if (ifa->ifa_addr->sa_family != dst->sa_family) { struct ifaddr *oifa = ifa; ifa = ifaof_ifpforaddr(dst, ifa->ifa_ifp); if (ifa == NULL) ifa = oifa; else ifa_free(oifa); } return (ifa); } /* * Do appropriate manipulations of a routing tree given * all the bits of info needed */ int rtrequest(int req, struct sockaddr *dst, struct sockaddr *gateway, struct sockaddr *netmask, int flags, struct rtentry **ret_nrt) { return (rtrequest_fib(req, dst, gateway, netmask, flags, ret_nrt, RT_DEFAULT_FIB)); } int rtrequest_fib(int req, struct sockaddr *dst, struct sockaddr *gateway, struct sockaddr *netmask, int flags, struct rtentry **ret_nrt, u_int fibnum) { struct rt_addrinfo info; if (dst->sa_len == 0) return(EINVAL); bzero((caddr_t)&info, sizeof(info)); info.rti_flags = flags; info.rti_info[RTAX_DST] = dst; info.rti_info[RTAX_GATEWAY] = gateway; info.rti_info[RTAX_NETMASK] = netmask; return rtrequest1_fib(req, &info, ret_nrt, fibnum); } /* * These (questionable) definitions of apparent local variables apply * to the next two functions. XXXXXX!!! */ #define dst info->rti_info[RTAX_DST] #define gateway info->rti_info[RTAX_GATEWAY] #define netmask info->rti_info[RTAX_NETMASK] #define ifaaddr info->rti_info[RTAX_IFA] #define ifpaddr info->rti_info[RTAX_IFP] #define flags info->rti_flags int rt_getifa(struct rt_addrinfo *info) { return (rt_getifa_fib(info, RT_DEFAULT_FIB)); } /* * Look up rt_addrinfo for a specific fib. Note that if rti_ifa is defined, * it will be referenced so the caller must free it. */ int rt_getifa_fib(struct rt_addrinfo *info, u_int fibnum) { struct ifaddr *ifa; int error = 0; /* * ifp may be specified by sockaddr_dl * when protocol address is ambiguous. */ if (info->rti_ifp == NULL && ifpaddr != NULL && ifpaddr->sa_family == AF_LINK && (ifa = ifa_ifwithnet(ifpaddr, 0, fibnum)) != NULL) { info->rti_ifp = ifa->ifa_ifp; ifa_free(ifa); } if (info->rti_ifa == NULL && ifaaddr != NULL) info->rti_ifa = ifa_ifwithaddr(ifaaddr); if (info->rti_ifa == NULL) { struct sockaddr *sa; sa = ifaaddr != NULL ? ifaaddr : (gateway != NULL ? gateway : dst); if (sa != NULL && info->rti_ifp != NULL) info->rti_ifa = ifaof_ifpforaddr(sa, info->rti_ifp); else if (dst != NULL && gateway != NULL) info->rti_ifa = ifa_ifwithroute(flags, dst, gateway, fibnum); else if (sa != NULL) info->rti_ifa = ifa_ifwithroute(flags, sa, sa, fibnum); } if ((ifa = info->rti_ifa) != NULL) { if (info->rti_ifp == NULL) info->rti_ifp = ifa->ifa_ifp; } else error = ENETUNREACH; return (error); } /* * Expunges references to a route that's about to be reclaimed. * The route must be locked. */ int rt_expunge(struct radix_node_head *rnh, struct rtentry *rt) { #if !defined(RADIX_MPATH) struct radix_node *rn; #else struct rt_addrinfo info; int fib; struct rtentry *rt0; #endif struct ifaddr *ifa; int error = 0; RT_LOCK_ASSERT(rt); RADIX_NODE_HEAD_LOCK_ASSERT(rnh); #ifdef RADIX_MPATH fib = rt->rt_fibnum; bzero(&info, sizeof(info)); info.rti_ifp = rt->rt_ifp; info.rti_flags = RTF_RNH_LOCKED; info.rti_info[RTAX_DST] = rt_key(rt); info.rti_info[RTAX_GATEWAY] = rt->rt_ifa->ifa_addr; RT_UNLOCK(rt); error = rtrequest1_fib(RTM_DELETE, &info, &rt0, fib); if (error == 0 && rt0 != NULL) { rt = rt0; RT_LOCK(rt); } else if (error != 0) { RT_LOCK(rt); return (error); } #else /* * Remove the item from the tree; it should be there, * but when callers invoke us blindly it may not (sigh). */ rn = rnh->rnh_deladdr(rt_key(rt), rt_mask(rt), rnh); if (rn == NULL) { error = ESRCH; goto bad; } KASSERT((rn->rn_flags & (RNF_ACTIVE | RNF_ROOT)) == 0, ("unexpected flags 0x%x", rn->rn_flags)); KASSERT(rt == RNTORT(rn), ("lookup mismatch, rt %p rn %p", rt, rn)); #endif /* RADIX_MPATH */ rt->rt_flags &= ~RTF_UP; /* * Give the protocol a chance to keep things in sync. */ if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest) { struct rt_addrinfo info; bzero((caddr_t)&info, sizeof(info)); info.rti_flags = rt->rt_flags; info.rti_info[RTAX_DST] = rt_key(rt); info.rti_info[RTAX_GATEWAY] = rt->rt_gateway; info.rti_info[RTAX_NETMASK] = rt_mask(rt); ifa->ifa_rtrequest(RTM_DELETE, rt, &info); } /* * one more rtentry floating around that is not * linked to the routing table. */ V_rttrash++; #if !defined(RADIX_MPATH) bad: #endif return (error); } static int if_updatemtu_cb(struct radix_node *rn, void *arg) { struct rtentry *rt; struct if_mtuinfo *ifmtu; rt = (struct rtentry *)rn; ifmtu = (struct if_mtuinfo *)arg; if (rt->rt_ifp != ifmtu->ifp) return (0); if (rt->rt_mtu >= ifmtu->mtu) { /* We have to decrease mtu regardless of flags */ rt->rt_mtu = ifmtu->mtu; return (0); } /* * New MTU is bigger. Check if are allowed to alter it */ if ((rt->rt_flags & (RTF_FIXEDMTU | RTF_GATEWAY | RTF_HOST)) != 0) { /* * Skip routes with user-supplied MTU and * non-interface routes */ return (0); } /* We are safe to update route MTU */ rt->rt_mtu = ifmtu->mtu; return (0); } void rt_updatemtu(struct ifnet *ifp) { struct if_mtuinfo ifmtu; struct radix_node_head *rnh; int i, j; ifmtu.ifp = ifp; /* * Try to update rt_mtu for all routes using this interface * Unfortunately the only way to do this is to traverse all * routing tables in all fibs/domains. */ for (i = 1; i <= AF_MAX; i++) { ifmtu.mtu = if_getmtu_family(ifp, i); for (j = 0; j < rt_numfibs; j++) { rnh = rt_tables_get_rnh(j, i); if (rnh == NULL) continue; RADIX_NODE_HEAD_LOCK(rnh); rnh->rnh_walktree(rnh, if_updatemtu_cb, &ifmtu); RADIX_NODE_HEAD_UNLOCK(rnh); } } } #if 0 int p_sockaddr(char *buf, int buflen, struct sockaddr *s); int rt_print(char *buf, int buflen, struct rtentry *rt); int p_sockaddr(char *buf, int buflen, struct sockaddr *s) { void *paddr = NULL; switch (s->sa_family) { case AF_INET: paddr = &((struct sockaddr_in *)s)->sin_addr; break; case AF_INET6: paddr = &((struct sockaddr_in6 *)s)->sin6_addr; break; } if (paddr == NULL) return (0); if (inet_ntop(s->sa_family, paddr, buf, buflen) == NULL) return (0); return (strlen(buf)); } int rt_print(char *buf, int buflen, struct rtentry *rt) { struct sockaddr *addr, *mask; int i = 0; addr = rt_key(rt); mask = rt_mask(rt); i = p_sockaddr(buf, buflen, addr); if (!(rt->rt_flags & RTF_HOST)) { buf[i++] = '/'; i += p_sockaddr(buf + i, buflen - i, mask); } if (rt->rt_flags & RTF_GATEWAY) { buf[i++] = '>'; i += p_sockaddr(buf + i, buflen - i, rt->rt_gateway); } return (i); } #endif #ifdef RADIX_MPATH static int rn_mpath_update(int req, struct rt_addrinfo *info, struct radix_node_head *rnh, struct rtentry **ret_nrt) { /* * if we got multipath routes, we require users to specify * a matching RTAX_GATEWAY. */ struct rtentry *rt, *rto = NULL; struct radix_node *rn; int error = 0; rn = rnh->rnh_lookup(dst, netmask, rnh); if (rn == NULL) return (ESRCH); rto = rt = RNTORT(rn); rt = rt_mpath_matchgate(rt, gateway); if (rt == NULL) return (ESRCH); /* * this is the first entry in the chain */ if (rto == rt) { rn = rn_mpath_next((struct radix_node *)rt); /* * there is another entry, now it's active */ if (rn) { rto = RNTORT(rn); RT_LOCK(rto); rto->rt_flags |= RTF_UP; RT_UNLOCK(rto); } else if (rt->rt_flags & RTF_GATEWAY) { /* * For gateway routes, we need to * make sure that we we are deleting * the correct gateway. * rt_mpath_matchgate() does not * check the case when there is only * one route in the chain. */ if (gateway && (rt->rt_gateway->sa_len != gateway->sa_len || memcmp(rt->rt_gateway, gateway, gateway->sa_len))) error = ESRCH; else { /* * remove from tree before returning it * to the caller */ rn = rnh->rnh_deladdr(dst, netmask, rnh); KASSERT(rt == RNTORT(rn), ("radix node disappeared")); goto gwdelete; } } /* * use the normal delete code to remove * the first entry */ if (req != RTM_DELETE) goto nondelete; error = ENOENT; goto done; } /* * if the entry is 2nd and on up */ if ((req == RTM_DELETE) && !rt_mpath_deldup(rto, rt)) panic ("rtrequest1: rt_mpath_deldup"); gwdelete: RT_LOCK(rt); RT_ADDREF(rt); if (req == RTM_DELETE) { rt->rt_flags &= ~RTF_UP; /* * One more rtentry floating around that is not * linked to the routing table. rttrash will be decremented * when RTFREE(rt) is eventually called. */ V_rttrash++; } nondelete: if (req != RTM_DELETE) panic("unrecognized request %d", req); /* * If the caller wants it, then it can have it, * but it's up to it to free the rtentry as we won't be * doing it. */ if (ret_nrt) { *ret_nrt = rt; RT_UNLOCK(rt); } else RTFREE_LOCKED(rt); done: return (error); } #endif int rtrequest1_fib(int req, struct rt_addrinfo *info, struct rtentry **ret_nrt, u_int fibnum) { int error = 0, needlock = 0; struct rtentry *rt; #ifdef FLOWTABLE struct rtentry *rt0; #endif struct radix_node *rn; struct radix_node_head *rnh; struct ifaddr *ifa; struct sockaddr *ndst; struct sockaddr_storage mdst; #define senderr(x) { error = x ; goto bad; } KASSERT((fibnum < rt_numfibs), ("rtrequest1_fib: bad fibnum")); switch (dst->sa_family) { case AF_INET6: case AF_INET: /* We support multiple FIBs. */ break; default: fibnum = RT_DEFAULT_FIB; break; } /* * Find the correct routing tree to use for this Address Family */ rnh = rt_tables_get_rnh(fibnum, dst->sa_family); if (rnh == NULL) return (EAFNOSUPPORT); needlock = ((flags & RTF_RNH_LOCKED) == 0); flags &= ~RTF_RNH_LOCKED; if (needlock) RADIX_NODE_HEAD_LOCK(rnh); else RADIX_NODE_HEAD_LOCK_ASSERT(rnh); /* * If we are adding a host route then we don't want to put * a netmask in the tree, nor do we want to clone it. */ if (flags & RTF_HOST) netmask = NULL; switch (req) { case RTM_DELETE: if (netmask) { rt_maskedcopy(dst, (struct sockaddr *)&mdst, netmask); dst = (struct sockaddr *)&mdst; } #ifdef RADIX_MPATH if (rn_mpath_capable(rnh)) { error = rn_mpath_update(req, info, rnh, ret_nrt); /* * "bad" holds true for the success case * as well */ if (error != ENOENT) goto bad; error = 0; } #endif if ((flags & RTF_PINNED) == 0) { /* Check if target route can be deleted */ rt = (struct rtentry *)rnh->rnh_lookup(dst, netmask, rnh); if ((rt != NULL) && (rt->rt_flags & RTF_PINNED)) senderr(EADDRINUSE); } /* * Remove the item from the tree and return it. * Complain if it is not there and do no more processing. */ rn = rnh->rnh_deladdr(dst, netmask, rnh); if (rn == NULL) senderr(ESRCH); if (rn->rn_flags & (RNF_ACTIVE | RNF_ROOT)) panic ("rtrequest delete"); rt = RNTORT(rn); RT_LOCK(rt); RT_ADDREF(rt); rt->rt_flags &= ~RTF_UP; /* * give the protocol a chance to keep things in sync. */ if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest) ifa->ifa_rtrequest(RTM_DELETE, rt, info); /* * One more rtentry floating around that is not * linked to the routing table. rttrash will be decremented * when RTFREE(rt) is eventually called. */ V_rttrash++; /* * If the caller wants it, then it can have it, * but it's up to it to free the rtentry as we won't be * doing it. */ if (ret_nrt) { *ret_nrt = rt; RT_UNLOCK(rt); } else RTFREE_LOCKED(rt); break; case RTM_RESOLVE: /* * resolve was only used for route cloning * here for compat */ break; case RTM_ADD: if ((flags & RTF_GATEWAY) && !gateway) senderr(EINVAL); if (dst && gateway && (dst->sa_family != gateway->sa_family) && (gateway->sa_family != AF_UNSPEC) && (gateway->sa_family != AF_LINK)) senderr(EINVAL); if (info->rti_ifa == NULL) { error = rt_getifa_fib(info, fibnum); if (error) senderr(error); } else ifa_ref(info->rti_ifa); ifa = info->rti_ifa; rt = uma_zalloc(V_rtzone, M_NOWAIT); if (rt == NULL) { ifa_free(ifa); senderr(ENOBUFS); } rt->rt_flags = RTF_UP | flags; rt->rt_fibnum = fibnum; /* * Add the gateway. Possibly re-malloc-ing the storage for it. */ RT_LOCK(rt); if ((error = rt_setgate(rt, dst, gateway)) != 0) { ifa_free(ifa); uma_zfree(V_rtzone, rt); senderr(error); } /* * point to the (possibly newly malloc'd) dest address. */ ndst = (struct sockaddr *)rt_key(rt); /* * make sure it contains the value we want (masked if needed). */ if (netmask) { rt_maskedcopy(dst, ndst, netmask); } else bcopy(dst, ndst, dst->sa_len); /* * We use the ifa reference returned by rt_getifa_fib(). * This moved from below so that rnh->rnh_addaddr() can * examine the ifa and ifa->ifa_ifp if it so desires. */ rt->rt_ifa = ifa; rt->rt_ifp = ifa->ifa_ifp; rt->rt_weight = 1; rt_setmetrics(info, rt); #ifdef RADIX_MPATH /* do not permit exactly the same dst/mask/gw pair */ if (rn_mpath_capable(rnh) && rt_mpath_conflict(rnh, rt, netmask)) { ifa_free(rt->rt_ifa); Free(rt_key(rt)); uma_zfree(V_rtzone, rt); senderr(EEXIST); } #endif #ifdef FLOWTABLE rt0 = NULL; /* "flow-table" only supports IPv6 and IPv4 at the moment. */ switch (dst->sa_family) { #ifdef INET6 case AF_INET6: #endif #ifdef INET case AF_INET: #endif #if defined(INET6) || defined(INET) rn = rnh->rnh_matchaddr(dst, rnh); if (rn && ((rn->rn_flags & RNF_ROOT) == 0)) { struct sockaddr *mask; u_char *m, *n; int len; /* * compare mask to see if the new route is * more specific than the existing one */ rt0 = RNTORT(rn); RT_LOCK(rt0); RT_ADDREF(rt0); RT_UNLOCK(rt0); /* * A host route is already present, so * leave the flow-table entries as is. */ if (rt0->rt_flags & RTF_HOST) { RTFREE(rt0); rt0 = NULL; } else if (!(flags & RTF_HOST) && netmask) { mask = rt_mask(rt0); len = mask->sa_len; m = (u_char *)mask; n = (u_char *)netmask; while (len-- > 0) { if (*n != *m) break; n++; m++; } if (len == 0 || (*n < *m)) { RTFREE(rt0); rt0 = NULL; } } } #endif/* INET6 || INET */ } #endif /* FLOWTABLE */ /* XXX mtu manipulation will be done in rnh_addaddr -- itojun */ rn = rnh->rnh_addaddr(ndst, netmask, rnh, rt->rt_nodes); /* * If it still failed to go into the tree, * then un-make it (this should be a function) */ if (rn == NULL) { ifa_free(rt->rt_ifa); Free(rt_key(rt)); uma_zfree(V_rtzone, rt); #ifdef FLOWTABLE if (rt0 != NULL) RTFREE(rt0); #endif senderr(EEXIST); } #ifdef FLOWTABLE else if (rt0 != NULL) { flowtable_route_flush(dst->sa_family, rt0); RTFREE(rt0); } #endif /* * If this protocol has something to add to this then * allow it to do that as well. */ if (ifa->ifa_rtrequest) ifa->ifa_rtrequest(req, rt, info); /* * actually return a resultant rtentry and * give the caller a single reference. */ if (ret_nrt) { *ret_nrt = rt; RT_ADDREF(rt); } RT_UNLOCK(rt); break; case RTM_CHANGE: error = rtrequest1_fib_change(rnh, info, ret_nrt, fibnum); break; default: error = EOPNOTSUPP; } bad: if (needlock) RADIX_NODE_HEAD_UNLOCK(rnh); return (error); #undef senderr } #undef dst #undef gateway #undef netmask #undef ifaaddr #undef ifpaddr #undef flags static int rtrequest1_fib_change(struct radix_node_head *rnh, struct rt_addrinfo *info, struct rtentry **ret_nrt, u_int fibnum) { struct rtentry *rt = NULL; int error = 0; int free_ifa = 0; int family, mtu; struct if_mtuinfo ifmtu; rt = (struct rtentry *)rnh->rnh_lookup(info->rti_info[RTAX_DST], info->rti_info[RTAX_NETMASK], rnh); if (rt == NULL) return (ESRCH); #ifdef RADIX_MPATH /* * If we got multipath routes, * we require users to specify a matching RTAX_GATEWAY. */ if (rn_mpath_capable(rnh)) { rt = rt_mpath_matchgate(rt, info->rti_info[RTAX_GATEWAY]); if (rt == NULL) return (ESRCH); } #endif RT_LOCK(rt); rt_setmetrics(info, rt); /* * New gateway could require new ifaddr, ifp; * flags may also be different; ifp may be specified * by ll sockaddr when protocol address is ambiguous */ if (((rt->rt_flags & RTF_GATEWAY) && info->rti_info[RTAX_GATEWAY] != NULL) || info->rti_info[RTAX_IFP] != NULL || (info->rti_info[RTAX_IFA] != NULL && !sa_equal(info->rti_info[RTAX_IFA], rt->rt_ifa->ifa_addr))) { error = rt_getifa_fib(info, fibnum); if (info->rti_ifa != NULL) free_ifa = 1; if (error != 0) goto bad; } /* Check if outgoing interface has changed */ if (info->rti_ifa != NULL && info->rti_ifa != rt->rt_ifa && rt->rt_ifa != NULL && rt->rt_ifa->ifa_rtrequest != NULL) { rt->rt_ifa->ifa_rtrequest(RTM_DELETE, rt, info); ifa_free(rt->rt_ifa); } /* Update gateway address */ if (info->rti_info[RTAX_GATEWAY] != NULL) { error = rt_setgate(rt, rt_key(rt), info->rti_info[RTAX_GATEWAY]); if (error != 0) goto bad; rt->rt_flags &= ~RTF_GATEWAY; rt->rt_flags |= (RTF_GATEWAY & info->rti_flags); } if (info->rti_ifa != NULL && info->rti_ifa != rt->rt_ifa) { ifa_ref(info->rti_ifa); rt->rt_ifa = info->rti_ifa; rt->rt_ifp = info->rti_ifp; } /* Allow some flags to be toggled on change. */ rt->rt_flags &= ~RTF_FMASK; rt->rt_flags |= info->rti_flags & RTF_FMASK; if (rt->rt_ifa && rt->rt_ifa->ifa_rtrequest != NULL) rt->rt_ifa->ifa_rtrequest(RTM_ADD, rt, info); /* Alter route MTU if necessary */ if (rt->rt_ifp != NULL) { family = info->rti_info[RTAX_DST]->sa_family; mtu = if_getmtu_family(rt->rt_ifp, family); /* Set default MTU */ if (rt->rt_mtu == 0) rt->rt_mtu = mtu; if (rt->rt_mtu != mtu) { /* Check if we really need to update */ ifmtu.ifp = rt->rt_ifp; ifmtu.mtu = mtu; if_updatemtu_cb(rt->rt_nodes, &ifmtu); } } if (ret_nrt) { *ret_nrt = rt; RT_ADDREF(rt); } bad: RT_UNLOCK(rt); if (free_ifa != 0) ifa_free(info->rti_ifa); return (error); } static void rt_setmetrics(const struct rt_addrinfo *info, struct rtentry *rt) { if (info->rti_mflags & RTV_MTU) { if (info->rti_rmx->rmx_mtu != 0) { /* * MTU was explicitly provided by user. * Keep it. */ rt->rt_flags |= RTF_FIXEDMTU; } else { /* * User explicitly sets MTU to 0. * Assume rollback to default. */ rt->rt_flags &= ~RTF_FIXEDMTU; } rt->rt_mtu = info->rti_rmx->rmx_mtu; } if (info->rti_mflags & RTV_WEIGHT) rt->rt_weight = info->rti_rmx->rmx_weight; /* Kernel -> userland timebase conversion. */ if (info->rti_mflags & RTV_EXPIRE) rt->rt_expire = info->rti_rmx->rmx_expire ? info->rti_rmx->rmx_expire - time_second + time_uptime : 0; } int rt_setgate(struct rtentry *rt, struct sockaddr *dst, struct sockaddr *gate) { /* XXX dst may be overwritten, can we move this to below */ int dlen = SA_SIZE(dst), glen = SA_SIZE(gate); #ifdef INVARIANTS struct radix_node_head *rnh; rnh = rt_tables_get_rnh(rt->rt_fibnum, dst->sa_family); #endif RT_LOCK_ASSERT(rt); RADIX_NODE_HEAD_LOCK_ASSERT(rnh); /* * Prepare to store the gateway in rt->rt_gateway. * Both dst and gateway are stored one after the other in the same * malloc'd chunk. If we have room, we can reuse the old buffer, * rt_gateway already points to the right place. * Otherwise, malloc a new block and update the 'dst' address. */ if (rt->rt_gateway == NULL || glen > SA_SIZE(rt->rt_gateway)) { caddr_t new; R_Malloc(new, caddr_t, dlen + glen); if (new == NULL) return ENOBUFS; /* * XXX note, we copy from *dst and not *rt_key(rt) because * rt_setgate() can be called to initialize a newly * allocated route entry, in which case rt_key(rt) == NULL * (and also rt->rt_gateway == NULL). * Free()/free() handle a NULL argument just fine. */ bcopy(dst, new, dlen); Free(rt_key(rt)); /* free old block, if any */ rt_key(rt) = (struct sockaddr *)new; rt->rt_gateway = (struct sockaddr *)(new + dlen); } /* * Copy the new gateway value into the memory chunk. */ bcopy(gate, rt->rt_gateway, glen); return (0); } void rt_maskedcopy(struct sockaddr *src, struct sockaddr *dst, struct sockaddr *netmask) { u_char *cp1 = (u_char *)src; u_char *cp2 = (u_char *)dst; u_char *cp3 = (u_char *)netmask; u_char *cplim = cp2 + *cp3; u_char *cplim2 = cp2 + *cp1; *cp2++ = *cp1++; *cp2++ = *cp1++; /* copies sa_len & sa_family */ cp3 += 2; if (cplim > cplim2) cplim = cplim2; while (cp2 < cplim) *cp2++ = *cp1++ & *cp3++; if (cp2 < cplim2) bzero((caddr_t)cp2, (unsigned)(cplim2 - cp2)); } /* * Set up a routing table entry, normally * for an interface. */ #define _SOCKADDR_TMPSIZE 128 /* Not too big.. kernel stack size is limited */ static inline int rtinit1(struct ifaddr *ifa, int cmd, int flags, int fibnum) { struct sockaddr *dst; struct sockaddr *netmask; struct rtentry *rt = NULL; struct rt_addrinfo info; int error = 0; int startfib, endfib; char tempbuf[_SOCKADDR_TMPSIZE]; int didwork = 0; int a_failure = 0; static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK}; struct radix_node_head *rnh; if (flags & RTF_HOST) { dst = ifa->ifa_dstaddr; netmask = NULL; } else { dst = ifa->ifa_addr; netmask = ifa->ifa_netmask; } if (dst->sa_len == 0) return(EINVAL); switch (dst->sa_family) { case AF_INET6: case AF_INET: /* We support multiple FIBs. */ break; default: fibnum = RT_DEFAULT_FIB; break; } if (fibnum == RT_ALL_FIBS) { if (V_rt_add_addr_allfibs == 0 && cmd == (int)RTM_ADD) startfib = endfib = ifa->ifa_ifp->if_fib; else { startfib = 0; endfib = rt_numfibs - 1; } } else { KASSERT((fibnum < rt_numfibs), ("rtinit1: bad fibnum")); startfib = fibnum; endfib = fibnum; } /* * If it's a delete, check that if it exists, * it's on the correct interface or we might scrub * a route to another ifa which would * be confusing at best and possibly worse. */ if (cmd == RTM_DELETE) { /* * It's a delete, so it should already exist.. * If it's a net, mask off the host bits * (Assuming we have a mask) * XXX this is kinda inet specific.. */ if (netmask != NULL) { rt_maskedcopy(dst, (struct sockaddr *)tempbuf, netmask); dst = (struct sockaddr *)tempbuf; } } /* * Now go through all the requested tables (fibs) and do the * requested action. Realistically, this will either be fib 0 * for protocols that don't do multiple tables or all the * tables for those that do. */ for ( fibnum = startfib; fibnum <= endfib; fibnum++) { if (cmd == RTM_DELETE) { struct radix_node *rn; /* * Look up an rtentry that is in the routing tree and * contains the correct info. */ rnh = rt_tables_get_rnh(fibnum, dst->sa_family); if (rnh == NULL) /* this table doesn't exist but others might */ continue; RADIX_NODE_HEAD_RLOCK(rnh); rn = rnh->rnh_lookup(dst, netmask, rnh); #ifdef RADIX_MPATH if (rn_mpath_capable(rnh)) { if (rn == NULL) error = ESRCH; else { rt = RNTORT(rn); /* * for interface route the * rt->rt_gateway is sockaddr_intf * for cloning ARP entries, so * rt_mpath_matchgate must use the * interface address */ rt = rt_mpath_matchgate(rt, ifa->ifa_addr); if (rt == NULL) error = ESRCH; } } #endif error = (rn == NULL || (rn->rn_flags & RNF_ROOT) || RNTORT(rn)->rt_ifa != ifa); RADIX_NODE_HEAD_RUNLOCK(rnh); if (error) { /* this is only an error if bad on ALL tables */ continue; } } /* * Do the actual request */ bzero((caddr_t)&info, sizeof(info)); info.rti_ifa = ifa; info.rti_flags = flags | (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED; info.rti_info[RTAX_DST] = dst; /* * doing this for compatibility reasons */ if (cmd == RTM_ADD) info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&null_sdl; else info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr; info.rti_info[RTAX_NETMASK] = netmask; error = rtrequest1_fib(cmd, &info, &rt, fibnum); if ((error == EEXIST) && (cmd == RTM_ADD)) { /* * Interface route addition failed. * Atomically delete current prefix generating * RTM_DELETE message, and retry adding * interface prefix. */ rnh = rt_tables_get_rnh(fibnum, dst->sa_family); RADIX_NODE_HEAD_LOCK(rnh); /* Delete old prefix */ info.rti_ifa = NULL; info.rti_flags = RTF_RNH_LOCKED; error = rtrequest1_fib(RTM_DELETE, &info, NULL, fibnum); if (error == 0) { info.rti_ifa = ifa; info.rti_flags = flags | RTF_RNH_LOCKED | (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED; error = rtrequest1_fib(cmd, &info, &rt, fibnum); } RADIX_NODE_HEAD_UNLOCK(rnh); } if (error == 0 && rt != NULL) { /* * notify any listening routing agents of the change */ RT_LOCK(rt); #ifdef RADIX_MPATH /* * in case address alias finds the first address * e.g. ifconfig bge0 192.0.2.246/24 * e.g. ifconfig bge0 192.0.2.247/24 * the address set in the route is 192.0.2.246 * so we need to replace it with 192.0.2.247 */ if (memcmp(rt->rt_ifa->ifa_addr, ifa->ifa_addr, ifa->ifa_addr->sa_len)) { ifa_free(rt->rt_ifa); ifa_ref(ifa); rt->rt_ifp = ifa->ifa_ifp; rt->rt_ifa = ifa; } #endif /* * doing this for compatibility reasons */ if (cmd == RTM_ADD) { ((struct sockaddr_dl *)rt->rt_gateway)->sdl_type = rt->rt_ifp->if_type; ((struct sockaddr_dl *)rt->rt_gateway)->sdl_index = rt->rt_ifp->if_index; } RT_ADDREF(rt); RT_UNLOCK(rt); rt_newaddrmsg_fib(cmd, ifa, error, rt, fibnum); RT_LOCK(rt); RT_REMREF(rt); if (cmd == RTM_DELETE) { /* * If we are deleting, and we found an entry, * then it's been removed from the tree.. * now throw it away. */ RTFREE_LOCKED(rt); } else { if (cmd == RTM_ADD) { /* * We just wanted to add it.. * we don't actually need a reference. */ RT_REMREF(rt); } RT_UNLOCK(rt); } didwork = 1; } if (error) a_failure = error; } if (cmd == RTM_DELETE) { if (didwork) { error = 0; } else { /* we only give an error if it wasn't in any table */ error = ((flags & RTF_HOST) ? EHOSTUNREACH : ENETUNREACH); } } else { if (a_failure) { /* return an error if any of them failed */ error = a_failure; } } return (error); } /* * Set up a routing table entry, normally * for an interface. */ int rtinit(struct ifaddr *ifa, int cmd, int flags) { struct sockaddr *dst; int fib = RT_DEFAULT_FIB; if (flags & RTF_HOST) { dst = ifa->ifa_dstaddr; } else { dst = ifa->ifa_addr; } switch (dst->sa_family) { case AF_INET6: case AF_INET: /* We do support multiple FIBs. */ fib = RT_ALL_FIBS; break; } return (rtinit1(ifa, cmd, flags, fib)); } /* * Announce interface address arrival/withdraw * Returns 0 on success. */ int rt_addrmsg(int cmd, struct ifaddr *ifa, int fibnum) { KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE, ("unexpected cmd %d", cmd)); KASSERT(fibnum == RT_ALL_FIBS || (fibnum >= 0 && fibnum < rt_numfibs), ("%s: fib out of range 0 <=%d<%d", __func__, fibnum, rt_numfibs)); #if defined(INET) || defined(INET6) #ifdef SCTP /* * notify the SCTP stack * this will only get called when an address is added/deleted * XXX pass the ifaddr struct instead if ifa->ifa_addr... */ sctp_addr_change(ifa, cmd); #endif /* SCTP */ #endif return (rtsock_addrmsg(cmd, ifa, fibnum)); } /* * Announce route addition/removal. * Users of this function MUST validate input data BEFORE calling. * However we have to be able to handle invalid data: * if some userland app sends us "invalid" route message (invalid mask, * no dst, wrong address families, etc...) we need to pass it back * to app (and any other rtsock consumers) with rtm_errno field set to * non-zero value. * Returns 0 on success. */ int rt_routemsg(int cmd, struct ifnet *ifp, int error, struct rtentry *rt, int fibnum) { KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE, ("unexpected cmd %d", cmd)); KASSERT(fibnum == RT_ALL_FIBS || (fibnum >= 0 && fibnum < rt_numfibs), ("%s: fib out of range 0 <=%d<%d", __func__, fibnum, rt_numfibs)); KASSERT(rt_key(rt) != NULL, (":%s: rt_key must be supplied", __func__)); return (rtsock_routemsg(cmd, ifp, error, rt, fibnum)); } void rt_newaddrmsg(int cmd, struct ifaddr *ifa, int error, struct rtentry *rt) { rt_newaddrmsg_fib(cmd, ifa, error, rt, RT_ALL_FIBS); } /* * This is called to generate messages from the routing socket * indicating a network interface has had addresses associated with it. */ void rt_newaddrmsg_fib(int cmd, struct ifaddr *ifa, int error, struct rtentry *rt, int fibnum) { KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE, ("unexpected cmd %u", cmd)); KASSERT(fibnum == RT_ALL_FIBS || (fibnum >= 0 && fibnum < rt_numfibs), ("%s: fib out of range 0 <=%d<%d", __func__, fibnum, rt_numfibs)); if (cmd == RTM_ADD) { rt_addrmsg(cmd, ifa, fibnum); if (rt != NULL) rt_routemsg(cmd, ifa->ifa_ifp, error, rt, fibnum); } else { if (rt != NULL) rt_routemsg(cmd, ifa->ifa_ifp, error, rt, fibnum); rt_addrmsg(cmd, ifa, fibnum); } } Index: user/ngie/more-tests/sys/netinet/ip_reass.c =================================================================== --- user/ngie/more-tests/sys/netinet/ip_reass.c (revision 281584) +++ user/ngie/more-tests/sys/netinet/ip_reass.c (revision 281585) @@ -1,657 +1,658 @@ /*- * Copyright (c) 2015 Gleb Smirnoff * Copyright (c) 2015 Adrian Chadd * Copyright (c) 1982, 1986, 1988, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)ip_input.c 8.2 (Berkeley) 1/4/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_rss.h" #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #ifdef MAC #include #endif SYSCTL_DECL(_net_inet_ip); /* * Reassembly headers are stored in hash buckets. */ #define IPREASS_NHASH_LOG2 6 #define IPREASS_NHASH (1 << IPREASS_NHASH_LOG2) #define IPREASS_HMASK (IPREASS_NHASH - 1) struct ipqbucket { TAILQ_HEAD(ipqhead, ipq) head; struct mtx lock; }; static VNET_DEFINE(struct ipqbucket, ipq[IPREASS_NHASH]); #define V_ipq VNET(ipq) static VNET_DEFINE(uint32_t, ipq_hashseed); #define V_ipq_hashseed VNET(ipq_hashseed) #define IPQ_LOCK(i) mtx_lock(&V_ipq[i].lock) #define IPQ_TRYLOCK(i) mtx_trylock(&V_ipq[i].lock) #define IPQ_UNLOCK(i) mtx_unlock(&V_ipq[i].lock) #define IPQ_LOCK_ASSERT(i) mtx_assert(&V_ipq[i].lock, MA_OWNED) void ipreass_init(void); void ipreass_drain(void); void ipreass_slowtimo(void); #ifdef VIMAGE void ipreass_destroy(void); #endif static int sysctl_maxfragpackets(SYSCTL_HANDLER_ARGS); static void ipreass_zone_change(void *); static void ipreass_drain_tomax(void); static void ipq_free(struct ipqhead *, struct ipq *); static struct ipq * ipq_reuse(int); static inline void ipq_timeout(struct ipqhead *head, struct ipq *fp) { IPSTAT_ADD(ips_fragtimeout, fp->ipq_nfrags); ipq_free(head, fp); } static inline void ipq_drop(struct ipqhead *head, struct ipq *fp) { IPSTAT_ADD(ips_fragdropped, fp->ipq_nfrags); ipq_free(head, fp); } static VNET_DEFINE(uma_zone_t, ipq_zone); #define V_ipq_zone VNET(ipq_zone) SYSCTL_PROC(_net_inet_ip, OID_AUTO, maxfragpackets, CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW, NULL, 0, sysctl_maxfragpackets, "I", "Maximum number of IPv4 fragment reassembly queue entries"); SYSCTL_UMA_CUR(_net_inet_ip, OID_AUTO, fragpackets, CTLFLAG_VNET, &VNET_NAME(ipq_zone), "Current number of IPv4 fragment reassembly queue entries"); static VNET_DEFINE(int, noreass); #define V_noreass VNET(noreass) static VNET_DEFINE(int, maxfragsperpacket); #define V_maxfragsperpacket VNET(maxfragsperpacket) SYSCTL_INT(_net_inet_ip, OID_AUTO, maxfragsperpacket, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(maxfragsperpacket), 0, "Maximum number of IPv4 fragments allowed per packet"); /* * Take incoming datagram fragment and try to reassemble it into * whole datagram. If the argument is the first fragment or one * in between the function will return NULL and store the mbuf * in the fragment chain. If the argument is the last fragment * the packet will be reassembled and the pointer to the new * mbuf returned for further processing. Only m_tags attached * to the first packet/fragment are preserved. * The IP header is *NOT* adjusted out of iplen. */ #define M_IP_FRAG M_PROTO9 struct mbuf * ip_reass(struct mbuf *m) { struct ip *ip; struct mbuf *p, *q, *nq, *t; struct ipq *fp; struct ipqhead *head; int i, hlen, next; u_int8_t ecn, ecn0; uint32_t hash; #ifdef RSS uint32_t rss_hash, rss_type; #endif /* * If no reassembling or maxfragsperpacket are 0, * never accept fragments. */ if (V_noreass == 1 || V_maxfragsperpacket == 0) { IPSTAT_INC(ips_fragments); IPSTAT_INC(ips_fragdropped); m_freem(m); return (NULL); } ip = mtod(m, struct ip *); hlen = ip->ip_hl << 2; /* * Adjust ip_len to not reflect header, * convert offset of this to bytes. */ ip->ip_len = htons(ntohs(ip->ip_len) - hlen); if (ip->ip_off & htons(IP_MF)) { /* * Make sure that fragments have a data length * that's a non-zero multiple of 8 bytes. */ if (ip->ip_len == htons(0) || (ntohs(ip->ip_len) & 0x7) != 0) { IPSTAT_INC(ips_toosmall); /* XXX */ IPSTAT_INC(ips_fragdropped); m_freem(m); return (NULL); } m->m_flags |= M_IP_FRAG; } else m->m_flags &= ~M_IP_FRAG; ip->ip_off = htons(ntohs(ip->ip_off) << 3); /* * Attempt reassembly; if it succeeds, proceed. * ip_reass() will return a different mbuf. */ IPSTAT_INC(ips_fragments); m->m_pkthdr.PH_loc.ptr = ip; /* * Presence of header sizes in mbufs * would confuse code below. */ m->m_data += hlen; m->m_len -= hlen; hash = ip->ip_src.s_addr ^ ip->ip_id; hash = jenkins_hash32(&hash, 1, V_ipq_hashseed) & IPREASS_HMASK; head = &V_ipq[hash].head; IPQ_LOCK(hash); /* * Look for queue of fragments * of this datagram. */ TAILQ_FOREACH(fp, head, ipq_list) if (ip->ip_id == fp->ipq_id && ip->ip_src.s_addr == fp->ipq_src.s_addr && ip->ip_dst.s_addr == fp->ipq_dst.s_addr && #ifdef MAC mac_ipq_match(m, fp) && #endif ip->ip_p == fp->ipq_p) break; /* * If first fragment to arrive, create a reassembly queue. */ if (fp == NULL) { fp = uma_zalloc(V_ipq_zone, M_NOWAIT); if (fp == NULL) fp = ipq_reuse(hash); #ifdef MAC if (mac_ipq_init(fp, M_NOWAIT) != 0) { uma_zfree(V_ipq_zone, fp); fp = NULL; goto dropfrag; } mac_ipq_create(m, fp); #endif TAILQ_INSERT_HEAD(head, fp, ipq_list); fp->ipq_nfrags = 1; fp->ipq_ttl = IPFRAGTTL; fp->ipq_p = ip->ip_p; fp->ipq_id = ip->ip_id; fp->ipq_src = ip->ip_src; fp->ipq_dst = ip->ip_dst; fp->ipq_frags = m; m->m_nextpkt = NULL; goto done; } else { fp->ipq_nfrags++; #ifdef MAC mac_ipq_update(m, fp); #endif } #define GETIP(m) ((struct ip*)((m)->m_pkthdr.PH_loc.ptr)) /* * Handle ECN by comparing this segment with the first one; * if CE is set, do not lose CE. * drop if CE and not-ECT are mixed for the same packet. */ ecn = ip->ip_tos & IPTOS_ECN_MASK; ecn0 = GETIP(fp->ipq_frags)->ip_tos & IPTOS_ECN_MASK; if (ecn == IPTOS_ECN_CE) { if (ecn0 == IPTOS_ECN_NOTECT) goto dropfrag; if (ecn0 != IPTOS_ECN_CE) GETIP(fp->ipq_frags)->ip_tos |= IPTOS_ECN_CE; } if (ecn == IPTOS_ECN_NOTECT && ecn0 != IPTOS_ECN_NOTECT) goto dropfrag; /* * Find a segment which begins after this one does. */ for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt) if (ntohs(GETIP(q)->ip_off) > ntohs(ip->ip_off)) break; /* * If there is a preceding segment, it may provide some of * our data already. If so, drop the data from the incoming * segment. If it provides all of our data, drop us, otherwise * stick new segment in the proper place. * * If some of the data is dropped from the preceding * segment, then it's checksum is invalidated. */ if (p) { i = ntohs(GETIP(p)->ip_off) + ntohs(GETIP(p)->ip_len) - ntohs(ip->ip_off); if (i > 0) { if (i >= ntohs(ip->ip_len)) goto dropfrag; m_adj(m, i); m->m_pkthdr.csum_flags = 0; ip->ip_off = htons(ntohs(ip->ip_off) + i); ip->ip_len = htons(ntohs(ip->ip_len) - i); } m->m_nextpkt = p->m_nextpkt; p->m_nextpkt = m; } else { m->m_nextpkt = fp->ipq_frags; fp->ipq_frags = m; } /* * While we overlap succeeding segments trim them or, * if they are completely covered, dequeue them. */ for (; q != NULL && ntohs(ip->ip_off) + ntohs(ip->ip_len) > ntohs(GETIP(q)->ip_off); q = nq) { i = (ntohs(ip->ip_off) + ntohs(ip->ip_len)) - ntohs(GETIP(q)->ip_off); if (i < ntohs(GETIP(q)->ip_len)) { GETIP(q)->ip_len = htons(ntohs(GETIP(q)->ip_len) - i); GETIP(q)->ip_off = htons(ntohs(GETIP(q)->ip_off) + i); m_adj(q, i); q->m_pkthdr.csum_flags = 0; break; } nq = q->m_nextpkt; m->m_nextpkt = nq; IPSTAT_INC(ips_fragdropped); fp->ipq_nfrags--; m_freem(q); } /* * Check for complete reassembly and perform frag per packet * limiting. * * Frag limiting is performed here so that the nth frag has * a chance to complete the packet before we drop the packet. * As a result, n+1 frags are actually allowed per packet, but * only n will ever be stored. (n = maxfragsperpacket.) * */ next = 0; for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt) { if (ntohs(GETIP(q)->ip_off) != next) { if (fp->ipq_nfrags > V_maxfragsperpacket) ipq_drop(head, fp); goto done; } next += ntohs(GETIP(q)->ip_len); } /* Make sure the last packet didn't have the IP_MF flag */ if (p->m_flags & M_IP_FRAG) { if (fp->ipq_nfrags > V_maxfragsperpacket) ipq_drop(head, fp); goto done; } /* * Reassembly is complete. Make sure the packet is a sane size. */ q = fp->ipq_frags; ip = GETIP(q); if (next + (ip->ip_hl << 2) > IP_MAXPACKET) { IPSTAT_INC(ips_toolong); ipq_drop(head, fp); goto done; } /* * Concatenate fragments. */ m = q; t = m->m_next; m->m_next = NULL; m_cat(m, t); nq = q->m_nextpkt; q->m_nextpkt = NULL; for (q = nq; q != NULL; q = nq) { nq = q->m_nextpkt; q->m_nextpkt = NULL; m->m_pkthdr.csum_flags &= q->m_pkthdr.csum_flags; m->m_pkthdr.csum_data += q->m_pkthdr.csum_data; m_cat(m, q); } /* * In order to do checksumming faster we do 'end-around carry' here * (and not in for{} loop), though it implies we are not going to * reassemble more than 64k fragments. */ while (m->m_pkthdr.csum_data & 0xffff0000) m->m_pkthdr.csum_data = (m->m_pkthdr.csum_data & 0xffff) + (m->m_pkthdr.csum_data >> 16); #ifdef MAC mac_ipq_reassemble(fp, m); mac_ipq_destroy(fp); #endif /* * Create header for new ip packet by modifying header of first * packet; dequeue and discard fragment reassembly header. * Make header visible. */ ip->ip_len = htons((ip->ip_hl << 2) + next); ip->ip_src = fp->ipq_src; ip->ip_dst = fp->ipq_dst; TAILQ_REMOVE(head, fp, ipq_list); uma_zfree(V_ipq_zone, fp); m->m_len += (ip->ip_hl << 2); m->m_data -= (ip->ip_hl << 2); /* some debugging cruft by sklower, below, will go away soon */ if (m->m_flags & M_PKTHDR) /* XXX this should be done elsewhere */ m_fixhdr(m); IPSTAT_INC(ips_reassembled); IPQ_UNLOCK(hash); #ifdef RSS /* * Query the RSS layer for the flowid / flowtype for the * mbuf payload. * * For now, just assume we have to calculate a new one. * Later on we should check to see if the assigned flowid matches * what RSS wants for the given IP protocol and if so, just keep it. * * We then queue into the relevant netisr so it can be dispatched * to the correct CPU. * * Note - this may return 1, which means the flowid in the mbuf * is correct for the configured RSS hash types and can be used. */ if (rss_mbuf_software_hash_v4(m, 0, &rss_hash, &rss_type) == 0) { m->m_pkthdr.flowid = rss_hash; M_HASHTYPE_SET(m, rss_type); } /* * Queue/dispatch for reprocessing. * * Note: this is much slower than just handling the frame in the * current receive context. It's likely worth investigating * why this is. */ netisr_dispatch(NETISR_IP_DIRECT, m); return (NULL); #endif /* Handle in-line */ return (m); dropfrag: IPSTAT_INC(ips_fragdropped); if (fp != NULL) fp->ipq_nfrags--; m_freem(m); done: IPQ_UNLOCK(hash); return (NULL); #undef GETIP } /* * Initialize IP reassembly structures. */ void ipreass_init(void) { for (int i = 0; i < IPREASS_NHASH; i++) { TAILQ_INIT(&V_ipq[i].head); mtx_init(&V_ipq[i].lock, "IP reassembly", NULL, MTX_DEF | MTX_DUPOK); } V_ipq_hashseed = arc4random(); V_maxfragsperpacket = 16; V_ipq_zone = uma_zcreate("ipq", sizeof(struct ipq), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); uma_zone_set_max(V_ipq_zone, nmbclusters / 32); if (IS_DEFAULT_VNET(curvnet)) EVENTHANDLER_REGISTER(nmbclusters_change, ipreass_zone_change, NULL, EVENTHANDLER_PRI_ANY); } /* * If a timer expires on a reassembly queue, discard it. */ void ipreass_slowtimo(void) { struct ipq *fp, *tmp; for (int i = 0; i < IPREASS_NHASH; i++) { IPQ_LOCK(i); TAILQ_FOREACH_SAFE(fp, &V_ipq[i].head, ipq_list, tmp) if (--fp->ipq_ttl == 0) ipq_timeout(&V_ipq[i].head, fp); IPQ_UNLOCK(i); } } /* * Drain off all datagram fragments. */ void ipreass_drain(void) { for (int i = 0; i < IPREASS_NHASH; i++) { IPQ_LOCK(i); while(!TAILQ_EMPTY(&V_ipq[i].head)) ipq_drop(&V_ipq[i].head, TAILQ_FIRST(&V_ipq[i].head)); IPQ_UNLOCK(i); } } #ifdef VIMAGE /* * Destroy IP reassembly structures. */ void ipreass_destroy(void) { ipreass_drain(); uma_zdestroy(V_ipq_zone); for (int i = 0; i < IPREASS_NHASH; i++) mtx_destroy(&V_ipq[i].lock); } #endif /* * After maxnipq has been updated, propagate the change to UMA. The UMA zone * max has slightly different semantics than the sysctl, for historical * reasons. */ static void ipreass_drain_tomax(void) { int target; /* * If we are over the maximum number of fragments, * drain off enough to get down to the new limit, * stripping off last elements on queues. Every * run we strip the oldest element from each bucket. */ target = uma_zone_get_max(V_ipq_zone); while (uma_zone_get_cur(V_ipq_zone) > target) { struct ipq *fp; for (int i = 0; i < IPREASS_NHASH; i++) { IPQ_LOCK(i); fp = TAILQ_LAST(&V_ipq[i].head, ipqhead); if (fp != NULL) ipq_timeout(&V_ipq[i].head, fp); IPQ_UNLOCK(i); } } } static void ipreass_zone_change(void *tag) { uma_zone_set_max(V_ipq_zone, nmbclusters / 32); ipreass_drain_tomax(); } /* * Change the limit on the UMA zone, or disable the fragment allocation * at all. Since 0 and -1 is a special values here, we need our own handler, * instead of sysctl_handle_uma_zone_max(). */ static int sysctl_maxfragpackets(SYSCTL_HANDLER_ARGS) { int error, max; if (V_noreass == 0) { max = uma_zone_get_max(V_ipq_zone); if (max == 0) max = -1; } else max = 0; error = sysctl_handle_int(oidp, &max, 0, req); if (error || !req->newptr) return (error); if (max > 0) { /* * XXXRW: Might be a good idea to sanity check the argument * and place an extreme upper bound. */ max = uma_zone_set_max(V_ipq_zone, max); ipreass_drain_tomax(); V_noreass = 0; } else if (max == 0) { V_noreass = 1; ipreass_drain(); } else if (max == -1) { V_noreass = 0; uma_zone_set_max(V_ipq_zone, 0); } else return (EINVAL); return (0); } /* * Seek for old fragment queue header that can be reused. Try to * reuse a header from currently locked hash bucket. */ static struct ipq * ipq_reuse(int start) { struct ipq *fp; int i; IPQ_LOCK_ASSERT(start); for (i = start;; i++) { if (i == IPREASS_NHASH) i = 0; if (i != start && IPQ_TRYLOCK(i) == 0) continue; fp = TAILQ_LAST(&V_ipq[i].head, ipqhead); if (fp) { struct mbuf *m; IPSTAT_ADD(ips_fragtimeout, fp->ipq_nfrags); while (fp->ipq_frags) { m = fp->ipq_frags; fp->ipq_frags = m->m_nextpkt; m_freem(m); } TAILQ_REMOVE(&V_ipq[i].head, fp, ipq_list); if (i != start) IPQ_UNLOCK(i); IPQ_LOCK_ASSERT(start); return (fp); } if (i != start) IPQ_UNLOCK(i); } } /* * Free a fragment reassembly header and all associated datagrams. */ static void ipq_free(struct ipqhead *fhp, struct ipq *fp) { struct mbuf *q; while (fp->ipq_frags) { q = fp->ipq_frags; fp->ipq_frags = q->m_nextpkt; m_freem(q); } TAILQ_REMOVE(fhp, fp, ipq_list); uma_zfree(V_ipq_zone, fp); } Index: user/ngie/more-tests/sys/netpfil/pf/pf.c =================================================================== --- user/ngie/more-tests/sys/netpfil/pf/pf.c (revision 281584) +++ user/ngie/more-tests/sys/netpfil/pf/pf.c (revision 281585) @@ -1,6444 +1,6444 @@ /*- * Copyright (c) 2001 Daniel Hartmeier * Copyright (c) 2002 - 2008 Henning Brauer * Copyright (c) 2012 Gleb Smirnoff * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials provided * with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * Effort sponsored in part by the Defense Advanced Research Projects * Agency (DARPA) and Air Force Research Laboratory, Air Force * Materiel Command, USAF, under agreement number F30602-01-2-0537. * * $OpenBSD: pf.c,v 1.634 2009/02/27 12:37:45 henning Exp $ */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include "opt_bpf.h" #include "opt_pf.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* XXX: only for DIR_IN/DIR_OUT */ #ifdef INET6 #include #include #include #include #include #endif /* INET6 */ #include #include #define DPFPRINTF(n, x) if (V_pf_status.debug >= (n)) printf x /* * Global variables */ /* state tables */ VNET_DEFINE(struct pf_altqqueue, pf_altqs[2]); VNET_DEFINE(struct pf_palist, pf_pabuf); VNET_DEFINE(struct pf_altqqueue *, pf_altqs_active); VNET_DEFINE(struct pf_altqqueue *, pf_altqs_inactive); VNET_DEFINE(struct pf_kstatus, pf_status); VNET_DEFINE(u_int32_t, ticket_altqs_active); VNET_DEFINE(u_int32_t, ticket_altqs_inactive); VNET_DEFINE(int, altqs_inactive_open); VNET_DEFINE(u_int32_t, ticket_pabuf); VNET_DEFINE(MD5_CTX, pf_tcp_secret_ctx); #define V_pf_tcp_secret_ctx VNET(pf_tcp_secret_ctx) VNET_DEFINE(u_char, pf_tcp_secret[16]); #define V_pf_tcp_secret VNET(pf_tcp_secret) VNET_DEFINE(int, pf_tcp_secret_init); #define V_pf_tcp_secret_init VNET(pf_tcp_secret_init) VNET_DEFINE(int, pf_tcp_iss_off); #define V_pf_tcp_iss_off VNET(pf_tcp_iss_off) /* * Queue for pf_intr() sends. */ static MALLOC_DEFINE(M_PFTEMP, "pf_temp", "pf(4) temporary allocations"); struct pf_send_entry { STAILQ_ENTRY(pf_send_entry) pfse_next; struct mbuf *pfse_m; enum { PFSE_IP, PFSE_IP6, PFSE_ICMP, PFSE_ICMP6, } pfse_type; struct { int type; int code; int mtu; } icmpopts; }; STAILQ_HEAD(pf_send_head, pf_send_entry); static VNET_DEFINE(struct pf_send_head, pf_sendqueue); #define V_pf_sendqueue VNET(pf_sendqueue) static struct mtx pf_sendqueue_mtx; #define PF_SENDQ_LOCK() mtx_lock(&pf_sendqueue_mtx) #define PF_SENDQ_UNLOCK() mtx_unlock(&pf_sendqueue_mtx) /* * Queue for pf_overload_task() tasks. */ struct pf_overload_entry { SLIST_ENTRY(pf_overload_entry) next; struct pf_addr addr; sa_family_t af; uint8_t dir; struct pf_rule *rule; }; SLIST_HEAD(pf_overload_head, pf_overload_entry); static VNET_DEFINE(struct pf_overload_head, pf_overloadqueue); #define V_pf_overloadqueue VNET(pf_overloadqueue) static VNET_DEFINE(struct task, pf_overloadtask); #define V_pf_overloadtask VNET(pf_overloadtask) static struct mtx pf_overloadqueue_mtx; #define PF_OVERLOADQ_LOCK() mtx_lock(&pf_overloadqueue_mtx) #define PF_OVERLOADQ_UNLOCK() mtx_unlock(&pf_overloadqueue_mtx) VNET_DEFINE(struct pf_rulequeue, pf_unlinked_rules); struct mtx pf_unlnkdrules_mtx; static VNET_DEFINE(uma_zone_t, pf_sources_z); #define V_pf_sources_z VNET(pf_sources_z) uma_zone_t pf_mtag_z; VNET_DEFINE(uma_zone_t, pf_state_z); VNET_DEFINE(uma_zone_t, pf_state_key_z); VNET_DEFINE(uint64_t, pf_stateid[MAXCPU]); #define PFID_CPUBITS 8 #define PFID_CPUSHIFT (sizeof(uint64_t) * NBBY - PFID_CPUBITS) #define PFID_CPUMASK ((uint64_t)((1 << PFID_CPUBITS) - 1) << PFID_CPUSHIFT) #define PFID_MAXID (~PFID_CPUMASK) CTASSERT((1 << PFID_CPUBITS) >= MAXCPU); static void pf_src_tree_remove_state(struct pf_state *); static void pf_init_threshold(struct pf_threshold *, u_int32_t, u_int32_t); static void pf_add_threshold(struct pf_threshold *); static int pf_check_threshold(struct pf_threshold *); static void pf_change_ap(struct pf_addr *, u_int16_t *, u_int16_t *, u_int16_t *, struct pf_addr *, u_int16_t, u_int8_t, sa_family_t); static int pf_modulate_sack(struct mbuf *, int, struct pf_pdesc *, struct tcphdr *, struct pf_state_peer *); static void pf_change_icmp(struct pf_addr *, u_int16_t *, struct pf_addr *, struct pf_addr *, u_int16_t, u_int16_t *, u_int16_t *, u_int16_t *, u_int16_t *, u_int8_t, sa_family_t); static void pf_send_tcp(struct mbuf *, const struct pf_rule *, sa_family_t, const struct pf_addr *, const struct pf_addr *, u_int16_t, u_int16_t, u_int32_t, u_int32_t, u_int8_t, u_int16_t, u_int16_t, u_int8_t, int, u_int16_t, struct ifnet *); static void pf_send_icmp(struct mbuf *, u_int8_t, u_int8_t, sa_family_t, struct pf_rule *); static void pf_detach_state(struct pf_state *); static int pf_state_key_attach(struct pf_state_key *, struct pf_state_key *, struct pf_state *); static void pf_state_key_detach(struct pf_state *, int); static int pf_state_key_ctor(void *, int, void *, int); static u_int32_t pf_tcp_iss(struct pf_pdesc *); static int pf_test_rule(struct pf_rule **, struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, struct pf_pdesc *, struct pf_rule **, struct pf_ruleset **, struct inpcb *); static int pf_create_state(struct pf_rule *, struct pf_rule *, struct pf_rule *, struct pf_pdesc *, struct pf_src_node *, struct pf_state_key *, struct pf_state_key *, struct mbuf *, int, u_int16_t, u_int16_t, int *, struct pfi_kif *, struct pf_state **, int, u_int16_t, u_int16_t, int); static int pf_test_fragment(struct pf_rule **, int, struct pfi_kif *, struct mbuf *, void *, struct pf_pdesc *, struct pf_rule **, struct pf_ruleset **); static int pf_tcp_track_full(struct pf_state_peer *, struct pf_state_peer *, struct pf_state **, struct pfi_kif *, struct mbuf *, int, struct pf_pdesc *, u_short *, int *); static int pf_tcp_track_sloppy(struct pf_state_peer *, struct pf_state_peer *, struct pf_state **, struct pf_pdesc *, u_short *); static int pf_test_state_tcp(struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, void *, struct pf_pdesc *, u_short *); static int pf_test_state_udp(struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, void *, struct pf_pdesc *); static int pf_test_state_icmp(struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, void *, struct pf_pdesc *, u_short *); static int pf_test_state_other(struct pf_state **, int, struct pfi_kif *, struct mbuf *, struct pf_pdesc *); static u_int8_t pf_get_wscale(struct mbuf *, int, u_int16_t, sa_family_t); static u_int16_t pf_get_mss(struct mbuf *, int, u_int16_t, sa_family_t); static u_int16_t pf_calc_mss(struct pf_addr *, sa_family_t, int, u_int16_t); static int pf_check_proto_cksum(struct mbuf *, int, int, u_int8_t, sa_family_t); static void pf_print_state_parts(struct pf_state *, struct pf_state_key *, struct pf_state_key *); static int pf_addr_wrap_neq(struct pf_addr_wrap *, struct pf_addr_wrap *); static struct pf_state *pf_find_state(struct pfi_kif *, struct pf_state_key_cmp *, u_int); static int pf_src_connlimit(struct pf_state **); static void pf_overload_task(void *v, int pending); static int pf_insert_src_node(struct pf_src_node **, struct pf_rule *, struct pf_addr *, sa_family_t); static u_int pf_purge_expired_states(u_int, int); static void pf_purge_unlinked_rules(void); static int pf_mtag_uminit(void *, int, int); static void pf_mtag_free(struct m_tag *); #ifdef INET static void pf_route(struct mbuf **, struct pf_rule *, int, struct ifnet *, struct pf_state *, struct pf_pdesc *); #endif /* INET */ #ifdef INET6 static void pf_change_a6(struct pf_addr *, u_int16_t *, struct pf_addr *, u_int8_t); static void pf_route6(struct mbuf **, struct pf_rule *, int, struct ifnet *, struct pf_state *, struct pf_pdesc *); #endif /* INET6 */ int in4_cksum(struct mbuf *m, u_int8_t nxt, int off, int len); VNET_DECLARE(int, pf_end_threads); VNET_DEFINE(struct pf_limit, pf_limits[PF_LIMIT_MAX]); #define PACKET_LOOPED(pd) ((pd)->pf_mtag && \ (pd)->pf_mtag->flags & PF_PACKET_LOOPED) #define STATE_LOOKUP(i, k, d, s, pd) \ do { \ (s) = pf_find_state((i), (k), (d)); \ if ((s) == NULL) \ return (PF_DROP); \ if (PACKET_LOOPED(pd)) \ return (PF_PASS); \ if ((d) == PF_OUT && \ (((s)->rule.ptr->rt == PF_ROUTETO && \ (s)->rule.ptr->direction == PF_OUT) || \ ((s)->rule.ptr->rt == PF_REPLYTO && \ (s)->rule.ptr->direction == PF_IN)) && \ (s)->rt_kif != NULL && \ (s)->rt_kif != (i)) \ return (PF_PASS); \ } while (0) #define BOUND_IFACE(r, k) \ ((r)->rule_flag & PFRULE_IFBOUND) ? (k) : V_pfi_all #define STATE_INC_COUNTERS(s) \ do { \ counter_u64_add(s->rule.ptr->states_cur, 1); \ counter_u64_add(s->rule.ptr->states_tot, 1); \ if (s->anchor.ptr != NULL) { \ counter_u64_add(s->anchor.ptr->states_cur, 1); \ counter_u64_add(s->anchor.ptr->states_tot, 1); \ } \ if (s->nat_rule.ptr != NULL) { \ counter_u64_add(s->nat_rule.ptr->states_cur, 1);\ counter_u64_add(s->nat_rule.ptr->states_tot, 1);\ } \ } while (0) #define STATE_DEC_COUNTERS(s) \ do { \ if (s->nat_rule.ptr != NULL) \ counter_u64_add(s->nat_rule.ptr->states_cur, -1);\ if (s->anchor.ptr != NULL) \ counter_u64_add(s->anchor.ptr->states_cur, -1); \ counter_u64_add(s->rule.ptr->states_cur, -1); \ } while (0) static MALLOC_DEFINE(M_PFHASH, "pf_hash", "pf(4) hash header structures"); VNET_DEFINE(struct pf_keyhash *, pf_keyhash); VNET_DEFINE(struct pf_idhash *, pf_idhash); VNET_DEFINE(struct pf_srchash *, pf_srchash); SYSCTL_NODE(_net, OID_AUTO, pf, CTLFLAG_RW, 0, "pf(4)"); u_long pf_hashmask; u_long pf_srchashmask; static u_long pf_hashsize; static u_long pf_srchashsize; SYSCTL_ULONG(_net_pf, OID_AUTO, states_hashsize, CTLFLAG_RDTUN, &pf_hashsize, 0, "Size of pf(4) states hashtable"); SYSCTL_ULONG(_net_pf, OID_AUTO, source_nodes_hashsize, CTLFLAG_RDTUN, &pf_srchashsize, 0, "Size of pf(4) source nodes hashtable"); VNET_DEFINE(void *, pf_swi_cookie); VNET_DEFINE(uint32_t, pf_hashseed); #define V_pf_hashseed VNET(pf_hashseed) int pf_addr_cmp(struct pf_addr *a, struct pf_addr *b, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: if (a->addr32[0] > b->addr32[0]) return (1); if (a->addr32[0] < b->addr32[0]) return (-1); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (a->addr32[3] > b->addr32[3]) return (1); if (a->addr32[3] < b->addr32[3]) return (-1); if (a->addr32[2] > b->addr32[2]) return (1); if (a->addr32[2] < b->addr32[2]) return (-1); if (a->addr32[1] > b->addr32[1]) return (1); if (a->addr32[1] < b->addr32[1]) return (-1); if (a->addr32[0] > b->addr32[0]) return (1); if (a->addr32[0] < b->addr32[0]) return (-1); break; #endif /* INET6 */ default: panic("%s: unknown address family %u", __func__, af); } return (0); } static __inline uint32_t pf_hashkey(struct pf_state_key *sk) { uint32_t h; h = murmur3_32_hash32((uint32_t *)sk, sizeof(struct pf_state_key_cmp)/sizeof(uint32_t), V_pf_hashseed); return (h & pf_hashmask); } static __inline uint32_t pf_hashsrc(struct pf_addr *addr, sa_family_t af) { uint32_t h; switch (af) { case AF_INET: h = murmur3_32_hash32((uint32_t *)&addr->v4, sizeof(addr->v4)/sizeof(uint32_t), V_pf_hashseed); break; case AF_INET6: h = murmur3_32_hash32((uint32_t *)&addr->v6, sizeof(addr->v6)/sizeof(uint32_t), V_pf_hashseed); break; default: panic("%s: unknown address family %u", __func__, af); } return (h & pf_srchashmask); } #ifdef INET6 void pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: dst->addr32[0] = src->addr32[0]; break; #endif /* INET */ case AF_INET6: dst->addr32[0] = src->addr32[0]; dst->addr32[1] = src->addr32[1]; dst->addr32[2] = src->addr32[2]; dst->addr32[3] = src->addr32[3]; break; } } #endif /* INET6 */ static void pf_init_threshold(struct pf_threshold *threshold, u_int32_t limit, u_int32_t seconds) { threshold->limit = limit * PF_THRESHOLD_MULT; threshold->seconds = seconds; threshold->count = 0; threshold->last = time_uptime; } static void pf_add_threshold(struct pf_threshold *threshold) { u_int32_t t = time_uptime, diff = t - threshold->last; if (diff >= threshold->seconds) threshold->count = 0; else threshold->count -= threshold->count * diff / threshold->seconds; threshold->count += PF_THRESHOLD_MULT; threshold->last = t; } static int pf_check_threshold(struct pf_threshold *threshold) { return (threshold->count > threshold->limit); } static int pf_src_connlimit(struct pf_state **state) { struct pf_overload_entry *pfoe; int bad = 0; PF_STATE_LOCK_ASSERT(*state); (*state)->src_node->conn++; (*state)->src.tcp_est = 1; pf_add_threshold(&(*state)->src_node->conn_rate); if ((*state)->rule.ptr->max_src_conn && (*state)->rule.ptr->max_src_conn < (*state)->src_node->conn) { counter_u64_add(V_pf_status.lcounters[LCNT_SRCCONN], 1); bad++; } if ((*state)->rule.ptr->max_src_conn_rate.limit && pf_check_threshold(&(*state)->src_node->conn_rate)) { counter_u64_add(V_pf_status.lcounters[LCNT_SRCCONNRATE], 1); bad++; } if (!bad) return (0); /* Kill this state. */ (*state)->timeout = PFTM_PURGE; (*state)->src.state = (*state)->dst.state = TCPS_CLOSED; if ((*state)->rule.ptr->overload_tbl == NULL) return (1); /* Schedule overloading and flushing task. */ pfoe = malloc(sizeof(*pfoe), M_PFTEMP, M_NOWAIT); if (pfoe == NULL) return (1); /* too bad :( */ bcopy(&(*state)->src_node->addr, &pfoe->addr, sizeof(pfoe->addr)); pfoe->af = (*state)->key[PF_SK_WIRE]->af; pfoe->rule = (*state)->rule.ptr; pfoe->dir = (*state)->direction; PF_OVERLOADQ_LOCK(); SLIST_INSERT_HEAD(&V_pf_overloadqueue, pfoe, next); PF_OVERLOADQ_UNLOCK(); taskqueue_enqueue(taskqueue_swi, &V_pf_overloadtask); return (1); } static void pf_overload_task(void *v, int pending) { struct pf_overload_head queue; struct pfr_addr p; struct pf_overload_entry *pfoe, *pfoe1; uint32_t killed = 0; CURVNET_SET((struct vnet *)v); PF_OVERLOADQ_LOCK(); queue = V_pf_overloadqueue; SLIST_INIT(&V_pf_overloadqueue); PF_OVERLOADQ_UNLOCK(); bzero(&p, sizeof(p)); SLIST_FOREACH(pfoe, &queue, next) { counter_u64_add(V_pf_status.lcounters[LCNT_OVERLOAD_TABLE], 1); if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("%s: blocking address ", __func__); pf_print_host(&pfoe->addr, 0, pfoe->af); printf("\n"); } p.pfra_af = pfoe->af; switch (pfoe->af) { #ifdef INET case AF_INET: p.pfra_net = 32; p.pfra_ip4addr = pfoe->addr.v4; break; #endif #ifdef INET6 case AF_INET6: p.pfra_net = 128; p.pfra_ip6addr = pfoe->addr.v6; break; #endif } PF_RULES_WLOCK(); pfr_insert_kentry(pfoe->rule->overload_tbl, &p, time_second); PF_RULES_WUNLOCK(); } /* * Remove those entries, that don't need flushing. */ SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1) if (pfoe->rule->flush == 0) { SLIST_REMOVE(&queue, pfoe, pf_overload_entry, next); free(pfoe, M_PFTEMP); } else counter_u64_add( V_pf_status.lcounters[LCNT_OVERLOAD_FLUSH], 1); /* If nothing to flush, return. */ if (SLIST_EMPTY(&queue)) { CURVNET_RESTORE(); return; } for (int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; struct pf_state_key *sk; struct pf_state *s; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { sk = s->key[PF_SK_WIRE]; SLIST_FOREACH(pfoe, &queue, next) if (sk->af == pfoe->af && ((pfoe->rule->flush & PF_FLUSH_GLOBAL) || pfoe->rule == s->rule.ptr) && ((pfoe->dir == PF_OUT && PF_AEQ(&pfoe->addr, &sk->addr[1], sk->af)) || (pfoe->dir == PF_IN && PF_AEQ(&pfoe->addr, &sk->addr[0], sk->af)))) { s->timeout = PFTM_PURGE; s->src.state = s->dst.state = TCPS_CLOSED; killed++; } } PF_HASHROW_UNLOCK(ih); } SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1) free(pfoe, M_PFTEMP); if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: %u states killed", __func__, killed); CURVNET_RESTORE(); } /* * Can return locked on failure, so that we can consistently * allocate and insert a new one. */ struct pf_src_node * pf_find_src_node(struct pf_addr *src, struct pf_rule *rule, sa_family_t af, int returnlocked) { struct pf_srchash *sh; struct pf_src_node *n; counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_SEARCH], 1); sh = &V_pf_srchash[pf_hashsrc(src, af)]; PF_HASHROW_LOCK(sh); LIST_FOREACH(n, &sh->nodes, entry) if (n->rule.ptr == rule && n->af == af && ((af == AF_INET && n->addr.v4.s_addr == src->v4.s_addr) || (af == AF_INET6 && bcmp(&n->addr, src, sizeof(*src)) == 0))) break; if (n != NULL) { n->states++; PF_HASHROW_UNLOCK(sh); } else if (returnlocked == 0) PF_HASHROW_UNLOCK(sh); return (n); } static int pf_insert_src_node(struct pf_src_node **sn, struct pf_rule *rule, struct pf_addr *src, sa_family_t af) { KASSERT((rule->rule_flag & PFRULE_RULESRCTRACK || rule->rpool.opts & PF_POOL_STICKYADDR), ("%s for non-tracking rule %p", __func__, rule)); if (*sn == NULL) *sn = pf_find_src_node(src, rule, af, 1); if (*sn == NULL) { struct pf_srchash *sh = &V_pf_srchash[pf_hashsrc(src, af)]; PF_HASHROW_ASSERT(sh); if (!rule->max_src_nodes || counter_u64_fetch(rule->src_nodes) < rule->max_src_nodes) (*sn) = uma_zalloc(V_pf_sources_z, M_NOWAIT | M_ZERO); else counter_u64_add(V_pf_status.lcounters[LCNT_SRCNODES], 1); if ((*sn) == NULL) { PF_HASHROW_UNLOCK(sh); return (-1); } pf_init_threshold(&(*sn)->conn_rate, rule->max_src_conn_rate.limit, rule->max_src_conn_rate.seconds); (*sn)->af = af; (*sn)->rule.ptr = rule; PF_ACPY(&(*sn)->addr, src, af); LIST_INSERT_HEAD(&sh->nodes, *sn, entry); (*sn)->creation = time_uptime; (*sn)->ruletype = rule->action; (*sn)->states = 1; if ((*sn)->rule.ptr != NULL) counter_u64_add((*sn)->rule.ptr->src_nodes, 1); PF_HASHROW_UNLOCK(sh); counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_INSERT], 1); } else { if (rule->max_src_states && (*sn)->states >= rule->max_src_states) { counter_u64_add(V_pf_status.lcounters[LCNT_SRCSTATES], 1); return (-1); } } return (0); } void pf_unlink_src_node(struct pf_src_node *src) { PF_HASHROW_ASSERT(&V_pf_srchash[pf_hashsrc(&src->addr, src->af)]); LIST_REMOVE(src, entry); if (src->rule.ptr) counter_u64_add(src->rule.ptr->src_nodes, -1); } u_int pf_free_src_nodes(struct pf_src_node_list *head) { struct pf_src_node *sn, *tmp; u_int count = 0; LIST_FOREACH_SAFE(sn, head, entry, tmp) { uma_zfree(V_pf_sources_z, sn); count++; } counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], count); return (count); } void pf_mtag_initialize() { pf_mtag_z = uma_zcreate("pf mtags", sizeof(struct m_tag) + sizeof(struct pf_mtag), NULL, NULL, pf_mtag_uminit, NULL, UMA_ALIGN_PTR, 0); } /* Per-vnet data storage structures initialization. */ void pf_initialize() { struct pf_keyhash *kh; struct pf_idhash *ih; struct pf_srchash *sh; u_int i; if (pf_hashsize == 0 || !powerof2(pf_hashsize)) pf_hashsize = PF_HASHSIZ; if (pf_srchashsize == 0 || !powerof2(pf_srchashsize)) pf_srchashsize = PF_HASHSIZ / 4; V_pf_hashseed = arc4random(); /* States and state keys storage. */ V_pf_state_z = uma_zcreate("pf states", sizeof(struct pf_state), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_limits[PF_LIMIT_STATES].zone = V_pf_state_z; uma_zone_set_max(V_pf_state_z, PFSTATE_HIWAT); uma_zone_set_warning(V_pf_state_z, "PF states limit reached"); V_pf_state_key_z = uma_zcreate("pf state keys", sizeof(struct pf_state_key), pf_state_key_ctor, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_keyhash = malloc(pf_hashsize * sizeof(struct pf_keyhash), M_PFHASH, M_WAITOK | M_ZERO); V_pf_idhash = malloc(pf_hashsize * sizeof(struct pf_idhash), M_PFHASH, M_WAITOK | M_ZERO); pf_hashmask = pf_hashsize - 1; for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= pf_hashmask; i++, kh++, ih++) { mtx_init(&kh->lock, "pf_keyhash", NULL, MTX_DEF | MTX_DUPOK); mtx_init(&ih->lock, "pf_idhash", NULL, MTX_DEF); } /* Source nodes. */ V_pf_sources_z = uma_zcreate("pf source nodes", sizeof(struct pf_src_node), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_limits[PF_LIMIT_SRC_NODES].zone = V_pf_sources_z; uma_zone_set_max(V_pf_sources_z, PFSNODE_HIWAT); uma_zone_set_warning(V_pf_sources_z, "PF source nodes limit reached"); V_pf_srchash = malloc(pf_srchashsize * sizeof(struct pf_srchash), M_PFHASH, M_WAITOK|M_ZERO); pf_srchashmask = pf_srchashsize - 1; for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) mtx_init(&sh->lock, "pf_srchash", NULL, MTX_DEF); /* ALTQ */ TAILQ_INIT(&V_pf_altqs[0]); TAILQ_INIT(&V_pf_altqs[1]); TAILQ_INIT(&V_pf_pabuf); V_pf_altqs_active = &V_pf_altqs[0]; V_pf_altqs_inactive = &V_pf_altqs[1]; /* Send & overload+flush queues. */ STAILQ_INIT(&V_pf_sendqueue); SLIST_INIT(&V_pf_overloadqueue); TASK_INIT(&V_pf_overloadtask, 0, pf_overload_task, curvnet); mtx_init(&pf_sendqueue_mtx, "pf send queue", NULL, MTX_DEF); mtx_init(&pf_overloadqueue_mtx, "pf overload/flush queue", NULL, MTX_DEF); /* Unlinked, but may be referenced rules. */ TAILQ_INIT(&V_pf_unlinked_rules); mtx_init(&pf_unlnkdrules_mtx, "pf unlinked rules", NULL, MTX_DEF); } void pf_mtag_cleanup() { uma_zdestroy(pf_mtag_z); } void pf_cleanup() { struct pf_keyhash *kh; struct pf_idhash *ih; struct pf_srchash *sh; struct pf_send_entry *pfse, *next; u_int i; for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= pf_hashmask; i++, kh++, ih++) { KASSERT(LIST_EMPTY(&kh->keys), ("%s: key hash not empty", __func__)); KASSERT(LIST_EMPTY(&ih->states), ("%s: id hash not empty", __func__)); mtx_destroy(&kh->lock); mtx_destroy(&ih->lock); } free(V_pf_keyhash, M_PFHASH); free(V_pf_idhash, M_PFHASH); for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { KASSERT(LIST_EMPTY(&sh->nodes), ("%s: source node hash not empty", __func__)); mtx_destroy(&sh->lock); } free(V_pf_srchash, M_PFHASH); STAILQ_FOREACH_SAFE(pfse, &V_pf_sendqueue, pfse_next, next) { m_freem(pfse->pfse_m); free(pfse, M_PFTEMP); } mtx_destroy(&pf_sendqueue_mtx); mtx_destroy(&pf_overloadqueue_mtx); mtx_destroy(&pf_unlnkdrules_mtx); uma_zdestroy(V_pf_sources_z); uma_zdestroy(V_pf_state_z); uma_zdestroy(V_pf_state_key_z); } static int pf_mtag_uminit(void *mem, int size, int how) { struct m_tag *t; t = (struct m_tag *)mem; t->m_tag_cookie = MTAG_ABI_COMPAT; t->m_tag_id = PACKET_TAG_PF; t->m_tag_len = sizeof(struct pf_mtag); t->m_tag_free = pf_mtag_free; return (0); } static void pf_mtag_free(struct m_tag *t) { uma_zfree(pf_mtag_z, t); } struct pf_mtag * pf_get_mtag(struct mbuf *m) { struct m_tag *mtag; if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) != NULL) return ((struct pf_mtag *)(mtag + 1)); mtag = uma_zalloc(pf_mtag_z, M_NOWAIT); if (mtag == NULL) return (NULL); bzero(mtag + 1, sizeof(struct pf_mtag)); m_tag_prepend(m, mtag); return ((struct pf_mtag *)(mtag + 1)); } static int pf_state_key_attach(struct pf_state_key *skw, struct pf_state_key *sks, struct pf_state *s) { struct pf_keyhash *khs, *khw, *kh; struct pf_state_key *sk, *cur; struct pf_state *si, *olds = NULL; int idx; KASSERT(s->refs == 0, ("%s: state not pristine", __func__)); KASSERT(s->key[PF_SK_WIRE] == NULL, ("%s: state has key", __func__)); KASSERT(s->key[PF_SK_STACK] == NULL, ("%s: state has key", __func__)); /* * We need to lock hash slots of both keys. To avoid deadlock * we always lock the slot with lower address first. Unlock order * isn't important. * * We also need to lock ID hash slot before dropping key * locks. On success we return with ID hash slot locked. */ if (skw == sks) { khs = khw = &V_pf_keyhash[pf_hashkey(skw)]; PF_HASHROW_LOCK(khs); } else { khs = &V_pf_keyhash[pf_hashkey(sks)]; khw = &V_pf_keyhash[pf_hashkey(skw)]; if (khs == khw) { PF_HASHROW_LOCK(khs); } else if (khs < khw) { PF_HASHROW_LOCK(khs); PF_HASHROW_LOCK(khw); } else { PF_HASHROW_LOCK(khw); PF_HASHROW_LOCK(khs); } } #define KEYS_UNLOCK() do { \ if (khs != khw) { \ PF_HASHROW_UNLOCK(khs); \ PF_HASHROW_UNLOCK(khw); \ } else \ PF_HASHROW_UNLOCK(khs); \ } while (0) /* * First run: start with wire key. */ sk = skw; kh = khw; idx = PF_SK_WIRE; keyattach: LIST_FOREACH(cur, &kh->keys, entry) if (bcmp(cur, sk, sizeof(struct pf_state_key_cmp)) == 0) break; if (cur != NULL) { /* Key exists. Check for same kif, if none, add to key. */ TAILQ_FOREACH(si, &cur->states[idx], key_list[idx]) { struct pf_idhash *ih = &V_pf_idhash[PF_IDHASH(si)]; PF_HASHROW_LOCK(ih); if (si->kif == s->kif && si->direction == s->direction) { if (sk->proto == IPPROTO_TCP && si->src.state >= TCPS_FIN_WAIT_2 && si->dst.state >= TCPS_FIN_WAIT_2) { /* * New state matches an old >FIN_WAIT_2 * state. We can't drop key hash locks, * thus we can't unlink it properly. * * As a workaround we drop it into * TCPS_CLOSED state, schedule purge * ASAP and push it into the very end * of the slot TAILQ, so that it won't * conflict with our new state. */ si->src.state = si->dst.state = TCPS_CLOSED; si->timeout = PFTM_PURGE; olds = si; } else { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: %s key attach " "failed on %s: ", (idx == PF_SK_WIRE) ? "wire" : "stack", s->kif->pfik_name); pf_print_state_parts(s, (idx == PF_SK_WIRE) ? sk : NULL, (idx == PF_SK_STACK) ? sk : NULL); printf(", existing: "); pf_print_state_parts(si, (idx == PF_SK_WIRE) ? sk : NULL, (idx == PF_SK_STACK) ? sk : NULL); printf("\n"); } PF_HASHROW_UNLOCK(ih); KEYS_UNLOCK(); uma_zfree(V_pf_state_key_z, sk); if (idx == PF_SK_STACK) pf_detach_state(s); return (EEXIST); /* collision! */ } } PF_HASHROW_UNLOCK(ih); } uma_zfree(V_pf_state_key_z, sk); s->key[idx] = cur; } else { LIST_INSERT_HEAD(&kh->keys, sk, entry); s->key[idx] = sk; } stateattach: /* List is sorted, if-bound states before floating. */ if (s->kif == V_pfi_all) TAILQ_INSERT_TAIL(&s->key[idx]->states[idx], s, key_list[idx]); else TAILQ_INSERT_HEAD(&s->key[idx]->states[idx], s, key_list[idx]); if (olds) { TAILQ_REMOVE(&s->key[idx]->states[idx], olds, key_list[idx]); TAILQ_INSERT_TAIL(&s->key[idx]->states[idx], olds, key_list[idx]); olds = NULL; } /* * Attach done. See how should we (or should not?) * attach a second key. */ if (sks == skw) { s->key[PF_SK_STACK] = s->key[PF_SK_WIRE]; idx = PF_SK_STACK; sks = NULL; goto stateattach; } else if (sks != NULL) { /* * Continue attaching with stack key. */ sk = sks; kh = khs; idx = PF_SK_STACK; sks = NULL; goto keyattach; } PF_STATE_LOCK(s); KEYS_UNLOCK(); KASSERT(s->key[PF_SK_WIRE] != NULL && s->key[PF_SK_STACK] != NULL, ("%s failure", __func__)); return (0); #undef KEYS_UNLOCK } static void pf_detach_state(struct pf_state *s) { struct pf_state_key *sks = s->key[PF_SK_STACK]; struct pf_keyhash *kh; if (sks != NULL) { kh = &V_pf_keyhash[pf_hashkey(sks)]; PF_HASHROW_LOCK(kh); if (s->key[PF_SK_STACK] != NULL) pf_state_key_detach(s, PF_SK_STACK); /* * If both point to same key, then we are done. */ if (sks == s->key[PF_SK_WIRE]) { pf_state_key_detach(s, PF_SK_WIRE); PF_HASHROW_UNLOCK(kh); return; } PF_HASHROW_UNLOCK(kh); } if (s->key[PF_SK_WIRE] != NULL) { kh = &V_pf_keyhash[pf_hashkey(s->key[PF_SK_WIRE])]; PF_HASHROW_LOCK(kh); if (s->key[PF_SK_WIRE] != NULL) pf_state_key_detach(s, PF_SK_WIRE); PF_HASHROW_UNLOCK(kh); } } static void pf_state_key_detach(struct pf_state *s, int idx) { struct pf_state_key *sk = s->key[idx]; #ifdef INVARIANTS struct pf_keyhash *kh = &V_pf_keyhash[pf_hashkey(sk)]; PF_HASHROW_ASSERT(kh); #endif TAILQ_REMOVE(&sk->states[idx], s, key_list[idx]); s->key[idx] = NULL; if (TAILQ_EMPTY(&sk->states[0]) && TAILQ_EMPTY(&sk->states[1])) { LIST_REMOVE(sk, entry); uma_zfree(V_pf_state_key_z, sk); } } static int pf_state_key_ctor(void *mem, int size, void *arg, int flags) { struct pf_state_key *sk = mem; bzero(sk, sizeof(struct pf_state_key_cmp)); TAILQ_INIT(&sk->states[PF_SK_WIRE]); TAILQ_INIT(&sk->states[PF_SK_STACK]); return (0); } struct pf_state_key * pf_state_key_setup(struct pf_pdesc *pd, struct pf_addr *saddr, struct pf_addr *daddr, u_int16_t sport, u_int16_t dport) { struct pf_state_key *sk; sk = uma_zalloc(V_pf_state_key_z, M_NOWAIT); if (sk == NULL) return (NULL); PF_ACPY(&sk->addr[pd->sidx], saddr, pd->af); PF_ACPY(&sk->addr[pd->didx], daddr, pd->af); sk->port[pd->sidx] = sport; sk->port[pd->didx] = dport; sk->proto = pd->proto; sk->af = pd->af; return (sk); } struct pf_state_key * pf_state_key_clone(struct pf_state_key *orig) { struct pf_state_key *sk; sk = uma_zalloc(V_pf_state_key_z, M_NOWAIT); if (sk == NULL) return (NULL); bcopy(orig, sk, sizeof(struct pf_state_key_cmp)); return (sk); } int pf_state_insert(struct pfi_kif *kif, struct pf_state_key *skw, struct pf_state_key *sks, struct pf_state *s) { struct pf_idhash *ih; struct pf_state *cur; int error; KASSERT(TAILQ_EMPTY(&sks->states[0]) && TAILQ_EMPTY(&sks->states[1]), ("%s: sks not pristine", __func__)); KASSERT(TAILQ_EMPTY(&skw->states[0]) && TAILQ_EMPTY(&skw->states[1]), ("%s: skw not pristine", __func__)); KASSERT(s->refs == 0, ("%s: state not pristine", __func__)); s->kif = kif; if (s->id == 0 && s->creatorid == 0) { /* XXX: should be atomic, but probability of collision low */ if ((s->id = V_pf_stateid[curcpu]++) == PFID_MAXID) V_pf_stateid[curcpu] = 1; s->id |= (uint64_t )curcpu << PFID_CPUSHIFT; s->id = htobe64(s->id); s->creatorid = V_pf_status.hostid; } /* Returns with ID locked on success. */ if ((error = pf_state_key_attach(skw, sks, s)) != 0) return (error); ih = &V_pf_idhash[PF_IDHASH(s)]; PF_HASHROW_ASSERT(ih); LIST_FOREACH(cur, &ih->states, entry) if (cur->id == s->id && cur->creatorid == s->creatorid) break; if (cur != NULL) { PF_HASHROW_UNLOCK(ih); if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: state ID collision: " "id: %016llx creatorid: %08x\n", (unsigned long long)be64toh(s->id), ntohl(s->creatorid)); } pf_detach_state(s); return (EEXIST); } LIST_INSERT_HEAD(&ih->states, s, entry); /* One for keys, one for ID hash. */ refcount_init(&s->refs, 2); counter_u64_add(V_pf_status.fcounters[FCNT_STATE_INSERT], 1); if (pfsync_insert_state_ptr != NULL) pfsync_insert_state_ptr(s); /* Returns locked. */ return (0); } /* * Find state by ID: returns with locked row on success. */ struct pf_state * pf_find_state_byid(uint64_t id, uint32_t creatorid) { struct pf_idhash *ih; struct pf_state *s; counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1); ih = &V_pf_idhash[(be64toh(id) % (pf_hashmask + 1))]; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) if (s->id == id && s->creatorid == creatorid) break; if (s == NULL) PF_HASHROW_UNLOCK(ih); return (s); } /* * Find state by key. * Returns with ID hash slot locked on success. */ static struct pf_state * pf_find_state(struct pfi_kif *kif, struct pf_state_key_cmp *key, u_int dir) { struct pf_keyhash *kh; struct pf_state_key *sk; struct pf_state *s; int idx; counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1); kh = &V_pf_keyhash[pf_hashkey((struct pf_state_key *)key)]; PF_HASHROW_LOCK(kh); LIST_FOREACH(sk, &kh->keys, entry) if (bcmp(sk, key, sizeof(struct pf_state_key_cmp)) == 0) break; if (sk == NULL) { PF_HASHROW_UNLOCK(kh); return (NULL); } idx = (dir == PF_IN ? PF_SK_WIRE : PF_SK_STACK); /* List is sorted, if-bound states before floating ones. */ TAILQ_FOREACH(s, &sk->states[idx], key_list[idx]) if (s->kif == V_pfi_all || s->kif == kif) { PF_STATE_LOCK(s); PF_HASHROW_UNLOCK(kh); if (s->timeout >= PFTM_MAX) { /* * State is either being processed by * pf_unlink_state() in an other thread, or * is scheduled for immediate expiry. */ PF_STATE_UNLOCK(s); return (NULL); } return (s); } PF_HASHROW_UNLOCK(kh); return (NULL); } struct pf_state * pf_find_state_all(struct pf_state_key_cmp *key, u_int dir, int *more) { struct pf_keyhash *kh; struct pf_state_key *sk; struct pf_state *s, *ret = NULL; int idx, inout = 0; counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1); kh = &V_pf_keyhash[pf_hashkey((struct pf_state_key *)key)]; PF_HASHROW_LOCK(kh); LIST_FOREACH(sk, &kh->keys, entry) if (bcmp(sk, key, sizeof(struct pf_state_key_cmp)) == 0) break; if (sk == NULL) { PF_HASHROW_UNLOCK(kh); return (NULL); } switch (dir) { case PF_IN: idx = PF_SK_WIRE; break; case PF_OUT: idx = PF_SK_STACK; break; case PF_INOUT: idx = PF_SK_WIRE; inout = 1; break; default: panic("%s: dir %u", __func__, dir); } second_run: TAILQ_FOREACH(s, &sk->states[idx], key_list[idx]) { if (more == NULL) { PF_HASHROW_UNLOCK(kh); return (s); } if (ret) (*more)++; else ret = s; } if (inout == 1) { inout = 0; idx = PF_SK_STACK; goto second_run; } PF_HASHROW_UNLOCK(kh); return (ret); } /* END state table stuff */ static void pf_send(struct pf_send_entry *pfse) { PF_SENDQ_LOCK(); STAILQ_INSERT_TAIL(&V_pf_sendqueue, pfse, pfse_next); PF_SENDQ_UNLOCK(); swi_sched(V_pf_swi_cookie, 0); } void pf_intr(void *v) { struct pf_send_head queue; struct pf_send_entry *pfse, *next; CURVNET_SET((struct vnet *)v); PF_SENDQ_LOCK(); queue = V_pf_sendqueue; STAILQ_INIT(&V_pf_sendqueue); PF_SENDQ_UNLOCK(); STAILQ_FOREACH_SAFE(pfse, &queue, pfse_next, next) { switch (pfse->pfse_type) { #ifdef INET case PFSE_IP: ip_output(pfse->pfse_m, NULL, NULL, 0, NULL, NULL); break; case PFSE_ICMP: icmp_error(pfse->pfse_m, pfse->icmpopts.type, pfse->icmpopts.code, 0, pfse->icmpopts.mtu); break; #endif /* INET */ #ifdef INET6 case PFSE_IP6: ip6_output(pfse->pfse_m, NULL, NULL, 0, NULL, NULL, NULL); break; case PFSE_ICMP6: icmp6_error(pfse->pfse_m, pfse->icmpopts.type, pfse->icmpopts.code, pfse->icmpopts.mtu); break; #endif /* INET6 */ default: panic("%s: unknown type", __func__); } free(pfse, M_PFTEMP); } CURVNET_RESTORE(); } void pf_purge_thread(void *v) { u_int idx = 0; CURVNET_SET((struct vnet *)v); for (;;) { PF_RULES_RLOCK(); rw_sleep(pf_purge_thread, &pf_rules_lock, 0, "pftm", hz / 10); if (V_pf_end_threads) { /* * To cleanse up all kifs and rules we need * two runs: first one clears reference flags, * then pf_purge_expired_states() doesn't * raise them, and then second run frees. */ PF_RULES_RUNLOCK(); pf_purge_unlinked_rules(); pfi_kif_purge(); /* * Now purge everything. */ pf_purge_expired_states(0, pf_hashmask); pf_purge_expired_fragments(); pf_purge_expired_src_nodes(); /* * Now all kifs & rules should be unreferenced, * thus should be successfully freed. */ pf_purge_unlinked_rules(); pfi_kif_purge(); /* * Announce success and exit. */ PF_RULES_RLOCK(); V_pf_end_threads++; PF_RULES_RUNLOCK(); wakeup(pf_purge_thread); kproc_exit(0); } PF_RULES_RUNLOCK(); /* Process 1/interval fraction of the state table every run. */ idx = pf_purge_expired_states(idx, pf_hashmask / (V_pf_default_rule.timeout[PFTM_INTERVAL] * 10)); /* Purge other expired types every PFTM_INTERVAL seconds. */ if (idx == 0) { /* * Order is important: * - states and src nodes reference rules * - states and rules reference kifs */ pf_purge_expired_fragments(); pf_purge_expired_src_nodes(); pf_purge_unlinked_rules(); pfi_kif_purge(); } } /* not reached */ CURVNET_RESTORE(); } u_int32_t pf_state_expires(const struct pf_state *state) { u_int32_t timeout; u_int32_t start; u_int32_t end; u_int32_t states; /* handle all PFTM_* > PFTM_MAX here */ if (state->timeout == PFTM_PURGE) return (time_uptime); KASSERT(state->timeout != PFTM_UNLINKED, ("pf_state_expires: timeout == PFTM_UNLINKED")); KASSERT((state->timeout < PFTM_MAX), ("pf_state_expires: timeout > PFTM_MAX")); timeout = state->rule.ptr->timeout[state->timeout]; if (!timeout) timeout = V_pf_default_rule.timeout[state->timeout]; start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START]; if (start) { end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END]; states = counter_u64_fetch(state->rule.ptr->states_cur); } else { start = V_pf_default_rule.timeout[PFTM_ADAPTIVE_START]; end = V_pf_default_rule.timeout[PFTM_ADAPTIVE_END]; states = V_pf_status.states; } if (end && states > start && start < end) { if (states < end) return (state->expire + timeout * (end - states) / (end - start)); else return (time_uptime); } return (state->expire + timeout); } void pf_purge_expired_src_nodes() { struct pf_src_node_list freelist; struct pf_srchash *sh; struct pf_src_node *cur, *next; int i; LIST_INIT(&freelist); for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { PF_HASHROW_LOCK(sh); LIST_FOREACH_SAFE(cur, &sh->nodes, entry, next) if (cur->states == 0 && cur->expire <= time_uptime) { pf_unlink_src_node(cur); LIST_INSERT_HEAD(&freelist, cur, entry); } else if (cur->rule.ptr != NULL) cur->rule.ptr->rule_flag |= PFRULE_REFS; PF_HASHROW_UNLOCK(sh); } pf_free_src_nodes(&freelist); V_pf_status.src_nodes = uma_zone_get_cur(V_pf_sources_z); } static void pf_src_tree_remove_state(struct pf_state *s) { struct pf_src_node *sn; struct pf_srchash *sh; uint32_t timeout; timeout = s->rule.ptr->timeout[PFTM_SRC_NODE] ? s->rule.ptr->timeout[PFTM_SRC_NODE] : V_pf_default_rule.timeout[PFTM_SRC_NODE]; if (s->src_node != NULL) { sn = s->src_node; sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)]; PF_HASHROW_LOCK(sh); if (s->src.tcp_est) --sn->conn; if (--sn->states == 0) sn->expire = time_uptime + timeout; PF_HASHROW_UNLOCK(sh); } if (s->nat_src_node != s->src_node && s->nat_src_node != NULL) { sn = s->nat_src_node; sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)]; PF_HASHROW_LOCK(sh); if (--sn->states == 0) sn->expire = time_uptime + timeout; PF_HASHROW_UNLOCK(sh); } s->src_node = s->nat_src_node = NULL; } /* * Unlink and potentilly free a state. Function may be * called with ID hash row locked, but always returns * unlocked, since it needs to go through key hash locking. */ int pf_unlink_state(struct pf_state *s, u_int flags) { struct pf_idhash *ih = &V_pf_idhash[PF_IDHASH(s)]; if ((flags & PF_ENTER_LOCKED) == 0) PF_HASHROW_LOCK(ih); else PF_HASHROW_ASSERT(ih); if (s->timeout == PFTM_UNLINKED) { /* * State is being processed * by pf_unlink_state() in * an other thread. */ PF_HASHROW_UNLOCK(ih); return (0); /* XXXGL: undefined actually */ } if (s->src.state == PF_TCPS_PROXY_DST) { /* XXX wire key the right one? */ pf_send_tcp(NULL, s->rule.ptr, s->key[PF_SK_WIRE]->af, &s->key[PF_SK_WIRE]->addr[1], &s->key[PF_SK_WIRE]->addr[0], s->key[PF_SK_WIRE]->port[1], s->key[PF_SK_WIRE]->port[0], s->src.seqhi, s->src.seqlo + 1, TH_RST|TH_ACK, 0, 0, 0, 1, s->tag, NULL); } LIST_REMOVE(s, entry); pf_src_tree_remove_state(s); if (pfsync_delete_state_ptr != NULL) pfsync_delete_state_ptr(s); STATE_DEC_COUNTERS(s); s->timeout = PFTM_UNLINKED; PF_HASHROW_UNLOCK(ih); pf_detach_state(s); refcount_release(&s->refs); return (pf_release_state(s)); } void pf_free_state(struct pf_state *cur) { KASSERT(cur->refs == 0, ("%s: %p has refs", __func__, cur)); KASSERT(cur->timeout == PFTM_UNLINKED, ("%s: timeout %u", __func__, cur->timeout)); pf_normalize_tcp_cleanup(cur); uma_zfree(V_pf_state_z, cur); counter_u64_add(V_pf_status.fcounters[FCNT_STATE_REMOVALS], 1); } /* * Called only from pf_purge_thread(), thus serialized. */ static u_int pf_purge_expired_states(u_int i, int maxcheck) { struct pf_idhash *ih; struct pf_state *s; V_pf_status.states = uma_zone_get_cur(V_pf_state_z); /* * Go through hash and unlink states that expire now. */ while (maxcheck > 0) { ih = &V_pf_idhash[i]; relock: PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { if (pf_state_expires(s) <= time_uptime) { V_pf_status.states -= pf_unlink_state(s, PF_ENTER_LOCKED); goto relock; } s->rule.ptr->rule_flag |= PFRULE_REFS; if (s->nat_rule.ptr != NULL) s->nat_rule.ptr->rule_flag |= PFRULE_REFS; if (s->anchor.ptr != NULL) s->anchor.ptr->rule_flag |= PFRULE_REFS; s->kif->pfik_flags |= PFI_IFLAG_REFS; if (s->rt_kif) s->rt_kif->pfik_flags |= PFI_IFLAG_REFS; } PF_HASHROW_UNLOCK(ih); /* Return when we hit end of hash. */ if (++i > pf_hashmask) { V_pf_status.states = uma_zone_get_cur(V_pf_state_z); return (0); } maxcheck--; } V_pf_status.states = uma_zone_get_cur(V_pf_state_z); return (i); } static void pf_purge_unlinked_rules() { struct pf_rulequeue tmpq; struct pf_rule *r, *r1; /* * If we have overloading task pending, then we'd * better skip purging this time. There is a tiny * probability that overloading task references * an already unlinked rule. */ PF_OVERLOADQ_LOCK(); if (!SLIST_EMPTY(&V_pf_overloadqueue)) { PF_OVERLOADQ_UNLOCK(); return; } PF_OVERLOADQ_UNLOCK(); /* * Do naive mark-and-sweep garbage collecting of old rules. * Reference flag is raised by pf_purge_expired_states() * and pf_purge_expired_src_nodes(). * * To avoid LOR between PF_UNLNKDRULES_LOCK/PF_RULES_WLOCK, * use a temporary queue. */ TAILQ_INIT(&tmpq); PF_UNLNKDRULES_LOCK(); TAILQ_FOREACH_SAFE(r, &V_pf_unlinked_rules, entries, r1) { if (!(r->rule_flag & PFRULE_REFS)) { TAILQ_REMOVE(&V_pf_unlinked_rules, r, entries); TAILQ_INSERT_TAIL(&tmpq, r, entries); } else r->rule_flag &= ~PFRULE_REFS; } PF_UNLNKDRULES_UNLOCK(); if (!TAILQ_EMPTY(&tmpq)) { PF_RULES_WLOCK(); TAILQ_FOREACH_SAFE(r, &tmpq, entries, r1) { TAILQ_REMOVE(&tmpq, r, entries); pf_free_rule(r); } PF_RULES_WUNLOCK(); } } void pf_print_host(struct pf_addr *addr, u_int16_t p, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: { u_int32_t a = ntohl(addr->addr32[0]); printf("%u.%u.%u.%u", (a>>24)&255, (a>>16)&255, (a>>8)&255, a&255); if (p) { p = ntohs(p); printf(":%u", p); } break; } #endif /* INET */ #ifdef INET6 case AF_INET6: { u_int16_t b; u_int8_t i, curstart, curend, maxstart, maxend; curstart = curend = maxstart = maxend = 255; for (i = 0; i < 8; i++) { if (!addr->addr16[i]) { if (curstart == 255) curstart = i; curend = i; } else { if ((curend - curstart) > (maxend - maxstart)) { maxstart = curstart; maxend = curend; } curstart = curend = 255; } } if ((curend - curstart) > (maxend - maxstart)) { maxstart = curstart; maxend = curend; } for (i = 0; i < 8; i++) { if (i >= maxstart && i <= maxend) { if (i == 0) printf(":"); if (i == maxend) printf(":"); } else { b = ntohs(addr->addr16[i]); printf("%x", b); if (i < 7) printf(":"); } } if (p) { p = ntohs(p); printf("[%u]", p); } break; } #endif /* INET6 */ } } void pf_print_state(struct pf_state *s) { pf_print_state_parts(s, NULL, NULL); } static void pf_print_state_parts(struct pf_state *s, struct pf_state_key *skwp, struct pf_state_key *sksp) { struct pf_state_key *skw, *sks; u_int8_t proto, dir; /* Do our best to fill these, but they're skipped if NULL */ skw = skwp ? skwp : (s ? s->key[PF_SK_WIRE] : NULL); sks = sksp ? sksp : (s ? s->key[PF_SK_STACK] : NULL); proto = skw ? skw->proto : (sks ? sks->proto : 0); dir = s ? s->direction : 0; switch (proto) { case IPPROTO_IPV4: printf("IPv4"); break; case IPPROTO_IPV6: printf("IPv6"); break; case IPPROTO_TCP: printf("TCP"); break; case IPPROTO_UDP: printf("UDP"); break; case IPPROTO_ICMP: printf("ICMP"); break; case IPPROTO_ICMPV6: printf("ICMPv6"); break; default: printf("%u", skw->proto); break; } switch (dir) { case PF_IN: printf(" in"); break; case PF_OUT: printf(" out"); break; } if (skw) { printf(" wire: "); pf_print_host(&skw->addr[0], skw->port[0], skw->af); printf(" "); pf_print_host(&skw->addr[1], skw->port[1], skw->af); } if (sks) { printf(" stack: "); if (sks != skw) { pf_print_host(&sks->addr[0], sks->port[0], sks->af); printf(" "); pf_print_host(&sks->addr[1], sks->port[1], sks->af); } else printf("-"); } if (s) { if (proto == IPPROTO_TCP) { printf(" [lo=%u high=%u win=%u modulator=%u", s->src.seqlo, s->src.seqhi, s->src.max_win, s->src.seqdiff); if (s->src.wscale && s->dst.wscale) printf(" wscale=%u", s->src.wscale & PF_WSCALE_MASK); printf("]"); printf(" [lo=%u high=%u win=%u modulator=%u", s->dst.seqlo, s->dst.seqhi, s->dst.max_win, s->dst.seqdiff); if (s->src.wscale && s->dst.wscale) printf(" wscale=%u", s->dst.wscale & PF_WSCALE_MASK); printf("]"); } printf(" %u:%u", s->src.state, s->dst.state); } } void pf_print_flags(u_int8_t f) { if (f) printf(" "); if (f & TH_FIN) printf("F"); if (f & TH_SYN) printf("S"); if (f & TH_RST) printf("R"); if (f & TH_PUSH) printf("P"); if (f & TH_ACK) printf("A"); if (f & TH_URG) printf("U"); if (f & TH_ECE) printf("E"); if (f & TH_CWR) printf("W"); } #define PF_SET_SKIP_STEPS(i) \ do { \ while (head[i] != cur) { \ head[i]->skip[i].ptr = cur; \ head[i] = TAILQ_NEXT(head[i], entries); \ } \ } while (0) void pf_calc_skip_steps(struct pf_rulequeue *rules) { struct pf_rule *cur, *prev, *head[PF_SKIP_COUNT]; int i; cur = TAILQ_FIRST(rules); prev = cur; for (i = 0; i < PF_SKIP_COUNT; ++i) head[i] = cur; while (cur != NULL) { if (cur->kif != prev->kif || cur->ifnot != prev->ifnot) PF_SET_SKIP_STEPS(PF_SKIP_IFP); if (cur->direction != prev->direction) PF_SET_SKIP_STEPS(PF_SKIP_DIR); if (cur->af != prev->af) PF_SET_SKIP_STEPS(PF_SKIP_AF); if (cur->proto != prev->proto) PF_SET_SKIP_STEPS(PF_SKIP_PROTO); if (cur->src.neg != prev->src.neg || pf_addr_wrap_neq(&cur->src.addr, &prev->src.addr)) PF_SET_SKIP_STEPS(PF_SKIP_SRC_ADDR); if (cur->src.port[0] != prev->src.port[0] || cur->src.port[1] != prev->src.port[1] || cur->src.port_op != prev->src.port_op) PF_SET_SKIP_STEPS(PF_SKIP_SRC_PORT); if (cur->dst.neg != prev->dst.neg || pf_addr_wrap_neq(&cur->dst.addr, &prev->dst.addr)) PF_SET_SKIP_STEPS(PF_SKIP_DST_ADDR); if (cur->dst.port[0] != prev->dst.port[0] || cur->dst.port[1] != prev->dst.port[1] || cur->dst.port_op != prev->dst.port_op) PF_SET_SKIP_STEPS(PF_SKIP_DST_PORT); prev = cur; cur = TAILQ_NEXT(cur, entries); } for (i = 0; i < PF_SKIP_COUNT; ++i) PF_SET_SKIP_STEPS(i); } static int pf_addr_wrap_neq(struct pf_addr_wrap *aw1, struct pf_addr_wrap *aw2) { if (aw1->type != aw2->type) return (1); switch (aw1->type) { case PF_ADDR_ADDRMASK: case PF_ADDR_RANGE: if (PF_ANEQ(&aw1->v.a.addr, &aw2->v.a.addr, 0)) return (1); if (PF_ANEQ(&aw1->v.a.mask, &aw2->v.a.mask, 0)) return (1); return (0); case PF_ADDR_DYNIFTL: return (aw1->p.dyn->pfid_kt != aw2->p.dyn->pfid_kt); case PF_ADDR_NOROUTE: case PF_ADDR_URPFFAILED: return (0); case PF_ADDR_TABLE: return (aw1->p.tbl != aw2->p.tbl); default: printf("invalid address type: %d\n", aw1->type); return (1); } } u_int16_t pf_cksum_fixup(u_int16_t cksum, u_int16_t old, u_int16_t new, u_int8_t udp) { u_int32_t l; if (udp && !cksum) return (0x0000); l = cksum + old - new; l = (l >> 16) + (l & 65535); l = l & 65535; if (udp && !l) return (0xFFFF); return (l); } static void pf_change_ap(struct pf_addr *a, u_int16_t *p, u_int16_t *ic, u_int16_t *pc, struct pf_addr *an, u_int16_t pn, u_int8_t u, sa_family_t af) { struct pf_addr ao; u_int16_t po = *p; PF_ACPY(&ao, a, af); PF_ACPY(a, an, af); *p = pn; switch (af) { #ifdef INET case AF_INET: *ic = pf_cksum_fixup(pf_cksum_fixup(*ic, ao.addr16[0], an->addr16[0], 0), ao.addr16[1], an->addr16[1], 0); *p = pn; *pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(*pc, ao.addr16[0], an->addr16[0], u), ao.addr16[1], an->addr16[1], u), po, pn, u); break; #endif /* INET */ #ifdef INET6 case AF_INET6: *pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(*pc, ao.addr16[0], an->addr16[0], u), ao.addr16[1], an->addr16[1], u), ao.addr16[2], an->addr16[2], u), ao.addr16[3], an->addr16[3], u), ao.addr16[4], an->addr16[4], u), ao.addr16[5], an->addr16[5], u), ao.addr16[6], an->addr16[6], u), ao.addr16[7], an->addr16[7], u), po, pn, u); break; #endif /* INET6 */ } } /* Changes a u_int32_t. Uses a void * so there are no align restrictions */ void pf_change_a(void *a, u_int16_t *c, u_int32_t an, u_int8_t u) { u_int32_t ao; memcpy(&ao, a, sizeof(ao)); memcpy(a, &an, sizeof(u_int32_t)); *c = pf_cksum_fixup(pf_cksum_fixup(*c, ao / 65536, an / 65536, u), ao % 65536, an % 65536, u); } #ifdef INET6 static void pf_change_a6(struct pf_addr *a, u_int16_t *c, struct pf_addr *an, u_int8_t u) { struct pf_addr ao; PF_ACPY(&ao, a, AF_INET6); PF_ACPY(a, an, AF_INET6); *c = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*c, ao.addr16[0], an->addr16[0], u), ao.addr16[1], an->addr16[1], u), ao.addr16[2], an->addr16[2], u), ao.addr16[3], an->addr16[3], u), ao.addr16[4], an->addr16[4], u), ao.addr16[5], an->addr16[5], u), ao.addr16[6], an->addr16[6], u), ao.addr16[7], an->addr16[7], u); } #endif /* INET6 */ static void pf_change_icmp(struct pf_addr *ia, u_int16_t *ip, struct pf_addr *oa, struct pf_addr *na, u_int16_t np, u_int16_t *pc, u_int16_t *h2c, u_int16_t *ic, u_int16_t *hc, u_int8_t u, sa_family_t af) { struct pf_addr oia, ooa; PF_ACPY(&oia, ia, af); if (oa) PF_ACPY(&ooa, oa, af); /* Change inner protocol port, fix inner protocol checksum. */ if (ip != NULL) { u_int16_t oip = *ip; u_int32_t opc; if (pc != NULL) opc = *pc; *ip = np; if (pc != NULL) *pc = pf_cksum_fixup(*pc, oip, *ip, u); *ic = pf_cksum_fixup(*ic, oip, *ip, 0); if (pc != NULL) *ic = pf_cksum_fixup(*ic, opc, *pc, 0); } /* Change inner ip address, fix inner ip and icmp checksums. */ PF_ACPY(ia, na, af); switch (af) { #ifdef INET case AF_INET: { u_int32_t oh2c = *h2c; *h2c = pf_cksum_fixup(pf_cksum_fixup(*h2c, oia.addr16[0], ia->addr16[0], 0), oia.addr16[1], ia->addr16[1], 0); *ic = pf_cksum_fixup(pf_cksum_fixup(*ic, oia.addr16[0], ia->addr16[0], 0), oia.addr16[1], ia->addr16[1], 0); *ic = pf_cksum_fixup(*ic, oh2c, *h2c, 0); break; } #endif /* INET */ #ifdef INET6 case AF_INET6: *ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*ic, oia.addr16[0], ia->addr16[0], u), oia.addr16[1], ia->addr16[1], u), oia.addr16[2], ia->addr16[2], u), oia.addr16[3], ia->addr16[3], u), oia.addr16[4], ia->addr16[4], u), oia.addr16[5], ia->addr16[5], u), oia.addr16[6], ia->addr16[6], u), oia.addr16[7], ia->addr16[7], u); break; #endif /* INET6 */ } /* Outer ip address, fix outer ip or icmpv6 checksum, if necessary. */ if (oa) { PF_ACPY(oa, na, af); switch (af) { #ifdef INET case AF_INET: *hc = pf_cksum_fixup(pf_cksum_fixup(*hc, ooa.addr16[0], oa->addr16[0], 0), ooa.addr16[1], oa->addr16[1], 0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: *ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*ic, ooa.addr16[0], oa->addr16[0], u), ooa.addr16[1], oa->addr16[1], u), ooa.addr16[2], oa->addr16[2], u), ooa.addr16[3], oa->addr16[3], u), ooa.addr16[4], oa->addr16[4], u), ooa.addr16[5], oa->addr16[5], u), ooa.addr16[6], oa->addr16[6], u), ooa.addr16[7], oa->addr16[7], u); break; #endif /* INET6 */ } } } /* * Need to modulate the sequence numbers in the TCP SACK option * (credits to Krzysztof Pfaff for report and patch) */ static int pf_modulate_sack(struct mbuf *m, int off, struct pf_pdesc *pd, struct tcphdr *th, struct pf_state_peer *dst) { int hlen = (th->th_off << 2) - sizeof(*th), thoptlen = hlen; u_int8_t opts[TCP_MAXOLEN], *opt = opts; int copyback = 0, i, olen; struct sackblk sack; #define TCPOLEN_SACKLEN (TCPOLEN_SACK + 2) if (hlen < TCPOLEN_SACKLEN || !pf_pull_hdr(m, off + sizeof(*th), opts, hlen, NULL, NULL, pd->af)) return 0; while (hlen >= TCPOLEN_SACKLEN) { olen = opt[1]; switch (*opt) { case TCPOPT_EOL: /* FALLTHROUGH */ case TCPOPT_NOP: opt++; hlen--; break; case TCPOPT_SACK: if (olen > hlen) olen = hlen; if (olen >= TCPOLEN_SACKLEN) { for (i = 2; i + TCPOLEN_SACK <= olen; i += TCPOLEN_SACK) { memcpy(&sack, &opt[i], sizeof(sack)); pf_change_a(&sack.start, &th->th_sum, htonl(ntohl(sack.start) - dst->seqdiff), 0); pf_change_a(&sack.end, &th->th_sum, htonl(ntohl(sack.end) - dst->seqdiff), 0); memcpy(&opt[i], &sack, sizeof(sack)); } copyback = 1; } /* FALLTHROUGH */ default: if (olen < 2) olen = 2; hlen -= olen; opt += olen; } } if (copyback) m_copyback(m, off + sizeof(*th), thoptlen, (caddr_t)opts); return (copyback); } static void pf_send_tcp(struct mbuf *replyto, const struct pf_rule *r, sa_family_t af, const struct pf_addr *saddr, const struct pf_addr *daddr, u_int16_t sport, u_int16_t dport, u_int32_t seq, u_int32_t ack, u_int8_t flags, u_int16_t win, u_int16_t mss, u_int8_t ttl, int tag, u_int16_t rtag, struct ifnet *ifp) { struct pf_send_entry *pfse; struct mbuf *m; int len, tlen; #ifdef INET struct ip *h = NULL; #endif /* INET */ #ifdef INET6 struct ip6_hdr *h6 = NULL; #endif /* INET6 */ struct tcphdr *th; char *opt; struct pf_mtag *pf_mtag; len = 0; th = NULL; /* maximum segment size tcp option */ tlen = sizeof(struct tcphdr); if (mss) tlen += 4; switch (af) { #ifdef INET case AF_INET: len = sizeof(struct ip) + tlen; break; #endif /* INET */ #ifdef INET6 case AF_INET6: len = sizeof(struct ip6_hdr) + tlen; break; #endif /* INET6 */ default: panic("%s: unsupported af %d", __func__, af); } /* Allocate outgoing queue entry, mbuf and mbuf tag. */ pfse = malloc(sizeof(*pfse), M_PFTEMP, M_NOWAIT); if (pfse == NULL) return; m = m_gethdr(M_NOWAIT, MT_DATA); if (m == NULL) { free(pfse, M_PFTEMP); return; } #ifdef MAC mac_netinet_firewall_send(m); #endif if ((pf_mtag = pf_get_mtag(m)) == NULL) { free(pfse, M_PFTEMP); m_freem(m); return; } if (tag) m->m_flags |= M_SKIP_FIREWALL; pf_mtag->tag = rtag; if (r != NULL && r->rtableid >= 0) M_SETFIB(m, r->rtableid); #ifdef ALTQ if (r != NULL && r->qid) { pf_mtag->qid = r->qid; /* add hints for ecn */ pf_mtag->hdr = mtod(m, struct ip *); } #endif /* ALTQ */ m->m_data += max_linkhdr; m->m_pkthdr.len = m->m_len = len; m->m_pkthdr.rcvif = NULL; bzero(m->m_data, len); switch (af) { #ifdef INET case AF_INET: h = mtod(m, struct ip *); /* IP header fields included in the TCP checksum */ h->ip_p = IPPROTO_TCP; h->ip_len = htons(tlen); h->ip_src.s_addr = saddr->v4.s_addr; h->ip_dst.s_addr = daddr->v4.s_addr; th = (struct tcphdr *)((caddr_t)h + sizeof(struct ip)); break; #endif /* INET */ #ifdef INET6 case AF_INET6: h6 = mtod(m, struct ip6_hdr *); /* IP header fields included in the TCP checksum */ h6->ip6_nxt = IPPROTO_TCP; h6->ip6_plen = htons(tlen); memcpy(&h6->ip6_src, &saddr->v6, sizeof(struct in6_addr)); memcpy(&h6->ip6_dst, &daddr->v6, sizeof(struct in6_addr)); th = (struct tcphdr *)((caddr_t)h6 + sizeof(struct ip6_hdr)); break; #endif /* INET6 */ } /* TCP header */ th->th_sport = sport; th->th_dport = dport; th->th_seq = htonl(seq); th->th_ack = htonl(ack); th->th_off = tlen >> 2; th->th_flags = flags; th->th_win = htons(win); if (mss) { opt = (char *)(th + 1); opt[0] = TCPOPT_MAXSEG; opt[1] = 4; HTONS(mss); bcopy((caddr_t)&mss, (caddr_t)(opt + 2), 2); } switch (af) { #ifdef INET case AF_INET: /* TCP checksum */ th->th_sum = in_cksum(m, len); /* Finish the IP header */ h->ip_v = 4; h->ip_hl = sizeof(*h) >> 2; h->ip_tos = IPTOS_LOWDELAY; h->ip_off = htons(V_path_mtu_discovery ? IP_DF : 0); h->ip_len = htons(len); h->ip_ttl = ttl ? ttl : V_ip_defttl; h->ip_sum = 0; pfse->pfse_type = PFSE_IP; break; #endif /* INET */ #ifdef INET6 case AF_INET6: /* TCP checksum */ th->th_sum = in6_cksum(m, IPPROTO_TCP, sizeof(struct ip6_hdr), tlen); h6->ip6_vfc |= IPV6_VERSION; h6->ip6_hlim = IPV6_DEFHLIM; pfse->pfse_type = PFSE_IP6; break; #endif /* INET6 */ } pfse->pfse_m = m; pf_send(pfse); } static void pf_send_icmp(struct mbuf *m, u_int8_t type, u_int8_t code, sa_family_t af, struct pf_rule *r) { struct pf_send_entry *pfse; struct mbuf *m0; struct pf_mtag *pf_mtag; /* Allocate outgoing queue entry, mbuf and mbuf tag. */ pfse = malloc(sizeof(*pfse), M_PFTEMP, M_NOWAIT); if (pfse == NULL) return; if ((m0 = m_copypacket(m, M_NOWAIT)) == NULL) { free(pfse, M_PFTEMP); return; } if ((pf_mtag = pf_get_mtag(m0)) == NULL) { free(pfse, M_PFTEMP); return; } /* XXX: revisit */ m0->m_flags |= M_SKIP_FIREWALL; if (r->rtableid >= 0) M_SETFIB(m0, r->rtableid); #ifdef ALTQ if (r->qid) { pf_mtag->qid = r->qid; /* add hints for ecn */ pf_mtag->hdr = mtod(m0, struct ip *); } #endif /* ALTQ */ switch (af) { #ifdef INET case AF_INET: pfse->pfse_type = PFSE_ICMP; break; #endif /* INET */ #ifdef INET6 case AF_INET6: pfse->pfse_type = PFSE_ICMP6; break; #endif /* INET6 */ } pfse->pfse_m = m0; pfse->icmpopts.type = type; pfse->icmpopts.code = code; pf_send(pfse); } /* * Return 1 if the addresses a and b match (with mask m), otherwise return 0. * If n is 0, they match if they are equal. If n is != 0, they match if they * are different. */ int pf_match_addr(u_int8_t n, struct pf_addr *a, struct pf_addr *m, struct pf_addr *b, sa_family_t af) { int match = 0; switch (af) { #ifdef INET case AF_INET: if ((a->addr32[0] & m->addr32[0]) == (b->addr32[0] & m->addr32[0])) match++; break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (((a->addr32[0] & m->addr32[0]) == (b->addr32[0] & m->addr32[0])) && ((a->addr32[1] & m->addr32[1]) == (b->addr32[1] & m->addr32[1])) && ((a->addr32[2] & m->addr32[2]) == (b->addr32[2] & m->addr32[2])) && ((a->addr32[3] & m->addr32[3]) == (b->addr32[3] & m->addr32[3]))) match++; break; #endif /* INET6 */ } if (match) { if (n) return (0); else return (1); } else { if (n) return (1); else return (0); } } /* * Return 1 if b <= a <= e, otherwise return 0. */ int pf_match_addr_range(struct pf_addr *b, struct pf_addr *e, struct pf_addr *a, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: if ((a->addr32[0] < b->addr32[0]) || (a->addr32[0] > e->addr32[0])) return (0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: { int i; /* check a >= b */ for (i = 0; i < 4; ++i) if (a->addr32[i] > b->addr32[i]) break; else if (a->addr32[i] < b->addr32[i]) return (0); /* check a <= e */ for (i = 0; i < 4; ++i) if (a->addr32[i] < e->addr32[i]) break; else if (a->addr32[i] > e->addr32[i]) return (0); break; } #endif /* INET6 */ } return (1); } static int pf_match(u_int8_t op, u_int32_t a1, u_int32_t a2, u_int32_t p) { switch (op) { case PF_OP_IRG: return ((p > a1) && (p < a2)); case PF_OP_XRG: return ((p < a1) || (p > a2)); case PF_OP_RRG: return ((p >= a1) && (p <= a2)); case PF_OP_EQ: return (p == a1); case PF_OP_NE: return (p != a1); case PF_OP_LT: return (p < a1); case PF_OP_LE: return (p <= a1); case PF_OP_GT: return (p > a1); case PF_OP_GE: return (p >= a1); } return (0); /* never reached */ } int pf_match_port(u_int8_t op, u_int16_t a1, u_int16_t a2, u_int16_t p) { NTOHS(a1); NTOHS(a2); NTOHS(p); return (pf_match(op, a1, a2, p)); } static int pf_match_uid(u_int8_t op, uid_t a1, uid_t a2, uid_t u) { if (u == UID_MAX && op != PF_OP_EQ && op != PF_OP_NE) return (0); return (pf_match(op, a1, a2, u)); } static int pf_match_gid(u_int8_t op, gid_t a1, gid_t a2, gid_t g) { if (g == GID_MAX && op != PF_OP_EQ && op != PF_OP_NE) return (0); return (pf_match(op, a1, a2, g)); } int pf_match_tag(struct mbuf *m, struct pf_rule *r, int *tag, int mtag) { if (*tag == -1) *tag = mtag; return ((!r->match_tag_not && r->match_tag == *tag) || (r->match_tag_not && r->match_tag != *tag)); } int pf_tag_packet(struct mbuf *m, struct pf_pdesc *pd, int tag) { KASSERT(tag > 0, ("%s: tag %d", __func__, tag)); if (pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(m)) == NULL)) return (ENOMEM); pd->pf_mtag->tag = tag; return (0); } #define PF_ANCHOR_STACKSIZE 32 struct pf_anchor_stackframe { struct pf_ruleset *rs; struct pf_rule *r; /* XXX: + match bit */ struct pf_anchor *child; }; /* * XXX: We rely on malloc(9) returning pointer aligned addresses. */ #define PF_ANCHORSTACK_MATCH 0x00000001 #define PF_ANCHORSTACK_MASK (PF_ANCHORSTACK_MATCH) #define PF_ANCHOR_MATCH(f) ((uintptr_t)(f)->r & PF_ANCHORSTACK_MATCH) #define PF_ANCHOR_RULE(f) (struct pf_rule *) \ ((uintptr_t)(f)->r & ~PF_ANCHORSTACK_MASK) #define PF_ANCHOR_SET_MATCH(f) do { (f)->r = (void *) \ ((uintptr_t)(f)->r | PF_ANCHORSTACK_MATCH); \ } while (0) void pf_step_into_anchor(struct pf_anchor_stackframe *stack, int *depth, struct pf_ruleset **rs, int n, struct pf_rule **r, struct pf_rule **a, int *match) { struct pf_anchor_stackframe *f; PF_RULES_RASSERT(); if (match) *match = 0; if (*depth >= PF_ANCHOR_STACKSIZE) { printf("%s: anchor stack overflow on %s\n", __func__, (*r)->anchor->name); *r = TAILQ_NEXT(*r, entries); return; } else if (*depth == 0 && a != NULL) *a = *r; f = stack + (*depth)++; f->rs = *rs; f->r = *r; if ((*r)->anchor_wildcard) { struct pf_anchor_node *parent = &(*r)->anchor->children; if ((f->child = RB_MIN(pf_anchor_node, parent)) == NULL) { *r = NULL; return; } *rs = &f->child->ruleset; } else { f->child = NULL; *rs = &(*r)->anchor->ruleset; } *r = TAILQ_FIRST((*rs)->rules[n].active.ptr); } int pf_step_out_of_anchor(struct pf_anchor_stackframe *stack, int *depth, struct pf_ruleset **rs, int n, struct pf_rule **r, struct pf_rule **a, int *match) { struct pf_anchor_stackframe *f; struct pf_rule *fr; int quick = 0; PF_RULES_RASSERT(); do { if (*depth <= 0) break; f = stack + *depth - 1; fr = PF_ANCHOR_RULE(f); if (f->child != NULL) { struct pf_anchor_node *parent; /* * This block traverses through * a wildcard anchor. */ parent = &fr->anchor->children; if (match != NULL && *match) { /* * If any of "*" matched, then * "foo/ *" matched, mark frame * appropriately. */ PF_ANCHOR_SET_MATCH(f); *match = 0; } f->child = RB_NEXT(pf_anchor_node, parent, f->child); if (f->child != NULL) { *rs = &f->child->ruleset; *r = TAILQ_FIRST((*rs)->rules[n].active.ptr); if (*r == NULL) continue; else break; } } (*depth)--; if (*depth == 0 && a != NULL) *a = NULL; *rs = f->rs; if (PF_ANCHOR_MATCH(f) || (match != NULL && *match)) quick = fr->quick; *r = TAILQ_NEXT(fr, entries); } while (*r == NULL); return (quick); } #ifdef INET6 void pf_poolmask(struct pf_addr *naddr, struct pf_addr *raddr, struct pf_addr *rmask, struct pf_addr *saddr, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) | ((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]); break; #endif /* INET */ case AF_INET6: naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) | ((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]); naddr->addr32[1] = (raddr->addr32[1] & rmask->addr32[1]) | ((rmask->addr32[1] ^ 0xffffffff ) & saddr->addr32[1]); naddr->addr32[2] = (raddr->addr32[2] & rmask->addr32[2]) | ((rmask->addr32[2] ^ 0xffffffff ) & saddr->addr32[2]); naddr->addr32[3] = (raddr->addr32[3] & rmask->addr32[3]) | ((rmask->addr32[3] ^ 0xffffffff ) & saddr->addr32[3]); break; } } void pf_addr_inc(struct pf_addr *addr, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: addr->addr32[0] = htonl(ntohl(addr->addr32[0]) + 1); break; #endif /* INET */ case AF_INET6: if (addr->addr32[3] == 0xffffffff) { addr->addr32[3] = 0; if (addr->addr32[2] == 0xffffffff) { addr->addr32[2] = 0; if (addr->addr32[1] == 0xffffffff) { addr->addr32[1] = 0; addr->addr32[0] = htonl(ntohl(addr->addr32[0]) + 1); } else addr->addr32[1] = htonl(ntohl(addr->addr32[1]) + 1); } else addr->addr32[2] = htonl(ntohl(addr->addr32[2]) + 1); } else addr->addr32[3] = htonl(ntohl(addr->addr32[3]) + 1); break; } } #endif /* INET6 */ int pf_socket_lookup(int direction, struct pf_pdesc *pd, struct mbuf *m) { struct pf_addr *saddr, *daddr; u_int16_t sport, dport; struct inpcbinfo *pi; struct inpcb *inp; pd->lookup.uid = UID_MAX; pd->lookup.gid = GID_MAX; switch (pd->proto) { case IPPROTO_TCP: if (pd->hdr.tcp == NULL) return (-1); sport = pd->hdr.tcp->th_sport; dport = pd->hdr.tcp->th_dport; pi = &V_tcbinfo; break; case IPPROTO_UDP: if (pd->hdr.udp == NULL) return (-1); sport = pd->hdr.udp->uh_sport; dport = pd->hdr.udp->uh_dport; pi = &V_udbinfo; break; default: return (-1); } if (direction == PF_IN) { saddr = pd->src; daddr = pd->dst; } else { u_int16_t p; p = sport; sport = dport; dport = p; saddr = pd->dst; daddr = pd->src; } switch (pd->af) { #ifdef INET case AF_INET: inp = in_pcblookup_mbuf(pi, saddr->v4, sport, daddr->v4, dport, INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) { inp = in_pcblookup_mbuf(pi, saddr->v4, sport, daddr->v4, dport, INPLOOKUP_WILDCARD | INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) return (-1); } break; #endif /* INET */ #ifdef INET6 case AF_INET6: inp = in6_pcblookup_mbuf(pi, &saddr->v6, sport, &daddr->v6, dport, INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) { inp = in6_pcblookup_mbuf(pi, &saddr->v6, sport, &daddr->v6, dport, INPLOOKUP_WILDCARD | INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) return (-1); } break; #endif /* INET6 */ default: return (-1); } INP_RLOCK_ASSERT(inp); pd->lookup.uid = inp->inp_cred->cr_uid; pd->lookup.gid = inp->inp_cred->cr_groups[0]; INP_RUNLOCK(inp); return (1); } static u_int8_t pf_get_wscale(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af) { int hlen; u_int8_t hdr[60]; u_int8_t *opt, optlen; u_int8_t wscale = 0; hlen = th_off << 2; /* hlen <= sizeof(hdr) */ if (hlen <= sizeof(struct tcphdr)) return (0); if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af)) return (0); opt = hdr + sizeof(struct tcphdr); hlen -= sizeof(struct tcphdr); while (hlen >= 3) { switch (*opt) { case TCPOPT_EOL: case TCPOPT_NOP: ++opt; --hlen; break; case TCPOPT_WINDOW: wscale = opt[2]; if (wscale > TCP_MAX_WINSHIFT) wscale = TCP_MAX_WINSHIFT; wscale |= PF_WSCALE_FLAG; /* FALLTHROUGH */ default: optlen = opt[1]; if (optlen < 2) optlen = 2; hlen -= optlen; opt += optlen; break; } } return (wscale); } static u_int16_t pf_get_mss(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af) { int hlen; u_int8_t hdr[60]; u_int8_t *opt, optlen; u_int16_t mss = V_tcp_mssdflt; hlen = th_off << 2; /* hlen <= sizeof(hdr) */ if (hlen <= sizeof(struct tcphdr)) return (0); if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af)) return (0); opt = hdr + sizeof(struct tcphdr); hlen -= sizeof(struct tcphdr); while (hlen >= TCPOLEN_MAXSEG) { switch (*opt) { case TCPOPT_EOL: case TCPOPT_NOP: ++opt; --hlen; break; case TCPOPT_MAXSEG: bcopy((caddr_t)(opt + 2), (caddr_t)&mss, 2); NTOHS(mss); /* FALLTHROUGH */ default: optlen = opt[1]; if (optlen < 2) optlen = 2; hlen -= optlen; opt += optlen; break; } } return (mss); } static u_int16_t pf_calc_mss(struct pf_addr *addr, sa_family_t af, int rtableid, u_int16_t offer) { #ifdef INET struct sockaddr_in *dst; struct route ro; #endif /* INET */ #ifdef INET6 struct sockaddr_in6 *dst6; struct route_in6 ro6; #endif /* INET6 */ struct rtentry *rt = NULL; int hlen = 0; u_int16_t mss = V_tcp_mssdflt; switch (af) { #ifdef INET case AF_INET: hlen = sizeof(struct ip); bzero(&ro, sizeof(ro)); dst = (struct sockaddr_in *)&ro.ro_dst; dst->sin_family = AF_INET; dst->sin_len = sizeof(*dst); dst->sin_addr = addr->v4; in_rtalloc_ign(&ro, 0, rtableid); rt = ro.ro_rt; break; #endif /* INET */ #ifdef INET6 case AF_INET6: hlen = sizeof(struct ip6_hdr); bzero(&ro6, sizeof(ro6)); dst6 = (struct sockaddr_in6 *)&ro6.ro_dst; dst6->sin6_family = AF_INET6; dst6->sin6_len = sizeof(*dst6); dst6->sin6_addr = addr->v6; in6_rtalloc_ign(&ro6, 0, rtableid); rt = ro6.ro_rt; break; #endif /* INET6 */ } if (rt && rt->rt_ifp) { mss = rt->rt_ifp->if_mtu - hlen - sizeof(struct tcphdr); mss = max(V_tcp_mssdflt, mss); RTFREE(rt); } mss = min(mss, offer); mss = max(mss, 64); /* sanity - at least max opt space */ return (mss); } static u_int32_t pf_tcp_iss(struct pf_pdesc *pd) { MD5_CTX ctx; u_int32_t digest[4]; if (V_pf_tcp_secret_init == 0) { read_random(&V_pf_tcp_secret, sizeof(V_pf_tcp_secret)); MD5Init(&V_pf_tcp_secret_ctx); MD5Update(&V_pf_tcp_secret_ctx, V_pf_tcp_secret, sizeof(V_pf_tcp_secret)); V_pf_tcp_secret_init = 1; } ctx = V_pf_tcp_secret_ctx; MD5Update(&ctx, (char *)&pd->hdr.tcp->th_sport, sizeof(u_short)); MD5Update(&ctx, (char *)&pd->hdr.tcp->th_dport, sizeof(u_short)); if (pd->af == AF_INET6) { MD5Update(&ctx, (char *)&pd->src->v6, sizeof(struct in6_addr)); MD5Update(&ctx, (char *)&pd->dst->v6, sizeof(struct in6_addr)); } else { MD5Update(&ctx, (char *)&pd->src->v4, sizeof(struct in_addr)); MD5Update(&ctx, (char *)&pd->dst->v4, sizeof(struct in_addr)); } MD5Final((u_char *)digest, &ctx); V_pf_tcp_iss_off += 4096; #define ISN_RANDOM_INCREMENT (4096 - 1) return (digest[0] + (arc4random() & ISN_RANDOM_INCREMENT) + V_pf_tcp_iss_off); #undef ISN_RANDOM_INCREMENT } static int pf_test_rule(struct pf_rule **rm, struct pf_state **sm, int direction, struct pfi_kif *kif, struct mbuf *m, int off, struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm, struct inpcb *inp) { struct pf_rule *nr = NULL; struct pf_addr * const saddr = pd->src; struct pf_addr * const daddr = pd->dst; sa_family_t af = pd->af; struct pf_rule *r, *a = NULL; struct pf_ruleset *ruleset = NULL; struct pf_src_node *nsn = NULL; struct tcphdr *th = pd->hdr.tcp; struct pf_state_key *sk = NULL, *nk = NULL; u_short reason; int rewrite = 0, hdrlen = 0; int tag = -1, rtableid = -1; int asd = 0; int match = 0; int state_icmp = 0; u_int16_t sport = 0, dport = 0; u_int16_t bproto_sum = 0, bip_sum = 0; u_int8_t icmptype = 0, icmpcode = 0; struct pf_anchor_stackframe anchor_stack[PF_ANCHOR_STACKSIZE]; PF_RULES_RASSERT(); if (inp != NULL) { INP_LOCK_ASSERT(inp); pd->lookup.uid = inp->inp_cred->cr_uid; pd->lookup.gid = inp->inp_cred->cr_groups[0]; pd->lookup.done = 1; } switch (pd->proto) { case IPPROTO_TCP: sport = th->th_sport; dport = th->th_dport; hdrlen = sizeof(*th); break; case IPPROTO_UDP: sport = pd->hdr.udp->uh_sport; dport = pd->hdr.udp->uh_dport; hdrlen = sizeof(*pd->hdr.udp); break; #ifdef INET case IPPROTO_ICMP: if (pd->af != AF_INET) break; sport = dport = pd->hdr.icmp->icmp_id; hdrlen = sizeof(*pd->hdr.icmp); icmptype = pd->hdr.icmp->icmp_type; icmpcode = pd->hdr.icmp->icmp_code; if (icmptype == ICMP_UNREACH || icmptype == ICMP_SOURCEQUENCH || icmptype == ICMP_REDIRECT || icmptype == ICMP_TIMXCEED || icmptype == ICMP_PARAMPROB) state_icmp++; break; #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: if (af != AF_INET6) break; sport = dport = pd->hdr.icmp6->icmp6_id; hdrlen = sizeof(*pd->hdr.icmp6); icmptype = pd->hdr.icmp6->icmp6_type; icmpcode = pd->hdr.icmp6->icmp6_code; if (icmptype == ICMP6_DST_UNREACH || icmptype == ICMP6_PACKET_TOO_BIG || icmptype == ICMP6_TIME_EXCEEDED || icmptype == ICMP6_PARAM_PROB) state_icmp++; break; #endif /* INET6 */ default: sport = dport = hdrlen = 0; break; } r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr); /* check packet for BINAT/NAT/RDR */ if ((nr = pf_get_translation(pd, m, off, direction, kif, &nsn, &sk, &nk, saddr, daddr, sport, dport, anchor_stack)) != NULL) { KASSERT(sk != NULL, ("%s: null sk", __func__)); KASSERT(nk != NULL, ("%s: null nk", __func__)); if (pd->ip_sum) bip_sum = *pd->ip_sum; switch (pd->proto) { case IPPROTO_TCP: bproto_sum = th->th_sum; pd->proto_sum = &th->th_sum; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], af) || nk->port[pd->sidx] != sport) { pf_change_ap(saddr, &th->th_sport, pd->ip_sum, &th->th_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 0, af); pd->sport = &th->th_sport; sport = th->th_sport; } if (PF_ANEQ(daddr, &nk->addr[pd->didx], af) || nk->port[pd->didx] != dport) { pf_change_ap(daddr, &th->th_dport, pd->ip_sum, &th->th_sum, &nk->addr[pd->didx], nk->port[pd->didx], 0, af); dport = th->th_dport; pd->dport = &th->th_dport; } rewrite++; break; case IPPROTO_UDP: bproto_sum = pd->hdr.udp->uh_sum; pd->proto_sum = &pd->hdr.udp->uh_sum; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], af) || nk->port[pd->sidx] != sport) { pf_change_ap(saddr, &pd->hdr.udp->uh_sport, pd->ip_sum, &pd->hdr.udp->uh_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 1, af); sport = pd->hdr.udp->uh_sport; pd->sport = &pd->hdr.udp->uh_sport; } if (PF_ANEQ(daddr, &nk->addr[pd->didx], af) || nk->port[pd->didx] != dport) { pf_change_ap(daddr, &pd->hdr.udp->uh_dport, pd->ip_sum, &pd->hdr.udp->uh_sum, &nk->addr[pd->didx], nk->port[pd->didx], 1, af); dport = pd->hdr.udp->uh_dport; pd->dport = &pd->hdr.udp->uh_dport; } rewrite++; break; #ifdef INET case IPPROTO_ICMP: nk->port[0] = nk->port[1]; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&saddr->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET)) pf_change_a(&daddr->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); if (nk->port[1] != pd->hdr.icmp->icmp_id) { pd->hdr.icmp->icmp_cksum = pf_cksum_fixup( pd->hdr.icmp->icmp_cksum, sport, nk->port[1], 0); pd->hdr.icmp->icmp_id = nk->port[1]; pd->sport = &pd->hdr.icmp->icmp_id; } m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp); break; #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: nk->port[0] = nk->port[1]; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET6)) pf_change_a6(saddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->sidx], 0); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET6)) pf_change_a6(daddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->didx], 0); rewrite++; break; #endif /* INET */ default: switch (af) { #ifdef INET case AF_INET: if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&saddr->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET)) pf_change_a(&daddr->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET6)) PF_ACPY(saddr, &nk->addr[pd->sidx], af); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET6)) PF_ACPY(saddr, &nk->addr[pd->didx], af); break; #endif /* INET */ } break; } if (nr->natpass) r = NULL; pd->nat_rule = nr; } while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != direction) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != af) r = r->skip[PF_SKIP_AF].ptr; else if (r->proto && r->proto != pd->proto) r = r->skip[PF_SKIP_PROTO].ptr; else if (PF_MISMATCHAW(&r->src.addr, saddr, af, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; /* tcp/udp only. port_op always 0 in other cases */ else if (r->src.port_op && !pf_match_port(r->src.port_op, r->src.port[0], r->src.port[1], sport)) r = r->skip[PF_SKIP_SRC_PORT].ptr; else if (PF_MISMATCHAW(&r->dst.addr, daddr, af, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; /* tcp/udp only. port_op always 0 in other cases */ else if (r->dst.port_op && !pf_match_port(r->dst.port_op, r->dst.port[0], r->dst.port[1], dport)) r = r->skip[PF_SKIP_DST_PORT].ptr; /* icmp only. type always 0 in other cases */ else if (r->type && r->type != icmptype + 1) r = TAILQ_NEXT(r, entries); /* icmp only. type always 0 in other cases */ else if (r->code && r->code != icmpcode + 1) r = TAILQ_NEXT(r, entries); else if (r->tos && !(r->tos == pd->tos)) r = TAILQ_NEXT(r, entries); else if (r->rule_flag & PFRULE_FRAGMENT) r = TAILQ_NEXT(r, entries); else if (pd->proto == IPPROTO_TCP && (r->flagset & th->th_flags) != r->flags) r = TAILQ_NEXT(r, entries); /* tcp/udp only. uid.op always 0 in other cases */ else if (r->uid.op && (pd->lookup.done || (pd->lookup.done = pf_socket_lookup(direction, pd, m), 1)) && !pf_match_uid(r->uid.op, r->uid.uid[0], r->uid.uid[1], pd->lookup.uid)) r = TAILQ_NEXT(r, entries); /* tcp/udp only. gid.op always 0 in other cases */ else if (r->gid.op && (pd->lookup.done || (pd->lookup.done = pf_socket_lookup(direction, pd, m), 1)) && !pf_match_gid(r->gid.op, r->gid.gid[0], r->gid.gid[1], pd->lookup.gid)) r = TAILQ_NEXT(r, entries); else if (r->prob && r->prob <= arc4random()) r = TAILQ_NEXT(r, entries); else if (r->match_tag && !pf_match_tag(m, r, &tag, pd->pf_mtag ? pd->pf_mtag->tag : 0)) r = TAILQ_NEXT(r, entries); else if (r->os_fingerprint != PF_OSFP_ANY && (pd->proto != IPPROTO_TCP || !pf_osfp_match( pf_osfp_fingerprint(pd, m, off, th), r->os_fingerprint))) r = TAILQ_NEXT(r, entries); else { if (r->tag) tag = r->tag; if (r->rtableid >= 0) rtableid = r->rtableid; if (r->anchor == NULL) { match = 1; *rm = r; *am = a; *rsm = ruleset; if ((*rm)->quick) break; r = TAILQ_NEXT(r, entries); } else pf_step_into_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match); } if (r == NULL && pf_step_out_of_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match)) break; } r = *rm; a = *am; ruleset = *rsm; REASON_SET(&reason, PFRES_MATCH); if (r->log || (nr != NULL && nr->log)) { if (rewrite) m_copyback(m, off, hdrlen, pd->hdr.any); PFLOG_PACKET(kif, m, af, direction, reason, r->log ? r : nr, a, ruleset, pd, 1); } if ((r->action == PF_DROP) && ((r->rule_flag & PFRULE_RETURNRST) || (r->rule_flag & PFRULE_RETURNICMP) || (r->rule_flag & PFRULE_RETURN))) { /* undo NAT changes, if they have taken place */ if (nr != NULL) { PF_ACPY(saddr, &sk->addr[pd->sidx], af); PF_ACPY(daddr, &sk->addr[pd->didx], af); if (pd->sport) *pd->sport = sk->port[pd->sidx]; if (pd->dport) *pd->dport = sk->port[pd->didx]; if (pd->proto_sum) *pd->proto_sum = bproto_sum; if (pd->ip_sum) *pd->ip_sum = bip_sum; m_copyback(m, off, hdrlen, pd->hdr.any); } if (pd->proto == IPPROTO_TCP && ((r->rule_flag & PFRULE_RETURNRST) || (r->rule_flag & PFRULE_RETURN)) && !(th->th_flags & TH_RST)) { u_int32_t ack = ntohl(th->th_seq) + pd->p_len; int len = 0; #ifdef INET struct ip *h4; #endif #ifdef INET6 struct ip6_hdr *h6; #endif switch (af) { #ifdef INET case AF_INET: h4 = mtod(m, struct ip *); len = ntohs(h4->ip_len) - off; break; #endif #ifdef INET6 case AF_INET6: h6 = mtod(m, struct ip6_hdr *); len = ntohs(h6->ip6_plen) - (off - sizeof(*h6)); break; #endif } if (pf_check_proto_cksum(m, off, len, IPPROTO_TCP, af)) REASON_SET(&reason, PFRES_PROTCKSUM); else { if (th->th_flags & TH_SYN) ack++; if (th->th_flags & TH_FIN) ack++; pf_send_tcp(m, r, af, pd->dst, pd->src, th->th_dport, th->th_sport, ntohl(th->th_ack), ack, TH_RST|TH_ACK, 0, 0, r->return_ttl, 1, 0, kif->pfik_ifp); } } else if (pd->proto != IPPROTO_ICMP && af == AF_INET && r->return_icmp) pf_send_icmp(m, r->return_icmp >> 8, r->return_icmp & 255, af, r); else if (pd->proto != IPPROTO_ICMPV6 && af == AF_INET6 && r->return_icmp6) pf_send_icmp(m, r->return_icmp6 >> 8, r->return_icmp6 & 255, af, r); } if (r->action == PF_DROP) goto cleanup; if (tag > 0 && pf_tag_packet(m, pd, tag)) { REASON_SET(&reason, PFRES_MEMORY); goto cleanup; } if (rtableid >= 0) M_SETFIB(m, rtableid); if (!state_icmp && (r->keep_state || nr != NULL || (pd->flags & PFDESC_TCP_NORM))) { int action; action = pf_create_state(r, nr, a, pd, nsn, nk, sk, m, off, sport, dport, &rewrite, kif, sm, tag, bproto_sum, bip_sum, hdrlen); if (action != PF_PASS) return (action); } else { if (sk != NULL) uma_zfree(V_pf_state_key_z, sk); if (nk != NULL) uma_zfree(V_pf_state_key_z, nk); } /* copy back packet headers if we performed NAT operations */ if (rewrite) m_copyback(m, off, hdrlen, pd->hdr.any); if (*sm != NULL && !((*sm)->state_flags & PFSTATE_NOSYNC) && direction == PF_OUT && pfsync_defer_ptr != NULL && pfsync_defer_ptr(*sm, m)) /* * We want the state created, but we dont * want to send this in case a partner * firewall has to know about it to allow * replies through it. */ return (PF_DEFER); return (PF_PASS); cleanup: if (sk != NULL) uma_zfree(V_pf_state_key_z, sk); if (nk != NULL) uma_zfree(V_pf_state_key_z, nk); return (PF_DROP); } static int pf_create_state(struct pf_rule *r, struct pf_rule *nr, struct pf_rule *a, struct pf_pdesc *pd, struct pf_src_node *nsn, struct pf_state_key *nk, struct pf_state_key *sk, struct mbuf *m, int off, u_int16_t sport, u_int16_t dport, int *rewrite, struct pfi_kif *kif, struct pf_state **sm, int tag, u_int16_t bproto_sum, u_int16_t bip_sum, int hdrlen) { struct pf_state *s = NULL; struct pf_src_node *sn = NULL; struct tcphdr *th = pd->hdr.tcp; u_int16_t mss = V_tcp_mssdflt; u_short reason; /* check maximums */ if (r->max_states && (counter_u64_fetch(r->states_cur) >= r->max_states)) { counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1); REASON_SET(&reason, PFRES_MAXSTATES); return (PF_DROP); } /* src node for filter rule */ if ((r->rule_flag & PFRULE_SRCTRACK || r->rpool.opts & PF_POOL_STICKYADDR) && pf_insert_src_node(&sn, r, pd->src, pd->af) != 0) { REASON_SET(&reason, PFRES_SRCLIMIT); goto csfailed; } /* src node for translation rule */ if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) && pf_insert_src_node(&nsn, nr, &sk->addr[pd->sidx], pd->af)) { REASON_SET(&reason, PFRES_SRCLIMIT); goto csfailed; } s = uma_zalloc(V_pf_state_z, M_NOWAIT | M_ZERO); if (s == NULL) { REASON_SET(&reason, PFRES_MEMORY); goto csfailed; } s->rule.ptr = r; s->nat_rule.ptr = nr; s->anchor.ptr = a; STATE_INC_COUNTERS(s); if (r->allow_opts) s->state_flags |= PFSTATE_ALLOWOPTS; if (r->rule_flag & PFRULE_STATESLOPPY) s->state_flags |= PFSTATE_SLOPPY; s->log = r->log & PF_LOG_ALL; s->sync_state = PFSYNC_S_NONE; if (nr != NULL) s->log |= nr->log & PF_LOG_ALL; switch (pd->proto) { case IPPROTO_TCP: s->src.seqlo = ntohl(th->th_seq); s->src.seqhi = s->src.seqlo + pd->p_len + 1; if ((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN && r->keep_state == PF_STATE_MODULATE) { /* Generate sequence number modulator */ if ((s->src.seqdiff = pf_tcp_iss(pd) - s->src.seqlo) == 0) s->src.seqdiff = 1; pf_change_a(&th->th_seq, &th->th_sum, htonl(s->src.seqlo + s->src.seqdiff), 0); *rewrite = 1; } else s->src.seqdiff = 0; if (th->th_flags & TH_SYN) { s->src.seqhi++; s->src.wscale = pf_get_wscale(m, off, th->th_off, pd->af); } s->src.max_win = MAX(ntohs(th->th_win), 1); if (s->src.wscale & PF_WSCALE_MASK) { /* Remove scale factor from initial window */ int win = s->src.max_win; win += 1 << (s->src.wscale & PF_WSCALE_MASK); s->src.max_win = (win - 1) >> (s->src.wscale & PF_WSCALE_MASK); } if (th->th_flags & TH_FIN) s->src.seqhi++; s->dst.seqhi = 1; s->dst.max_win = 1; s->src.state = TCPS_SYN_SENT; s->dst.state = TCPS_CLOSED; s->timeout = PFTM_TCP_FIRST_PACKET; break; case IPPROTO_UDP: s->src.state = PFUDPS_SINGLE; s->dst.state = PFUDPS_NO_TRAFFIC; s->timeout = PFTM_UDP_FIRST_PACKET; break; case IPPROTO_ICMP: #ifdef INET6 case IPPROTO_ICMPV6: #endif s->timeout = PFTM_ICMP_FIRST_PACKET; break; default: s->src.state = PFOTHERS_SINGLE; s->dst.state = PFOTHERS_NO_TRAFFIC; s->timeout = PFTM_OTHER_FIRST_PACKET; } if (r->rt && r->rt != PF_FASTROUTE) { if (pf_map_addr(pd->af, r, pd->src, &s->rt_addr, NULL, &sn)) { REASON_SET(&reason, PFRES_MAPFAILED); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); goto csfailed; } s->rt_kif = r->rpool.cur->kif; } s->creation = time_uptime; s->expire = time_uptime; if (sn != NULL) s->src_node = sn; if (nsn != NULL) { /* XXX We only modify one side for now. */ PF_ACPY(&nsn->raddr, &nk->addr[1], pd->af); s->nat_src_node = nsn; } if (pd->proto == IPPROTO_TCP) { if ((pd->flags & PFDESC_TCP_NORM) && pf_normalize_tcp_init(m, off, pd, th, &s->src, &s->dst)) { REASON_SET(&reason, PFRES_MEMORY); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); return (PF_DROP); } if ((pd->flags & PFDESC_TCP_NORM) && s->src.scrub && pf_normalize_tcp_stateful(m, off, pd, &reason, th, s, &s->src, &s->dst, rewrite)) { /* This really shouldn't happen!!! */ DPFPRINTF(PF_DEBUG_URGENT, ("pf_normalize_tcp_stateful failed on first pkt")); pf_normalize_tcp_cleanup(s); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); return (PF_DROP); } } s->direction = pd->dir; /* * sk/nk could already been setup by pf_get_translation(). */ if (nr == NULL) { KASSERT((sk == NULL && nk == NULL), ("%s: nr %p sk %p, nk %p", __func__, nr, sk, nk)); sk = pf_state_key_setup(pd, pd->src, pd->dst, sport, dport); if (sk == NULL) goto csfailed; nk = sk; } else KASSERT((sk != NULL && nk != NULL), ("%s: nr %p sk %p, nk %p", __func__, nr, sk, nk)); /* Swap sk/nk for PF_OUT. */ if (pf_state_insert(BOUND_IFACE(r, kif), (pd->dir == PF_IN) ? sk : nk, (pd->dir == PF_IN) ? nk : sk, s)) { if (pd->proto == IPPROTO_TCP) pf_normalize_tcp_cleanup(s); REASON_SET(&reason, PFRES_STATEINS); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); return (PF_DROP); } else *sm = s; if (tag > 0) s->tag = tag; if (pd->proto == IPPROTO_TCP && (th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN && r->keep_state == PF_STATE_SYNPROXY) { s->src.state = PF_TCPS_PROXY_SRC; /* undo NAT changes, if they have taken place */ if (nr != NULL) { struct pf_state_key *skt = s->key[PF_SK_WIRE]; if (pd->dir == PF_OUT) skt = s->key[PF_SK_STACK]; PF_ACPY(pd->src, &skt->addr[pd->sidx], pd->af); PF_ACPY(pd->dst, &skt->addr[pd->didx], pd->af); if (pd->sport) *pd->sport = skt->port[pd->sidx]; if (pd->dport) *pd->dport = skt->port[pd->didx]; if (pd->proto_sum) *pd->proto_sum = bproto_sum; if (pd->ip_sum) *pd->ip_sum = bip_sum; m_copyback(m, off, hdrlen, pd->hdr.any); } s->src.seqhi = htonl(arc4random()); /* Find mss option */ int rtid = M_GETFIB(m); mss = pf_get_mss(m, off, th->th_off, pd->af); mss = pf_calc_mss(pd->src, pd->af, rtid, mss); mss = pf_calc_mss(pd->dst, pd->af, rtid, mss); s->src.mss = mss; pf_send_tcp(NULL, r, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, s->src.seqhi, ntohl(th->th_seq) + 1, TH_SYN|TH_ACK, 0, s->src.mss, 0, 1, 0, NULL); REASON_SET(&reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } return (PF_PASS); csfailed: if (sk != NULL) uma_zfree(V_pf_state_key_z, sk); if (nk != NULL) uma_zfree(V_pf_state_key_z, nk); if (sn != NULL) { struct pf_srchash *sh; sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)]; PF_HASHROW_LOCK(sh); if (--sn->states == 0 && sn->expire == 0) { pf_unlink_src_node(sn); uma_zfree(V_pf_sources_z, sn); counter_u64_add( V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], 1); } PF_HASHROW_UNLOCK(sh); } if (nsn != sn && nsn != NULL) { struct pf_srchash *sh; sh = &V_pf_srchash[pf_hashsrc(&nsn->addr, nsn->af)]; PF_HASHROW_LOCK(sh); if (--nsn->states == 1 && nsn->expire == 0) { pf_unlink_src_node(nsn); uma_zfree(V_pf_sources_z, nsn); counter_u64_add( V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], 1); } PF_HASHROW_UNLOCK(sh); } return (PF_DROP); } static int pf_test_fragment(struct pf_rule **rm, int direction, struct pfi_kif *kif, struct mbuf *m, void *h, struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm) { struct pf_rule *r, *a = NULL; struct pf_ruleset *ruleset = NULL; sa_family_t af = pd->af; u_short reason; int tag = -1; int asd = 0; int match = 0; struct pf_anchor_stackframe anchor_stack[PF_ANCHOR_STACKSIZE]; PF_RULES_RASSERT(); r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr); while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != direction) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != af) r = r->skip[PF_SKIP_AF].ptr; else if (r->proto && r->proto != pd->proto) r = r->skip[PF_SKIP_PROTO].ptr; else if (PF_MISMATCHAW(&r->src.addr, pd->src, af, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; else if (r->tos && !(r->tos == pd->tos)) r = TAILQ_NEXT(r, entries); else if (r->os_fingerprint != PF_OSFP_ANY) r = TAILQ_NEXT(r, entries); else if (pd->proto == IPPROTO_UDP && (r->src.port_op || r->dst.port_op)) r = TAILQ_NEXT(r, entries); else if (pd->proto == IPPROTO_TCP && (r->src.port_op || r->dst.port_op || r->flagset)) r = TAILQ_NEXT(r, entries); else if ((pd->proto == IPPROTO_ICMP || pd->proto == IPPROTO_ICMPV6) && (r->type || r->code)) r = TAILQ_NEXT(r, entries); else if (r->prob && r->prob <= (arc4random() % (UINT_MAX - 1) + 1)) r = TAILQ_NEXT(r, entries); else if (r->match_tag && !pf_match_tag(m, r, &tag, pd->pf_mtag ? pd->pf_mtag->tag : 0)) r = TAILQ_NEXT(r, entries); else { if (r->anchor == NULL) { match = 1; *rm = r; *am = a; *rsm = ruleset; if ((*rm)->quick) break; r = TAILQ_NEXT(r, entries); } else pf_step_into_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match); } if (r == NULL && pf_step_out_of_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match)) break; } r = *rm; a = *am; ruleset = *rsm; REASON_SET(&reason, PFRES_MATCH); if (r->log) PFLOG_PACKET(kif, m, af, direction, reason, r, a, ruleset, pd, 1); if (r->action != PF_PASS) return (PF_DROP); if (tag > 0 && pf_tag_packet(m, pd, tag)) { REASON_SET(&reason, PFRES_MEMORY); return (PF_DROP); } return (PF_PASS); } static int pf_tcp_track_full(struct pf_state_peer *src, struct pf_state_peer *dst, struct pf_state **state, struct pfi_kif *kif, struct mbuf *m, int off, struct pf_pdesc *pd, u_short *reason, int *copyback) { struct tcphdr *th = pd->hdr.tcp; u_int16_t win = ntohs(th->th_win); u_int32_t ack, end, seq, orig_seq; u_int8_t sws, dws; int ackskew; if (src->wscale && dst->wscale && !(th->th_flags & TH_SYN)) { sws = src->wscale & PF_WSCALE_MASK; dws = dst->wscale & PF_WSCALE_MASK; } else sws = dws = 0; /* * Sequence tracking algorithm from Guido van Rooij's paper: * http://www.madison-gurkha.com/publications/tcp_filtering/ * tcp_filtering.ps */ orig_seq = seq = ntohl(th->th_seq); if (src->seqlo == 0) { /* First packet from this end. Set its state */ if ((pd->flags & PFDESC_TCP_NORM || dst->scrub) && src->scrub == NULL) { if (pf_normalize_tcp_init(m, off, pd, th, src, dst)) { REASON_SET(reason, PFRES_MEMORY); return (PF_DROP); } } /* Deferred generation of sequence number modulator */ if (dst->seqdiff && !src->seqdiff) { /* use random iss for the TCP server */ while ((src->seqdiff = arc4random() - seq) == 0) ; ack = ntohl(th->th_ack) - dst->seqdiff; pf_change_a(&th->th_seq, &th->th_sum, htonl(seq + src->seqdiff), 0); pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0); *copyback = 1; } else { ack = ntohl(th->th_ack); } end = seq + pd->p_len; if (th->th_flags & TH_SYN) { end++; if (dst->wscale & PF_WSCALE_FLAG) { src->wscale = pf_get_wscale(m, off, th->th_off, pd->af); if (src->wscale & PF_WSCALE_FLAG) { /* Remove scale factor from initial * window */ sws = src->wscale & PF_WSCALE_MASK; win = ((u_int32_t)win + (1 << sws) - 1) >> sws; dws = dst->wscale & PF_WSCALE_MASK; } else { /* fixup other window */ dst->max_win <<= dst->wscale & PF_WSCALE_MASK; /* in case of a retrans SYN|ACK */ dst->wscale = 0; } } } if (th->th_flags & TH_FIN) end++; src->seqlo = seq; if (src->state < TCPS_SYN_SENT) src->state = TCPS_SYN_SENT; /* * May need to slide the window (seqhi may have been set by * the crappy stack check or if we picked up the connection * after establishment) */ if (src->seqhi == 1 || SEQ_GEQ(end + MAX(1, dst->max_win << dws), src->seqhi)) src->seqhi = end + MAX(1, dst->max_win << dws); if (win > src->max_win) src->max_win = win; } else { ack = ntohl(th->th_ack) - dst->seqdiff; if (src->seqdiff) { /* Modulate sequence numbers */ pf_change_a(&th->th_seq, &th->th_sum, htonl(seq + src->seqdiff), 0); pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0); *copyback = 1; } end = seq + pd->p_len; if (th->th_flags & TH_SYN) end++; if (th->th_flags & TH_FIN) end++; } if ((th->th_flags & TH_ACK) == 0) { /* Let it pass through the ack skew check */ ack = dst->seqlo; } else if ((ack == 0 && (th->th_flags & (TH_ACK|TH_RST)) == (TH_ACK|TH_RST)) || /* broken tcp stacks do not set ack */ (dst->state < TCPS_SYN_SENT)) { /* * Many stacks (ours included) will set the ACK number in an * FIN|ACK if the SYN times out -- no sequence to ACK. */ ack = dst->seqlo; } if (seq == end) { /* Ease sequencing restrictions on no data packets */ seq = src->seqlo; end = seq; } ackskew = dst->seqlo - ack; /* * Need to demodulate the sequence numbers in any TCP SACK options * (Selective ACK). We could optionally validate the SACK values * against the current ACK window, either forwards or backwards, but * I'm not confident that SACK has been implemented properly * everywhere. It wouldn't surprise me if several stacks accidently * SACK too far backwards of previously ACKed data. There really aren't * any security implications of bad SACKing unless the target stack * doesn't validate the option length correctly. Someone trying to * spoof into a TCP connection won't bother blindly sending SACK * options anyway. */ if (dst->seqdiff && (th->th_off << 2) > sizeof(struct tcphdr)) { if (pf_modulate_sack(m, off, pd, th, dst)) *copyback = 1; } #define MAXACKWINDOW (0xffff + 1500) /* 1500 is an arbitrary fudge factor */ if (SEQ_GEQ(src->seqhi, end) && /* Last octet inside other's window space */ SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) && /* Retrans: not more than one window back */ (ackskew >= -MAXACKWINDOW) && /* Acking not more than one reassembled fragment backwards */ (ackskew <= (MAXACKWINDOW << sws)) && /* Acking not more than one window forward */ ((th->th_flags & TH_RST) == 0 || orig_seq == src->seqlo || (orig_seq == src->seqlo + 1) || (orig_seq + 1 == src->seqlo) || (pd->flags & PFDESC_IP_REAS) == 0)) { /* Require an exact/+1 sequence match on resets when possible */ if (dst->scrub || src->scrub) { if (pf_normalize_tcp_stateful(m, off, pd, reason, th, *state, src, dst, copyback)) return (PF_DROP); } /* update max window */ if (src->max_win < win) src->max_win = win; /* synchronize sequencing */ if (SEQ_GT(end, src->seqlo)) src->seqlo = end; /* slide the window of what the other end can send */ if (SEQ_GEQ(ack + (win << sws), dst->seqhi)) dst->seqhi = ack + MAX((win << sws), 1); /* update states */ if (th->th_flags & TH_SYN) if (src->state < TCPS_SYN_SENT) src->state = TCPS_SYN_SENT; if (th->th_flags & TH_FIN) if (src->state < TCPS_CLOSING) src->state = TCPS_CLOSING; if (th->th_flags & TH_ACK) { if (dst->state == TCPS_SYN_SENT) { dst->state = TCPS_ESTABLISHED; if (src->state == TCPS_ESTABLISHED && (*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } } else if (dst->state == TCPS_CLOSING) dst->state = TCPS_FIN_WAIT_2; } if (th->th_flags & TH_RST) src->state = dst->state = TCPS_TIME_WAIT; /* update expire time */ (*state)->expire = time_uptime; if (src->state >= TCPS_FIN_WAIT_2 && dst->state >= TCPS_FIN_WAIT_2) (*state)->timeout = PFTM_TCP_CLOSED; else if (src->state >= TCPS_CLOSING && dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_FIN_WAIT; else if (src->state < TCPS_ESTABLISHED || dst->state < TCPS_ESTABLISHED) (*state)->timeout = PFTM_TCP_OPENING; else if (src->state >= TCPS_CLOSING || dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_CLOSING; else (*state)->timeout = PFTM_TCP_ESTABLISHED; /* Fall through to PASS packet */ } else if ((dst->state < TCPS_SYN_SENT || dst->state >= TCPS_FIN_WAIT_2 || src->state >= TCPS_FIN_WAIT_2) && SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) && /* Within a window forward of the originating packet */ SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW)) { /* Within a window backward of the originating packet */ /* * This currently handles three situations: * 1) Stupid stacks will shotgun SYNs before their peer * replies. * 2) When PF catches an already established stream (the * firewall rebooted, the state table was flushed, routes * changed...) * 3) Packets get funky immediately after the connection * closes (this should catch Solaris spurious ACK|FINs * that web servers like to spew after a close) * * This must be a little more careful than the above code * since packet floods will also be caught here. We don't * update the TTL here to mitigate the damage of a packet * flood and so the same code can handle awkward establishment * and a loosened connection close. * In the establishment case, a correct peer response will * validate the connection, go through the normal state code * and keep updating the state TTL. */ if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: loose state match: "); pf_print_state(*state); pf_print_flags(th->th_flags); printf(" seq=%u (%u) ack=%u len=%u ackskew=%d " "pkts=%llu:%llu dir=%s,%s\n", seq, orig_seq, ack, pd->p_len, ackskew, (unsigned long long)(*state)->packets[0], (unsigned long long)(*state)->packets[1], pd->dir == PF_IN ? "in" : "out", pd->dir == (*state)->direction ? "fwd" : "rev"); } if (dst->scrub || src->scrub) { if (pf_normalize_tcp_stateful(m, off, pd, reason, th, *state, src, dst, copyback)) return (PF_DROP); } /* update max window */ if (src->max_win < win) src->max_win = win; /* synchronize sequencing */ if (SEQ_GT(end, src->seqlo)) src->seqlo = end; /* slide the window of what the other end can send */ if (SEQ_GEQ(ack + (win << sws), dst->seqhi)) dst->seqhi = ack + MAX((win << sws), 1); /* * Cannot set dst->seqhi here since this could be a shotgunned * SYN and not an already established connection. */ if (th->th_flags & TH_FIN) if (src->state < TCPS_CLOSING) src->state = TCPS_CLOSING; if (th->th_flags & TH_RST) src->state = dst->state = TCPS_TIME_WAIT; /* Fall through to PASS packet */ } else { if ((*state)->dst.state == TCPS_SYN_SENT && (*state)->src.state == TCPS_SYN_SENT) { /* Send RST for state mismatches during handshake */ if (!(th->th_flags & TH_RST)) pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, ntohl(th->th_ack), 0, TH_RST, 0, 0, (*state)->rule.ptr->return_ttl, 1, 0, kif->pfik_ifp); src->seqlo = 0; src->seqhi = 1; src->max_win = 1; } else if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: BAD state: "); pf_print_state(*state); pf_print_flags(th->th_flags); printf(" seq=%u (%u) ack=%u len=%u ackskew=%d " "pkts=%llu:%llu dir=%s,%s\n", seq, orig_seq, ack, pd->p_len, ackskew, (unsigned long long)(*state)->packets[0], (unsigned long long)(*state)->packets[1], pd->dir == PF_IN ? "in" : "out", pd->dir == (*state)->direction ? "fwd" : "rev"); printf("pf: State failure on: %c %c %c %c | %c %c\n", SEQ_GEQ(src->seqhi, end) ? ' ' : '1', SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) ? ' ': '2', (ackskew >= -MAXACKWINDOW) ? ' ' : '3', (ackskew <= (MAXACKWINDOW << sws)) ? ' ' : '4', SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) ?' ' :'5', SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW) ?' ' :'6'); } REASON_SET(reason, PFRES_BADSTATE); return (PF_DROP); } return (PF_PASS); } static int pf_tcp_track_sloppy(struct pf_state_peer *src, struct pf_state_peer *dst, struct pf_state **state, struct pf_pdesc *pd, u_short *reason) { struct tcphdr *th = pd->hdr.tcp; if (th->th_flags & TH_SYN) if (src->state < TCPS_SYN_SENT) src->state = TCPS_SYN_SENT; if (th->th_flags & TH_FIN) if (src->state < TCPS_CLOSING) src->state = TCPS_CLOSING; if (th->th_flags & TH_ACK) { if (dst->state == TCPS_SYN_SENT) { dst->state = TCPS_ESTABLISHED; if (src->state == TCPS_ESTABLISHED && (*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } } else if (dst->state == TCPS_CLOSING) { dst->state = TCPS_FIN_WAIT_2; } else if (src->state == TCPS_SYN_SENT && dst->state < TCPS_SYN_SENT) { /* * Handle a special sloppy case where we only see one * half of the connection. If there is a ACK after * the initial SYN without ever seeing a packet from * the destination, set the connection to established. */ dst->state = src->state = TCPS_ESTABLISHED; if ((*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } } else if (src->state == TCPS_CLOSING && dst->state == TCPS_ESTABLISHED && dst->seqlo == 0) { /* * Handle the closing of half connections where we * don't see the full bidirectional FIN/ACK+ACK * handshake. */ dst->state = TCPS_CLOSING; } } if (th->th_flags & TH_RST) src->state = dst->state = TCPS_TIME_WAIT; /* update expire time */ (*state)->expire = time_uptime; if (src->state >= TCPS_FIN_WAIT_2 && dst->state >= TCPS_FIN_WAIT_2) (*state)->timeout = PFTM_TCP_CLOSED; else if (src->state >= TCPS_CLOSING && dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_FIN_WAIT; else if (src->state < TCPS_ESTABLISHED || dst->state < TCPS_ESTABLISHED) (*state)->timeout = PFTM_TCP_OPENING; else if (src->state >= TCPS_CLOSING || dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_CLOSING; else (*state)->timeout = PFTM_TCP_ESTABLISHED; return (PF_PASS); } static int pf_test_state_tcp(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd, u_short *reason) { struct pf_state_key_cmp key; struct tcphdr *th = pd->hdr.tcp; int copyback = 0; struct pf_state_peer *src, *dst; struct pf_state_key *sk; bzero(&key, sizeof(key)); key.af = pd->af; key.proto = IPPROTO_TCP; if (direction == PF_IN) { /* wire side, straight */ PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); key.port[0] = th->th_sport; key.port[1] = th->th_dport; } else { /* stack side, reverse */ PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); key.port[1] = th->th_sport; key.port[0] = th->th_dport; } STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->src; dst = &(*state)->dst; } else { src = &(*state)->dst; dst = &(*state)->src; } sk = (*state)->key[pd->didx]; if ((*state)->src.state == PF_TCPS_PROXY_SRC) { if (direction != (*state)->direction) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } if (th->th_flags & TH_SYN) { if (ntohl(th->th_seq) != (*state)->src.seqlo) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, (*state)->src.seqhi, ntohl(th->th_seq) + 1, TH_SYN|TH_ACK, 0, (*state)->src.mss, 0, 1, 0, NULL); REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } else if (!(th->th_flags & TH_ACK) || (ntohl(th->th_ack) != (*state)->src.seqhi + 1) || (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } else if ((*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } else (*state)->src.state = PF_TCPS_PROXY_DST; } if ((*state)->src.state == PF_TCPS_PROXY_DST) { if (direction == (*state)->direction) { if (((th->th_flags & (TH_SYN|TH_ACK)) != TH_ACK) || (ntohl(th->th_ack) != (*state)->src.seqhi + 1) || (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } (*state)->src.max_win = MAX(ntohs(th->th_win), 1); if ((*state)->dst.seqhi == 1) (*state)->dst.seqhi = htonl(arc4random()); pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, &sk->addr[pd->sidx], &sk->addr[pd->didx], sk->port[pd->sidx], sk->port[pd->didx], (*state)->dst.seqhi, 0, TH_SYN, 0, (*state)->src.mss, 0, 0, (*state)->tag, NULL); REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } else if (((th->th_flags & (TH_SYN|TH_ACK)) != (TH_SYN|TH_ACK)) || (ntohl(th->th_ack) != (*state)->dst.seqhi + 1)) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } else { (*state)->dst.max_win = MAX(ntohs(th->th_win), 1); (*state)->dst.seqlo = ntohl(th->th_seq); pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, ntohl(th->th_ack), ntohl(th->th_seq) + 1, TH_ACK, (*state)->src.max_win, 0, 0, 0, (*state)->tag, NULL); pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, &sk->addr[pd->sidx], &sk->addr[pd->didx], sk->port[pd->sidx], sk->port[pd->didx], (*state)->src.seqhi + 1, (*state)->src.seqlo + 1, TH_ACK, (*state)->dst.max_win, 0, 0, 1, 0, NULL); (*state)->src.seqdiff = (*state)->dst.seqhi - (*state)->src.seqlo; (*state)->dst.seqdiff = (*state)->src.seqhi - (*state)->dst.seqlo; (*state)->src.seqhi = (*state)->src.seqlo + (*state)->dst.max_win; (*state)->dst.seqhi = (*state)->dst.seqlo + (*state)->src.max_win; (*state)->src.wscale = (*state)->dst.wscale = 0; (*state)->src.state = (*state)->dst.state = TCPS_ESTABLISHED; REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } } if (((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN) && dst->state >= TCPS_FIN_WAIT_2 && src->state >= TCPS_FIN_WAIT_2) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: state reuse "); pf_print_state(*state); pf_print_flags(th->th_flags); printf("\n"); } /* XXX make sure it's the same direction ?? */ (*state)->src.state = (*state)->dst.state = TCPS_CLOSED; pf_unlink_state(*state, PF_ENTER_LOCKED); *state = NULL; return (PF_DROP); } if ((*state)->state_flags & PFSTATE_SLOPPY) { if (pf_tcp_track_sloppy(src, dst, state, pd, reason) == PF_DROP) return (PF_DROP); } else { if (pf_tcp_track_full(src, dst, state, kif, m, off, pd, reason, ©back) == PF_DROP) return (PF_DROP); } /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], pd->af) || nk->port[pd->sidx] != th->th_sport) pf_change_ap(pd->src, &th->th_sport, pd->ip_sum, &th->th_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 0, pd->af); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], pd->af) || nk->port[pd->didx] != th->th_dport) pf_change_ap(pd->dst, &th->th_dport, pd->ip_sum, &th->th_sum, &nk->addr[pd->didx], nk->port[pd->didx], 0, pd->af); copyback = 1; } /* Copyback sequence modulation or stateful scrub changes if needed */ if (copyback) m_copyback(m, off, sizeof(*th), (caddr_t)th); return (PF_PASS); } static int pf_test_state_udp(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd) { struct pf_state_peer *src, *dst; struct pf_state_key_cmp key; struct udphdr *uh = pd->hdr.udp; bzero(&key, sizeof(key)); key.af = pd->af; key.proto = IPPROTO_UDP; if (direction == PF_IN) { /* wire side, straight */ PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); key.port[0] = uh->uh_sport; key.port[1] = uh->uh_dport; } else { /* stack side, reverse */ PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); key.port[1] = uh->uh_sport; key.port[0] = uh->uh_dport; } STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->src; dst = &(*state)->dst; } else { src = &(*state)->dst; dst = &(*state)->src; } /* update states */ if (src->state < PFUDPS_SINGLE) src->state = PFUDPS_SINGLE; if (dst->state == PFUDPS_SINGLE) dst->state = PFUDPS_MULTIPLE; /* update expire time */ (*state)->expire = time_uptime; if (src->state == PFUDPS_MULTIPLE && dst->state == PFUDPS_MULTIPLE) (*state)->timeout = PFTM_UDP_MULTIPLE; else (*state)->timeout = PFTM_UDP_SINGLE; /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], pd->af) || nk->port[pd->sidx] != uh->uh_sport) pf_change_ap(pd->src, &uh->uh_sport, pd->ip_sum, &uh->uh_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 1, pd->af); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], pd->af) || nk->port[pd->didx] != uh->uh_dport) pf_change_ap(pd->dst, &uh->uh_dport, pd->ip_sum, &uh->uh_sum, &nk->addr[pd->didx], nk->port[pd->didx], 1, pd->af); m_copyback(m, off, sizeof(*uh), (caddr_t)uh); } return (PF_PASS); } static int pf_test_state_icmp(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd, u_short *reason) { struct pf_addr *saddr = pd->src, *daddr = pd->dst; u_int16_t icmpid = 0, *icmpsum; u_int8_t icmptype; int state_icmp = 0; struct pf_state_key_cmp key; bzero(&key, sizeof(key)); switch (pd->proto) { #ifdef INET case IPPROTO_ICMP: icmptype = pd->hdr.icmp->icmp_type; icmpid = pd->hdr.icmp->icmp_id; icmpsum = &pd->hdr.icmp->icmp_cksum; if (icmptype == ICMP_UNREACH || icmptype == ICMP_SOURCEQUENCH || icmptype == ICMP_REDIRECT || icmptype == ICMP_TIMXCEED || icmptype == ICMP_PARAMPROB) state_icmp++; break; #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: icmptype = pd->hdr.icmp6->icmp6_type; icmpid = pd->hdr.icmp6->icmp6_id; icmpsum = &pd->hdr.icmp6->icmp6_cksum; if (icmptype == ICMP6_DST_UNREACH || icmptype == ICMP6_PACKET_TOO_BIG || icmptype == ICMP6_TIME_EXCEEDED || icmptype == ICMP6_PARAM_PROB) state_icmp++; break; #endif /* INET6 */ } if (!state_icmp) { /* * ICMP query/reply message not related to a TCP/UDP packet. * Search for an ICMP state. */ key.af = pd->af; key.proto = pd->proto; key.port[0] = key.port[1] = icmpid; if (direction == PF_IN) { /* wire side, straight */ PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); } else { /* stack side, reverse */ PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); } STATE_LOOKUP(kif, &key, direction, *state, pd); (*state)->expire = time_uptime; (*state)->timeout = PFTM_ICMP_ERROR_REPLY; /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; switch (pd->af) { #ifdef INET case AF_INET: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&saddr->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET)) pf_change_a(&daddr->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); if (nk->port[0] != pd->hdr.icmp->icmp_id) { pd->hdr.icmp->icmp_cksum = pf_cksum_fixup( pd->hdr.icmp->icmp_cksum, icmpid, nk->port[pd->sidx], 0); pd->hdr.icmp->icmp_id = nk->port[pd->sidx]; } m_copyback(m, off, ICMP_MINLEN, (caddr_t )pd->hdr.icmp); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET6)) pf_change_a6(saddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->sidx], 0); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET6)) pf_change_a6(daddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->didx], 0); m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); break; #endif /* INET6 */ } } return (PF_PASS); } else { /* * ICMP error message in response to a TCP/UDP packet. * Extract the inner TCP/UDP header and search for that state. */ struct pf_pdesc pd2; bzero(&pd2, sizeof pd2); #ifdef INET struct ip h2; #endif /* INET */ #ifdef INET6 struct ip6_hdr h2_6; int terminal = 0; #endif /* INET6 */ int ipoff2 = 0; int off2 = 0; pd2.af = pd->af; /* Payload packet is from the opposite direction. */ pd2.sidx = (direction == PF_IN) ? 1 : 0; pd2.didx = (direction == PF_IN) ? 0 : 1; switch (pd->af) { #ifdef INET case AF_INET: /* offset of h2 in mbuf chain */ ipoff2 = off + ICMP_MINLEN; if (!pf_pull_hdr(m, ipoff2, &h2, sizeof(h2), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(ip)\n")); return (PF_DROP); } /* * ICMP error messages don't refer to non-first * fragments */ if (h2.ip_off & htons(IP_OFFMASK)) { REASON_SET(reason, PFRES_FRAG); return (PF_DROP); } /* offset of protocol header that follows h2 */ off2 = ipoff2 + (h2.ip_hl << 2); pd2.proto = h2.ip_p; pd2.src = (struct pf_addr *)&h2.ip_src; pd2.dst = (struct pf_addr *)&h2.ip_dst; pd2.ip_sum = &h2.ip_sum; break; #endif /* INET */ #ifdef INET6 case AF_INET6: ipoff2 = off + sizeof(struct icmp6_hdr); if (!pf_pull_hdr(m, ipoff2, &h2_6, sizeof(h2_6), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(ip6)\n")); return (PF_DROP); } pd2.proto = h2_6.ip6_nxt; pd2.src = (struct pf_addr *)&h2_6.ip6_src; pd2.dst = (struct pf_addr *)&h2_6.ip6_dst; pd2.ip_sum = NULL; off2 = ipoff2 + sizeof(h2_6); do { switch (pd2.proto) { case IPPROTO_FRAGMENT: /* * ICMPv6 error messages for * non-first fragments */ REASON_SET(reason, PFRES_FRAG); return (PF_DROP); case IPPROTO_AH: case IPPROTO_HOPOPTS: case IPPROTO_ROUTING: case IPPROTO_DSTOPTS: { /* get next header and header length */ struct ip6_ext opt6; if (!pf_pull_hdr(m, off2, &opt6, sizeof(opt6), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMPv6 short opt\n")); return (PF_DROP); } if (pd2.proto == IPPROTO_AH) off2 += (opt6.ip6e_len + 2) * 4; else off2 += (opt6.ip6e_len + 1) * 8; pd2.proto = opt6.ip6e_nxt; /* goto the next header */ break; } default: terminal++; break; } } while (!terminal); break; #endif /* INET6 */ } switch (pd2.proto) { case IPPROTO_TCP: { struct tcphdr th; u_int32_t seq; struct pf_state_peer *src, *dst; u_int8_t dws; int copyback = 0; /* * Only the first 8 bytes of the TCP header can be * expected. Don't access any TCP header fields after * th_seq, an ackskew test is not possible. */ if (!pf_pull_hdr(m, off2, &th, 8, NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(tcp)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_TCP; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[pd2.sidx] = th.th_sport; key.port[pd2.didx] = th.th_dport; STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->dst; dst = &(*state)->src; } else { src = &(*state)->src; dst = &(*state)->dst; } if (src->wscale && dst->wscale) dws = dst->wscale & PF_WSCALE_MASK; else dws = 0; /* Demodulate sequence number */ seq = ntohl(th.th_seq) - src->seqdiff; if (src->seqdiff) { pf_change_a(&th.th_seq, icmpsum, htonl(seq), 0); copyback = 1; } if (!((*state)->state_flags & PFSTATE_SLOPPY) && (!SEQ_GEQ(src->seqhi, seq) || !SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)))) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: BAD ICMP %d:%d ", icmptype, pd->hdr.icmp->icmp_code); pf_print_host(pd->src, 0, pd->af); printf(" -> "); pf_print_host(pd->dst, 0, pd->af); printf(" state: "); pf_print_state(*state); printf(" seq=%u\n", seq); } REASON_SET(reason, PFRES_BADSTATE); return (PF_DROP); } else { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: OK ICMP %d:%d ", icmptype, pd->hdr.icmp->icmp_code); pf_print_host(pd->src, 0, pd->af); printf(" -> "); pf_print_host(pd->dst, 0, pd->af); printf(" state: "); pf_print_state(*state); printf(" seq=%u\n", seq); } } /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != th.th_sport) pf_change_icmp(pd2.src, &th.th_sport, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != th.th_dport) pf_change_icmp(pd2.dst, &th.th_dport, NULL, /* XXX Inbound NAT? */ &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); copyback = 1; } if (copyback) { switch (pd2.af) { #ifdef INET case AF_INET: m_copyback(m, off, ICMP_MINLEN, (caddr_t )pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t )&h2); break; #endif /* INET */ #ifdef INET6 case AF_INET6: m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t )&h2_6); break; #endif /* INET6 */ } m_copyback(m, off2, 8, (caddr_t)&th); } return (PF_PASS); break; } case IPPROTO_UDP: { struct udphdr uh; if (!pf_pull_hdr(m, off2, &uh, sizeof(uh), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(udp)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_UDP; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[pd2.sidx] = uh.uh_sport; key.port[pd2.didx] = uh.uh_dport; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != uh.uh_sport) pf_change_icmp(pd2.src, &uh.uh_sport, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], &uh.uh_sum, pd2.ip_sum, icmpsum, pd->ip_sum, 1, pd2.af); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != uh.uh_dport) pf_change_icmp(pd2.dst, &uh.uh_dport, NULL, /* XXX Inbound NAT? */ &nk->addr[pd2.didx], nk->port[pd2.didx], &uh.uh_sum, pd2.ip_sum, icmpsum, pd->ip_sum, 1, pd2.af); switch (pd2.af) { #ifdef INET case AF_INET: m_copyback(m, off, ICMP_MINLEN, (caddr_t )pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2); break; #endif /* INET */ #ifdef INET6 case AF_INET6: m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t )&h2_6); break; #endif /* INET6 */ } m_copyback(m, off2, sizeof(uh), (caddr_t)&uh); } return (PF_PASS); break; } #ifdef INET case IPPROTO_ICMP: { struct icmp iih; if (!pf_pull_hdr(m, off2, &iih, ICMP_MINLEN, NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short i" "(icmp)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_ICMP; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[0] = key.port[1] = iih.icmp_id; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != iih.icmp_id) pf_change_icmp(pd2.src, &iih.icmp_id, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != iih.icmp_id) pf_change_icmp(pd2.dst, &iih.icmp_id, NULL, /* XXX Inbound NAT? */ &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET); m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2); m_copyback(m, off2, ICMP_MINLEN, (caddr_t)&iih); } return (PF_PASS); break; } #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: { struct icmp6_hdr iih; if (!pf_pull_hdr(m, off2, &iih, sizeof(struct icmp6_hdr), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(icmp6)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_ICMPV6; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[0] = key.port[1] = iih.icmp6_id; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != iih.icmp6_id) pf_change_icmp(pd2.src, &iih.icmp6_id, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET6); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != iih.icmp6_id) pf_change_icmp(pd2.dst, &iih.icmp6_id, NULL, /* XXX Inbound NAT? */ &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET6); m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t)pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t)&h2_6); m_copyback(m, off2, sizeof(struct icmp6_hdr), (caddr_t)&iih); } return (PF_PASS); break; } #endif /* INET6 */ default: { key.af = pd2.af; key.proto = pd2.proto; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[0] = key.port[1] = 0; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af)) pf_change_icmp(pd2.src, NULL, daddr, &nk->addr[pd2.sidx], 0, NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af)) pf_change_icmp(pd2.src, NULL, NULL, /* XXX Inbound NAT? */ &nk->addr[pd2.didx], 0, NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); switch (pd2.af) { #ifdef INET case AF_INET: m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2); break; #endif /* INET */ #ifdef INET6 case AF_INET6: m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t )&h2_6); break; #endif /* INET6 */ } } return (PF_PASS); break; } } } } static int pf_test_state_other(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, struct pf_pdesc *pd) { struct pf_state_peer *src, *dst; struct pf_state_key_cmp key; bzero(&key, sizeof(key)); key.af = pd->af; key.proto = pd->proto; if (direction == PF_IN) { PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); key.port[0] = key.port[1] = 0; } else { PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); key.port[1] = key.port[0] = 0; } STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->src; dst = &(*state)->dst; } else { src = &(*state)->dst; dst = &(*state)->src; } /* update states */ if (src->state < PFOTHERS_SINGLE) src->state = PFOTHERS_SINGLE; if (dst->state == PFOTHERS_SINGLE) dst->state = PFOTHERS_MULTIPLE; /* update expire time */ (*state)->expire = time_uptime; if (src->state == PFOTHERS_MULTIPLE && dst->state == PFOTHERS_MULTIPLE) (*state)->timeout = PFTM_OTHER_MULTIPLE; else (*state)->timeout = PFTM_OTHER_SINGLE; /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; KASSERT(nk, ("%s: nk is null", __func__)); KASSERT(pd, ("%s: pd is null", __func__)); KASSERT(pd->src, ("%s: pd->src is null", __func__)); KASSERT(pd->dst, ("%s: pd->dst is null", __func__)); switch (pd->af) { #ifdef INET case AF_INET: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&pd->src->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET)) pf_change_a(&pd->dst->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET)) PF_ACPY(pd->src, &nk->addr[pd->sidx], pd->af); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET)) PF_ACPY(pd->dst, &nk->addr[pd->didx], pd->af); #endif /* INET6 */ } } return (PF_PASS); } /* * ipoff and off are measured from the start of the mbuf chain. * h must be at "ipoff" on the mbuf chain. */ void * pf_pull_hdr(struct mbuf *m, int off, void *p, int len, u_short *actionp, u_short *reasonp, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: { struct ip *h = mtod(m, struct ip *); u_int16_t fragoff = (ntohs(h->ip_off) & IP_OFFMASK) << 3; if (fragoff) { if (fragoff >= len) ACTION_SET(actionp, PF_PASS); else { ACTION_SET(actionp, PF_DROP); REASON_SET(reasonp, PFRES_FRAG); } return (NULL); } if (m->m_pkthdr.len < off + len || ntohs(h->ip_len) < off + len) { ACTION_SET(actionp, PF_DROP); REASON_SET(reasonp, PFRES_SHORT); return (NULL); } break; } #endif /* INET */ #ifdef INET6 case AF_INET6: { struct ip6_hdr *h = mtod(m, struct ip6_hdr *); if (m->m_pkthdr.len < off + len || (ntohs(h->ip6_plen) + sizeof(struct ip6_hdr)) < (unsigned)(off + len)) { ACTION_SET(actionp, PF_DROP); REASON_SET(reasonp, PFRES_SHORT); return (NULL); } break; } #endif /* INET6 */ } m_copydata(m, off, len, p); return (p); } int pf_routable(struct pf_addr *addr, sa_family_t af, struct pfi_kif *kif, int rtableid) { #ifdef RADIX_MPATH struct radix_node_head *rnh; #endif struct sockaddr_in *dst; int ret = 1; int check_mpath; #ifdef INET6 struct sockaddr_in6 *dst6; struct route_in6 ro; #else struct route ro; #endif struct radix_node *rn; struct rtentry *rt; struct ifnet *ifp; check_mpath = 0; #ifdef RADIX_MPATH /* XXX: stick to table 0 for now */ rnh = rt_tables_get_rnh(0, af); if (rnh != NULL && rn_mpath_capable(rnh)) check_mpath = 1; #endif bzero(&ro, sizeof(ro)); switch (af) { case AF_INET: dst = satosin(&ro.ro_dst); dst->sin_family = AF_INET; dst->sin_len = sizeof(*dst); dst->sin_addr = addr->v4; break; #ifdef INET6 case AF_INET6: /* * Skip check for addresses with embedded interface scope, * as they would always match anyway. */ if (IN6_IS_SCOPE_EMBED(&addr->v6)) goto out; dst6 = (struct sockaddr_in6 *)&ro.ro_dst; dst6->sin6_family = AF_INET6; dst6->sin6_len = sizeof(*dst6); dst6->sin6_addr = addr->v6; break; #endif /* INET6 */ default: return (0); } /* Skip checks for ipsec interfaces */ if (kif != NULL && kif->pfik_ifp->if_type == IFT_ENC) goto out; switch (af) { #ifdef INET6 case AF_INET6: in6_rtalloc_ign(&ro, 0, rtableid); break; #endif #ifdef INET case AF_INET: in_rtalloc_ign((struct route *)&ro, 0, rtableid); break; #endif default: rtalloc_ign((struct route *)&ro, 0); /* No/default FIB. */ break; } if (ro.ro_rt != NULL) { /* No interface given, this is a no-route check */ if (kif == NULL) goto out; if (kif->pfik_ifp == NULL) { ret = 0; goto out; } /* Perform uRPF check if passed input interface */ ret = 0; rn = (struct radix_node *)ro.ro_rt; do { rt = (struct rtentry *)rn; ifp = rt->rt_ifp; if (kif->pfik_ifp == ifp) ret = 1; #ifdef RADIX_MPATH rn = rn_mpath_next(rn); #endif } while (check_mpath == 1 && rn != NULL && ret == 0); } else ret = 0; out: if (ro.ro_rt != NULL) RTFREE(ro.ro_rt); return (ret); } #ifdef INET static void pf_route(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp, struct pf_state *s, struct pf_pdesc *pd) { struct mbuf *m0, *m1; struct sockaddr_in dst; struct ip *ip; struct ifnet *ifp = NULL; struct pf_addr naddr; struct pf_src_node *sn = NULL; int error = 0; uint16_t ip_len, ip_off; KASSERT(m && *m && r && oifp, ("%s: invalid parameters", __func__)); KASSERT(dir == PF_IN || dir == PF_OUT, ("%s: invalid direction", __func__)); if ((pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) || pd->pf_mtag->routed++ > 3) { m0 = *m; *m = NULL; goto bad_locked; } if (r->rt == PF_DUPTO) { if ((m0 = m_dup(*m, M_NOWAIT)) == NULL) { if (s) PF_STATE_UNLOCK(s); return; } } else { if ((r->rt == PF_REPLYTO) == (r->direction == dir)) { if (s) PF_STATE_UNLOCK(s); return; } m0 = *m; } ip = mtod(m0, struct ip *); bzero(&dst, sizeof(dst)); dst.sin_family = AF_INET; dst.sin_len = sizeof(dst); dst.sin_addr = ip->ip_dst; if (r->rt == PF_FASTROUTE) { struct rtentry *rt; if (s) PF_STATE_UNLOCK(s); rt = rtalloc1_fib(sintosa(&dst), 0, 0, M_GETFIB(m0)); if (rt == NULL) { KMOD_IPSTAT_INC(ips_noroute); error = EHOSTUNREACH; goto bad; } ifp = rt->rt_ifp; counter_u64_add(rt->rt_pksent, 1); if (rt->rt_flags & RTF_GATEWAY) bcopy(satosin(rt->rt_gateway), &dst, sizeof(dst)); RTFREE_LOCKED(rt); } else { if (TAILQ_EMPTY(&r->rpool.list)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: TAILQ_EMPTY(&r->rpool.list)\n", __func__)); goto bad_locked; } if (s == NULL) { pf_map_addr(AF_INET, r, (struct pf_addr *)&ip->ip_src, &naddr, NULL, &sn); if (!PF_AZERO(&naddr, AF_INET)) dst.sin_addr.s_addr = naddr.v4.s_addr; ifp = r->rpool.cur->kif ? r->rpool.cur->kif->pfik_ifp : NULL; } else { if (!PF_AZERO(&s->rt_addr, AF_INET)) dst.sin_addr.s_addr = s->rt_addr.v4.s_addr; ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; PF_STATE_UNLOCK(s); } } if (ifp == NULL) goto bad; if (oifp != ifp) { if (pf_test(PF_OUT, ifp, &m0, NULL) != PF_PASS) goto bad; else if (m0 == NULL) goto done; if (m0->m_len < sizeof(struct ip)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: m0->m_len < sizeof(struct ip)\n", __func__)); goto bad; } ip = mtod(m0, struct ip *); } if (ifp->if_flags & IFF_LOOPBACK) m0->m_flags |= M_SKIP_FIREWALL; ip_len = ntohs(ip->ip_len); ip_off = ntohs(ip->ip_off); /* Copied from FreeBSD 10.0-CURRENT ip_output. */ m0->m_pkthdr.csum_flags |= CSUM_IP; if (m0->m_pkthdr.csum_flags & CSUM_DELAY_DATA & ~ifp->if_hwassist) { in_delayed_cksum(m0); m0->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA; } #ifdef SCTP if (m0->m_pkthdr.csum_flags & CSUM_SCTP & ~ifp->if_hwassist) { sctp_delayed_cksum(m, (uint32_t)(ip->ip_hl << 2)); m0->m_pkthdr.csum_flags &= ~CSUM_SCTP; } #endif /* * If small enough for interface, or the interface will take * care of the fragmentation for us, we can just send directly. */ if (ip_len <= ifp->if_mtu || (m0->m_pkthdr.csum_flags & ifp->if_hwassist & CSUM_TSO) != 0) { ip->ip_sum = 0; if (m0->m_pkthdr.csum_flags & CSUM_IP & ~ifp->if_hwassist) { ip->ip_sum = in_cksum(m0, ip->ip_hl << 2); m0->m_pkthdr.csum_flags &= ~CSUM_IP; } m_clrprotoflags(m0); /* Avoid confusing lower layers. */ error = (*ifp->if_output)(ifp, m0, sintosa(&dst), NULL); goto done; } /* Balk when DF bit is set or the interface didn't support TSO. */ if ((ip_off & IP_DF) || (m0->m_pkthdr.csum_flags & CSUM_TSO)) { error = EMSGSIZE; KMOD_IPSTAT_INC(ips_cantfrag); if (r->rt != PF_DUPTO) { icmp_error(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0, ifp->if_mtu); goto done; } else goto bad; } error = ip_fragment(ip, &m0, ifp->if_mtu, ifp->if_hwassist); if (error) goto bad; for (; m0; m0 = m1) { m1 = m0->m_nextpkt; m0->m_nextpkt = NULL; if (error == 0) { m_clrprotoflags(m0); error = (*ifp->if_output)(ifp, m0, sintosa(&dst), NULL); } else m_freem(m0); } if (error == 0) KMOD_IPSTAT_INC(ips_fragmented); done: if (r->rt != PF_DUPTO) *m = NULL; return; bad_locked: if (s) PF_STATE_UNLOCK(s); bad: m_freem(m0); goto done; } #endif /* INET */ #ifdef INET6 static void pf_route6(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp, struct pf_state *s, struct pf_pdesc *pd) { struct mbuf *m0; struct sockaddr_in6 dst; struct ip6_hdr *ip6; struct ifnet *ifp = NULL; struct pf_addr naddr; struct pf_src_node *sn = NULL; KASSERT(m && *m && r && oifp, ("%s: invalid parameters", __func__)); KASSERT(dir == PF_IN || dir == PF_OUT, ("%s: invalid direction", __func__)); if ((pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) || pd->pf_mtag->routed++ > 3) { m0 = *m; *m = NULL; goto bad_locked; } if (r->rt == PF_DUPTO) { if ((m0 = m_dup(*m, M_NOWAIT)) == NULL) { if (s) PF_STATE_UNLOCK(s); return; } } else { if ((r->rt == PF_REPLYTO) == (r->direction == dir)) { if (s) PF_STATE_UNLOCK(s); return; } m0 = *m; } ip6 = mtod(m0, struct ip6_hdr *); bzero(&dst, sizeof(dst)); dst.sin6_family = AF_INET6; dst.sin6_len = sizeof(dst); dst.sin6_addr = ip6->ip6_dst; /* Cheat. XXX why only in the v6 case??? */ if (r->rt == PF_FASTROUTE) { if (s) PF_STATE_UNLOCK(s); m0->m_flags |= M_SKIP_FIREWALL; ip6_output(m0, NULL, NULL, 0, NULL, NULL, NULL); *m = NULL; return; } if (TAILQ_EMPTY(&r->rpool.list)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: TAILQ_EMPTY(&r->rpool.list)\n", __func__)); goto bad_locked; } if (s == NULL) { pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src, &naddr, NULL, &sn); if (!PF_AZERO(&naddr, AF_INET6)) PF_ACPY((struct pf_addr *)&dst.sin6_addr, &naddr, AF_INET6); ifp = r->rpool.cur->kif ? r->rpool.cur->kif->pfik_ifp : NULL; } else { if (!PF_AZERO(&s->rt_addr, AF_INET6)) PF_ACPY((struct pf_addr *)&dst.sin6_addr, &s->rt_addr, AF_INET6); ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; } if (s) PF_STATE_UNLOCK(s); if (ifp == NULL) goto bad; if (oifp != ifp) { if (pf_test6(PF_FWD, ifp, &m0, NULL) != PF_PASS) goto bad; else if (m0 == NULL) goto done; if (m0->m_len < sizeof(struct ip6_hdr)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: m0->m_len < sizeof(struct ip6_hdr)\n", __func__)); goto bad; } ip6 = mtod(m0, struct ip6_hdr *); } if (ifp->if_flags & IFF_LOOPBACK) m0->m_flags |= M_SKIP_FIREWALL; /* * If the packet is too large for the outgoing interface, * send back an icmp6 error. */ if (IN6_IS_SCOPE_EMBED(&dst.sin6_addr)) dst.sin6_addr.s6_addr16[1] = htons(ifp->if_index); if ((u_long)m0->m_pkthdr.len <= ifp->if_mtu) nd6_output(ifp, ifp, m0, &dst, NULL); else { in6_ifstat_inc(ifp, ifs6_in_toobig); if (r->rt != PF_DUPTO) icmp6_error(m0, ICMP6_PACKET_TOO_BIG, 0, ifp->if_mtu); else goto bad; } done: if (r->rt != PF_DUPTO) *m = NULL; return; bad_locked: if (s) PF_STATE_UNLOCK(s); bad: m_freem(m0); goto done; } #endif /* INET6 */ /* * FreeBSD supports cksum offloads for the following drivers. * em(4), fxp(4), ixgb(4), lge(4), ndis(4), nge(4), re(4), * ti(4), txp(4), xl(4) * * CSUM_DATA_VALID | CSUM_PSEUDO_HDR : * network driver performed cksum including pseudo header, need to verify * csum_data * CSUM_DATA_VALID : * network driver performed cksum, needs to additional pseudo header * cksum computation with partial csum_data(i.e. lack of H/W support for * pseudo header, for instance hme(4), sk(4) and possibly gem(4)) * * After validating the cksum of packet, set both flag CSUM_DATA_VALID and * CSUM_PSEUDO_HDR in order to avoid recomputation of the cksum in upper * TCP/UDP layer. * Also, set csum_data to 0xffff to force cksum validation. */ static int pf_check_proto_cksum(struct mbuf *m, int off, int len, u_int8_t p, sa_family_t af) { u_int16_t sum = 0; int hw_assist = 0; struct ip *ip; if (off < sizeof(struct ip) || len < sizeof(struct udphdr)) return (1); if (m->m_pkthdr.len < off + len) return (1); switch (p) { case IPPROTO_TCP: if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) { if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) { sum = m->m_pkthdr.csum_data; } else { ip = mtod(m, struct ip *); sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htonl((u_short)len + m->m_pkthdr.csum_data + IPPROTO_TCP)); } sum ^= 0xffff; ++hw_assist; } break; case IPPROTO_UDP: if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) { if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) { sum = m->m_pkthdr.csum_data; } else { ip = mtod(m, struct ip *); sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htonl((u_short)len + m->m_pkthdr.csum_data + IPPROTO_UDP)); } sum ^= 0xffff; ++hw_assist; } break; case IPPROTO_ICMP: #ifdef INET6 case IPPROTO_ICMPV6: #endif /* INET6 */ break; default: return (1); } if (!hw_assist) { switch (af) { case AF_INET: if (p == IPPROTO_ICMP) { if (m->m_len < off) return (1); m->m_data += off; m->m_len -= off; sum = in_cksum(m, len); m->m_data -= off; m->m_len += off; } else { if (m->m_len < sizeof(struct ip)) return (1); sum = in4_cksum(m, p, off, len); } break; #ifdef INET6 case AF_INET6: if (m->m_len < sizeof(struct ip6_hdr)) return (1); sum = in6_cksum(m, p, off, len); break; #endif /* INET6 */ default: return (1); } } if (sum) { switch (p) { case IPPROTO_TCP: { KMOD_TCPSTAT_INC(tcps_rcvbadsum); break; } case IPPROTO_UDP: { KMOD_UDPSTAT_INC(udps_badsum); break; } #ifdef INET case IPPROTO_ICMP: { KMOD_ICMPSTAT_INC(icps_checksum); break; } #endif #ifdef INET6 case IPPROTO_ICMPV6: { KMOD_ICMP6STAT_INC(icp6s_checksum); break; } #endif /* INET6 */ } return (1); } else { if (p == IPPROTO_TCP || p == IPPROTO_UDP) { m->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m->m_pkthdr.csum_data = 0xffff; } } return (0); } #ifdef INET int pf_test(int dir, struct ifnet *ifp, struct mbuf **m0, struct inpcb *inp) { struct pfi_kif *kif; u_short action, reason = 0, log = 0; struct mbuf *m = *m0; struct ip *h = NULL; struct m_tag *ipfwtag; struct pf_rule *a = NULL, *r = &V_pf_default_rule, *tr, *nr; struct pf_state *s = NULL; struct pf_ruleset *ruleset = NULL; struct pf_pdesc pd; int off, dirndx, pqid = 0; M_ASSERTPKTHDR(m); if (!V_pf_status.running) return (PF_PASS); memset(&pd, 0, sizeof(pd)); kif = (struct pfi_kif *)ifp->if_pf_kif; if (kif == NULL) { DPFPRINTF(PF_DEBUG_URGENT, ("pf_test: kif == NULL, if_xname %s\n", ifp->if_xname)); return (PF_DROP); } if (kif->pfik_flags & PFI_IFLAG_SKIP) return (PF_PASS); if (m->m_flags & M_SKIP_FIREWALL) return (PF_PASS); pd.pf_mtag = pf_find_mtag(m); PF_RULES_RLOCK(); if (ip_divert_ptr != NULL && ((ipfwtag = m_tag_locate(m, MTAG_IPFW_RULE, 0, NULL)) != NULL)) { struct ipfw_rule_ref *rr = (struct ipfw_rule_ref *)(ipfwtag+1); if (rr->info & IPFW_IS_DIVERT && rr->rulenum == 0) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; goto done; } pd.pf_mtag->flags |= PF_PACKET_LOOPED; m_tag_delete(m, ipfwtag); } if (pd.pf_mtag && pd.pf_mtag->flags & PF_FASTFWD_OURS_PRESENT) { m->m_flags |= M_FASTFWD_OURS; pd.pf_mtag->flags &= ~PF_FASTFWD_OURS_PRESENT; } } else if (pf_normalize_ip(m0, dir, kif, &reason, &pd) != PF_PASS) { /* We do IP header normalization and packet reassembly here */ action = PF_DROP; goto done; } m = *m0; /* pf_normalize messes with m0 */ h = mtod(m, struct ip *); off = h->ip_hl << 2; if (off < (int)sizeof(struct ip)) { action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); log = 1; goto done; } pd.src = (struct pf_addr *)&h->ip_src; pd.dst = (struct pf_addr *)&h->ip_dst; pd.sport = pd.dport = NULL; pd.ip_sum = &h->ip_sum; pd.proto_sum = NULL; pd.proto = h->ip_p; pd.dir = dir; pd.sidx = (dir == PF_IN) ? 0 : 1; pd.didx = (dir == PF_IN) ? 1 : 0; pd.af = AF_INET; pd.tos = h->ip_tos; pd.tot_len = ntohs(h->ip_len); /* handle fragments that didn't get reassembled by normalization */ if (h->ip_off & htons(IP_MF | IP_OFFMASK)) { action = pf_test_fragment(&r, dir, kif, m, h, &pd, &a, &ruleset); goto done; } switch (h->ip_p) { case IPPROTO_TCP: { struct tcphdr th; pd.hdr.tcp = &th; if (!pf_pull_hdr(m, off, &th, sizeof(th), &action, &reason, AF_INET)) { log = action != PF_PASS; goto done; } pd.p_len = pd.tot_len - off - (th.th_off << 2); if ((th.th_flags & TH_ACK) && pd.p_len == 0) pqid = 1; action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd); if (action == PF_DROP) goto done; action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_UDP: { struct udphdr uh; pd.hdr.udp = &uh; if (!pf_pull_hdr(m, off, &uh, sizeof(uh), &action, &reason, AF_INET)) { log = action != PF_PASS; goto done; } if (uh.uh_dport == 0 || ntohs(uh.uh_ulen) > m->m_pkthdr.len - off || ntohs(uh.uh_ulen) < sizeof(struct udphdr)) { action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); goto done; } action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_ICMP: { struct icmp ih; pd.hdr.icmp = &ih; if (!pf_pull_hdr(m, off, &ih, ICMP_MINLEN, &action, &reason, AF_INET)) { log = action != PF_PASS; goto done; } action = pf_test_state_icmp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } #ifdef INET6 case IPPROTO_ICMPV6: { action = PF_DROP; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping IPv4 packet with ICMPv6 payload\n")); goto done; } #endif default: action = pf_test_state_other(&s, dir, kif, m, &pd); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } done: PF_RULES_RUNLOCK(); if (action == PF_PASS && h->ip_hl > 5 && !((s && s->state_flags & PFSTATE_ALLOWOPTS) || r->allow_opts)) { action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping packet with ip options\n")); } if (s && s->tag > 0 && pf_tag_packet(m, &pd, s->tag)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } if (r->rtableid >= 0) M_SETFIB(m, r->rtableid); #ifdef ALTQ if (action == PF_PASS && r->qid) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } if (pqid || (pd.tos & IPTOS_LOWDELAY)) pd.pf_mtag->qid = r->pqid; else pd.pf_mtag->qid = r->qid; /* add hints for ecn */ pd.pf_mtag->hdr = h; } #endif /* ALTQ */ /* * connections redirected to loopback should not match sockets * bound specifically to loopback due to security implications, * see tcp_input() and in_pcblookup_listen(). */ if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP || pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL && (s->nat_rule.ptr->action == PF_RDR || s->nat_rule.ptr->action == PF_BINAT) && (ntohl(pd.dst->v4.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET) m->m_flags |= M_SKIP_FIREWALL; if (action == PF_PASS && r->divert.port && ip_divert_ptr != NULL && !PACKET_LOOPED(&pd)) { ipfwtag = m_tag_alloc(MTAG_IPFW_RULE, 0, sizeof(struct ipfw_rule_ref), M_NOWAIT | M_ZERO); if (ipfwtag != NULL) { ((struct ipfw_rule_ref *)(ipfwtag+1))->info = ntohs(r->divert.port); ((struct ipfw_rule_ref *)(ipfwtag+1))->rulenum = dir; if (s) PF_STATE_UNLOCK(s); m_tag_prepend(m, ipfwtag); if (m->m_flags & M_FASTFWD_OURS) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: failed to allocate tag\n")); } pd.pf_mtag->flags |= PF_FASTFWD_OURS_PRESENT; m->m_flags &= ~M_FASTFWD_OURS; } ip_divert_ptr(*m0, dir == PF_IN ? DIR_IN : DIR_OUT); *m0 = NULL; return (action); } else { /* XXX: ipfw has the same behaviour! */ action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: failed to allocate divert tag\n")); } } if (log) { struct pf_rule *lr; if (s != NULL && s->nat_rule.ptr != NULL && s->nat_rule.ptr->log & PF_LOG_ALL) lr = s->nat_rule.ptr; else lr = r; PFLOG_PACKET(kif, m, AF_INET, dir, reason, lr, a, ruleset, &pd, (s == NULL)); } kif->pfik_bytes[0][dir == PF_OUT][action != PF_PASS] += pd.tot_len; kif->pfik_packets[0][dir == PF_OUT][action != PF_PASS]++; if (action == PF_PASS || r->action == PF_DROP) { dirndx = (dir == PF_OUT); r->packets[dirndx]++; r->bytes[dirndx] += pd.tot_len; if (a != NULL) { a->packets[dirndx]++; a->bytes[dirndx] += pd.tot_len; } if (s != NULL) { if (s->nat_rule.ptr != NULL) { s->nat_rule.ptr->packets[dirndx]++; s->nat_rule.ptr->bytes[dirndx] += pd.tot_len; } if (s->src_node != NULL) { s->src_node->packets[dirndx]++; s->src_node->bytes[dirndx] += pd.tot_len; } if (s->nat_src_node != NULL) { s->nat_src_node->packets[dirndx]++; s->nat_src_node->bytes[dirndx] += pd.tot_len; } dirndx = (dir == s->direction) ? 0 : 1; s->packets[dirndx]++; s->bytes[dirndx] += pd.tot_len; } tr = r; nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule; if (nr != NULL && r == &V_pf_default_rule) tr = nr; if (tr->src.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->src.addr.p.tbl, (s == NULL) ? pd.src : &s->key[(s->direction == PF_IN)]-> addr[(s->direction == PF_OUT)], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->src.neg); if (tr->dst.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->dst.addr.p.tbl, (s == NULL) ? pd.dst : &s->key[(s->direction == PF_IN)]-> addr[(s->direction == PF_IN)], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->dst.neg); } switch (action) { case PF_SYNPROXY_DROP: m_freem(*m0); case PF_DEFER: *m0 = NULL; action = PF_PASS; break; case PF_DROP: m_freem(*m0); *m0 = NULL; break; default: /* pf_route() returns unlocked. */ if (r->rt) { pf_route(m0, r, dir, kif->pfik_ifp, s, &pd); return (action); } break; } if (s) PF_STATE_UNLOCK(s); return (action); } #endif /* INET */ #ifdef INET6 int pf_test6(int dir, struct ifnet *ifp, struct mbuf **m0, struct inpcb *inp) { struct pfi_kif *kif; u_short action, reason = 0, log = 0; struct mbuf *m = *m0, *n = NULL; struct m_tag *mtag; struct ip6_hdr *h = NULL; struct pf_rule *a = NULL, *r = &V_pf_default_rule, *tr, *nr; struct pf_state *s = NULL; struct pf_ruleset *ruleset = NULL; struct pf_pdesc pd; int off, terminal = 0, dirndx, rh_cnt = 0; int fwdir = dir; M_ASSERTPKTHDR(m); - if (ifp != m->m_pkthdr.rcvif) + if (dir == PF_OUT && m->m_pkthdr.rcvif && ifp != m->m_pkthdr.rcvif) fwdir = PF_FWD; if (!V_pf_status.running) return (PF_PASS); memset(&pd, 0, sizeof(pd)); pd.pf_mtag = pf_find_mtag(m); if (pd.pf_mtag && pd.pf_mtag->flags & PF_TAG_GENERATED) return (PF_PASS); kif = (struct pfi_kif *)ifp->if_pf_kif; if (kif == NULL) { DPFPRINTF(PF_DEBUG_URGENT, ("pf_test6: kif == NULL, if_xname %s\n", ifp->if_xname)); return (PF_DROP); } if (kif->pfik_flags & PFI_IFLAG_SKIP) return (PF_PASS); if (m->m_flags & M_SKIP_FIREWALL) return (PF_PASS); PF_RULES_RLOCK(); /* We do IP header normalization and packet reassembly here */ if (pf_normalize_ip6(m0, dir, kif, &reason, &pd) != PF_PASS) { action = PF_DROP; goto done; } m = *m0; /* pf_normalize messes with m0 */ h = mtod(m, struct ip6_hdr *); #if 1 /* * we do not support jumbogram yet. if we keep going, zero ip6_plen * will do something bad, so drop the packet for now. */ if (htons(h->ip6_plen) == 0) { action = PF_DROP; REASON_SET(&reason, PFRES_NORM); /*XXX*/ goto done; } #endif pd.src = (struct pf_addr *)&h->ip6_src; pd.dst = (struct pf_addr *)&h->ip6_dst; pd.sport = pd.dport = NULL; pd.ip_sum = NULL; pd.proto_sum = NULL; pd.dir = dir; pd.sidx = (dir == PF_IN) ? 0 : 1; pd.didx = (dir == PF_IN) ? 1 : 0; pd.af = AF_INET6; pd.tos = 0; pd.tot_len = ntohs(h->ip6_plen) + sizeof(struct ip6_hdr); off = ((caddr_t)h - m->m_data) + sizeof(struct ip6_hdr); pd.proto = h->ip6_nxt; do { switch (pd.proto) { case IPPROTO_FRAGMENT: action = pf_test_fragment(&r, dir, kif, m, h, &pd, &a, &ruleset); if (action == PF_DROP) REASON_SET(&reason, PFRES_FRAG); goto done; case IPPROTO_ROUTING: { struct ip6_rthdr rthdr; if (rh_cnt++) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 more than one rthdr\n")); action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = 1; goto done; } if (!pf_pull_hdr(m, off, &rthdr, sizeof(rthdr), NULL, &reason, pd.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 short rthdr\n")); action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); log = 1; goto done; } if (rthdr.ip6r_type == IPV6_RTHDR_TYPE_0) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 rthdr0\n")); action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = 1; goto done; } /* FALLTHROUGH */ } case IPPROTO_AH: case IPPROTO_HOPOPTS: case IPPROTO_DSTOPTS: { /* get next header and header length */ struct ip6_ext opt6; if (!pf_pull_hdr(m, off, &opt6, sizeof(opt6), NULL, &reason, pd.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 short opt\n")); action = PF_DROP; log = 1; goto done; } if (pd.proto == IPPROTO_AH) off += (opt6.ip6e_len + 2) * 4; else off += (opt6.ip6e_len + 1) * 8; pd.proto = opt6.ip6e_nxt; /* goto the next header */ break; } default: terminal++; break; } } while (!terminal); /* if there's no routing header, use unmodified mbuf for checksumming */ if (!n) n = m; switch (pd.proto) { case IPPROTO_TCP: { struct tcphdr th; pd.hdr.tcp = &th; if (!pf_pull_hdr(m, off, &th, sizeof(th), &action, &reason, AF_INET6)) { log = action != PF_PASS; goto done; } pd.p_len = pd.tot_len - off - (th.th_off << 2); action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd); if (action == PF_DROP) goto done; action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_UDP: { struct udphdr uh; pd.hdr.udp = &uh; if (!pf_pull_hdr(m, off, &uh, sizeof(uh), &action, &reason, AF_INET6)) { log = action != PF_PASS; goto done; } if (uh.uh_dport == 0 || ntohs(uh.uh_ulen) > m->m_pkthdr.len - off || ntohs(uh.uh_ulen) < sizeof(struct udphdr)) { action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); goto done; } action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_ICMP: { action = PF_DROP; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping IPv6 packet with ICMPv4 payload\n")); goto done; } case IPPROTO_ICMPV6: { struct icmp6_hdr ih; pd.hdr.icmp6 = &ih; if (!pf_pull_hdr(m, off, &ih, sizeof(ih), &action, &reason, AF_INET6)) { log = action != PF_PASS; goto done; } action = pf_test_state_icmp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } default: action = pf_test_state_other(&s, dir, kif, m, &pd); if (action == PF_PASS) { if (pfsync_update_state_ptr != NULL) pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } done: PF_RULES_RUNLOCK(); if (n != m) { m_freem(n); n = NULL; } /* handle dangerous IPv6 extension headers. */ if (action == PF_PASS && rh_cnt && !((s && s->state_flags & PFSTATE_ALLOWOPTS) || r->allow_opts)) { action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping packet with dangerous v6 headers\n")); } if (s && s->tag > 0 && pf_tag_packet(m, &pd, s->tag)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } if (r->rtableid >= 0) M_SETFIB(m, r->rtableid); #ifdef ALTQ if (action == PF_PASS && r->qid) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } if (pd.tos & IPTOS_LOWDELAY) pd.pf_mtag->qid = r->pqid; else pd.pf_mtag->qid = r->qid; /* add hints for ecn */ pd.pf_mtag->hdr = h; } #endif /* ALTQ */ if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP || pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL && (s->nat_rule.ptr->action == PF_RDR || s->nat_rule.ptr->action == PF_BINAT) && IN6_IS_ADDR_LOOPBACK(&pd.dst->v6)) m->m_flags |= M_SKIP_FIREWALL; /* XXX: Anybody working on it?! */ if (r->divert.port) printf("pf: divert(9) is not supported for IPv6\n"); if (log) { struct pf_rule *lr; if (s != NULL && s->nat_rule.ptr != NULL && s->nat_rule.ptr->log & PF_LOG_ALL) lr = s->nat_rule.ptr; else lr = r; PFLOG_PACKET(kif, m, AF_INET6, dir, reason, lr, a, ruleset, &pd, (s == NULL)); } kif->pfik_bytes[1][dir == PF_OUT][action != PF_PASS] += pd.tot_len; kif->pfik_packets[1][dir == PF_OUT][action != PF_PASS]++; if (action == PF_PASS || r->action == PF_DROP) { dirndx = (dir == PF_OUT); r->packets[dirndx]++; r->bytes[dirndx] += pd.tot_len; if (a != NULL) { a->packets[dirndx]++; a->bytes[dirndx] += pd.tot_len; } if (s != NULL) { if (s->nat_rule.ptr != NULL) { s->nat_rule.ptr->packets[dirndx]++; s->nat_rule.ptr->bytes[dirndx] += pd.tot_len; } if (s->src_node != NULL) { s->src_node->packets[dirndx]++; s->src_node->bytes[dirndx] += pd.tot_len; } if (s->nat_src_node != NULL) { s->nat_src_node->packets[dirndx]++; s->nat_src_node->bytes[dirndx] += pd.tot_len; } dirndx = (dir == s->direction) ? 0 : 1; s->packets[dirndx]++; s->bytes[dirndx] += pd.tot_len; } tr = r; nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule; if (nr != NULL && r == &V_pf_default_rule) tr = nr; if (tr->src.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->src.addr.p.tbl, (s == NULL) ? pd.src : &s->key[(s->direction == PF_IN)]->addr[0], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->src.neg); if (tr->dst.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->dst.addr.p.tbl, (s == NULL) ? pd.dst : &s->key[(s->direction == PF_IN)]->addr[1], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->dst.neg); } switch (action) { case PF_SYNPROXY_DROP: m_freem(*m0); case PF_DEFER: *m0 = NULL; action = PF_PASS; break; case PF_DROP: m_freem(*m0); *m0 = NULL; break; default: /* pf_route6() returns unlocked. */ if (r->rt) { pf_route6(m0, r, dir, kif->pfik_ifp, s, &pd); return (action); } break; } if (s) PF_STATE_UNLOCK(s); /* If reassembled packet passed, create new fragments. */ if (action == PF_PASS && *m0 && fwdir == PF_FWD && (mtag = m_tag_find(m, PF_REASSEMBLED, NULL)) != NULL) action = pf_refragment6(ifp, m0, mtag); return (action); } #endif /* INET6 */ Index: user/ngie/more-tests/sys/netpfil/pf/pf_norm.c =================================================================== --- user/ngie/more-tests/sys/netpfil/pf/pf_norm.c (revision 281584) +++ user/ngie/more-tests/sys/netpfil/pf/pf_norm.c (revision 281585) @@ -1,2294 +1,2294 @@ /*- * Copyright 2001 Niels Provos * Copyright 2011 Alexander Bluhm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $OpenBSD: pf_norm.c,v 1.114 2009/01/29 14:11:45 henning Exp $ */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include "opt_pf.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET6 #include #endif /* INET6 */ struct pf_frent { TAILQ_ENTRY(pf_frent) fr_next; struct mbuf *fe_m; uint16_t fe_hdrlen; /* ipv4 header lenght with ip options ipv6, extension, fragment header */ uint16_t fe_extoff; /* last extension header offset or 0 */ uint16_t fe_len; /* fragment length */ uint16_t fe_off; /* fragment offset */ uint16_t fe_mff; /* more fragment flag */ }; struct pf_fragment_cmp { struct pf_addr frc_src; struct pf_addr frc_dst; uint32_t frc_id; sa_family_t frc_af; uint8_t frc_proto; uint8_t frc_direction; }; struct pf_fragment { struct pf_fragment_cmp fr_key; #define fr_src fr_key.frc_src #define fr_dst fr_key.frc_dst #define fr_id fr_key.frc_id #define fr_af fr_key.frc_af #define fr_proto fr_key.frc_proto #define fr_direction fr_key.frc_direction RB_ENTRY(pf_fragment) fr_entry; TAILQ_ENTRY(pf_fragment) frag_next; uint8_t fr_flags; /* status flags */ #define PFFRAG_SEENLAST 0x0001 /* Seen the last fragment for this */ #define PFFRAG_NOBUFFER 0x0002 /* Non-buffering fragment cache */ #define PFFRAG_DROP 0x0004 /* Drop all fragments */ #define BUFFER_FRAGMENTS(fr) (!((fr)->fr_flags & PFFRAG_NOBUFFER)) uint16_t fr_max; /* fragment data max */ uint32_t fr_timeout; uint16_t fr_maxlen; /* maximum length of single fragment */ TAILQ_HEAD(pf_fragq, pf_frent) fr_queue; }; struct pf_fragment_tag { uint16_t ft_hdrlen; /* header length of reassembled pkt */ uint16_t ft_extoff; /* last extension header offset or 0 */ uint16_t ft_maxlen; /* maximum fragment payload length */ uint32_t ft_id; /* fragment id */ }; static struct mtx pf_frag_mtx; #define PF_FRAG_LOCK() mtx_lock(&pf_frag_mtx) #define PF_FRAG_UNLOCK() mtx_unlock(&pf_frag_mtx) #define PF_FRAG_ASSERT() mtx_assert(&pf_frag_mtx, MA_OWNED) VNET_DEFINE(uma_zone_t, pf_state_scrub_z); /* XXX: shared with pfsync */ static VNET_DEFINE(uma_zone_t, pf_frent_z); #define V_pf_frent_z VNET(pf_frent_z) static VNET_DEFINE(uma_zone_t, pf_frag_z); #define V_pf_frag_z VNET(pf_frag_z) TAILQ_HEAD(pf_fragqueue, pf_fragment); TAILQ_HEAD(pf_cachequeue, pf_fragment); static VNET_DEFINE(struct pf_fragqueue, pf_fragqueue); #define V_pf_fragqueue VNET(pf_fragqueue) static VNET_DEFINE(struct pf_cachequeue, pf_cachequeue); #define V_pf_cachequeue VNET(pf_cachequeue) RB_HEAD(pf_frag_tree, pf_fragment); static VNET_DEFINE(struct pf_frag_tree, pf_frag_tree); #define V_pf_frag_tree VNET(pf_frag_tree) static VNET_DEFINE(struct pf_frag_tree, pf_cache_tree); #define V_pf_cache_tree VNET(pf_cache_tree) static int pf_frag_compare(struct pf_fragment *, struct pf_fragment *); static RB_PROTOTYPE(pf_frag_tree, pf_fragment, fr_entry, pf_frag_compare); static RB_GENERATE(pf_frag_tree, pf_fragment, fr_entry, pf_frag_compare); static void pf_flush_fragments(void); static void pf_free_fragment(struct pf_fragment *); static void pf_remove_fragment(struct pf_fragment *); static int pf_normalize_tcpopt(struct pf_rule *, struct mbuf *, struct tcphdr *, int, sa_family_t); static struct pf_frent *pf_create_fragment(u_short *); static struct pf_fragment *pf_find_fragment(struct pf_fragment_cmp *key, struct pf_frag_tree *tree); static struct pf_fragment *pf_fillup_fragment(struct pf_fragment_cmp *, struct pf_frent *, u_short *); static int pf_isfull_fragment(struct pf_fragment *); static struct mbuf *pf_join_fragment(struct pf_fragment *); #ifdef INET static void pf_scrub_ip(struct mbuf **, uint32_t, uint8_t, uint8_t); static int pf_reassemble(struct mbuf **, struct ip *, int, u_short *); static struct mbuf *pf_fragcache(struct mbuf **, struct ip*, struct pf_fragment **, int, int, int *); #endif /* INET */ #ifdef INET6 static int pf_reassemble6(struct mbuf **, struct ip6_hdr *, struct ip6_frag *, uint16_t, uint16_t, int, u_short *); static void pf_scrub_ip6(struct mbuf **, uint8_t); #endif /* INET6 */ #define DPFPRINTF(x) do { \ if (V_pf_status.debug >= PF_DEBUG_MISC) { \ printf("%s: ", __func__); \ printf x ; \ } \ } while(0) #ifdef INET static void pf_ip2key(struct ip *ip, int dir, struct pf_fragment_cmp *key) { key->frc_src.v4 = ip->ip_src; key->frc_dst.v4 = ip->ip_dst; key->frc_af = AF_INET; key->frc_proto = ip->ip_p; key->frc_id = ip->ip_id; key->frc_direction = dir; } #endif /* INET */ void pf_normalize_init(void) { V_pf_frag_z = uma_zcreate("pf frags", sizeof(struct pf_fragment), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_frent_z = uma_zcreate("pf frag entries", sizeof(struct pf_frent), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_state_scrub_z = uma_zcreate("pf state scrubs", sizeof(struct pf_state_scrub), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_limits[PF_LIMIT_FRAGS].zone = V_pf_frent_z; V_pf_limits[PF_LIMIT_FRAGS].limit = PFFRAG_FRENT_HIWAT; uma_zone_set_max(V_pf_frent_z, PFFRAG_FRENT_HIWAT); uma_zone_set_warning(V_pf_frent_z, "PF frag entries limit reached"); mtx_init(&pf_frag_mtx, "pf fragments", NULL, MTX_DEF); TAILQ_INIT(&V_pf_fragqueue); TAILQ_INIT(&V_pf_cachequeue); } void pf_normalize_cleanup(void) { uma_zdestroy(V_pf_state_scrub_z); uma_zdestroy(V_pf_frent_z); uma_zdestroy(V_pf_frag_z); mtx_destroy(&pf_frag_mtx); } static int pf_frag_compare(struct pf_fragment *a, struct pf_fragment *b) { int diff; if ((diff = a->fr_id - b->fr_id) != 0) return (diff); if ((diff = a->fr_proto - b->fr_proto) != 0) return (diff); if ((diff = a->fr_af - b->fr_af) != 0) return (diff); if ((diff = pf_addr_cmp(&a->fr_src, &b->fr_src, a->fr_af)) != 0) return (diff); if ((diff = pf_addr_cmp(&a->fr_dst, &b->fr_dst, a->fr_af)) != 0) return (diff); return (0); } void pf_purge_expired_fragments(void) { struct pf_fragment *frag; u_int32_t expire = time_uptime - V_pf_default_rule.timeout[PFTM_FRAG]; PF_FRAG_LOCK(); while ((frag = TAILQ_LAST(&V_pf_fragqueue, pf_fragqueue)) != NULL) { KASSERT((BUFFER_FRAGMENTS(frag)), ("BUFFER_FRAGMENTS(frag) == 0: %s", __FUNCTION__)); if (frag->fr_timeout > expire) break; DPFPRINTF(("expiring %d(%p)\n", frag->fr_id, frag)); pf_free_fragment(frag); } while ((frag = TAILQ_LAST(&V_pf_cachequeue, pf_cachequeue)) != NULL) { KASSERT((!BUFFER_FRAGMENTS(frag)), ("BUFFER_FRAGMENTS(frag) != 0: %s", __FUNCTION__)); if (frag->fr_timeout > expire) break; DPFPRINTF(("expiring %d(%p)\n", frag->fr_id, frag)); pf_free_fragment(frag); KASSERT((TAILQ_EMPTY(&V_pf_cachequeue) || TAILQ_LAST(&V_pf_cachequeue, pf_cachequeue) != frag), ("!(TAILQ_EMPTY() || TAILQ_LAST() == farg): %s", __FUNCTION__)); } PF_FRAG_UNLOCK(); } /* * Try to flush old fragments to make space for new ones */ static void pf_flush_fragments(void) { struct pf_fragment *frag, *cache; int goal; PF_FRAG_ASSERT(); goal = uma_zone_get_cur(V_pf_frent_z) * 9 / 10; DPFPRINTF(("trying to free %d frag entriess\n", goal)); while (goal < uma_zone_get_cur(V_pf_frent_z)) { frag = TAILQ_LAST(&V_pf_fragqueue, pf_fragqueue); if (frag) pf_free_fragment(frag); cache = TAILQ_LAST(&V_pf_cachequeue, pf_cachequeue); if (cache) pf_free_fragment(cache); if (frag == NULL && cache == NULL) break; } } /* Frees the fragments and all associated entries */ static void pf_free_fragment(struct pf_fragment *frag) { struct pf_frent *frent; PF_FRAG_ASSERT(); /* Free all fragments */ if (BUFFER_FRAGMENTS(frag)) { for (frent = TAILQ_FIRST(&frag->fr_queue); frent; frent = TAILQ_FIRST(&frag->fr_queue)) { TAILQ_REMOVE(&frag->fr_queue, frent, fr_next); m_freem(frent->fe_m); uma_zfree(V_pf_frent_z, frent); } } else { for (frent = TAILQ_FIRST(&frag->fr_queue); frent; frent = TAILQ_FIRST(&frag->fr_queue)) { TAILQ_REMOVE(&frag->fr_queue, frent, fr_next); KASSERT((TAILQ_EMPTY(&frag->fr_queue) || TAILQ_FIRST(&frag->fr_queue)->fe_off > frent->fe_len), ("! (TAILQ_EMPTY() || TAILQ_FIRST()->fe_off >" " frent->fe_len): %s", __func__)); uma_zfree(V_pf_frent_z, frent); } } pf_remove_fragment(frag); } static struct pf_fragment * pf_find_fragment(struct pf_fragment_cmp *key, struct pf_frag_tree *tree) { struct pf_fragment *frag; PF_FRAG_ASSERT(); frag = RB_FIND(pf_frag_tree, tree, (struct pf_fragment *)key); if (frag != NULL) { /* XXX Are we sure we want to update the timeout? */ frag->fr_timeout = time_uptime; if (BUFFER_FRAGMENTS(frag)) { TAILQ_REMOVE(&V_pf_fragqueue, frag, frag_next); TAILQ_INSERT_HEAD(&V_pf_fragqueue, frag, frag_next); } else { TAILQ_REMOVE(&V_pf_cachequeue, frag, frag_next); TAILQ_INSERT_HEAD(&V_pf_cachequeue, frag, frag_next); } } return (frag); } /* Removes a fragment from the fragment queue and frees the fragment */ static void pf_remove_fragment(struct pf_fragment *frag) { PF_FRAG_ASSERT(); if (BUFFER_FRAGMENTS(frag)) { RB_REMOVE(pf_frag_tree, &V_pf_frag_tree, frag); TAILQ_REMOVE(&V_pf_fragqueue, frag, frag_next); uma_zfree(V_pf_frag_z, frag); } else { RB_REMOVE(pf_frag_tree, &V_pf_cache_tree, frag); TAILQ_REMOVE(&V_pf_cachequeue, frag, frag_next); uma_zfree(V_pf_frag_z, frag); } } static struct pf_frent * pf_create_fragment(u_short *reason) { struct pf_frent *frent; PF_FRAG_ASSERT(); frent = uma_zalloc(V_pf_frent_z, M_NOWAIT); if (frent == NULL) { pf_flush_fragments(); frent = uma_zalloc(V_pf_frent_z, M_NOWAIT); if (frent == NULL) { REASON_SET(reason, PFRES_MEMORY); return (NULL); } } return (frent); } static struct pf_fragment * pf_fillup_fragment(struct pf_fragment_cmp *key, struct pf_frent *frent, u_short *reason) { struct pf_frent *after, *next, *prev; struct pf_fragment *frag; uint16_t total; PF_FRAG_ASSERT(); /* No empty fragments. */ if (frent->fe_len == 0) { DPFPRINTF(("bad fragment: len 0")); goto bad_fragment; } /* All fragments are 8 byte aligned. */ if (frent->fe_mff && (frent->fe_len & 0x7)) { DPFPRINTF(("bad fragment: mff and len %d", frent->fe_len)); goto bad_fragment; } /* Respect maximum length, IP_MAXPACKET == IPV6_MAXPACKET. */ if (frent->fe_off + frent->fe_len > IP_MAXPACKET) { DPFPRINTF(("bad fragment: max packet %d", frent->fe_off + frent->fe_len)); goto bad_fragment; } DPFPRINTF((key->frc_af == AF_INET ? "reass frag %d @ %d-%d" : "reass frag %#08x @ %d-%d", key->frc_id, frent->fe_off, frent->fe_off + frent->fe_len)); /* Fully buffer all of the fragments in this fragment queue. */ frag = pf_find_fragment(key, &V_pf_frag_tree); /* Create a new reassembly queue for this packet. */ if (frag == NULL) { frag = uma_zalloc(V_pf_frag_z, M_NOWAIT); if (frag == NULL) { pf_flush_fragments(); frag = uma_zalloc(V_pf_frag_z, M_NOWAIT); if (frag == NULL) { REASON_SET(reason, PFRES_MEMORY); goto drop_fragment; } } *(struct pf_fragment_cmp *)frag = *key; frag->fr_timeout = time_second; frag->fr_maxlen = frent->fe_len; TAILQ_INIT(&frag->fr_queue); RB_INSERT(pf_frag_tree, &V_pf_frag_tree, frag); TAILQ_INSERT_HEAD(&V_pf_fragqueue, frag, frag_next); /* We do not have a previous fragment. */ TAILQ_INSERT_HEAD(&frag->fr_queue, frent, fr_next); return (frag); } KASSERT(!TAILQ_EMPTY(&frag->fr_queue), ("!TAILQ_EMPTY()->fr_queue")); /* Remember maximum fragment len for refragmentation. */ if (frent->fe_len > frag->fr_maxlen) frag->fr_maxlen = frent->fe_len; /* Maximum data we have seen already. */ total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off + TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len; /* Non terminal fragments must have more fragments flag. */ if (frent->fe_off + frent->fe_len < total && !frent->fe_mff) goto bad_fragment; /* Check if we saw the last fragment already. */ if (!TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_mff) { if (frent->fe_off + frent->fe_len > total || (frent->fe_off + frent->fe_len == total && frent->fe_mff)) goto bad_fragment; } else { if (frent->fe_off + frent->fe_len == total && !frent->fe_mff) goto bad_fragment; } /* Find a fragment after the current one. */ prev = NULL; TAILQ_FOREACH(after, &frag->fr_queue, fr_next) { if (after->fe_off > frent->fe_off) break; prev = after; } KASSERT(prev != NULL || after != NULL, ("prev != NULL || after != NULL")); if (prev != NULL && prev->fe_off + prev->fe_len > frent->fe_off) { uint16_t precut; precut = prev->fe_off + prev->fe_len - frent->fe_off; if (precut >= frent->fe_len) goto bad_fragment; DPFPRINTF(("overlap -%d", precut)); m_adj(frent->fe_m, precut); frent->fe_off += precut; frent->fe_len -= precut; } for (; after != NULL && frent->fe_off + frent->fe_len > after->fe_off; after = next) { uint16_t aftercut; aftercut = frent->fe_off + frent->fe_len - after->fe_off; DPFPRINTF(("adjust overlap %d", aftercut)); if (aftercut < after->fe_len) { m_adj(after->fe_m, aftercut); after->fe_off += aftercut; after->fe_len -= aftercut; break; } /* This fragment is completely overlapped, lose it. */ next = TAILQ_NEXT(after, fr_next); m_freem(after->fe_m); TAILQ_REMOVE(&frag->fr_queue, after, fr_next); uma_zfree(V_pf_frent_z, after); } if (prev == NULL) TAILQ_INSERT_HEAD(&frag->fr_queue, frent, fr_next); else TAILQ_INSERT_AFTER(&frag->fr_queue, prev, frent, fr_next); return (frag); bad_fragment: REASON_SET(reason, PFRES_FRAG); drop_fragment: uma_zfree(V_pf_frent_z, frent); return (NULL); } static int pf_isfull_fragment(struct pf_fragment *frag) { struct pf_frent *frent, *next; uint16_t off, total; /* Check if we are completely reassembled */ if (TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_mff) return (0); /* Maximum data we have seen already */ total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off + TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len; /* Check if we have all the data */ off = 0; for (frent = TAILQ_FIRST(&frag->fr_queue); frent; frent = next) { next = TAILQ_NEXT(frent, fr_next); off += frent->fe_len; if (off < total && (next == NULL || next->fe_off != off)) { DPFPRINTF(("missing fragment at %d, next %d, total %d", off, next == NULL ? -1 : next->fe_off, total)); return (0); } } DPFPRINTF(("%d < %d?", off, total)); if (off < total) return (0); KASSERT(off == total, ("off == total")); return (1); } static struct mbuf * pf_join_fragment(struct pf_fragment *frag) { struct mbuf *m, *m2; struct pf_frent *frent, *next; frent = TAILQ_FIRST(&frag->fr_queue); next = TAILQ_NEXT(frent, fr_next); m = frent->fe_m; m_adj(m, (frent->fe_hdrlen + frent->fe_len) - m->m_pkthdr.len); uma_zfree(V_pf_frent_z, frent); for (frent = next; frent != NULL; frent = next) { next = TAILQ_NEXT(frent, fr_next); m2 = frent->fe_m; /* Strip off ip header. */ m_adj(m2, frent->fe_hdrlen); /* Strip off any trailing bytes. */ m_adj(m2, frent->fe_len - m2->m_pkthdr.len); uma_zfree(V_pf_frent_z, frent); m_cat(m, m2); } /* Remove from fragment queue. */ pf_remove_fragment(frag); return (m); } #ifdef INET static int pf_reassemble(struct mbuf **m0, struct ip *ip, int dir, u_short *reason) { struct mbuf *m = *m0; struct pf_frent *frent; struct pf_fragment *frag; struct pf_fragment_cmp key; uint16_t total, hdrlen; /* Get an entry for the fragment queue */ if ((frent = pf_create_fragment(reason)) == NULL) return (PF_DROP); frent->fe_m = m; frent->fe_hdrlen = ip->ip_hl << 2; frent->fe_extoff = 0; frent->fe_len = ntohs(ip->ip_len) - (ip->ip_hl << 2); frent->fe_off = (ntohs(ip->ip_off) & IP_OFFMASK) << 3; frent->fe_mff = ntohs(ip->ip_off) & IP_MF; pf_ip2key(ip, dir, &key); if ((frag = pf_fillup_fragment(&key, frent, reason)) == NULL) return (PF_DROP); /* The mbuf is part of the fragment entry, no direct free or access */ m = *m0 = NULL; if (!pf_isfull_fragment(frag)) return (PF_PASS); /* drop because *m0 is NULL, no error */ /* We have all the data */ frent = TAILQ_FIRST(&frag->fr_queue); KASSERT(frent != NULL, ("frent != NULL")); total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off + TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len; hdrlen = frent->fe_hdrlen; m = *m0 = pf_join_fragment(frag); frag = NULL; if (m->m_flags & M_PKTHDR) { int plen = 0; for (m = *m0; m; m = m->m_next) plen += m->m_len; m = *m0; m->m_pkthdr.len = plen; } ip = mtod(m, struct ip *); ip->ip_len = htons(hdrlen + total); ip->ip_off &= ~(IP_MF|IP_OFFMASK); if (hdrlen + total > IP_MAXPACKET) { DPFPRINTF(("drop: too big: %d", total)); ip->ip_len = 0; REASON_SET(reason, PFRES_SHORT); /* PF_DROP requires a valid mbuf *m0 in pf_test() */ return (PF_DROP); } DPFPRINTF(("complete: %p(%d)\n", m, ntohs(ip->ip_len))); return (PF_PASS); } #endif /* INET */ #ifdef INET6 static int pf_reassemble6(struct mbuf **m0, struct ip6_hdr *ip6, struct ip6_frag *fraghdr, uint16_t hdrlen, uint16_t extoff, int dir, u_short *reason) { struct mbuf *m = *m0; struct pf_frent *frent; struct pf_fragment *frag; struct pf_fragment_cmp key; struct m_tag *mtag; struct pf_fragment_tag *ftag; int off; uint32_t frag_id; uint16_t total, maxlen; uint8_t proto; PF_FRAG_LOCK(); /* Get an entry for the fragment queue. */ if ((frent = pf_create_fragment(reason)) == NULL) { PF_FRAG_UNLOCK(); return (PF_DROP); } frent->fe_m = m; frent->fe_hdrlen = hdrlen; frent->fe_extoff = extoff; frent->fe_len = sizeof(struct ip6_hdr) + ntohs(ip6->ip6_plen) - hdrlen; frent->fe_off = ntohs(fraghdr->ip6f_offlg & IP6F_OFF_MASK); frent->fe_mff = fraghdr->ip6f_offlg & IP6F_MORE_FRAG; key.frc_src.v6 = ip6->ip6_src; key.frc_dst.v6 = ip6->ip6_dst; key.frc_af = AF_INET6; /* Only the first fragment's protocol is relevant. */ key.frc_proto = 0; key.frc_id = fraghdr->ip6f_ident; key.frc_direction = dir; if ((frag = pf_fillup_fragment(&key, frent, reason)) == NULL) { PF_FRAG_UNLOCK(); return (PF_DROP); } /* The mbuf is part of the fragment entry, no direct free or access. */ m = *m0 = NULL; if (!pf_isfull_fragment(frag)) { PF_FRAG_UNLOCK(); return (PF_PASS); /* Drop because *m0 is NULL, no error. */ } /* We have all the data. */ extoff = frent->fe_extoff; maxlen = frag->fr_maxlen; frag_id = frag->fr_id; frent = TAILQ_FIRST(&frag->fr_queue); KASSERT(frent != NULL, ("frent != NULL")); total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off + TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len; hdrlen = frent->fe_hdrlen - sizeof(struct ip6_frag); m = *m0 = pf_join_fragment(frag); frag = NULL; PF_FRAG_UNLOCK(); /* Take protocol from first fragment header. */ m = m_getptr(m, hdrlen + offsetof(struct ip6_frag, ip6f_nxt), &off); KASSERT(m, ("%s: short mbuf chain", __func__)); proto = *(mtod(m, caddr_t) + off); m = *m0; /* Delete frag6 header */ if (ip6_deletefraghdr(m, hdrlen, M_NOWAIT) != 0) goto fail; if (m->m_flags & M_PKTHDR) { int plen = 0; for (m = *m0; m; m = m->m_next) plen += m->m_len; m = *m0; m->m_pkthdr.len = plen; } if ((mtag = m_tag_get(PF_REASSEMBLED, sizeof(struct pf_fragment_tag), M_NOWAIT)) == NULL) goto fail; ftag = (struct pf_fragment_tag *)(mtag + 1); ftag->ft_hdrlen = hdrlen; ftag->ft_extoff = extoff; ftag->ft_maxlen = maxlen; ftag->ft_id = frag_id; m_tag_prepend(m, mtag); ip6 = mtod(m, struct ip6_hdr *); ip6->ip6_plen = htons(hdrlen - sizeof(struct ip6_hdr) + total); if (extoff) { /* Write protocol into next field of last extension header. */ m = m_getptr(m, extoff + offsetof(struct ip6_ext, ip6e_nxt), &off); KASSERT(m, ("%s: short mbuf chain", __func__)); *(mtod(m, char *) + off) = proto; m = *m0; } else ip6->ip6_nxt = proto; if (hdrlen - sizeof(struct ip6_hdr) + total > IPV6_MAXPACKET) { DPFPRINTF(("drop: too big: %d", total)); ip6->ip6_plen = 0; REASON_SET(reason, PFRES_SHORT); /* PF_DROP requires a valid mbuf *m0 in pf_test6(). */ return (PF_DROP); } DPFPRINTF(("complete: %p(%d)", m, ntohs(ip6->ip6_plen))); return (PF_PASS); fail: REASON_SET(reason, PFRES_MEMORY); /* PF_DROP requires a valid mbuf *m0 in pf_test6(), will free later. */ return (PF_DROP); } #endif /* INET6 */ #ifdef INET static struct mbuf * pf_fragcache(struct mbuf **m0, struct ip *h, struct pf_fragment **frag, int mff, int drop, int *nomem) { struct mbuf *m = *m0; struct pf_frent *frp, *fra, *cur = NULL; int ip_len = ntohs(h->ip_len) - (h->ip_hl << 2); u_int16_t off = ntohs(h->ip_off) << 3; u_int16_t max = ip_len + off; int hosed = 0; PF_FRAG_ASSERT(); KASSERT((*frag == NULL || !BUFFER_FRAGMENTS(*frag)), ("!(*frag == NULL || !BUFFER_FRAGMENTS(*frag)): %s", __FUNCTION__)); /* Create a new range queue for this packet */ if (*frag == NULL) { *frag = uma_zalloc(V_pf_frag_z, M_NOWAIT); if (*frag == NULL) { pf_flush_fragments(); *frag = uma_zalloc(V_pf_frag_z, M_NOWAIT); if (*frag == NULL) goto no_mem; } /* Get an entry for the queue */ cur = uma_zalloc(V_pf_frent_z, M_NOWAIT); if (cur == NULL) { uma_zfree(V_pf_frag_z, *frag); *frag = NULL; goto no_mem; } (*frag)->fr_flags = PFFRAG_NOBUFFER; (*frag)->fr_max = 0; (*frag)->fr_src.v4 = h->ip_src; (*frag)->fr_dst.v4 = h->ip_dst; (*frag)->fr_id = h->ip_id; (*frag)->fr_timeout = time_uptime; cur->fe_off = off; cur->fe_len = max; /* TODO: fe_len = max - off ? */ TAILQ_INIT(&(*frag)->fr_queue); TAILQ_INSERT_HEAD(&(*frag)->fr_queue, cur, fr_next); RB_INSERT(pf_frag_tree, &V_pf_cache_tree, *frag); TAILQ_INSERT_HEAD(&V_pf_cachequeue, *frag, frag_next); DPFPRINTF(("fragcache[%d]: new %d-%d\n", h->ip_id, off, max)); goto pass; } /* * Find a fragment after the current one: * - off contains the real shifted offset. */ frp = NULL; TAILQ_FOREACH(fra, &(*frag)->fr_queue, fr_next) { if (fra->fe_off > off) break; frp = fra; } KASSERT((frp != NULL || fra != NULL), ("!(frp != NULL || fra != NULL): %s", __FUNCTION__)); if (frp != NULL) { int precut; precut = frp->fe_len - off; if (precut >= ip_len) { /* Fragment is entirely a duplicate */ DPFPRINTF(("fragcache[%d]: dead (%d-%d) %d-%d\n", h->ip_id, frp->fe_off, frp->fe_len, off, max)); goto drop_fragment; } if (precut == 0) { /* They are adjacent. Fixup cache entry */ DPFPRINTF(("fragcache[%d]: adjacent (%d-%d) %d-%d\n", h->ip_id, frp->fe_off, frp->fe_len, off, max)); frp->fe_len = max; } else if (precut > 0) { /* The first part of this payload overlaps with a * fragment that has already been passed. * Need to trim off the first part of the payload. * But to do so easily, we need to create another * mbuf to throw the original header into. */ DPFPRINTF(("fragcache[%d]: chop %d (%d-%d) %d-%d\n", h->ip_id, precut, frp->fe_off, frp->fe_len, off, max)); off += precut; max -= precut; /* Update the previous frag to encompass this one */ frp->fe_len = max; if (!drop) { /* XXX Optimization opportunity * This is a very heavy way to trim the payload. * we could do it much faster by diddling mbuf * internals but that would be even less legible * than this mbuf magic. For my next trick, * I'll pull a rabbit out of my laptop. */ *m0 = m_dup(m, M_NOWAIT); if (*m0 == NULL) goto no_mem; /* From KAME Project : We have missed this! */ m_adj(*m0, (h->ip_hl << 2) - (*m0)->m_pkthdr.len); KASSERT(((*m0)->m_next == NULL), ("(*m0)->m_next != NULL: %s", __FUNCTION__)); m_adj(m, precut + (h->ip_hl << 2)); m_cat(*m0, m); m = *m0; if (m->m_flags & M_PKTHDR) { int plen = 0; struct mbuf *t; for (t = m; t; t = t->m_next) plen += t->m_len; m->m_pkthdr.len = plen; } h = mtod(m, struct ip *); KASSERT(((int)m->m_len == ntohs(h->ip_len) - precut), ("m->m_len != ntohs(h->ip_len) - precut: %s", __FUNCTION__)); h->ip_off = htons(ntohs(h->ip_off) + (precut >> 3)); h->ip_len = htons(ntohs(h->ip_len) - precut); } else { hosed++; } } else { /* There is a gap between fragments */ DPFPRINTF(("fragcache[%d]: gap %d (%d-%d) %d-%d\n", h->ip_id, -precut, frp->fe_off, frp->fe_len, off, max)); cur = uma_zalloc(V_pf_frent_z, M_NOWAIT); if (cur == NULL) goto no_mem; cur->fe_off = off; cur->fe_len = max; TAILQ_INSERT_AFTER(&(*frag)->fr_queue, frp, cur, fr_next); } } if (fra != NULL) { int aftercut; int merge = 0; aftercut = max - fra->fe_off; if (aftercut == 0) { /* Adjacent fragments */ DPFPRINTF(("fragcache[%d]: adjacent %d-%d (%d-%d)\n", h->ip_id, off, max, fra->fe_off, fra->fe_len)); fra->fe_off = off; merge = 1; } else if (aftercut > 0) { /* Need to chop off the tail of this fragment */ DPFPRINTF(("fragcache[%d]: chop %d %d-%d (%d-%d)\n", h->ip_id, aftercut, off, max, fra->fe_off, fra->fe_len)); fra->fe_off = off; max -= aftercut; merge = 1; if (!drop) { m_adj(m, -aftercut); if (m->m_flags & M_PKTHDR) { int plen = 0; struct mbuf *t; for (t = m; t; t = t->m_next) plen += t->m_len; m->m_pkthdr.len = plen; } h = mtod(m, struct ip *); KASSERT(((int)m->m_len == ntohs(h->ip_len) - aftercut), ("m->m_len != ntohs(h->ip_len) - aftercut: %s", __FUNCTION__)); h->ip_len = htons(ntohs(h->ip_len) - aftercut); } else { hosed++; } } else if (frp == NULL) { /* There is a gap between fragments */ DPFPRINTF(("fragcache[%d]: gap %d %d-%d (%d-%d)\n", h->ip_id, -aftercut, off, max, fra->fe_off, fra->fe_len)); cur = uma_zalloc(V_pf_frent_z, M_NOWAIT); if (cur == NULL) goto no_mem; cur->fe_off = off; cur->fe_len = max; TAILQ_INSERT_HEAD(&(*frag)->fr_queue, cur, fr_next); } /* Need to glue together two separate fragment descriptors */ if (merge) { if (cur && fra->fe_off <= cur->fe_len) { /* Need to merge in a previous 'cur' */ DPFPRINTF(("fragcache[%d]: adjacent(merge " "%d-%d) %d-%d (%d-%d)\n", h->ip_id, cur->fe_off, cur->fe_len, off, max, fra->fe_off, fra->fe_len)); fra->fe_off = cur->fe_off; TAILQ_REMOVE(&(*frag)->fr_queue, cur, fr_next); uma_zfree(V_pf_frent_z, cur); cur = NULL; } else if (frp && fra->fe_off <= frp->fe_len) { /* Need to merge in a modified 'frp' */ KASSERT((cur == NULL), ("cur != NULL: %s", __FUNCTION__)); DPFPRINTF(("fragcache[%d]: adjacent(merge " "%d-%d) %d-%d (%d-%d)\n", h->ip_id, frp->fe_off, frp->fe_len, off, max, fra->fe_off, fra->fe_len)); fra->fe_off = frp->fe_off; TAILQ_REMOVE(&(*frag)->fr_queue, frp, fr_next); uma_zfree(V_pf_frent_z, frp); frp = NULL; } } } if (hosed) { /* * We must keep tracking the overall fragment even when * we're going to drop it anyway so that we know when to * free the overall descriptor. Thus we drop the frag late. */ goto drop_fragment; } pass: /* Update maximum data size */ if ((*frag)->fr_max < max) (*frag)->fr_max = max; /* This is the last segment */ if (!mff) (*frag)->fr_flags |= PFFRAG_SEENLAST; /* Check if we are completely reassembled */ if (((*frag)->fr_flags & PFFRAG_SEENLAST) && TAILQ_FIRST(&(*frag)->fr_queue)->fe_off == 0 && TAILQ_FIRST(&(*frag)->fr_queue)->fe_len == (*frag)->fr_max) { /* Remove from fragment queue */ DPFPRINTF(("fragcache[%d]: done 0-%d\n", h->ip_id, (*frag)->fr_max)); pf_free_fragment(*frag); *frag = NULL; } return (m); no_mem: *nomem = 1; /* Still need to pay attention to !IP_MF */ if (!mff && *frag != NULL) (*frag)->fr_flags |= PFFRAG_SEENLAST; m_freem(m); return (NULL); drop_fragment: /* Still need to pay attention to !IP_MF */ if (!mff && *frag != NULL) (*frag)->fr_flags |= PFFRAG_SEENLAST; if (drop) { /* This fragment has been deemed bad. Don't reass */ if (((*frag)->fr_flags & PFFRAG_DROP) == 0) DPFPRINTF(("fragcache[%d]: dropping overall fragment\n", h->ip_id)); (*frag)->fr_flags |= PFFRAG_DROP; } m_freem(m); return (NULL); } #endif /* INET */ #ifdef INET6 int pf_refragment6(struct ifnet *ifp, struct mbuf **m0, struct m_tag *mtag) { struct mbuf *m = *m0, *t; struct pf_fragment_tag *ftag = (struct pf_fragment_tag *)(mtag + 1); struct pf_pdesc pd; uint32_t frag_id; uint16_t hdrlen, extoff, maxlen; uint8_t proto; int error, action; hdrlen = ftag->ft_hdrlen; extoff = ftag->ft_extoff; maxlen = ftag->ft_maxlen; frag_id = ftag->ft_id; m_tag_delete(m, mtag); mtag = NULL; ftag = NULL; if (extoff) { int off; /* Use protocol from next field of last extension header */ m = m_getptr(m, extoff + offsetof(struct ip6_ext, ip6e_nxt), &off); KASSERT((m != NULL), ("pf_refragment6: short mbuf chain")); proto = *(mtod(m, caddr_t) + off); *(mtod(m, char *) + off) = IPPROTO_FRAGMENT; m = *m0; } else { struct ip6_hdr *hdr; hdr = mtod(m, struct ip6_hdr *); proto = hdr->ip6_nxt; hdr->ip6_nxt = IPPROTO_FRAGMENT; } /* * Maxlen may be less than 8 if there was only a single * fragment. As it was fragmented before, add a fragment * header also for a single fragment. If total or maxlen * is less than 8, ip6_fragment() will return EMSGSIZE and * we drop the packet. */ error = ip6_fragment(ifp, m, hdrlen, proto, maxlen, frag_id); m = (*m0)->m_nextpkt; (*m0)->m_nextpkt = NULL; if (error == 0) { /* The first mbuf contains the unfragmented packet. */ m_freem(*m0); *m0 = NULL; action = PF_PASS; } else { /* Drop expects an mbuf to free. */ DPFPRINTF(("refragment error %d", error)); action = PF_DROP; } for (t = m; m; m = t) { t = m->m_nextpkt; m->m_nextpkt = NULL; m->m_flags |= M_SKIP_FIREWALL; memset(&pd, 0, sizeof(pd)); pd.pf_mtag = pf_find_mtag(m); if (error == 0) ip6_forward(m, 0); else m_freem(m); } return (action); } #endif /* INET6 */ #ifdef INET int pf_normalize_ip(struct mbuf **m0, int dir, struct pfi_kif *kif, u_short *reason, struct pf_pdesc *pd) { struct mbuf *m = *m0; struct pf_rule *r; struct pf_fragment *frag = NULL; struct pf_fragment_cmp key; struct ip *h = mtod(m, struct ip *); int mff = (ntohs(h->ip_off) & IP_MF); int hlen = h->ip_hl << 2; u_int16_t fragoff = (ntohs(h->ip_off) & IP_OFFMASK) << 3; u_int16_t max; int ip_len; int ip_off; int tag = -1; int verdict; PF_RULES_RASSERT(); r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_SCRUB].active.ptr); while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != dir) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != AF_INET) r = r->skip[PF_SKIP_AF].ptr; else if (r->proto && r->proto != h->ip_p) r = r->skip[PF_SKIP_PROTO].ptr; else if (PF_MISMATCHAW(&r->src.addr, (struct pf_addr *)&h->ip_src.s_addr, AF_INET, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; else if (PF_MISMATCHAW(&r->dst.addr, (struct pf_addr *)&h->ip_dst.s_addr, AF_INET, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; else if (r->match_tag && !pf_match_tag(m, r, &tag, pd->pf_mtag ? pd->pf_mtag->tag : 0)) r = TAILQ_NEXT(r, entries); else break; } if (r == NULL || r->action == PF_NOSCRUB) return (PF_PASS); else { r->packets[dir == PF_OUT]++; r->bytes[dir == PF_OUT] += pd->tot_len; } /* Check for illegal packets */ if (hlen < (int)sizeof(struct ip)) goto drop; if (hlen > ntohs(h->ip_len)) goto drop; /* Clear IP_DF if the rule uses the no-df option */ if (r->rule_flag & PFRULE_NODF && h->ip_off & htons(IP_DF)) { u_int16_t ip_off = h->ip_off; h->ip_off &= htons(~IP_DF); h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_off, h->ip_off, 0); } /* We will need other tests here */ if (!fragoff && !mff) goto no_fragment; /* We're dealing with a fragment now. Don't allow fragments * with IP_DF to enter the cache. If the flag was cleared by * no-df above, fine. Otherwise drop it. */ if (h->ip_off & htons(IP_DF)) { DPFPRINTF(("IP_DF\n")); goto bad; } ip_len = ntohs(h->ip_len) - hlen; ip_off = (ntohs(h->ip_off) & IP_OFFMASK) << 3; /* All fragments are 8 byte aligned */ if (mff && (ip_len & 0x7)) { DPFPRINTF(("mff and %d\n", ip_len)); goto bad; } /* Respect maximum length */ if (fragoff + ip_len > IP_MAXPACKET) { DPFPRINTF(("max packet %d\n", fragoff + ip_len)); goto bad; } max = fragoff + ip_len; if ((r->rule_flag & (PFRULE_FRAGCROP|PFRULE_FRAGDROP)) == 0) { /* Fully buffer all of the fragments */ PF_FRAG_LOCK(); pf_ip2key(h, dir, &key); frag = pf_find_fragment(&key, &V_pf_frag_tree); /* Check if we saw the last fragment already */ if (frag != NULL && (frag->fr_flags & PFFRAG_SEENLAST) && max > frag->fr_max) goto bad; /* Might return a completely reassembled mbuf, or NULL */ DPFPRINTF(("reass frag %d @ %d-%d\n", h->ip_id, fragoff, max)); verdict = pf_reassemble(m0, h, dir, reason); PF_FRAG_UNLOCK(); if (verdict != PF_PASS) return (PF_DROP); m = *m0; if (m == NULL) return (PF_DROP); if (frag != NULL && (frag->fr_flags & PFFRAG_DROP)) goto drop; h = mtod(m, struct ip *); } else { /* non-buffering fragment cache (drops or masks overlaps) */ int nomem = 0; if (dir == PF_OUT && pd->pf_mtag && pd->pf_mtag->flags & PF_TAG_FRAGCACHE) { /* * Already passed the fragment cache in the * input direction. If we continued, it would * appear to be a dup and would be dropped. */ goto fragment_pass; } PF_FRAG_LOCK(); pf_ip2key(h, dir, &key); frag = pf_find_fragment(&key, &V_pf_cache_tree); /* Check if we saw the last fragment already */ if (frag != NULL && (frag->fr_flags & PFFRAG_SEENLAST) && max > frag->fr_max) { if (r->rule_flag & PFRULE_FRAGDROP) frag->fr_flags |= PFFRAG_DROP; goto bad; } *m0 = m = pf_fragcache(m0, h, &frag, mff, (r->rule_flag & PFRULE_FRAGDROP) ? 1 : 0, &nomem); PF_FRAG_UNLOCK(); if (m == NULL) { if (nomem) goto no_mem; goto drop; } if (dir == PF_IN) { /* Use mtag from copied and trimmed mbuf chain. */ pd->pf_mtag = pf_get_mtag(m); if (pd->pf_mtag == NULL) { m_freem(m); *m0 = NULL; goto no_mem; } pd->pf_mtag->flags |= PF_TAG_FRAGCACHE; } if (frag != NULL && (frag->fr_flags & PFFRAG_DROP)) goto drop; goto fragment_pass; } no_fragment: /* At this point, only IP_DF is allowed in ip_off */ if (h->ip_off & ~htons(IP_DF)) { u_int16_t ip_off = h->ip_off; h->ip_off &= htons(IP_DF); h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_off, h->ip_off, 0); } /* not missing a return here */ fragment_pass: pf_scrub_ip(&m, r->rule_flag, r->min_ttl, r->set_tos); if ((r->rule_flag & (PFRULE_FRAGCROP|PFRULE_FRAGDROP)) == 0) pd->flags |= PFDESC_IP_REAS; return (PF_PASS); no_mem: REASON_SET(reason, PFRES_MEMORY); if (r != NULL && r->log) PFLOG_PACKET(kif, m, AF_INET, dir, *reason, r, NULL, NULL, pd, 1); return (PF_DROP); drop: REASON_SET(reason, PFRES_NORM); if (r != NULL && r->log) PFLOG_PACKET(kif, m, AF_INET, dir, *reason, r, NULL, NULL, pd, 1); return (PF_DROP); bad: DPFPRINTF(("dropping bad fragment\n")); /* Free associated fragments */ if (frag != NULL) { pf_free_fragment(frag); PF_FRAG_UNLOCK(); } REASON_SET(reason, PFRES_FRAG); if (r != NULL && r->log) PFLOG_PACKET(kif, m, AF_INET, dir, *reason, r, NULL, NULL, pd, 1); return (PF_DROP); } #endif #ifdef INET6 int pf_normalize_ip6(struct mbuf **m0, int dir, struct pfi_kif *kif, u_short *reason, struct pf_pdesc *pd) { struct mbuf *m = *m0; struct pf_rule *r; struct ip6_hdr *h = mtod(m, struct ip6_hdr *); int extoff; int off; struct ip6_ext ext; struct ip6_opt opt; struct ip6_opt_jumbo jumbo; struct ip6_frag frag; u_int32_t jumbolen = 0, plen; int optend; int ooff; u_int8_t proto; int terminal; PF_RULES_RASSERT(); r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_SCRUB].active.ptr); while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != dir) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != AF_INET6) r = r->skip[PF_SKIP_AF].ptr; #if 0 /* header chain! */ else if (r->proto && r->proto != h->ip6_nxt) r = r->skip[PF_SKIP_PROTO].ptr; #endif else if (PF_MISMATCHAW(&r->src.addr, (struct pf_addr *)&h->ip6_src, AF_INET6, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; else if (PF_MISMATCHAW(&r->dst.addr, (struct pf_addr *)&h->ip6_dst, AF_INET6, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; else break; } if (r == NULL || r->action == PF_NOSCRUB) return (PF_PASS); else { r->packets[dir == PF_OUT]++; r->bytes[dir == PF_OUT] += pd->tot_len; } /* Check for illegal packets */ if (sizeof(struct ip6_hdr) + IPV6_MAXPACKET < m->m_pkthdr.len) goto drop; extoff = 0; off = sizeof(struct ip6_hdr); proto = h->ip6_nxt; terminal = 0; do { switch (proto) { case IPPROTO_FRAGMENT: goto fragment; break; case IPPROTO_AH: case IPPROTO_ROUTING: case IPPROTO_DSTOPTS: if (!pf_pull_hdr(m, off, &ext, sizeof(ext), NULL, NULL, AF_INET6)) goto shortpkt; extoff = off; if (proto == IPPROTO_AH) off += (ext.ip6e_len + 2) * 4; else off += (ext.ip6e_len + 1) * 8; proto = ext.ip6e_nxt; break; case IPPROTO_HOPOPTS: if (!pf_pull_hdr(m, off, &ext, sizeof(ext), NULL, NULL, AF_INET6)) goto shortpkt; extoff = off; optend = off + (ext.ip6e_len + 1) * 8; ooff = off + sizeof(ext); do { if (!pf_pull_hdr(m, ooff, &opt.ip6o_type, sizeof(opt.ip6o_type), NULL, NULL, AF_INET6)) goto shortpkt; if (opt.ip6o_type == IP6OPT_PAD1) { ooff++; continue; } if (!pf_pull_hdr(m, ooff, &opt, sizeof(opt), NULL, NULL, AF_INET6)) goto shortpkt; if (ooff + sizeof(opt) + opt.ip6o_len > optend) goto drop; switch (opt.ip6o_type) { case IP6OPT_JUMBO: if (h->ip6_plen != 0) goto drop; if (!pf_pull_hdr(m, ooff, &jumbo, sizeof(jumbo), NULL, NULL, AF_INET6)) goto shortpkt; memcpy(&jumbolen, jumbo.ip6oj_jumbo_len, sizeof(jumbolen)); jumbolen = ntohl(jumbolen); if (jumbolen <= IPV6_MAXPACKET) goto drop; if (sizeof(struct ip6_hdr) + jumbolen != m->m_pkthdr.len) goto drop; break; default: break; } ooff += sizeof(opt) + opt.ip6o_len; } while (ooff < optend); off = optend; proto = ext.ip6e_nxt; break; default: terminal = 1; break; } } while (!terminal); /* jumbo payload option must be present, or plen > 0 */ if (ntohs(h->ip6_plen) == 0) plen = jumbolen; else plen = ntohs(h->ip6_plen); if (plen == 0) goto drop; if (sizeof(struct ip6_hdr) + plen > m->m_pkthdr.len) goto shortpkt; pf_scrub_ip6(&m, r->min_ttl); return (PF_PASS); fragment: /* Jumbo payload packets cannot be fragmented. */ plen = ntohs(h->ip6_plen); if (plen == 0 || jumbolen) goto drop; if (sizeof(struct ip6_hdr) + plen > m->m_pkthdr.len) goto shortpkt; if (!pf_pull_hdr(m, off, &frag, sizeof(frag), NULL, NULL, AF_INET6)) goto shortpkt; /* Offset now points to data portion. */ off += sizeof(frag); /* Returns PF_DROP or *m0 is NULL or completely reassembled mbuf. */ if (pf_reassemble6(m0, h, &frag, off, extoff, dir, reason) != PF_PASS) return (PF_DROP); m = *m0; if (m == NULL) return (PF_DROP); pd->flags |= PFDESC_IP_REAS; return (PF_PASS); shortpkt: REASON_SET(reason, PFRES_SHORT); if (r != NULL && r->log) PFLOG_PACKET(kif, m, AF_INET6, dir, *reason, r, NULL, NULL, pd, 1); return (PF_DROP); drop: REASON_SET(reason, PFRES_NORM); if (r != NULL && r->log) PFLOG_PACKET(kif, m, AF_INET6, dir, *reason, r, NULL, NULL, pd, 1); return (PF_DROP); } #endif /* INET6 */ int pf_normalize_tcp(int dir, struct pfi_kif *kif, struct mbuf *m, int ipoff, int off, void *h, struct pf_pdesc *pd) { struct pf_rule *r, *rm = NULL; struct tcphdr *th = pd->hdr.tcp; int rewrite = 0; u_short reason; u_int8_t flags; sa_family_t af = pd->af; PF_RULES_RASSERT(); r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_SCRUB].active.ptr); while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != dir) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != af) r = r->skip[PF_SKIP_AF].ptr; else if (r->proto && r->proto != pd->proto) r = r->skip[PF_SKIP_PROTO].ptr; else if (PF_MISMATCHAW(&r->src.addr, pd->src, af, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; else if (r->src.port_op && !pf_match_port(r->src.port_op, r->src.port[0], r->src.port[1], th->th_sport)) r = r->skip[PF_SKIP_SRC_PORT].ptr; else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; else if (r->dst.port_op && !pf_match_port(r->dst.port_op, r->dst.port[0], r->dst.port[1], th->th_dport)) r = r->skip[PF_SKIP_DST_PORT].ptr; else if (r->os_fingerprint != PF_OSFP_ANY && !pf_osfp_match( pf_osfp_fingerprint(pd, m, off, th), r->os_fingerprint)) r = TAILQ_NEXT(r, entries); else { rm = r; break; } } if (rm == NULL || rm->action == PF_NOSCRUB) return (PF_PASS); else { r->packets[dir == PF_OUT]++; r->bytes[dir == PF_OUT] += pd->tot_len; } if (rm->rule_flag & PFRULE_REASSEMBLE_TCP) pd->flags |= PFDESC_TCP_NORM; flags = th->th_flags; if (flags & TH_SYN) { /* Illegal packet */ if (flags & TH_RST) goto tcp_drop; if (flags & TH_FIN) - flags &= ~TH_FIN; + goto tcp_drop; } else { /* Illegal packet */ if (!(flags & (TH_ACK|TH_RST))) goto tcp_drop; } if (!(flags & TH_ACK)) { /* These flags are only valid if ACK is set */ if ((flags & TH_FIN) || (flags & TH_PUSH) || (flags & TH_URG)) goto tcp_drop; } /* Check for illegal header length */ if (th->th_off < (sizeof(struct tcphdr) >> 2)) goto tcp_drop; /* If flags changed, or reserved data set, then adjust */ if (flags != th->th_flags || th->th_x2 != 0) { u_int16_t ov, nv; ov = *(u_int16_t *)(&th->th_ack + 1); th->th_flags = flags; th->th_x2 = 0; nv = *(u_int16_t *)(&th->th_ack + 1); th->th_sum = pf_cksum_fixup(th->th_sum, ov, nv, 0); rewrite = 1; } /* Remove urgent pointer, if TH_URG is not set */ if (!(flags & TH_URG) && th->th_urp) { th->th_sum = pf_cksum_fixup(th->th_sum, th->th_urp, 0, 0); th->th_urp = 0; rewrite = 1; } /* Process options */ if (r->max_mss && pf_normalize_tcpopt(r, m, th, off, pd->af)) rewrite = 1; /* copy back packet headers if we sanitized */ if (rewrite) m_copyback(m, off, sizeof(*th), (caddr_t)th); return (PF_PASS); tcp_drop: REASON_SET(&reason, PFRES_NORM); if (rm != NULL && r->log) PFLOG_PACKET(kif, m, AF_INET, dir, reason, r, NULL, NULL, pd, 1); return (PF_DROP); } int pf_normalize_tcp_init(struct mbuf *m, int off, struct pf_pdesc *pd, struct tcphdr *th, struct pf_state_peer *src, struct pf_state_peer *dst) { u_int32_t tsval, tsecr; u_int8_t hdr[60]; u_int8_t *opt; KASSERT((src->scrub == NULL), ("pf_normalize_tcp_init: src->scrub != NULL")); src->scrub = uma_zalloc(V_pf_state_scrub_z, M_ZERO | M_NOWAIT); if (src->scrub == NULL) return (1); switch (pd->af) { #ifdef INET case AF_INET: { struct ip *h = mtod(m, struct ip *); src->scrub->pfss_ttl = h->ip_ttl; break; } #endif /* INET */ #ifdef INET6 case AF_INET6: { struct ip6_hdr *h = mtod(m, struct ip6_hdr *); src->scrub->pfss_ttl = h->ip6_hlim; break; } #endif /* INET6 */ } /* * All normalizations below are only begun if we see the start of * the connections. They must all set an enabled bit in pfss_flags */ if ((th->th_flags & TH_SYN) == 0) return (0); if (th->th_off > (sizeof(struct tcphdr) >> 2) && src->scrub && pf_pull_hdr(m, off, hdr, th->th_off << 2, NULL, NULL, pd->af)) { /* Diddle with TCP options */ int hlen; opt = hdr + sizeof(struct tcphdr); hlen = (th->th_off << 2) - sizeof(struct tcphdr); while (hlen >= TCPOLEN_TIMESTAMP) { switch (*opt) { case TCPOPT_EOL: /* FALLTHROUGH */ case TCPOPT_NOP: opt++; hlen--; break; case TCPOPT_TIMESTAMP: if (opt[1] >= TCPOLEN_TIMESTAMP) { src->scrub->pfss_flags |= PFSS_TIMESTAMP; src->scrub->pfss_ts_mod = htonl(arc4random()); /* note PFSS_PAWS not set yet */ memcpy(&tsval, &opt[2], sizeof(u_int32_t)); memcpy(&tsecr, &opt[6], sizeof(u_int32_t)); src->scrub->pfss_tsval0 = ntohl(tsval); src->scrub->pfss_tsval = ntohl(tsval); src->scrub->pfss_tsecr = ntohl(tsecr); getmicrouptime(&src->scrub->pfss_last); } /* FALLTHROUGH */ default: hlen -= MAX(opt[1], 2); opt += MAX(opt[1], 2); break; } } } return (0); } void pf_normalize_tcp_cleanup(struct pf_state *state) { if (state->src.scrub) uma_zfree(V_pf_state_scrub_z, state->src.scrub); if (state->dst.scrub) uma_zfree(V_pf_state_scrub_z, state->dst.scrub); /* Someday... flush the TCP segment reassembly descriptors. */ } int pf_normalize_tcp_stateful(struct mbuf *m, int off, struct pf_pdesc *pd, u_short *reason, struct tcphdr *th, struct pf_state *state, struct pf_state_peer *src, struct pf_state_peer *dst, int *writeback) { struct timeval uptime; u_int32_t tsval, tsecr; u_int tsval_from_last; u_int8_t hdr[60]; u_int8_t *opt; int copyback = 0; int got_ts = 0; KASSERT((src->scrub || dst->scrub), ("%s: src->scrub && dst->scrub!", __func__)); /* * Enforce the minimum TTL seen for this connection. Negate a common * technique to evade an intrusion detection system and confuse * firewall state code. */ switch (pd->af) { #ifdef INET case AF_INET: { if (src->scrub) { struct ip *h = mtod(m, struct ip *); if (h->ip_ttl > src->scrub->pfss_ttl) src->scrub->pfss_ttl = h->ip_ttl; h->ip_ttl = src->scrub->pfss_ttl; } break; } #endif /* INET */ #ifdef INET6 case AF_INET6: { if (src->scrub) { struct ip6_hdr *h = mtod(m, struct ip6_hdr *); if (h->ip6_hlim > src->scrub->pfss_ttl) src->scrub->pfss_ttl = h->ip6_hlim; h->ip6_hlim = src->scrub->pfss_ttl; } break; } #endif /* INET6 */ } if (th->th_off > (sizeof(struct tcphdr) >> 2) && ((src->scrub && (src->scrub->pfss_flags & PFSS_TIMESTAMP)) || (dst->scrub && (dst->scrub->pfss_flags & PFSS_TIMESTAMP))) && pf_pull_hdr(m, off, hdr, th->th_off << 2, NULL, NULL, pd->af)) { /* Diddle with TCP options */ int hlen; opt = hdr + sizeof(struct tcphdr); hlen = (th->th_off << 2) - sizeof(struct tcphdr); while (hlen >= TCPOLEN_TIMESTAMP) { switch (*opt) { case TCPOPT_EOL: /* FALLTHROUGH */ case TCPOPT_NOP: opt++; hlen--; break; case TCPOPT_TIMESTAMP: /* Modulate the timestamps. Can be used for * NAT detection, OS uptime determination or * reboot detection. */ if (got_ts) { /* Huh? Multiple timestamps!? */ if (V_pf_status.debug >= PF_DEBUG_MISC) { DPFPRINTF(("multiple TS??")); pf_print_state(state); printf("\n"); } REASON_SET(reason, PFRES_TS); return (PF_DROP); } if (opt[1] >= TCPOLEN_TIMESTAMP) { memcpy(&tsval, &opt[2], sizeof(u_int32_t)); if (tsval && src->scrub && (src->scrub->pfss_flags & PFSS_TIMESTAMP)) { tsval = ntohl(tsval); pf_change_a(&opt[2], &th->th_sum, htonl(tsval + src->scrub->pfss_ts_mod), 0); copyback = 1; } /* Modulate TS reply iff valid (!0) */ memcpy(&tsecr, &opt[6], sizeof(u_int32_t)); if (tsecr && dst->scrub && (dst->scrub->pfss_flags & PFSS_TIMESTAMP)) { tsecr = ntohl(tsecr) - dst->scrub->pfss_ts_mod; pf_change_a(&opt[6], &th->th_sum, htonl(tsecr), 0); copyback = 1; } got_ts = 1; } /* FALLTHROUGH */ default: hlen -= MAX(opt[1], 2); opt += MAX(opt[1], 2); break; } } if (copyback) { /* Copyback the options, caller copys back header */ *writeback = 1; m_copyback(m, off + sizeof(struct tcphdr), (th->th_off << 2) - sizeof(struct tcphdr), hdr + sizeof(struct tcphdr)); } } /* * Must invalidate PAWS checks on connections idle for too long. * The fastest allowed timestamp clock is 1ms. That turns out to * be about 24 days before it wraps. XXX Right now our lowerbound * TS echo check only works for the first 12 days of a connection * when the TS has exhausted half its 32bit space */ #define TS_MAX_IDLE (24*24*60*60) #define TS_MAX_CONN (12*24*60*60) /* XXX remove when better tsecr check */ getmicrouptime(&uptime); if (src->scrub && (src->scrub->pfss_flags & PFSS_PAWS) && (uptime.tv_sec - src->scrub->pfss_last.tv_sec > TS_MAX_IDLE || time_uptime - state->creation > TS_MAX_CONN)) { if (V_pf_status.debug >= PF_DEBUG_MISC) { DPFPRINTF(("src idled out of PAWS\n")); pf_print_state(state); printf("\n"); } src->scrub->pfss_flags = (src->scrub->pfss_flags & ~PFSS_PAWS) | PFSS_PAWS_IDLED; } if (dst->scrub && (dst->scrub->pfss_flags & PFSS_PAWS) && uptime.tv_sec - dst->scrub->pfss_last.tv_sec > TS_MAX_IDLE) { if (V_pf_status.debug >= PF_DEBUG_MISC) { DPFPRINTF(("dst idled out of PAWS\n")); pf_print_state(state); printf("\n"); } dst->scrub->pfss_flags = (dst->scrub->pfss_flags & ~PFSS_PAWS) | PFSS_PAWS_IDLED; } if (got_ts && src->scrub && dst->scrub && (src->scrub->pfss_flags & PFSS_PAWS) && (dst->scrub->pfss_flags & PFSS_PAWS)) { /* Validate that the timestamps are "in-window". * RFC1323 describes TCP Timestamp options that allow * measurement of RTT (round trip time) and PAWS * (protection against wrapped sequence numbers). PAWS * gives us a set of rules for rejecting packets on * long fat pipes (packets that were somehow delayed * in transit longer than the time it took to send the * full TCP sequence space of 4Gb). We can use these * rules and infer a few others that will let us treat * the 32bit timestamp and the 32bit echoed timestamp * as sequence numbers to prevent a blind attacker from * inserting packets into a connection. * * RFC1323 tells us: * - The timestamp on this packet must be greater than * or equal to the last value echoed by the other * endpoint. The RFC says those will be discarded * since it is a dup that has already been acked. * This gives us a lowerbound on the timestamp. * timestamp >= other last echoed timestamp * - The timestamp will be less than or equal to * the last timestamp plus the time between the * last packet and now. The RFC defines the max * clock rate as 1ms. We will allow clocks to be * up to 10% fast and will allow a total difference * or 30 seconds due to a route change. And this * gives us an upperbound on the timestamp. * timestamp <= last timestamp + max ticks * We have to be careful here. Windows will send an * initial timestamp of zero and then initialize it * to a random value after the 3whs; presumably to * avoid a DoS by having to call an expensive RNG * during a SYN flood. Proof MS has at least one * good security geek. * * - The TCP timestamp option must also echo the other * endpoints timestamp. The timestamp echoed is the * one carried on the earliest unacknowledged segment * on the left edge of the sequence window. The RFC * states that the host will reject any echoed * timestamps that were larger than any ever sent. * This gives us an upperbound on the TS echo. * tescr <= largest_tsval * - The lowerbound on the TS echo is a little more * tricky to determine. The other endpoint's echoed * values will not decrease. But there may be * network conditions that re-order packets and * cause our view of them to decrease. For now the * only lowerbound we can safely determine is that * the TS echo will never be less than the original * TS. XXX There is probably a better lowerbound. * Remove TS_MAX_CONN with better lowerbound check. * tescr >= other original TS * * It is also important to note that the fastest * timestamp clock of 1ms will wrap its 32bit space in * 24 days. So we just disable TS checking after 24 * days of idle time. We actually must use a 12d * connection limit until we can come up with a better * lowerbound to the TS echo check. */ struct timeval delta_ts; int ts_fudge; /* * PFTM_TS_DIFF is how many seconds of leeway to allow * a host's timestamp. This can happen if the previous * packet got delayed in transit for much longer than * this packet. */ if ((ts_fudge = state->rule.ptr->timeout[PFTM_TS_DIFF]) == 0) ts_fudge = V_pf_default_rule.timeout[PFTM_TS_DIFF]; /* Calculate max ticks since the last timestamp */ #define TS_MAXFREQ 1100 /* RFC max TS freq of 1Khz + 10% skew */ #define TS_MICROSECS 1000000 /* microseconds per second */ delta_ts = uptime; timevalsub(&delta_ts, &src->scrub->pfss_last); tsval_from_last = (delta_ts.tv_sec + ts_fudge) * TS_MAXFREQ; tsval_from_last += delta_ts.tv_usec / (TS_MICROSECS/TS_MAXFREQ); if ((src->state >= TCPS_ESTABLISHED && dst->state >= TCPS_ESTABLISHED) && (SEQ_LT(tsval, dst->scrub->pfss_tsecr) || SEQ_GT(tsval, src->scrub->pfss_tsval + tsval_from_last) || (tsecr && (SEQ_GT(tsecr, dst->scrub->pfss_tsval) || SEQ_LT(tsecr, dst->scrub->pfss_tsval0))))) { /* Bad RFC1323 implementation or an insertion attack. * * - Solaris 2.6 and 2.7 are known to send another ACK * after the FIN,FIN|ACK,ACK closing that carries * an old timestamp. */ DPFPRINTF(("Timestamp failed %c%c%c%c\n", SEQ_LT(tsval, dst->scrub->pfss_tsecr) ? '0' : ' ', SEQ_GT(tsval, src->scrub->pfss_tsval + tsval_from_last) ? '1' : ' ', SEQ_GT(tsecr, dst->scrub->pfss_tsval) ? '2' : ' ', SEQ_LT(tsecr, dst->scrub->pfss_tsval0)? '3' : ' ')); DPFPRINTF((" tsval: %u tsecr: %u +ticks: %u " "idle: %jus %lums\n", tsval, tsecr, tsval_from_last, (uintmax_t)delta_ts.tv_sec, delta_ts.tv_usec / 1000)); DPFPRINTF((" src->tsval: %u tsecr: %u\n", src->scrub->pfss_tsval, src->scrub->pfss_tsecr)); DPFPRINTF((" dst->tsval: %u tsecr: %u tsval0: %u" "\n", dst->scrub->pfss_tsval, dst->scrub->pfss_tsecr, dst->scrub->pfss_tsval0)); if (V_pf_status.debug >= PF_DEBUG_MISC) { pf_print_state(state); pf_print_flags(th->th_flags); printf("\n"); } REASON_SET(reason, PFRES_TS); return (PF_DROP); } /* XXX I'd really like to require tsecr but it's optional */ } else if (!got_ts && (th->th_flags & TH_RST) == 0 && ((src->state == TCPS_ESTABLISHED && dst->state == TCPS_ESTABLISHED) || pd->p_len > 0 || (th->th_flags & TH_SYN)) && src->scrub && dst->scrub && (src->scrub->pfss_flags & PFSS_PAWS) && (dst->scrub->pfss_flags & PFSS_PAWS)) { /* Didn't send a timestamp. Timestamps aren't really useful * when: * - connection opening or closing (often not even sent). * but we must not let an attacker to put a FIN on a * data packet to sneak it through our ESTABLISHED check. * - on a TCP reset. RFC suggests not even looking at TS. * - on an empty ACK. The TS will not be echoed so it will * probably not help keep the RTT calculation in sync and * there isn't as much danger when the sequence numbers * got wrapped. So some stacks don't include TS on empty * ACKs :-( * * To minimize the disruption to mostly RFC1323 conformant * stacks, we will only require timestamps on data packets. * * And what do ya know, we cannot require timestamps on data * packets. There appear to be devices that do legitimate * TCP connection hijacking. There are HTTP devices that allow * a 3whs (with timestamps) and then buffer the HTTP request. * If the intermediate device has the HTTP response cache, it * will spoof the response but not bother timestamping its * packets. So we can look for the presence of a timestamp in * the first data packet and if there, require it in all future * packets. */ if (pd->p_len > 0 && (src->scrub->pfss_flags & PFSS_DATA_TS)) { /* * Hey! Someone tried to sneak a packet in. Or the * stack changed its RFC1323 behavior?!?! */ if (V_pf_status.debug >= PF_DEBUG_MISC) { DPFPRINTF(("Did not receive expected RFC1323 " "timestamp\n")); pf_print_state(state); pf_print_flags(th->th_flags); printf("\n"); } REASON_SET(reason, PFRES_TS); return (PF_DROP); } } /* * We will note if a host sends his data packets with or without * timestamps. And require all data packets to contain a timestamp * if the first does. PAWS implicitly requires that all data packets be * timestamped. But I think there are middle-man devices that hijack * TCP streams immediately after the 3whs and don't timestamp their * packets (seen in a WWW accelerator or cache). */ if (pd->p_len > 0 && src->scrub && (src->scrub->pfss_flags & (PFSS_TIMESTAMP|PFSS_DATA_TS|PFSS_DATA_NOTS)) == PFSS_TIMESTAMP) { if (got_ts) src->scrub->pfss_flags |= PFSS_DATA_TS; else { src->scrub->pfss_flags |= PFSS_DATA_NOTS; if (V_pf_status.debug >= PF_DEBUG_MISC && dst->scrub && (dst->scrub->pfss_flags & PFSS_TIMESTAMP)) { /* Don't warn if other host rejected RFC1323 */ DPFPRINTF(("Broken RFC1323 stack did not " "timestamp data packet. Disabled PAWS " "security.\n")); pf_print_state(state); pf_print_flags(th->th_flags); printf("\n"); } } } /* * Update PAWS values */ if (got_ts && src->scrub && PFSS_TIMESTAMP == (src->scrub->pfss_flags & (PFSS_PAWS_IDLED|PFSS_TIMESTAMP))) { getmicrouptime(&src->scrub->pfss_last); if (SEQ_GEQ(tsval, src->scrub->pfss_tsval) || (src->scrub->pfss_flags & PFSS_PAWS) == 0) src->scrub->pfss_tsval = tsval; if (tsecr) { if (SEQ_GEQ(tsecr, src->scrub->pfss_tsecr) || (src->scrub->pfss_flags & PFSS_PAWS) == 0) src->scrub->pfss_tsecr = tsecr; if ((src->scrub->pfss_flags & PFSS_PAWS) == 0 && (SEQ_LT(tsval, src->scrub->pfss_tsval0) || src->scrub->pfss_tsval0 == 0)) { /* tsval0 MUST be the lowest timestamp */ src->scrub->pfss_tsval0 = tsval; } /* Only fully initialized after a TS gets echoed */ if ((src->scrub->pfss_flags & PFSS_PAWS) == 0) src->scrub->pfss_flags |= PFSS_PAWS; } } /* I have a dream.... TCP segment reassembly.... */ return (0); } static int pf_normalize_tcpopt(struct pf_rule *r, struct mbuf *m, struct tcphdr *th, int off, sa_family_t af) { u_int16_t *mss; int thoff; int opt, cnt, optlen = 0; int rewrite = 0; u_char opts[TCP_MAXOLEN]; u_char *optp = opts; thoff = th->th_off << 2; cnt = thoff - sizeof(struct tcphdr); if (cnt > 0 && !pf_pull_hdr(m, off + sizeof(*th), opts, cnt, NULL, NULL, af)) return (rewrite); for (; cnt > 0; cnt -= optlen, optp += optlen) { opt = optp[0]; if (opt == TCPOPT_EOL) break; if (opt == TCPOPT_NOP) optlen = 1; else { if (cnt < 2) break; optlen = optp[1]; if (optlen < 2 || optlen > cnt) break; } switch (opt) { case TCPOPT_MAXSEG: mss = (u_int16_t *)(optp + 2); if ((ntohs(*mss)) > r->max_mss) { th->th_sum = pf_cksum_fixup(th->th_sum, *mss, htons(r->max_mss), 0); *mss = htons(r->max_mss); rewrite = 1; } break; default: break; } } if (rewrite) m_copyback(m, off + sizeof(*th), thoff - sizeof(*th), opts); return (rewrite); } #ifdef INET static void pf_scrub_ip(struct mbuf **m0, u_int32_t flags, u_int8_t min_ttl, u_int8_t tos) { struct mbuf *m = *m0; struct ip *h = mtod(m, struct ip *); /* Clear IP_DF if no-df was requested */ if (flags & PFRULE_NODF && h->ip_off & htons(IP_DF)) { u_int16_t ip_off = h->ip_off; h->ip_off &= htons(~IP_DF); h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_off, h->ip_off, 0); } /* Enforce a minimum ttl, may cause endless packet loops */ if (min_ttl && h->ip_ttl < min_ttl) { u_int16_t ip_ttl = h->ip_ttl; h->ip_ttl = min_ttl; h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_ttl, h->ip_ttl, 0); } /* Enforce tos */ if (flags & PFRULE_SET_TOS) { u_int16_t ov, nv; ov = *(u_int16_t *)h; h->ip_tos = tos; nv = *(u_int16_t *)h; h->ip_sum = pf_cksum_fixup(h->ip_sum, ov, nv, 0); } /* random-id, but not for fragments */ if (flags & PFRULE_RANDOMID && !(h->ip_off & ~htons(IP_DF))) { uint16_t ip_id = h->ip_id; ip_fillid(h); h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_id, h->ip_id, 0); } } #endif /* INET */ #ifdef INET6 static void pf_scrub_ip6(struct mbuf **m0, u_int8_t min_ttl) { struct mbuf *m = *m0; struct ip6_hdr *h = mtod(m, struct ip6_hdr *); /* Enforce a minimum ttl, may cause endless packet loops */ if (min_ttl && h->ip6_hlim < min_ttl) h->ip6_hlim = min_ttl; } #endif Index: user/ngie/more-tests/sys/sys/imgact.h =================================================================== --- user/ngie/more-tests/sys/sys/imgact.h (revision 281584) +++ user/ngie/more-tests/sys/sys/imgact.h (revision 281585) @@ -1,102 +1,103 @@ /*- * Copyright (c) 1993, David Greenman * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _SYS_IMGACT_H_ #define _SYS_IMGACT_H_ #include #include #define MAXSHELLCMDLEN PAGE_SIZE struct image_args { char *buf; /* pointer to string buffer */ char *begin_argv; /* beginning of argv in buf */ char *begin_envv; /* beginning of envv in buf */ char *endp; /* current `end' pointer of arg & env strings */ char *fname; /* pointer to filename of executable (system space) */ char *fname_buf; /* pointer to optional malloc(M_TEMP) buffer */ int stringspace; /* space left in arg & env buffer */ int argc; /* count of argument strings */ int envc; /* count of environment strings */ int fd; /* file descriptor of the executable */ }; struct image_params { struct proc *proc; /* our process struct */ struct label *execlabel; /* optional exec label */ struct vnode *vp; /* pointer to vnode of file to exec */ struct vm_object *object; /* The vm object for this vp */ struct vattr *attr; /* attributes of file */ const char *image_header; /* head of file to exec */ unsigned long entry_addr; /* entry address of target executable */ unsigned long reloc_base; /* load address of image */ char vmspace_destroyed; /* flag - we've blown away original vm space */ #define IMGACT_SHELL 0x1 #define IMGACT_BINMISC 0x2 unsigned char interpreted; /* mask of interpreters that have run */ char opened; /* flag - we have opened executable vnode */ char *interpreter_name; /* name of the interpreter */ void *auxargs; /* ELF Auxinfo structure pointer */ struct sf_buf *firstpage; /* first page that we mapped */ unsigned long ps_strings; /* PS_STRINGS for BSD/OS binaries */ size_t auxarg_size; struct image_args *args; /* system call arguments */ struct sysentvec *sysent; /* system entry vector */ char *execpath; unsigned long execpathp; char *freepath; unsigned long canary; int canarylen; unsigned long pagesizes; int pagesizeslen; vm_prot_t stack_prot; + u_long stack_sz; }; #ifdef _KERNEL struct sysentvec; struct thread; #define IMGACT_CORE_COMPRESS 0x01 int exec_alloc_args(struct image_args *); int exec_check_permissions(struct image_params *); register_t *exec_copyout_strings(struct image_params *); void exec_free_args(struct image_args *); int exec_new_vmspace(struct image_params *, struct sysentvec *); void exec_setregs(struct thread *, struct image_params *, u_long); int exec_shell_imgact(struct image_params *); int exec_copyin_args(struct image_args *, char *, enum uio_seg, char **, char **); #endif #endif /* !_SYS_IMGACT_H_ */ Index: user/ngie/more-tests/sys/sys/mount.h =================================================================== --- user/ngie/more-tests/sys/sys/mount.h (revision 281584) +++ user/ngie/more-tests/sys/sys/mount.h (revision 281585) @@ -1,949 +1,950 @@ /*- * Copyright (c) 1989, 1991, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)mount.h 8.21 (Berkeley) 5/20/95 * $FreeBSD$ */ #ifndef _SYS_MOUNT_H_ #define _SYS_MOUNT_H_ #include #include #ifdef _KERNEL #include #include #include #include #endif /* * NOTE: When changing statfs structure, mount structure, MNT_* flags or * MNTK_* flags also update DDB show mount command in vfs_subr.c. */ typedef struct fsid { int32_t val[2]; } fsid_t; /* filesystem id type */ /* * File identifier. * These are unique per filesystem on a single machine. */ #define MAXFIDSZ 16 struct fid { u_short fid_len; /* length of data in bytes */ u_short fid_data0; /* force longword alignment */ char fid_data[MAXFIDSZ]; /* data (variable length) */ }; /* * filesystem statistics */ #define MFSNAMELEN 16 /* length of type name including null */ #define MNAMELEN 88 /* size of on/from name bufs */ #define STATFS_VERSION 0x20030518 /* current version number */ struct statfs { uint32_t f_version; /* structure version number */ uint32_t f_type; /* type of filesystem */ uint64_t f_flags; /* copy of mount exported flags */ uint64_t f_bsize; /* filesystem fragment size */ uint64_t f_iosize; /* optimal transfer block size */ uint64_t f_blocks; /* total data blocks in filesystem */ uint64_t f_bfree; /* free blocks in filesystem */ int64_t f_bavail; /* free blocks avail to non-superuser */ uint64_t f_files; /* total file nodes in filesystem */ int64_t f_ffree; /* free nodes avail to non-superuser */ uint64_t f_syncwrites; /* count of sync writes since mount */ uint64_t f_asyncwrites; /* count of async writes since mount */ uint64_t f_syncreads; /* count of sync reads since mount */ uint64_t f_asyncreads; /* count of async reads since mount */ uint64_t f_spare[10]; /* unused spare */ uint32_t f_namemax; /* maximum filename length */ uid_t f_owner; /* user that mounted the filesystem */ fsid_t f_fsid; /* filesystem id */ char f_charspare[80]; /* spare string space */ char f_fstypename[MFSNAMELEN]; /* filesystem type name */ char f_mntfromname[MNAMELEN]; /* mounted filesystem */ char f_mntonname[MNAMELEN]; /* directory on which mounted */ }; #ifdef _KERNEL #define OMFSNAMELEN 16 /* length of fs type name, including null */ #define OMNAMELEN (88 - 2 * sizeof(long)) /* size of on/from name bufs */ /* XXX getfsstat.2 is out of date with write and read counter changes here. */ /* XXX statfs.2 is out of date with read counter changes here. */ struct ostatfs { long f_spare2; /* placeholder */ long f_bsize; /* fundamental filesystem block size */ long f_iosize; /* optimal transfer block size */ long f_blocks; /* total data blocks in filesystem */ long f_bfree; /* free blocks in fs */ long f_bavail; /* free blocks avail to non-superuser */ long f_files; /* total file nodes in filesystem */ long f_ffree; /* free file nodes in fs */ fsid_t f_fsid; /* filesystem id */ uid_t f_owner; /* user that mounted the filesystem */ int f_type; /* type of filesystem */ int f_flags; /* copy of mount exported flags */ long f_syncwrites; /* count of sync writes since mount */ long f_asyncwrites; /* count of async writes since mount */ char f_fstypename[OMFSNAMELEN]; /* fs type name */ char f_mntonname[OMNAMELEN]; /* directory on which mounted */ long f_syncreads; /* count of sync reads since mount */ long f_asyncreads; /* count of async reads since mount */ short f_spares1; /* unused spare */ char f_mntfromname[OMNAMELEN];/* mounted filesystem */ short f_spares2; /* unused spare */ /* * XXX on machines where longs are aligned to 8-byte boundaries, there * is an unnamed int32_t here. This spare was after the apparent end * of the struct until we bit off the read counters from f_mntonname. */ long f_spare[2]; /* unused spare */ }; TAILQ_HEAD(vnodelst, vnode); /* Mount options list */ TAILQ_HEAD(vfsoptlist, vfsopt); struct vfsopt { TAILQ_ENTRY(vfsopt) link; char *name; void *value; int len; int pos; int seen; }; /* * Structure per mounted filesystem. Each mounted filesystem has an * array of operations and an instance record. The filesystems are * put on a doubly linked list. * * Lock reference: * m - mountlist_mtx * i - interlock * v - vnode freelist mutex * * Unmarked fields are considered stable as long as a ref is held. * */ struct mount { struct mtx mnt_mtx; /* mount structure interlock */ int mnt_gen; /* struct mount generation */ #define mnt_startzero mnt_list TAILQ_ENTRY(mount) mnt_list; /* (m) mount list */ struct vfsops *mnt_op; /* operations on fs */ struct vfsconf *mnt_vfc; /* configuration info */ struct vnode *mnt_vnodecovered; /* vnode we mounted on */ struct vnode *mnt_syncer; /* syncer vnode */ int mnt_ref; /* (i) Reference count */ struct vnodelst mnt_nvnodelist; /* (i) list of vnodes */ int mnt_nvnodelistsize; /* (i) # of vnodes */ struct vnodelst mnt_activevnodelist; /* (v) list of active vnodes */ int mnt_activevnodelistsize;/* (v) # of active vnodes */ int mnt_writeopcount; /* (i) write syscalls pending */ int mnt_kern_flag; /* (i) kernel only flags */ uint64_t mnt_flag; /* (i) flags shared with user */ struct vfsoptlist *mnt_opt; /* current mount options */ struct vfsoptlist *mnt_optnew; /* new options passed to fs */ int mnt_maxsymlinklen; /* max size of short symlink */ struct statfs mnt_stat; /* cache of filesystem stats */ struct ucred *mnt_cred; /* credentials of mounter */ void * mnt_data; /* private data */ time_t mnt_time; /* last time written*/ int mnt_iosize_max; /* max size for clusters, etc */ struct netexport *mnt_export; /* export list */ struct label *mnt_label; /* MAC label for the fs */ u_int mnt_hashseed; /* Random seed for vfs_hash */ int mnt_lockref; /* (i) Lock reference count */ int mnt_secondary_writes; /* (i) # of secondary writes */ int mnt_secondary_accwrites;/* (i) secondary wr. starts */ struct thread *mnt_susp_owner; /* (i) thread owning suspension */ #define mnt_endzero mnt_gjprovider char *mnt_gjprovider; /* gjournal provider name */ struct lock mnt_explock; /* vfs_export walkers lock */ TAILQ_ENTRY(mount) mnt_upper_link; /* (m) we in the all uppers */ TAILQ_HEAD(, mount) mnt_uppers; /* (m) upper mounts over us*/ }; /* * Definitions for MNT_VNODE_FOREACH_ALL. */ struct vnode *__mnt_vnode_next_all(struct vnode **mvp, struct mount *mp); struct vnode *__mnt_vnode_first_all(struct vnode **mvp, struct mount *mp); void __mnt_vnode_markerfree_all(struct vnode **mvp, struct mount *mp); #define MNT_VNODE_FOREACH_ALL(vp, mp, mvp) \ for (vp = __mnt_vnode_first_all(&(mvp), (mp)); \ (vp) != NULL; vp = __mnt_vnode_next_all(&(mvp), (mp))) #define MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp) \ do { \ MNT_ILOCK(mp); \ __mnt_vnode_markerfree_all(&(mvp), (mp)); \ /* MNT_IUNLOCK(mp); -- done in above function */ \ mtx_assert(MNT_MTX(mp), MA_NOTOWNED); \ } while (0) /* * Definitions for MNT_VNODE_FOREACH_ACTIVE. */ struct vnode *__mnt_vnode_next_active(struct vnode **mvp, struct mount *mp); struct vnode *__mnt_vnode_first_active(struct vnode **mvp, struct mount *mp); void __mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *); #define MNT_VNODE_FOREACH_ACTIVE(vp, mp, mvp) \ for (vp = __mnt_vnode_first_active(&(mvp), (mp)); \ (vp) != NULL; vp = __mnt_vnode_next_active(&(mvp), (mp))) #define MNT_VNODE_FOREACH_ACTIVE_ABORT(mp, mvp) \ __mnt_vnode_markerfree_active(&(mvp), (mp)) #define MNT_ILOCK(mp) mtx_lock(&(mp)->mnt_mtx) #define MNT_ITRYLOCK(mp) mtx_trylock(&(mp)->mnt_mtx) #define MNT_IUNLOCK(mp) mtx_unlock(&(mp)->mnt_mtx) #define MNT_MTX(mp) (&(mp)->mnt_mtx) #define MNT_REF(mp) (mp)->mnt_ref++ #define MNT_REL(mp) do { \ KASSERT((mp)->mnt_ref > 0, ("negative mnt_ref")); \ (mp)->mnt_ref--; \ if ((mp)->mnt_ref == 0) \ wakeup((mp)); \ } while (0) #endif /* _KERNEL */ /* * User specifiable flags, stored in mnt_flag. */ #define MNT_RDONLY 0x0000000000000001ULL /* read only filesystem */ #define MNT_SYNCHRONOUS 0x0000000000000002ULL /* fs written synchronously */ #define MNT_NOEXEC 0x0000000000000004ULL /* can't exec from filesystem */ #define MNT_NOSUID 0x0000000000000008ULL /* don't honor setuid fs bits */ #define MNT_NFS4ACLS 0x0000000000000010ULL /* enable NFS version 4 ACLs */ #define MNT_UNION 0x0000000000000020ULL /* union with underlying fs */ #define MNT_ASYNC 0x0000000000000040ULL /* fs written asynchronously */ #define MNT_SUIDDIR 0x0000000000100000ULL /* special SUID dir handling */ #define MNT_SOFTDEP 0x0000000000200000ULL /* using soft updates */ #define MNT_NOSYMFOLLOW 0x0000000000400000ULL /* do not follow symlinks */ #define MNT_GJOURNAL 0x0000000002000000ULL /* GEOM journal support enabled */ #define MNT_MULTILABEL 0x0000000004000000ULL /* MAC support for objects */ #define MNT_ACLS 0x0000000008000000ULL /* ACL support enabled */ #define MNT_NOATIME 0x0000000010000000ULL /* dont update file access time */ #define MNT_NOCLUSTERR 0x0000000040000000ULL /* disable cluster read */ #define MNT_NOCLUSTERW 0x0000000080000000ULL /* disable cluster write */ #define MNT_SUJ 0x0000000100000000ULL /* using journaled soft updates */ #define MNT_AUTOMOUNTED 0x0000000200000000ULL /* mounted by automountd(8) */ /* * NFS export related mount flags. */ #define MNT_EXRDONLY 0x0000000000000080ULL /* exported read only */ #define MNT_EXPORTED 0x0000000000000100ULL /* filesystem is exported */ #define MNT_DEFEXPORTED 0x0000000000000200ULL /* exported to the world */ #define MNT_EXPORTANON 0x0000000000000400ULL /* anon uid mapping for all */ #define MNT_EXKERB 0x0000000000000800ULL /* exported with Kerberos */ #define MNT_EXPUBLIC 0x0000000020000000ULL /* public export (WebNFS) */ /* * Flags set by internal operations, * but visible to the user. * XXX some of these are not quite right.. (I've never seen the root flag set) */ #define MNT_LOCAL 0x0000000000001000ULL /* filesystem is stored locally */ #define MNT_QUOTA 0x0000000000002000ULL /* quotas are enabled on fs */ #define MNT_ROOTFS 0x0000000000004000ULL /* identifies the root fs */ #define MNT_USER 0x0000000000008000ULL /* mounted by a user */ #define MNT_IGNORE 0x0000000000800000ULL /* do not show entry in df */ /* * Mask of flags that are visible to statfs(). * XXX I think that this could now become (~(MNT_CMDFLAGS)) * but the 'mount' program may need changing to handle this. */ #define MNT_VISFLAGMASK (MNT_RDONLY | MNT_SYNCHRONOUS | MNT_NOEXEC | \ MNT_NOSUID | MNT_UNION | MNT_SUJ | \ MNT_ASYNC | MNT_EXRDONLY | MNT_EXPORTED | \ MNT_DEFEXPORTED | MNT_EXPORTANON| MNT_EXKERB | \ MNT_LOCAL | MNT_USER | MNT_QUOTA | \ MNT_ROOTFS | MNT_NOATIME | MNT_NOCLUSTERR| \ MNT_NOCLUSTERW | MNT_SUIDDIR | MNT_SOFTDEP | \ MNT_IGNORE | MNT_EXPUBLIC | MNT_NOSYMFOLLOW | \ MNT_GJOURNAL | MNT_MULTILABEL | MNT_ACLS | \ MNT_NFS4ACLS | MNT_AUTOMOUNTED) /* Mask of flags that can be updated. */ #define MNT_UPDATEMASK (MNT_NOSUID | MNT_NOEXEC | \ MNT_SYNCHRONOUS | MNT_UNION | MNT_ASYNC | \ MNT_NOATIME | \ MNT_NOSYMFOLLOW | MNT_IGNORE | \ MNT_NOCLUSTERR | MNT_NOCLUSTERW | MNT_SUIDDIR | \ MNT_ACLS | MNT_USER | MNT_NFS4ACLS | \ MNT_AUTOMOUNTED) /* * External filesystem command modifier flags. * Unmount can use the MNT_FORCE flag. * XXX: These are not STATES and really should be somewhere else. * XXX: MNT_BYFSID collides with MNT_ACLS, but because MNT_ACLS is only used for * mount(2) and MNT_BYFSID is only used for unmount(2) it's harmless. */ #define MNT_UPDATE 0x0000000000010000ULL /* not real mount, just update */ #define MNT_DELEXPORT 0x0000000000020000ULL /* delete export host lists */ #define MNT_RELOAD 0x0000000000040000ULL /* reload filesystem data */ #define MNT_FORCE 0x0000000000080000ULL /* force unmount or readonly */ #define MNT_SNAPSHOT 0x0000000001000000ULL /* snapshot the filesystem */ #define MNT_BYFSID 0x0000000008000000ULL /* specify filesystem by ID. */ #define MNT_CMDFLAGS (MNT_UPDATE | MNT_DELEXPORT | MNT_RELOAD | \ MNT_FORCE | MNT_SNAPSHOT | MNT_BYFSID) /* * Internal filesystem control flags stored in mnt_kern_flag. * * MNTK_UNMOUNT locks the mount entry so that name lookup cannot proceed * past the mount point. This keeps the subtree stable during mounts * and unmounts. * * MNTK_UNMOUNTF permits filesystems to detect a forced unmount while * dounmount() is still waiting to lock the mountpoint. This allows * the filesystem to cancel operations that might otherwise deadlock * with the unmount attempt (used by NFS). * * MNTK_NOINSMNTQ is strict subset of MNTK_UNMOUNT. They are separated * to allow for failed unmount attempt to restore the syncer vnode for * the mount. */ #define MNTK_UNMOUNTF 0x00000001 /* forced unmount in progress */ #define MNTK_ASYNC 0x00000002 /* filtered async flag */ #define MNTK_SOFTDEP 0x00000004 /* async disabled by softdep */ #define MNTK_NOINSMNTQ 0x00000008 /* insmntque is not allowed */ #define MNTK_DRAINING 0x00000010 /* lock draining is happening */ #define MNTK_REFEXPIRE 0x00000020 /* refcount expiring is happening */ #define MNTK_EXTENDED_SHARED 0x00000040 /* Allow shared locking for more ops */ #define MNTK_SHARED_WRITES 0x00000080 /* Allow shared locking for writes */ #define MNTK_NO_IOPF 0x00000100 /* Disallow page faults during reads and writes. Filesystem shall properly handle i/o state on EFAULT. */ #define MNTK_VGONE_UPPER 0x00000200 #define MNTK_VGONE_WAITER 0x00000400 #define MNTK_LOOKUP_EXCL_DOTDOT 0x00000800 #define MNTK_MARKER 0x00001000 #define MNTK_UNMAPPED_BUFS 0x00002000 +#define MNTK_USES_BCACHE 0x00004000 /* FS uses the buffer cache. */ #define MNTK_NOASYNC 0x00800000 /* disable async */ #define MNTK_UNMOUNT 0x01000000 /* unmount in progress */ #define MNTK_MWAIT 0x02000000 /* waiting for unmount to finish */ #define MNTK_SUSPEND 0x08000000 /* request write suspension */ #define MNTK_SUSPEND2 0x04000000 /* block secondary writes */ #define MNTK_SUSPENDED 0x10000000 /* write operations are suspended */ #define MNTK_SUSPENDABLE 0x20000000 /* writes can be suspended */ #define MNTK_LOOKUP_SHARED 0x40000000 /* FS supports shared lock lookups */ #define MNTK_NOKNOTE 0x80000000 /* Don't send KNOTEs from VOP hooks */ #ifdef _KERNEL static inline int MNT_SHARED_WRITES(struct mount *mp) { return (mp != NULL && (mp->mnt_kern_flag & MNTK_SHARED_WRITES) != 0); } static inline int MNT_EXTENDED_SHARED(struct mount *mp) { return (mp != NULL && (mp->mnt_kern_flag & MNTK_EXTENDED_SHARED) != 0); } #endif /* * Sysctl CTL_VFS definitions. * * Second level identifier specifies which filesystem. Second level * identifier VFS_VFSCONF returns information about all filesystems. * Second level identifier VFS_GENERIC is non-terminal. */ #define VFS_VFSCONF 0 /* get configured filesystems */ #define VFS_GENERIC 0 /* generic filesystem information */ /* * Third level identifiers for VFS_GENERIC are given below; third * level identifiers for specific filesystems are given in their * mount specific header files. */ #define VFS_MAXTYPENUM 1 /* int: highest defined filesystem type */ #define VFS_CONF 2 /* struct: vfsconf for filesystem given as next argument */ /* * Flags for various system call interfaces. * * waitfor flags to vfs_sync() and getfsstat() */ #define MNT_WAIT 1 /* synchronously wait for I/O to complete */ #define MNT_NOWAIT 2 /* start all I/O, but do not wait for it */ #define MNT_LAZY 3 /* push data not written by filesystem syncer */ #define MNT_SUSPEND 4 /* Suspend file system after sync */ /* * Generic file handle */ struct fhandle { fsid_t fh_fsid; /* Filesystem id of mount point */ struct fid fh_fid; /* Filesys specific id */ }; typedef struct fhandle fhandle_t; /* * Old export arguments without security flavor list */ struct oexport_args { int ex_flags; /* export related flags */ uid_t ex_root; /* mapping for root uid */ struct xucred ex_anon; /* mapping for anonymous user */ struct sockaddr *ex_addr; /* net address to which exported */ u_char ex_addrlen; /* and the net address length */ struct sockaddr *ex_mask; /* mask of valid bits in saddr */ u_char ex_masklen; /* and the smask length */ char *ex_indexfile; /* index file for WebNFS URLs */ }; /* * Export arguments for local filesystem mount calls. */ #define MAXSECFLAVORS 5 struct export_args { int ex_flags; /* export related flags */ uid_t ex_root; /* mapping for root uid */ struct xucred ex_anon; /* mapping for anonymous user */ struct sockaddr *ex_addr; /* net address to which exported */ u_char ex_addrlen; /* and the net address length */ struct sockaddr *ex_mask; /* mask of valid bits in saddr */ u_char ex_masklen; /* and the smask length */ char *ex_indexfile; /* index file for WebNFS URLs */ int ex_numsecflavors; /* security flavor count */ int ex_secflavors[MAXSECFLAVORS]; /* list of security flavors */ }; /* * Structure holding information for a publicly exported filesystem * (WebNFS). Currently the specs allow just for one such filesystem. */ struct nfs_public { int np_valid; /* Do we hold valid information */ fhandle_t np_handle; /* Filehandle for pub fs (internal) */ struct mount *np_mount; /* Mountpoint of exported fs */ char *np_index; /* Index file */ }; /* * Filesystem configuration information. One of these exists for each * type of filesystem supported by the kernel. These are searched at * mount time to identify the requested filesystem. * * XXX: Never change the first two arguments! */ struct vfsconf { u_int vfc_version; /* ABI version number */ char vfc_name[MFSNAMELEN]; /* filesystem type name */ struct vfsops *vfc_vfsops; /* filesystem operations vector */ int vfc_typenum; /* historic filesystem type number */ int vfc_refcount; /* number mounted of this type */ int vfc_flags; /* permanent flags */ struct vfsoptdecl *vfc_opts; /* mount options */ TAILQ_ENTRY(vfsconf) vfc_list; /* list of vfscons */ }; /* Userland version of the struct vfsconf. */ struct xvfsconf { struct vfsops *vfc_vfsops; /* filesystem operations vector */ char vfc_name[MFSNAMELEN]; /* filesystem type name */ int vfc_typenum; /* historic filesystem type number */ int vfc_refcount; /* number mounted of this type */ int vfc_flags; /* permanent flags */ struct vfsconf *vfc_next; /* next in list */ }; #ifndef BURN_BRIDGES struct ovfsconf { void *vfc_vfsops; char vfc_name[32]; int vfc_index; int vfc_refcount; int vfc_flags; }; #endif /* * NB: these flags refer to IMPLEMENTATION properties, not properties of * any actual mounts; i.e., it does not make sense to change the flags. */ #define VFCF_STATIC 0x00010000 /* statically compiled into kernel */ #define VFCF_NETWORK 0x00020000 /* may get data over the network */ #define VFCF_READONLY 0x00040000 /* writes are not implemented */ #define VFCF_SYNTHETIC 0x00080000 /* data does not represent real files */ #define VFCF_LOOPBACK 0x00100000 /* aliases some other mounted FS */ #define VFCF_UNICODE 0x00200000 /* stores file names as Unicode */ #define VFCF_JAIL 0x00400000 /* can be mounted from within a jail */ #define VFCF_DELEGADMIN 0x00800000 /* supports delegated administration */ #define VFCF_SBDRY 0x01000000 /* defer stop requests */ typedef uint32_t fsctlop_t; struct vfsidctl { int vc_vers; /* should be VFSIDCTL_VERS1 (below) */ fsid_t vc_fsid; /* fsid to operate on */ char vc_fstypename[MFSNAMELEN]; /* type of fs 'nfs' or '*' */ fsctlop_t vc_op; /* operation VFS_CTL_* (below) */ void *vc_ptr; /* pointer to data structure */ size_t vc_len; /* sizeof said structure */ u_int32_t vc_spare[12]; /* spare (must be zero) */ }; /* vfsidctl API version. */ #define VFS_CTL_VERS1 0x01 /* * New style VFS sysctls, do not reuse/conflict with the namespace for * private sysctls. * All "global" sysctl ops have the 33rd bit set: * 0x...1.... * Private sysctl ops should have the 33rd bit unset. */ #define VFS_CTL_QUERY 0x00010001 /* anything wrong? (vfsquery) */ #define VFS_CTL_TIMEO 0x00010002 /* set timeout for vfs notification */ #define VFS_CTL_NOLOCKS 0x00010003 /* disable file locking */ struct vfsquery { u_int32_t vq_flags; u_int32_t vq_spare[31]; }; /* vfsquery flags */ #define VQ_NOTRESP 0x0001 /* server down */ #define VQ_NEEDAUTH 0x0002 /* server bad auth */ #define VQ_LOWDISK 0x0004 /* we're low on space */ #define VQ_MOUNT 0x0008 /* new filesystem arrived */ #define VQ_UNMOUNT 0x0010 /* filesystem has left */ #define VQ_DEAD 0x0020 /* filesystem is dead, needs force unmount */ #define VQ_ASSIST 0x0040 /* filesystem needs assistance from external program */ #define VQ_NOTRESPLOCK 0x0080 /* server lockd down */ #define VQ_FLAG0100 0x0100 /* placeholder */ #define VQ_FLAG0200 0x0200 /* placeholder */ #define VQ_FLAG0400 0x0400 /* placeholder */ #define VQ_FLAG0800 0x0800 /* placeholder */ #define VQ_FLAG1000 0x1000 /* placeholder */ #define VQ_FLAG2000 0x2000 /* placeholder */ #define VQ_FLAG4000 0x4000 /* placeholder */ #define VQ_FLAG8000 0x8000 /* placeholder */ #ifdef _KERNEL /* Point a sysctl request at a vfsidctl's data. */ #define VCTLTOREQ(vc, req) \ do { \ (req)->newptr = (vc)->vc_ptr; \ (req)->newlen = (vc)->vc_len; \ (req)->newidx = 0; \ } while (0) #endif struct iovec; struct uio; #ifdef _KERNEL /* * vfs_busy specific flags and mask. */ #define MBF_NOWAIT 0x01 #define MBF_MNTLSTLOCK 0x02 #define MBF_MASK (MBF_NOWAIT | MBF_MNTLSTLOCK) #ifdef MALLOC_DECLARE MALLOC_DECLARE(M_MOUNT); #endif extern int maxvfsconf; /* highest defined filesystem type */ extern int nfs_mount_type; /* vfc_typenum for nfs, or -1 */ TAILQ_HEAD(vfsconfhead, vfsconf); extern struct vfsconfhead vfsconf; /* * Operations supported on mounted filesystem. */ struct mount_args; struct nameidata; struct sysctl_req; struct mntarg; typedef int vfs_cmount_t(struct mntarg *ma, void *data, uint64_t flags); typedef int vfs_unmount_t(struct mount *mp, int mntflags); typedef int vfs_root_t(struct mount *mp, int flags, struct vnode **vpp); typedef int vfs_quotactl_t(struct mount *mp, int cmds, uid_t uid, void *arg); typedef int vfs_statfs_t(struct mount *mp, struct statfs *sbp); typedef int vfs_sync_t(struct mount *mp, int waitfor); typedef int vfs_vget_t(struct mount *mp, ino_t ino, int flags, struct vnode **vpp); typedef int vfs_fhtovp_t(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp); typedef int vfs_checkexp_t(struct mount *mp, struct sockaddr *nam, int *extflagsp, struct ucred **credanonp, int *numsecflavors, int **secflavors); typedef int vfs_init_t(struct vfsconf *); typedef int vfs_uninit_t(struct vfsconf *); typedef int vfs_extattrctl_t(struct mount *mp, int cmd, struct vnode *filename_vp, int attrnamespace, const char *attrname); typedef int vfs_mount_t(struct mount *mp); typedef int vfs_sysctl_t(struct mount *mp, fsctlop_t op, struct sysctl_req *req); typedef void vfs_susp_clean_t(struct mount *mp); typedef void vfs_notify_lowervp_t(struct mount *mp, struct vnode *lowervp); typedef void vfs_purge_t(struct mount *mp); struct vfsops { vfs_mount_t *vfs_mount; vfs_cmount_t *vfs_cmount; vfs_unmount_t *vfs_unmount; vfs_root_t *vfs_root; vfs_quotactl_t *vfs_quotactl; vfs_statfs_t *vfs_statfs; vfs_sync_t *vfs_sync; vfs_vget_t *vfs_vget; vfs_fhtovp_t *vfs_fhtovp; vfs_checkexp_t *vfs_checkexp; vfs_init_t *vfs_init; vfs_uninit_t *vfs_uninit; vfs_extattrctl_t *vfs_extattrctl; vfs_sysctl_t *vfs_sysctl; vfs_susp_clean_t *vfs_susp_clean; vfs_notify_lowervp_t *vfs_reclaim_lowervp; vfs_notify_lowervp_t *vfs_unlink_lowervp; vfs_purge_t *vfs_purge; vfs_mount_t *vfs_spare[6]; /* spares for ABI compat */ }; vfs_statfs_t __vfs_statfs; #define VFS_PROLOGUE(MP) do { \ struct mount *mp__; \ int _enable_stops; \ \ mp__ = (MP); \ _enable_stops = (mp__ != NULL && \ (mp__->mnt_vfc->vfc_flags & VFCF_SBDRY) && sigdeferstop()) #define VFS_EPILOGUE(MP) \ if (_enable_stops) \ sigallowstop(); \ } while (0) #define VFS_MOUNT(MP) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_mount)(MP); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_UNMOUNT(MP, FORCE) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_unmount)(MP, FORCE); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_ROOT(MP, FLAGS, VPP) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_root)(MP, FLAGS, VPP); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_QUOTACTL(MP, C, U, A) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_quotactl)(MP, C, U, A); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_STATFS(MP, SBP) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = __vfs_statfs((MP), (SBP)); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_SYNC(MP, WAIT) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_sync)(MP, WAIT); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_VGET(MP, INO, FLAGS, VPP) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_vget)(MP, INO, FLAGS, VPP); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_FHTOVP(MP, FIDP, FLAGS, VPP) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_fhtovp)(MP, FIDP, FLAGS, VPP); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_CHECKEXP(MP, NAM, EXFLG, CRED, NUMSEC, SEC) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_checkexp)(MP, NAM, EXFLG, CRED, NUMSEC,\ SEC); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_EXTATTRCTL(MP, C, FN, NS, N) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_extattrctl)(MP, C, FN, NS, N); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_SYSCTL(MP, OP, REQ) ({ \ int _rc; \ \ VFS_PROLOGUE(MP); \ _rc = (*(MP)->mnt_op->vfs_sysctl)(MP, OP, REQ); \ VFS_EPILOGUE(MP); \ _rc; }) #define VFS_SUSP_CLEAN(MP) do { \ if (*(MP)->mnt_op->vfs_susp_clean != NULL) { \ VFS_PROLOGUE(MP); \ (*(MP)->mnt_op->vfs_susp_clean)(MP); \ VFS_EPILOGUE(MP); \ } \ } while (0) #define VFS_RECLAIM_LOWERVP(MP, VP) do { \ if (*(MP)->mnt_op->vfs_reclaim_lowervp != NULL) { \ VFS_PROLOGUE(MP); \ (*(MP)->mnt_op->vfs_reclaim_lowervp)((MP), (VP)); \ VFS_EPILOGUE(MP); \ } \ } while (0) #define VFS_UNLINK_LOWERVP(MP, VP) do { \ if (*(MP)->mnt_op->vfs_unlink_lowervp != NULL) { \ VFS_PROLOGUE(MP); \ (*(MP)->mnt_op->vfs_unlink_lowervp)((MP), (VP)); \ VFS_EPILOGUE(MP); \ } \ } while (0) #define VFS_PURGE(MP) do { \ if (*(MP)->mnt_op->vfs_purge != NULL) { \ VFS_PROLOGUE(MP); \ (*(MP)->mnt_op->vfs_purge)(MP); \ VFS_EPILOGUE(MP); \ } \ } while (0) #define VFS_KNOTE_LOCKED(vp, hint) do \ { \ if (((vp)->v_vflag & VV_NOKNOTE) == 0) \ VN_KNOTE((vp), (hint), KNF_LISTLOCKED); \ } while (0) #define VFS_KNOTE_UNLOCKED(vp, hint) do \ { \ if (((vp)->v_vflag & VV_NOKNOTE) == 0) \ VN_KNOTE((vp), (hint), 0); \ } while (0) #define VFS_NOTIFY_UPPER_RECLAIM 1 #define VFS_NOTIFY_UPPER_UNLINK 2 #include /* * Version numbers. */ #define VFS_VERSION_00 0x19660120 #define VFS_VERSION_01 0x20121030 #define VFS_VERSION VFS_VERSION_01 #define VFS_SET(vfsops, fsname, flags) \ static struct vfsconf fsname ## _vfsconf = { \ .vfc_version = VFS_VERSION, \ .vfc_name = #fsname, \ .vfc_vfsops = &vfsops, \ .vfc_typenum = -1, \ .vfc_flags = flags, \ }; \ static moduledata_t fsname ## _mod = { \ #fsname, \ vfs_modevent, \ & fsname ## _vfsconf \ }; \ DECLARE_MODULE(fsname, fsname ## _mod, SI_SUB_VFS, SI_ORDER_MIDDLE) /* * exported vnode operations */ int dounmount(struct mount *, int, struct thread *); int kernel_mount(struct mntarg *ma, uint64_t flags); int kernel_vmount(int flags, ...); struct mntarg *mount_arg(struct mntarg *ma, const char *name, const void *val, int len); struct mntarg *mount_argb(struct mntarg *ma, int flag, const char *name); struct mntarg *mount_argf(struct mntarg *ma, const char *name, const char *fmt, ...); struct mntarg *mount_argsu(struct mntarg *ma, const char *name, const void *val, int len); void statfs_scale_blocks(struct statfs *sf, long max_size); struct vfsconf *vfs_byname(const char *); struct vfsconf *vfs_byname_kld(const char *, struct thread *td, int *); void vfs_mount_destroy(struct mount *); void vfs_event_signal(fsid_t *, u_int32_t, intptr_t); void vfs_freeopts(struct vfsoptlist *opts); void vfs_deleteopt(struct vfsoptlist *opts, const char *name); int vfs_buildopts(struct uio *auio, struct vfsoptlist **options); int vfs_flagopt(struct vfsoptlist *opts, const char *name, uint64_t *w, uint64_t val); int vfs_getopt(struct vfsoptlist *, const char *, void **, int *); int vfs_getopt_pos(struct vfsoptlist *opts, const char *name); int vfs_getopt_size(struct vfsoptlist *opts, const char *name, off_t *value); char *vfs_getopts(struct vfsoptlist *, const char *, int *error); int vfs_copyopt(struct vfsoptlist *, const char *, void *, int); int vfs_filteropt(struct vfsoptlist *, const char **legal); void vfs_opterror(struct vfsoptlist *opts, const char *fmt, ...); int vfs_scanopt(struct vfsoptlist *opts, const char *name, const char *fmt, ...); int vfs_setopt(struct vfsoptlist *opts, const char *name, void *value, int len); int vfs_setopt_part(struct vfsoptlist *opts, const char *name, void *value, int len); int vfs_setopts(struct vfsoptlist *opts, const char *name, const char *value); int vfs_setpublicfs /* set publicly exported fs */ (struct mount *, struct netexport *, struct export_args *); void vfs_msync(struct mount *, int); int vfs_busy(struct mount *, int); int vfs_export /* process mount export info */ (struct mount *, struct export_args *); void vfs_allocate_syncvnode(struct mount *); void vfs_deallocate_syncvnode(struct mount *); int vfs_donmount(struct thread *td, uint64_t fsflags, struct uio *fsoptions); void vfs_getnewfsid(struct mount *); struct cdev *vfs_getrootfsid(struct mount *); struct mount *vfs_getvfs(fsid_t *); /* return vfs given fsid */ struct mount *vfs_busyfs(fsid_t *); int vfs_modevent(module_t, int, void *); void vfs_mount_error(struct mount *, const char *, ...); void vfs_mountroot(void); /* mount our root filesystem */ void vfs_mountedfrom(struct mount *, const char *from); void vfs_notify_upper(struct vnode *, int); void vfs_oexport_conv(const struct oexport_args *oexp, struct export_args *exp); void vfs_ref(struct mount *); void vfs_rel(struct mount *); struct mount *vfs_mount_alloc(struct vnode *, struct vfsconf *, const char *, struct ucred *); int vfs_suser(struct mount *, struct thread *); void vfs_unbusy(struct mount *); void vfs_unmountall(void); extern TAILQ_HEAD(mntlist, mount) mountlist; /* mounted filesystem list */ extern struct mtx mountlist_mtx; extern struct nfs_public nfs_pub; extern struct sx vfsconf_sx; #define vfsconf_lock() sx_xlock(&vfsconf_sx) #define vfsconf_unlock() sx_xunlock(&vfsconf_sx) #define vfsconf_slock() sx_slock(&vfsconf_sx) #define vfsconf_sunlock() sx_sunlock(&vfsconf_sx) /* * Declarations for these vfs default operations are located in * kern/vfs_default.c. They will be automatically used to replace * null entries in VFS ops tables when registering a new filesystem * type in the global table. */ vfs_root_t vfs_stdroot; vfs_quotactl_t vfs_stdquotactl; vfs_statfs_t vfs_stdstatfs; vfs_sync_t vfs_stdsync; vfs_sync_t vfs_stdnosync; vfs_vget_t vfs_stdvget; vfs_fhtovp_t vfs_stdfhtovp; vfs_checkexp_t vfs_stdcheckexp; vfs_init_t vfs_stdinit; vfs_uninit_t vfs_stduninit; vfs_extattrctl_t vfs_stdextattrctl; vfs_sysctl_t vfs_stdsysctl; void syncer_suspend(void); void syncer_resume(void); #else /* !_KERNEL */ #include struct stat; __BEGIN_DECLS int fhopen(const struct fhandle *, int); int fhstat(const struct fhandle *, struct stat *); int fhstatfs(const struct fhandle *, struct statfs *); int fstatfs(int, struct statfs *); int getfh(const char *, fhandle_t *); int getfsstat(struct statfs *, long, int); int getmntinfo(struct statfs **, int); int lgetfh(const char *, fhandle_t *); int mount(const char *, const char *, int, void *); int nmount(struct iovec *, unsigned int, int); int statfs(const char *, struct statfs *); int unmount(const char *, int); /* C library stuff */ int getvfsbyname(const char *, struct xvfsconf *); __END_DECLS #endif /* _KERNEL */ #endif /* !_SYS_MOUNT_H_ */ Index: user/ngie/more-tests/sys/sys/param.h =================================================================== --- user/ngie/more-tests/sys/sys/param.h (revision 281584) +++ user/ngie/more-tests/sys/sys/param.h (revision 281585) @@ -1,348 +1,348 @@ /*- * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)param.h 8.3 (Berkeley) 4/4/95 * $FreeBSD$ */ #ifndef _SYS_PARAM_H_ #define _SYS_PARAM_H_ #include #define BSD 199506 /* System version (year & month). */ #define BSD4_3 1 #define BSD4_4 1 /* * __FreeBSD_version numbers are documented in the Porter's Handbook. * If you bump the version for any reason, you should update the documentation * there. * Currently this lives here in the doc/ repository: * - * head/en_US.ISO8859-1/books/porters-handbook/book.xml + * head/en_US.ISO8859-1/books/porters-handbook/versions/chapter.xml * * scheme is: Rxx * 'R' is in the range 0 to 4 if this is a release branch or * x.0-CURRENT before RELENG_*_0 is created, otherwise 'R' is * in the range 5 to 9. */ #undef __FreeBSD_version -#define __FreeBSD_version 1100068 /* Master, propagated to newvers */ +#define __FreeBSD_version 1100069 /* Master, propagated to newvers */ /* * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD, * which by definition is always true on FreeBSD. This macro is also defined * on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD. * * It is tempting to use this macro in userland code when we want to enable * kernel-specific routines, and in fact it's fine to do this in code that * is part of FreeBSD itself. However, be aware that as presence of this * macro is still not widespread (e.g. older FreeBSD versions, 3rd party * compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in * external applications without also checking for __FreeBSD__ as an * alternative. */ #undef __FreeBSD_kernel__ #define __FreeBSD_kernel__ #ifdef _KERNEL #define P_OSREL_SIGWAIT 700000 #define P_OSREL_SIGSEGV 700004 #define P_OSREL_MAP_ANON 800104 #define P_OSREL_MAP_FSTRICT 1100036 #define P_OSREL_MAJOR(x) ((x) / 100000) #endif #ifndef LOCORE #include #endif /* * Machine-independent constants (some used in following include files). * Redefined constants are from POSIX 1003.1 limits file. * * MAXCOMLEN should be >= sizeof(ac_comm) (see ) */ #include #define MAXCOMLEN 19 /* max command name remembered */ #define MAXINTERP PATH_MAX /* max interpreter file name length */ #define MAXLOGNAME 33 /* max login name length (incl. NUL) */ #define MAXUPRC CHILD_MAX /* max simultaneous processes */ #define NCARGS ARG_MAX /* max bytes for an exec function */ #define NGROUPS (NGROUPS_MAX+1) /* max number groups */ #define NOFILE OPEN_MAX /* max open files per process */ #define NOGROUP 65535 /* marker for empty group set member */ #define MAXHOSTNAMELEN 256 /* max hostname size */ #define SPECNAMELEN 63 /* max length of devicename */ /* More types and definitions used throughout the kernel. */ #ifdef _KERNEL #include #include #ifndef LOCORE #include #include #endif #ifndef FALSE #define FALSE 0 #endif #ifndef TRUE #define TRUE 1 #endif #endif #ifndef _KERNEL /* Signals. */ #include #endif /* Machine type dependent parameters. */ #include #ifndef _KERNEL #include #endif #ifndef DEV_BSHIFT #define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */ #endif #define DEV_BSIZE (1<>PAGE_SHIFT) #endif /* * btodb() is messy and perhaps slow because `bytes' may be an off_t. We * want to shift an unsigned type to avoid sign extension and we don't * want to widen `bytes' unnecessarily. Assume that the result fits in * a daddr_t. */ #ifndef btodb #define btodb(bytes) /* calculates (bytes / DEV_BSIZE) */ \ (sizeof (bytes) > sizeof(long) \ ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \ : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT)) #endif #ifndef dbtob #define dbtob(db) /* calculates (db * DEV_BSIZE) */ \ ((off_t)(db) << DEV_BSHIFT) #endif #define PRIMASK 0x0ff #define PCATCH 0x100 /* OR'd with pri for tsleep to check signals */ #define PDROP 0x200 /* OR'd with pri to stop re-entry of interlock mutex */ #define NZERO 0 /* default "nice" */ #define NBBY 8 /* number of bits in a byte */ #define NBPW sizeof(int) /* number of bytes per word (integer) */ #define CMASK 022 /* default file mask: S_IWGRP|S_IWOTH */ #define NODEV (dev_t)(-1) /* non-existent device */ /* * File system parameters and macros. * * MAXBSIZE - Filesystems are made out of blocks of at most MAXBSIZE bytes * per block. MAXBSIZE may be made larger without effecting * any existing filesystems as long as it does not exceed MAXPHYS, * and may be made smaller at the risk of not being able to use * filesystems which require a block size exceeding MAXBSIZE. * * BKVASIZE - Nominal buffer space per buffer, in bytes. BKVASIZE is the * minimum KVM memory reservation the kernel is willing to make. * Filesystems can of course request smaller chunks. Actual * backing memory uses a chunk size of a page (PAGE_SIZE). * * If you make BKVASIZE too small you risk seriously fragmenting * the buffer KVM map which may slow things down a bit. If you * make it too big the kernel will not be able to optimally use * the KVM memory reserved for the buffer cache and will wind * up with too-few buffers. * * The default is 16384, roughly 2x the block size used by a * normal UFS filesystem. */ #define MAXBSIZE 65536 /* must be power of 2 */ #define BKVASIZE 16384 /* must be power of 2 */ #define BKVAMASK (BKVASIZE-1) /* * MAXPATHLEN defines the longest permissible path length after expanding * symbolic links. It is used to allocate a temporary buffer from the buffer * pool in which to do the name expansion, hence should be a power of two, * and must be less than or equal to MAXBSIZE. MAXSYMLINKS defines the * maximum number of symbolic links that may be expanded in a path name. * It should be set high enough to allow all legitimate uses, but halt * infinite loops reasonably quickly. */ #define MAXPATHLEN PATH_MAX #define MAXSYMLINKS 32 /* Bit map related macros. */ #define setbit(a,i) (((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY)) #define clrbit(a,i) (((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY))) #define isset(a,i) \ (((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) #define isclr(a,i) \ ((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0) /* Macros for counting and rounding. */ #ifndef howmany #define howmany(x, y) (((x)+((y)-1))/(y)) #endif #define nitems(x) (sizeof((x)) / sizeof((x)[0])) #define rounddown(x, y) (((x)/(y))*(y)) #define rounddown2(x, y) ((x)&(~((y)-1))) /* if y is power of two */ #define roundup(x, y) ((((x)+((y)-1))/(y))*(y)) /* to any y */ #define roundup2(x, y) (((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */ #define powerof2(x) ((((x)-1)&(x))==0) /* Macros for min/max. */ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) #ifdef _KERNEL /* * Basic byte order function prototypes for non-inline functions. */ #ifndef LOCORE #ifndef _BYTEORDER_PROTOTYPED #define _BYTEORDER_PROTOTYPED __BEGIN_DECLS __uint32_t htonl(__uint32_t); __uint16_t htons(__uint16_t); __uint32_t ntohl(__uint32_t); __uint16_t ntohs(__uint16_t); __END_DECLS #endif #endif #ifndef lint #ifndef _BYTEORDER_FUNC_DEFINED #define _BYTEORDER_FUNC_DEFINED #define htonl(x) __htonl(x) #define htons(x) __htons(x) #define ntohl(x) __ntohl(x) #define ntohs(x) __ntohs(x) #endif /* !_BYTEORDER_FUNC_DEFINED */ #endif /* lint */ #endif /* _KERNEL */ /* * Scale factor for scaled integers used to count %cpu time and load avgs. * * The number of CPU `tick's that map to a unique `%age' can be expressed * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average that * can be calculated (assuming 32 bits) can be closely approximated using * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). * * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age', * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024. */ #define FSHIFT 11 /* bits to right of fixed binary point */ #define FSCALE (1<> (PAGE_SHIFT - DEV_BSHIFT)) #define ctodb(db) /* calculates pages to devblks */ \ ((db) << (PAGE_SHIFT - DEV_BSHIFT)) /* * Old spelling of __containerof(). */ #define member2struct(s, m, x) \ ((struct s *)(void *)((char *)(x) - offsetof(struct s, m))) /* * Access a variable length array that has been declared as a fixed * length array. */ #define __PAST_END(array, offset) (((__typeof__(*(array)) *)(array))[offset]) #endif /* _SYS_PARAM_H_ */ Index: user/ngie/more-tests/sys/sys/syscallsubr.h =================================================================== --- user/ngie/more-tests/sys/sys/syscallsubr.h (revision 281584) +++ user/ngie/more-tests/sys/sys/syscallsubr.h (revision 281585) @@ -1,240 +1,240 @@ /*- * Copyright (c) 2002 Ian Dowse. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _SYS_SYSCALLSUBR_H_ #define _SYS_SYSCALLSUBR_H_ #include #include #include #include #include struct file; enum idtype; struct itimerval; struct image_args; struct jail; struct kevent; struct kevent_copyops; struct kld_file_stat; struct ksiginfo; struct mbuf; struct msghdr; struct msqid_ds; struct pollfd; struct ogetdirentries_args; struct rlimit; struct rusage; union semun; struct sendfile_args; struct sockaddr; struct stat; struct thr_param; struct __wrusage; int kern___getcwd(struct thread *td, char *buf, enum uio_seg bufseg, u_int buflen); int kern_accept(struct thread *td, int s, struct sockaddr **name, socklen_t *namelen, struct file **fp); int kern_accept4(struct thread *td, int s, struct sockaddr **name, socklen_t *namelen, int flags, struct file **fp); int kern_accessat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int flags, int mode); int kern_adjtime(struct thread *td, struct timeval *delta, struct timeval *olddelta); int kern_alternate_path(struct thread *td, const char *prefix, const char *path, enum uio_seg pathseg, char **pathbuf, int create, int dirfd); int kern_bindat(struct thread *td, int dirfd, int fd, struct sockaddr *sa); int kern_cap_ioctls_limit(struct thread *td, int fd, u_long *cmds, size_t ncmds); int kern_chdir(struct thread *td, char *path, enum uio_seg pathseg); int kern_clock_getcpuclockid2(struct thread *td, id_t id, int which, clockid_t *clk_id); int kern_clock_getres(struct thread *td, clockid_t clock_id, struct timespec *ts); int kern_clock_gettime(struct thread *td, clockid_t clock_id, struct timespec *ats); int kern_clock_settime(struct thread *td, clockid_t clock_id, struct timespec *ats); int kern_close(struct thread *td, int fd); int kern_connectat(struct thread *td, int dirfd, int fd, struct sockaddr *sa); int kern_execve(struct thread *td, struct image_args *args, struct mac *mac_p); int kern_fchmodat(struct thread *td, int fd, char *path, enum uio_seg pathseg, mode_t mode, int flag); int kern_fchownat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int uid, int gid, int flag); int kern_fcntl(struct thread *td, int fd, int cmd, intptr_t arg); int kern_fcntl_freebsd(struct thread *td, int fd, int cmd, long arg); int kern_fhstat(struct thread *td, fhandle_t fh, struct stat *buf); int kern_fhstatfs(struct thread *td, fhandle_t fh, struct statfs *buf); int kern_fstat(struct thread *td, int fd, struct stat *sbp); int kern_fstatfs(struct thread *td, int fd, struct statfs *buf); int kern_ftruncate(struct thread *td, int fd, off_t length); int kern_futimes(struct thread *td, int fd, struct timeval *tptr, enum uio_seg tptrseg); int kern_futimens(struct thread *td, int fd, struct timespec *tptr, enum uio_seg tptrseg); int kern_getdirentries(struct thread *td, int fd, char *buf, u_int count, long *basep, ssize_t *residp, enum uio_seg bufseg); int kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize, - enum uio_seg bufseg, int flags); + size_t *countp, enum uio_seg bufseg, int flags); int kern_getitimer(struct thread *, u_int, struct itimerval *); int kern_getppid(struct thread *); int kern_getpeername(struct thread *td, int fd, struct sockaddr **sa, socklen_t *alen); int kern_getrusage(struct thread *td, int who, struct rusage *rup); int kern_getsockname(struct thread *td, int fd, struct sockaddr **sa, socklen_t *alen); int kern_getsockopt(struct thread *td, int s, int level, int name, void *optval, enum uio_seg valseg, socklen_t *valsize); int kern_ioctl(struct thread *td, int fd, u_long com, caddr_t data); int kern_jail(struct thread *td, struct jail *j); int kern_jail_get(struct thread *td, struct uio *options, int flags); int kern_jail_set(struct thread *td, struct uio *options, int flags); int kern_kevent(struct thread *td, int fd, int nchanges, int nevents, struct kevent_copyops *k_ops, const struct timespec *timeout); int kern_kldload(struct thread *td, const char *file, int *fileid); int kern_kldstat(struct thread *td, int fileid, struct kld_file_stat *stat); int kern_kldunload(struct thread *td, int fileid, int flags); int kern_linkat(struct thread *td, int fd1, int fd2, char *path1, char *path2, enum uio_seg segflg, int follow); int kern_lutimes(struct thread *td, char *path, enum uio_seg pathseg, struct timeval *tptr, enum uio_seg tptrseg); int kern_mkdirat(struct thread *td, int fd, char *path, enum uio_seg segflg, int mode); int kern_mkfifoat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int mode); int kern_mknodat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int mode, int dev); int kern_msgctl(struct thread *, int, int, struct msqid_ds *); int kern_msgsnd(struct thread *, int, const void *, size_t, int, long); int kern_msgrcv(struct thread *, int, void *, size_t, long, int, long *); int kern_nanosleep(struct thread *td, struct timespec *rqt, struct timespec *rmt); int kern_ogetdirentries(struct thread *td, struct ogetdirentries_args *uap, long *ploff); int kern_openat(struct thread *td, int fd, char *path, enum uio_seg pathseg, int flags, int mode); int kern_pathconf(struct thread *td, char *path, enum uio_seg pathseg, int name, u_long flags); int kern_pipe(struct thread *td, int fildes[2]); int kern_pipe2(struct thread *td, int fildes[2], int flags); int kern_poll(struct thread *td, struct pollfd *fds, u_int nfds, struct timespec *tsp, sigset_t *uset); int kern_posix_fadvise(struct thread *td, int fd, off_t offset, off_t len, int advice); int kern_posix_fallocate(struct thread *td, int fd, off_t offset, off_t len); int kern_procctl(struct thread *td, enum idtype idtype, id_t id, int com, void *data); int kern_preadv(struct thread *td, int fd, struct uio *auio, off_t offset); int kern_pselect(struct thread *td, int nd, fd_set *in, fd_set *ou, fd_set *ex, struct timeval *tvp, sigset_t *uset, int abi_nfdbits); int kern_ptrace(struct thread *td, int req, pid_t pid, void *addr, int data); int kern_pwritev(struct thread *td, int fd, struct uio *auio, off_t offset); int kern_readlinkat(struct thread *td, int fd, char *path, enum uio_seg pathseg, char *buf, enum uio_seg bufseg, size_t count); int kern_readv(struct thread *td, int fd, struct uio *auio); int kern_recvit(struct thread *td, int s, struct msghdr *mp, enum uio_seg fromseg, struct mbuf **controlp); int kern_renameat(struct thread *td, int oldfd, char *old, int newfd, char *new, enum uio_seg pathseg); int kern_rmdirat(struct thread *td, int fd, char *path, enum uio_seg pathseg); int kern_sched_rr_get_interval(struct thread *td, pid_t pid, struct timespec *ts); int kern_semctl(struct thread *td, int semid, int semnum, int cmd, union semun *arg, register_t *rval); int kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou, fd_set *fd_ex, struct timeval *tvp, int abi_nfdbits); int kern_sendfile(struct thread *td, struct sendfile_args *uap, struct uio *hdr_uio, struct uio *trl_uio, int compat); int kern_sendit(struct thread *td, int s, struct msghdr *mp, int flags, struct mbuf *control, enum uio_seg segflg); int kern_setgroups(struct thread *td, u_int ngrp, gid_t *groups); int kern_setitimer(struct thread *, u_int, struct itimerval *, struct itimerval *); int kern_setrlimit(struct thread *, u_int, struct rlimit *); int kern_setsockopt(struct thread *td, int s, int level, int name, void *optval, enum uio_seg valseg, socklen_t valsize); int kern_settimeofday(struct thread *td, struct timeval *tv, struct timezone *tzp); int kern_shmat(struct thread *td, int shmid, const void *shmaddr, int shmflg); int kern_shmctl(struct thread *td, int shmid, int cmd, void *buf, size_t *bufsz); int kern_sigaction(struct thread *td, int sig, struct sigaction *act, struct sigaction *oact, int flags); int kern_sigaltstack(struct thread *td, stack_t *ss, stack_t *oss); int kern_sigprocmask(struct thread *td, int how, sigset_t *set, sigset_t *oset, int flags); int kern_sigsuspend(struct thread *td, sigset_t mask); int kern_sigtimedwait(struct thread *td, sigset_t waitset, struct ksiginfo *ksi, struct timespec *timeout); int kern_statat(struct thread *td, int flag, int fd, char *path, enum uio_seg pathseg, struct stat *sbp, void (*hook)(struct vnode *vp, struct stat *sbp)); int kern_statfs(struct thread *td, char *path, enum uio_seg pathseg, struct statfs *buf); int kern_symlinkat(struct thread *td, char *path1, int fd, char *path2, enum uio_seg segflg); int kern_ktimer_create(struct thread *td, clockid_t clock_id, struct sigevent *evp, int *timerid, int preset_id); int kern_ktimer_delete(struct thread *, int); int kern_ktimer_settime(struct thread *td, int timer_id, int flags, struct itimerspec *val, struct itimerspec *oval); int kern_ktimer_gettime(struct thread *td, int timer_id, struct itimerspec *val); int kern_ktimer_getoverrun(struct thread *td, int timer_id); int kern_thr_new(struct thread *td, struct thr_param *param); int kern_thr_suspend(struct thread *td, struct timespec *tsp); int kern_truncate(struct thread *td, char *path, enum uio_seg pathseg, off_t length); int kern_unlinkat(struct thread *td, int fd, char *path, enum uio_seg pathseg, ino_t oldinum); int kern_utimesat(struct thread *td, int fd, char *path, enum uio_seg pathseg, struct timeval *tptr, enum uio_seg tptrseg); int kern_utimensat(struct thread *td, int fd, char *path, enum uio_seg pathseg, struct timespec *tptr, enum uio_seg tptrseg, int follow); int kern_wait(struct thread *td, pid_t pid, int *status, int options, struct rusage *rup); int kern_wait6(struct thread *td, enum idtype idtype, id_t id, int *status, int options, struct __wrusage *wrup, siginfo_t *sip); int kern_writev(struct thread *td, int fd, struct uio *auio); int kern_socketpair(struct thread *td, int domain, int type, int protocol, int *rsv); /* flags for kern_sigaction */ #define KSA_OSIGSET 0x0001 /* uses osigact_t */ #define KSA_FREEBSD4 0x0002 /* uses ucontext4 */ #endif /* !_SYS_SYSCALLSUBR_H_ */ Index: user/ngie/more-tests/sys/ufs/ffs/ffs_vfsops.c =================================================================== --- user/ngie/more-tests/sys/ufs/ffs/ffs_vfsops.c (revision 281584) +++ user/ngie/more-tests/sys/ufs/ffs/ffs_vfsops.c (revision 281585) @@ -1,2216 +1,2217 @@ /*- * Copyright (c) 1989, 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)ffs_vfsops.c 8.31 (Berkeley) 5/20/95 */ #include __FBSDID("$FreeBSD$"); #include "opt_quota.h" #include "opt_ufs.h" #include "opt_ffs.h" #include "opt_ddb.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static uma_zone_t uma_inode, uma_ufs1, uma_ufs2; static int ffs_mountfs(struct vnode *, struct mount *, struct thread *); static void ffs_oldfscompat_read(struct fs *, struct ufsmount *, ufs2_daddr_t); static void ffs_ifree(struct ufsmount *ump, struct inode *ip); static int ffs_sync_lazy(struct mount *mp); static vfs_init_t ffs_init; static vfs_uninit_t ffs_uninit; static vfs_extattrctl_t ffs_extattrctl; static vfs_cmount_t ffs_cmount; static vfs_unmount_t ffs_unmount; static vfs_mount_t ffs_mount; static vfs_statfs_t ffs_statfs; static vfs_fhtovp_t ffs_fhtovp; static vfs_sync_t ffs_sync; static struct vfsops ufs_vfsops = { .vfs_extattrctl = ffs_extattrctl, .vfs_fhtovp = ffs_fhtovp, .vfs_init = ffs_init, .vfs_mount = ffs_mount, .vfs_cmount = ffs_cmount, .vfs_quotactl = ufs_quotactl, .vfs_root = ufs_root, .vfs_statfs = ffs_statfs, .vfs_sync = ffs_sync, .vfs_uninit = ffs_uninit, .vfs_unmount = ffs_unmount, .vfs_vget = ffs_vget, .vfs_susp_clean = process_deferred_inactive, }; VFS_SET(ufs_vfsops, ufs, 0); MODULE_VERSION(ufs, 1); static b_strategy_t ffs_geom_strategy; static b_write_t ffs_bufwrite; static struct buf_ops ffs_ops = { .bop_name = "FFS", .bop_write = ffs_bufwrite, .bop_strategy = ffs_geom_strategy, .bop_sync = bufsync, #ifdef NO_FFS_SNAPSHOT .bop_bdflush = bufbdflush, #else .bop_bdflush = ffs_bdflush, #endif }; /* * Note that userquota and groupquota options are not currently used * by UFS/FFS code and generally mount(8) does not pass those options * from userland, but they can be passed by loader(8) via * vfs.root.mountfrom.options. */ static const char *ffs_opts[] = { "acls", "async", "noatime", "noclusterr", "noclusterw", "noexec", "export", "force", "from", "groupquota", "multilabel", "nfsv4acls", "fsckpid", "snapshot", "nosuid", "suiddir", "nosymfollow", "sync", "union", "userquota", NULL }; static int ffs_mount(struct mount *mp) { struct vnode *devvp; struct thread *td; struct ufsmount *ump = NULL; struct fs *fs; pid_t fsckpid = 0; int error, flags; uint64_t mntorflags; accmode_t accmode; struct nameidata ndp; char *fspec; td = curthread; if (vfs_filteropt(mp->mnt_optnew, ffs_opts)) return (EINVAL); if (uma_inode == NULL) { uma_inode = uma_zcreate("FFS inode", sizeof(struct inode), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); uma_ufs1 = uma_zcreate("FFS1 dinode", sizeof(struct ufs1_dinode), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); uma_ufs2 = uma_zcreate("FFS2 dinode", sizeof(struct ufs2_dinode), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); } vfs_deleteopt(mp->mnt_optnew, "groupquota"); vfs_deleteopt(mp->mnt_optnew, "userquota"); fspec = vfs_getopts(mp->mnt_optnew, "from", &error); if (error) return (error); mntorflags = 0; if (vfs_getopt(mp->mnt_optnew, "acls", NULL, NULL) == 0) mntorflags |= MNT_ACLS; if (vfs_getopt(mp->mnt_optnew, "snapshot", NULL, NULL) == 0) { mntorflags |= MNT_SNAPSHOT; /* * Once we have set the MNT_SNAPSHOT flag, do not * persist "snapshot" in the options list. */ vfs_deleteopt(mp->mnt_optnew, "snapshot"); vfs_deleteopt(mp->mnt_opt, "snapshot"); } if (vfs_getopt(mp->mnt_optnew, "fsckpid", NULL, NULL) == 0 && vfs_scanopt(mp->mnt_optnew, "fsckpid", "%d", &fsckpid) == 1) { /* * Once we have set the restricted PID, do not * persist "fsckpid" in the options list. */ vfs_deleteopt(mp->mnt_optnew, "fsckpid"); vfs_deleteopt(mp->mnt_opt, "fsckpid"); if (mp->mnt_flag & MNT_UPDATE) { if (VFSTOUFS(mp)->um_fs->fs_ronly == 0 && vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0) == 0) { vfs_mount_error(mp, "Checker enable: Must be read-only"); return (EINVAL); } } else if (vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0) == 0) { vfs_mount_error(mp, "Checker enable: Must be read-only"); return (EINVAL); } /* Set to -1 if we are done */ if (fsckpid == 0) fsckpid = -1; } if (vfs_getopt(mp->mnt_optnew, "nfsv4acls", NULL, NULL) == 0) { if (mntorflags & MNT_ACLS) { vfs_mount_error(mp, "\"acls\" and \"nfsv4acls\" options " "are mutually exclusive"); return (EINVAL); } mntorflags |= MNT_NFS4ACLS; } MNT_ILOCK(mp); mp->mnt_flag |= mntorflags; MNT_IUNLOCK(mp); /* * If updating, check whether changing from read-only to * read/write; if there is no device name, that's all we do. */ if (mp->mnt_flag & MNT_UPDATE) { ump = VFSTOUFS(mp); fs = ump->um_fs; devvp = ump->um_devvp; if (fsckpid == -1 && ump->um_fsckpid > 0) { if ((error = ffs_flushfiles(mp, WRITECLOSE, td)) != 0 || (error = ffs_sbupdate(ump, MNT_WAIT, 0)) != 0) return (error); DROP_GIANT(); g_topology_lock(); /* * Return to normal read-only mode. */ error = g_access(ump->um_cp, 0, -1, 0); g_topology_unlock(); PICKUP_GIANT(); ump->um_fsckpid = 0; } if (fs->fs_ronly == 0 && vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) { /* * Flush any dirty data and suspend filesystem. */ if ((error = vn_start_write(NULL, &mp, V_WAIT)) != 0) return (error); error = vfs_write_suspend_umnt(mp); if (error != 0) return (error); /* * Check for and optionally get rid of files open * for writing. */ flags = WRITECLOSE; if (mp->mnt_flag & MNT_FORCE) flags |= FORCECLOSE; if (MOUNTEDSOFTDEP(mp)) { error = softdep_flushfiles(mp, flags, td); } else { error = ffs_flushfiles(mp, flags, td); } if (error) { vfs_write_resume(mp, 0); return (error); } if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) { printf("WARNING: %s Update error: blocks %jd " "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks, fs->fs_pendinginodes); fs->fs_pendingblocks = 0; fs->fs_pendinginodes = 0; } if ((fs->fs_flags & (FS_UNCLEAN | FS_NEEDSFSCK)) == 0) fs->fs_clean = 1; if ((error = ffs_sbupdate(ump, MNT_WAIT, 0)) != 0) { fs->fs_ronly = 0; fs->fs_clean = 0; vfs_write_resume(mp, 0); return (error); } if (MOUNTEDSOFTDEP(mp)) softdep_unmount(mp); DROP_GIANT(); g_topology_lock(); /* * Drop our write and exclusive access. */ g_access(ump->um_cp, 0, -1, -1); g_topology_unlock(); PICKUP_GIANT(); fs->fs_ronly = 1; MNT_ILOCK(mp); mp->mnt_flag |= MNT_RDONLY; MNT_IUNLOCK(mp); /* * Allow the writers to note that filesystem * is ro now. */ vfs_write_resume(mp, 0); } if ((mp->mnt_flag & MNT_RELOAD) && (error = ffs_reload(mp, td, 0)) != 0) return (error); if (fs->fs_ronly && !vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) { /* * If we are running a checker, do not allow upgrade. */ if (ump->um_fsckpid > 0) { vfs_mount_error(mp, "Active checker, cannot upgrade to write"); return (EINVAL); } /* * If upgrade to read-write by non-root, then verify * that user has necessary permissions on the device. */ vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (error) error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { VOP_UNLOCK(devvp, 0); return (error); } VOP_UNLOCK(devvp, 0); fs->fs_flags &= ~FS_UNCLEAN; if (fs->fs_clean == 0) { fs->fs_flags |= FS_UNCLEAN; if ((mp->mnt_flag & MNT_FORCE) || ((fs->fs_flags & (FS_SUJ | FS_NEEDSFSCK)) == 0 && (fs->fs_flags & FS_DOSOFTDEP))) { printf("WARNING: %s was not properly " "dismounted\n", fs->fs_fsmnt); } else { vfs_mount_error(mp, "R/W mount of %s denied. %s.%s", fs->fs_fsmnt, "Filesystem is not clean - run fsck", (fs->fs_flags & FS_SUJ) == 0 ? "" : " Forced mount will invalidate" " journal contents"); return (EPERM); } } DROP_GIANT(); g_topology_lock(); /* * Request exclusive write access. */ error = g_access(ump->um_cp, 0, 1, 1); g_topology_unlock(); PICKUP_GIANT(); if (error) return (error); if ((error = vn_start_write(NULL, &mp, V_WAIT)) != 0) return (error); fs->fs_ronly = 0; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_RDONLY; MNT_IUNLOCK(mp); fs->fs_mtime = time_second; /* check to see if we need to start softdep */ if ((fs->fs_flags & FS_DOSOFTDEP) && (error = softdep_mount(devvp, mp, fs, td->td_ucred))){ vn_finished_write(mp); return (error); } fs->fs_clean = 0; if ((error = ffs_sbupdate(ump, MNT_WAIT, 0)) != 0) { vn_finished_write(mp); return (error); } if (fs->fs_snapinum[0] != 0) ffs_snapshot_mount(mp); vn_finished_write(mp); } /* * Soft updates is incompatible with "async", * so if we are doing softupdates stop the user * from setting the async flag in an update. * Softdep_mount() clears it in an initial mount * or ro->rw remount. */ if (MOUNTEDSOFTDEP(mp)) { /* XXX: Reset too late ? */ MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_ASYNC; MNT_IUNLOCK(mp); } /* * Keep MNT_ACLS flag if it is stored in superblock. */ if ((fs->fs_flags & FS_ACLS) != 0) { /* XXX: Set too late ? */ MNT_ILOCK(mp); mp->mnt_flag |= MNT_ACLS; MNT_IUNLOCK(mp); } if ((fs->fs_flags & FS_NFS4ACLS) != 0) { /* XXX: Set too late ? */ MNT_ILOCK(mp); mp->mnt_flag |= MNT_NFS4ACLS; MNT_IUNLOCK(mp); } /* * If this is a request from fsck to clean up the filesystem, * then allow the specified pid to proceed. */ if (fsckpid > 0) { if (ump->um_fsckpid != 0) { vfs_mount_error(mp, "Active checker already running on %s", fs->fs_fsmnt); return (EINVAL); } KASSERT(MOUNTEDSOFTDEP(mp) == 0, ("soft updates enabled on read-only file system")); DROP_GIANT(); g_topology_lock(); /* * Request write access. */ error = g_access(ump->um_cp, 0, 1, 0); g_topology_unlock(); PICKUP_GIANT(); if (error) { vfs_mount_error(mp, "Checker activation failed on %s", fs->fs_fsmnt); return (error); } ump->um_fsckpid = fsckpid; if (fs->fs_snapinum[0] != 0) ffs_snapshot_mount(mp); fs->fs_mtime = time_second; fs->fs_fmod = 1; fs->fs_clean = 0; (void) ffs_sbupdate(ump, MNT_WAIT, 0); } /* * If this is a snapshot request, take the snapshot. */ if (mp->mnt_flag & MNT_SNAPSHOT) return (ffs_snapshot(mp, fspec)); } /* * Not an update, or updating the name: look up the name * and verify that it refers to a sensible disk device. */ NDINIT(&ndp, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspec, td); if ((error = namei(&ndp)) != 0) return (error); NDFREE(&ndp, NDF_ONLY_PNBUF); devvp = ndp.ni_vp; if (!vn_isdisk(devvp, &error)) { vput(devvp); return (error); } /* * If mount by non-root, then verify that user has necessary * permissions on the device. */ accmode = VREAD; if ((mp->mnt_flag & MNT_RDONLY) == 0) accmode |= VWRITE; error = VOP_ACCESS(devvp, accmode, td->td_ucred, td); if (error) error = priv_check(td, PRIV_VFS_MOUNT_PERM); if (error) { vput(devvp); return (error); } if (mp->mnt_flag & MNT_UPDATE) { /* * Update only * * If it's not the same vnode, or at least the same device * then it's not correct. */ if (devvp->v_rdev != ump->um_devvp->v_rdev) error = EINVAL; /* needs translation */ vput(devvp); if (error) return (error); } else { /* * New mount * * We need the name for the mount point (also used for * "last mounted on") copied in. If an error occurs, * the mount point is discarded by the upper level code. * Note that vfs_mount() populates f_mntonname for us. */ if ((error = ffs_mountfs(devvp, mp, td)) != 0) { vrele(devvp); return (error); } if (fsckpid > 0) { KASSERT(MOUNTEDSOFTDEP(mp) == 0, ("soft updates enabled on read-only file system")); ump = VFSTOUFS(mp); fs = ump->um_fs; DROP_GIANT(); g_topology_lock(); /* * Request write access. */ error = g_access(ump->um_cp, 0, 1, 0); g_topology_unlock(); PICKUP_GIANT(); if (error) { printf("WARNING: %s: Checker activation " "failed\n", fs->fs_fsmnt); } else { ump->um_fsckpid = fsckpid; if (fs->fs_snapinum[0] != 0) ffs_snapshot_mount(mp); fs->fs_mtime = time_second; fs->fs_clean = 0; (void) ffs_sbupdate(ump, MNT_WAIT, 0); } } } vfs_mountedfrom(mp, fspec); return (0); } /* * Compatibility with old mount system call. */ static int ffs_cmount(struct mntarg *ma, void *data, uint64_t flags) { struct ufs_args args; struct export_args exp; int error; if (data == NULL) return (EINVAL); error = copyin(data, &args, sizeof args); if (error) return (error); vfs_oexport_conv(&args.export, &exp); ma = mount_argsu(ma, "from", args.fspec, MAXPATHLEN); ma = mount_arg(ma, "export", &exp, sizeof(exp)); error = kernel_mount(ma, flags); return (error); } /* * Reload all incore data for a filesystem (used after running fsck on * the root filesystem and finding things to fix). If the 'force' flag * is 0, the filesystem must be mounted read-only. * * Things to do to update the mount: * 1) invalidate all cached meta-data. * 2) re-read superblock from disk. * 3) re-read summary information from disk. * 4) invalidate all inactive vnodes. * 5) invalidate all cached file data. * 6) re-read inode data for all active vnodes. */ int ffs_reload(struct mount *mp, struct thread *td, int force) { struct vnode *vp, *mvp, *devvp; struct inode *ip; void *space; struct buf *bp; struct fs *fs, *newfs; struct ufsmount *ump; ufs2_daddr_t sblockloc; int i, blks, size, error; int32_t *lp; ump = VFSTOUFS(mp); MNT_ILOCK(mp); if ((mp->mnt_flag & MNT_RDONLY) == 0 && force == 0) { MNT_IUNLOCK(mp); return (EINVAL); } MNT_IUNLOCK(mp); /* * Step 1: invalidate all cached meta-data. */ devvp = VFSTOUFS(mp)->um_devvp; vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); if (vinvalbuf(devvp, 0, 0, 0) != 0) panic("ffs_reload: dirty1"); VOP_UNLOCK(devvp, 0); /* * Step 2: re-read superblock from disk. */ fs = VFSTOUFS(mp)->um_fs; if ((error = bread(devvp, btodb(fs->fs_sblockloc), fs->fs_sbsize, NOCRED, &bp)) != 0) return (error); newfs = (struct fs *)bp->b_data; if ((newfs->fs_magic != FS_UFS1_MAGIC && newfs->fs_magic != FS_UFS2_MAGIC) || newfs->fs_bsize > MAXBSIZE || newfs->fs_bsize < sizeof(struct fs)) { brelse(bp); return (EIO); /* XXX needs translation */ } /* * Copy pointer fields back into superblock before copying in XXX * new superblock. These should really be in the ufsmount. XXX * Note that important parameters (eg fs_ncg) are unchanged. */ newfs->fs_csp = fs->fs_csp; newfs->fs_maxcluster = fs->fs_maxcluster; newfs->fs_contigdirs = fs->fs_contigdirs; newfs->fs_active = fs->fs_active; newfs->fs_ronly = fs->fs_ronly; sblockloc = fs->fs_sblockloc; bcopy(newfs, fs, (u_int)fs->fs_sbsize); brelse(bp); mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen; ffs_oldfscompat_read(fs, VFSTOUFS(mp), sblockloc); UFS_LOCK(ump); if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) { printf("WARNING: %s: reload pending error: blocks %jd " "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks, fs->fs_pendinginodes); fs->fs_pendingblocks = 0; fs->fs_pendinginodes = 0; } UFS_UNLOCK(ump); /* * Step 3: re-read summary information from disk. */ size = fs->fs_cssize; blks = howmany(size, fs->fs_fsize); if (fs->fs_contigsumsize > 0) size += fs->fs_ncg * sizeof(int32_t); size += fs->fs_ncg * sizeof(u_int8_t); free(fs->fs_csp, M_UFSMNT); space = malloc((u_long)size, M_UFSMNT, M_WAITOK); fs->fs_csp = space; for (i = 0; i < blks; i += fs->fs_frag) { size = fs->fs_bsize; if (i + fs->fs_frag > blks) size = (blks - i) * fs->fs_fsize; error = bread(devvp, fsbtodb(fs, fs->fs_csaddr + i), size, NOCRED, &bp); if (error) return (error); bcopy(bp->b_data, space, (u_int)size); space = (char *)space + size; brelse(bp); } /* * We no longer know anything about clusters per cylinder group. */ if (fs->fs_contigsumsize > 0) { fs->fs_maxcluster = lp = space; for (i = 0; i < fs->fs_ncg; i++) *lp++ = fs->fs_contigsumsize; space = lp; } size = fs->fs_ncg * sizeof(u_int8_t); fs->fs_contigdirs = (u_int8_t *)space; bzero(fs->fs_contigdirs, size); loop: MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { /* * Skip syncer vnode. */ if (vp->v_type == VNON) { VI_UNLOCK(vp); continue; } /* * Step 4: invalidate all cached file data. */ if (vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, td)) { MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); goto loop; } if (vinvalbuf(vp, 0, 0, 0)) panic("ffs_reload: dirty2"); /* * Step 5: re-read inode data for all active vnodes. */ ip = VTOI(vp); error = bread(devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)), (int)fs->fs_bsize, NOCRED, &bp); if (error) { VOP_UNLOCK(vp, 0); vrele(vp); MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); return (error); } ffs_load_inode(bp, ip, fs, ip->i_number); ip->i_effnlink = ip->i_nlink; brelse(bp); VOP_UNLOCK(vp, 0); vrele(vp); } return (0); } /* * Possible superblock locations ordered from most to least likely. */ static int sblock_try[] = SBLOCKSEARCH; /* * Common code for mount and mountroot */ static int ffs_mountfs(devvp, mp, td) struct vnode *devvp; struct mount *mp; struct thread *td; { struct ufsmount *ump; struct buf *bp; struct fs *fs; struct cdev *dev; void *space; ufs2_daddr_t sblockloc; int error, i, blks, size, ronly; int32_t *lp; struct ucred *cred; struct g_consumer *cp; struct mount *nmp; bp = NULL; ump = NULL; cred = td ? td->td_ucred : NOCRED; ronly = (mp->mnt_flag & MNT_RDONLY) != 0; dev = devvp->v_rdev; dev_ref(dev); DROP_GIANT(); g_topology_lock(); error = g_vfs_open(devvp, &cp, "ffs", ronly ? 0 : 1); g_topology_unlock(); PICKUP_GIANT(); VOP_UNLOCK(devvp, 0); if (error) goto out; if (devvp->v_rdev->si_iosize_max != 0) mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max; if (mp->mnt_iosize_max > MAXPHYS) mp->mnt_iosize_max = MAXPHYS; devvp->v_bufobj.bo_ops = &ffs_ops; fs = NULL; sblockloc = 0; /* * Try reading the superblock in each of its possible locations. */ for (i = 0; sblock_try[i] != -1; i++) { if ((SBLOCKSIZE % cp->provider->sectorsize) != 0) { error = EINVAL; vfs_mount_error(mp, "Invalid sectorsize %d for superblock size %d", cp->provider->sectorsize, SBLOCKSIZE); goto out; } if ((error = bread(devvp, btodb(sblock_try[i]), SBLOCKSIZE, cred, &bp)) != 0) goto out; fs = (struct fs *)bp->b_data; sblockloc = sblock_try[i]; if ((fs->fs_magic == FS_UFS1_MAGIC || (fs->fs_magic == FS_UFS2_MAGIC && (fs->fs_sblockloc == sblockloc || (fs->fs_old_flags & FS_FLAGS_UPDATED) == 0))) && fs->fs_bsize <= MAXBSIZE && fs->fs_bsize >= sizeof(struct fs)) break; brelse(bp); bp = NULL; } if (sblock_try[i] == -1) { error = EINVAL; /* XXX needs translation */ goto out; } fs->fs_fmod = 0; fs->fs_flags &= ~FS_INDEXDIRS; /* no support for directory indicies */ fs->fs_flags &= ~FS_UNCLEAN; if (fs->fs_clean == 0) { fs->fs_flags |= FS_UNCLEAN; if (ronly || (mp->mnt_flag & MNT_FORCE) || ((fs->fs_flags & (FS_SUJ | FS_NEEDSFSCK)) == 0 && (fs->fs_flags & FS_DOSOFTDEP))) { printf("WARNING: %s was not properly dismounted\n", fs->fs_fsmnt); } else { vfs_mount_error(mp, "R/W mount of %s denied. %s%s", fs->fs_fsmnt, "Filesystem is not clean - run fsck.", (fs->fs_flags & FS_SUJ) == 0 ? "" : " Forced mount will invalidate journal contents"); error = EPERM; goto out; } if ((fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) && (mp->mnt_flag & MNT_FORCE)) { printf("WARNING: %s: lost blocks %jd files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks, fs->fs_pendinginodes); fs->fs_pendingblocks = 0; fs->fs_pendinginodes = 0; } } if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) { printf("WARNING: %s: mount pending error: blocks %jd " "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks, fs->fs_pendinginodes); fs->fs_pendingblocks = 0; fs->fs_pendinginodes = 0; } if ((fs->fs_flags & FS_GJOURNAL) != 0) { #ifdef UFS_GJOURNAL /* * Get journal provider name. */ size = 1024; mp->mnt_gjprovider = malloc(size, M_UFSMNT, M_WAITOK); if (g_io_getattr("GJOURNAL::provider", cp, &size, mp->mnt_gjprovider) == 0) { mp->mnt_gjprovider = realloc(mp->mnt_gjprovider, size, M_UFSMNT, M_WAITOK); MNT_ILOCK(mp); mp->mnt_flag |= MNT_GJOURNAL; MNT_IUNLOCK(mp); } else { printf("WARNING: %s: GJOURNAL flag on fs " "but no gjournal provider below\n", mp->mnt_stat.f_mntonname); free(mp->mnt_gjprovider, M_UFSMNT); mp->mnt_gjprovider = NULL; } #else printf("WARNING: %s: GJOURNAL flag on fs but no " "UFS_GJOURNAL support\n", mp->mnt_stat.f_mntonname); #endif } else { mp->mnt_gjprovider = NULL; } ump = malloc(sizeof *ump, M_UFSMNT, M_WAITOK | M_ZERO); ump->um_cp = cp; ump->um_bo = &devvp->v_bufobj; ump->um_fs = malloc((u_long)fs->fs_sbsize, M_UFSMNT, M_WAITOK); if (fs->fs_magic == FS_UFS1_MAGIC) { ump->um_fstype = UFS1; ump->um_balloc = ffs_balloc_ufs1; } else { ump->um_fstype = UFS2; ump->um_balloc = ffs_balloc_ufs2; } ump->um_blkatoff = ffs_blkatoff; ump->um_truncate = ffs_truncate; ump->um_update = ffs_update; ump->um_valloc = ffs_valloc; ump->um_vfree = ffs_vfree; ump->um_ifree = ffs_ifree; ump->um_rdonly = ffs_rdonly; ump->um_snapgone = ffs_snapgone; mtx_init(UFS_MTX(ump), "FFS", "FFS Lock", MTX_DEF); bcopy(bp->b_data, ump->um_fs, (u_int)fs->fs_sbsize); if (fs->fs_sbsize < SBLOCKSIZE) bp->b_flags |= B_INVAL | B_NOCACHE; brelse(bp); bp = NULL; fs = ump->um_fs; ffs_oldfscompat_read(fs, ump, sblockloc); fs->fs_ronly = ronly; size = fs->fs_cssize; blks = howmany(size, fs->fs_fsize); if (fs->fs_contigsumsize > 0) size += fs->fs_ncg * sizeof(int32_t); size += fs->fs_ncg * sizeof(u_int8_t); space = malloc((u_long)size, M_UFSMNT, M_WAITOK); fs->fs_csp = space; for (i = 0; i < blks; i += fs->fs_frag) { size = fs->fs_bsize; if (i + fs->fs_frag > blks) size = (blks - i) * fs->fs_fsize; if ((error = bread(devvp, fsbtodb(fs, fs->fs_csaddr + i), size, cred, &bp)) != 0) { free(fs->fs_csp, M_UFSMNT); goto out; } bcopy(bp->b_data, space, (u_int)size); space = (char *)space + size; brelse(bp); bp = NULL; } if (fs->fs_contigsumsize > 0) { fs->fs_maxcluster = lp = space; for (i = 0; i < fs->fs_ncg; i++) *lp++ = fs->fs_contigsumsize; space = lp; } size = fs->fs_ncg * sizeof(u_int8_t); fs->fs_contigdirs = (u_int8_t *)space; bzero(fs->fs_contigdirs, size); fs->fs_active = NULL; mp->mnt_data = ump; mp->mnt_stat.f_fsid.val[0] = fs->fs_id[0]; mp->mnt_stat.f_fsid.val[1] = fs->fs_id[1]; nmp = NULL; if (fs->fs_id[0] == 0 || fs->fs_id[1] == 0 || (nmp = vfs_getvfs(&mp->mnt_stat.f_fsid))) { if (nmp) vfs_rel(nmp); vfs_getnewfsid(mp); } mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen; MNT_ILOCK(mp); mp->mnt_flag |= MNT_LOCAL; MNT_IUNLOCK(mp); if ((fs->fs_flags & FS_MULTILABEL) != 0) { #ifdef MAC MNT_ILOCK(mp); mp->mnt_flag |= MNT_MULTILABEL; MNT_IUNLOCK(mp); #else printf("WARNING: %s: multilabel flag on fs but " "no MAC support\n", mp->mnt_stat.f_mntonname); #endif } if ((fs->fs_flags & FS_ACLS) != 0) { #ifdef UFS_ACL MNT_ILOCK(mp); if (mp->mnt_flag & MNT_NFS4ACLS) printf("WARNING: %s: ACLs flag on fs conflicts with " "\"nfsv4acls\" mount option; option ignored\n", mp->mnt_stat.f_mntonname); mp->mnt_flag &= ~MNT_NFS4ACLS; mp->mnt_flag |= MNT_ACLS; MNT_IUNLOCK(mp); #else printf("WARNING: %s: ACLs flag on fs but no ACLs support\n", mp->mnt_stat.f_mntonname); #endif } if ((fs->fs_flags & FS_NFS4ACLS) != 0) { #ifdef UFS_ACL MNT_ILOCK(mp); if (mp->mnt_flag & MNT_ACLS) printf("WARNING: %s: NFSv4 ACLs flag on fs conflicts " "with \"acls\" mount option; option ignored\n", mp->mnt_stat.f_mntonname); mp->mnt_flag &= ~MNT_ACLS; mp->mnt_flag |= MNT_NFS4ACLS; MNT_IUNLOCK(mp); #else printf("WARNING: %s: NFSv4 ACLs flag on fs but no " "ACLs support\n", mp->mnt_stat.f_mntonname); #endif } if ((fs->fs_flags & FS_TRIM) != 0) { size = sizeof(int); if (g_io_getattr("GEOM::candelete", cp, &size, &ump->um_candelete) == 0) { if (!ump->um_candelete) printf("WARNING: %s: TRIM flag on fs but disk " "does not support TRIM\n", mp->mnt_stat.f_mntonname); } else { printf("WARNING: %s: TRIM flag on fs but disk does " "not confirm that it supports TRIM\n", mp->mnt_stat.f_mntonname); ump->um_candelete = 0; } } ump->um_mountp = mp; ump->um_dev = dev; ump->um_devvp = devvp; ump->um_nindir = fs->fs_nindir; ump->um_bptrtodb = fs->fs_fsbtodb; ump->um_seqinc = fs->fs_frag; for (i = 0; i < MAXQUOTAS; i++) ump->um_quotas[i] = NULLVP; #ifdef UFS_EXTATTR ufs_extattr_uepm_init(&ump->um_extattr); #endif /* * Set FS local "last mounted on" information (NULL pad) */ bzero(fs->fs_fsmnt, MAXMNTLEN); strlcpy(fs->fs_fsmnt, mp->mnt_stat.f_mntonname, MAXMNTLEN); mp->mnt_stat.f_iosize = fs->fs_bsize; if (mp->mnt_flag & MNT_ROOTFS) { /* * Root mount; update timestamp in mount structure. * this will be used by the common root mount code * to update the system clock. */ mp->mnt_time = fs->fs_time; } if (ronly == 0) { fs->fs_mtime = time_second; if ((fs->fs_flags & FS_DOSOFTDEP) && (error = softdep_mount(devvp, mp, fs, cred)) != 0) { free(fs->fs_csp, M_UFSMNT); ffs_flushfiles(mp, FORCECLOSE, td); goto out; } if (devvp->v_type == VCHR && devvp->v_rdev != NULL) devvp->v_rdev->si_mountpt = mp; if (fs->fs_snapinum[0] != 0) ffs_snapshot_mount(mp); fs->fs_fmod = 1; fs->fs_clean = 0; (void) ffs_sbupdate(ump, MNT_WAIT, 0); } /* * Initialize filesystem stat information in mount struct. */ MNT_ILOCK(mp); mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED | - MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS | MNTK_SUSPENDABLE; + MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS | MNTK_SUSPENDABLE | + MNTK_USES_BCACHE; MNT_IUNLOCK(mp); #ifdef UFS_EXTATTR #ifdef UFS_EXTATTR_AUTOSTART /* * * Auto-starting does the following: * - check for /.attribute in the fs, and extattr_start if so * - for each file in .attribute, enable that file with * an attribute of the same name. * Not clear how to report errors -- probably eat them. * This would all happen while the filesystem was busy/not * available, so would effectively be "atomic". */ (void) ufs_extattr_autostart(mp, td); #endif /* !UFS_EXTATTR_AUTOSTART */ #endif /* !UFS_EXTATTR */ return (0); out: if (bp) brelse(bp); if (cp != NULL) { DROP_GIANT(); g_topology_lock(); g_vfs_close(cp); g_topology_unlock(); PICKUP_GIANT(); } if (ump) { mtx_destroy(UFS_MTX(ump)); if (mp->mnt_gjprovider != NULL) { free(mp->mnt_gjprovider, M_UFSMNT); mp->mnt_gjprovider = NULL; } free(ump->um_fs, M_UFSMNT); free(ump, M_UFSMNT); mp->mnt_data = NULL; } dev_rel(dev); return (error); } #include static int bigcgs = 0; SYSCTL_INT(_debug, OID_AUTO, bigcgs, CTLFLAG_RW, &bigcgs, 0, ""); /* * Sanity checks for loading old filesystem superblocks. * See ffs_oldfscompat_write below for unwound actions. * * XXX - Parts get retired eventually. * Unfortunately new bits get added. */ static void ffs_oldfscompat_read(fs, ump, sblockloc) struct fs *fs; struct ufsmount *ump; ufs2_daddr_t sblockloc; { off_t maxfilesize; /* * If not yet done, update fs_flags location and value of fs_sblockloc. */ if ((fs->fs_old_flags & FS_FLAGS_UPDATED) == 0) { fs->fs_flags = fs->fs_old_flags; fs->fs_old_flags |= FS_FLAGS_UPDATED; fs->fs_sblockloc = sblockloc; } /* * If not yet done, update UFS1 superblock with new wider fields. */ if (fs->fs_magic == FS_UFS1_MAGIC && fs->fs_maxbsize != fs->fs_bsize) { fs->fs_maxbsize = fs->fs_bsize; fs->fs_time = fs->fs_old_time; fs->fs_size = fs->fs_old_size; fs->fs_dsize = fs->fs_old_dsize; fs->fs_csaddr = fs->fs_old_csaddr; fs->fs_cstotal.cs_ndir = fs->fs_old_cstotal.cs_ndir; fs->fs_cstotal.cs_nbfree = fs->fs_old_cstotal.cs_nbfree; fs->fs_cstotal.cs_nifree = fs->fs_old_cstotal.cs_nifree; fs->fs_cstotal.cs_nffree = fs->fs_old_cstotal.cs_nffree; } if (fs->fs_magic == FS_UFS1_MAGIC && fs->fs_old_inodefmt < FS_44INODEFMT) { fs->fs_maxfilesize = ((uint64_t)1 << 31) - 1; fs->fs_qbmask = ~fs->fs_bmask; fs->fs_qfmask = ~fs->fs_fmask; } if (fs->fs_magic == FS_UFS1_MAGIC) { ump->um_savedmaxfilesize = fs->fs_maxfilesize; maxfilesize = (uint64_t)0x80000000 * fs->fs_bsize - 1; if (fs->fs_maxfilesize > maxfilesize) fs->fs_maxfilesize = maxfilesize; } /* Compatibility for old filesystems */ if (fs->fs_avgfilesize <= 0) fs->fs_avgfilesize = AVFILESIZ; if (fs->fs_avgfpdir <= 0) fs->fs_avgfpdir = AFPDIR; if (bigcgs) { fs->fs_save_cgsize = fs->fs_cgsize; fs->fs_cgsize = fs->fs_bsize; } } /* * Unwinding superblock updates for old filesystems. * See ffs_oldfscompat_read above for details. * * XXX - Parts get retired eventually. * Unfortunately new bits get added. */ void ffs_oldfscompat_write(fs, ump) struct fs *fs; struct ufsmount *ump; { /* * Copy back UFS2 updated fields that UFS1 inspects. */ if (fs->fs_magic == FS_UFS1_MAGIC) { fs->fs_old_time = fs->fs_time; fs->fs_old_cstotal.cs_ndir = fs->fs_cstotal.cs_ndir; fs->fs_old_cstotal.cs_nbfree = fs->fs_cstotal.cs_nbfree; fs->fs_old_cstotal.cs_nifree = fs->fs_cstotal.cs_nifree; fs->fs_old_cstotal.cs_nffree = fs->fs_cstotal.cs_nffree; fs->fs_maxfilesize = ump->um_savedmaxfilesize; } if (bigcgs) { fs->fs_cgsize = fs->fs_save_cgsize; fs->fs_save_cgsize = 0; } } /* * unmount system call */ static int ffs_unmount(mp, mntflags) struct mount *mp; int mntflags; { struct thread *td; struct ufsmount *ump = VFSTOUFS(mp); struct fs *fs; int error, flags, susp; #ifdef UFS_EXTATTR int e_restart; #endif flags = 0; td = curthread; fs = ump->um_fs; susp = 0; if (mntflags & MNT_FORCE) { flags |= FORCECLOSE; susp = fs->fs_ronly == 0; } #ifdef UFS_EXTATTR if ((error = ufs_extattr_stop(mp, td))) { if (error != EOPNOTSUPP) printf("WARNING: unmount %s: ufs_extattr_stop " "returned errno %d\n", mp->mnt_stat.f_mntonname, error); e_restart = 0; } else { ufs_extattr_uepm_destroy(&ump->um_extattr); e_restart = 1; } #endif if (susp) { error = vfs_write_suspend_umnt(mp); if (error != 0) goto fail1; } if (MOUNTEDSOFTDEP(mp)) error = softdep_flushfiles(mp, flags, td); else error = ffs_flushfiles(mp, flags, td); if (error != 0 && error != ENXIO) goto fail; UFS_LOCK(ump); if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) { printf("WARNING: unmount %s: pending error: blocks %jd " "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks, fs->fs_pendinginodes); fs->fs_pendingblocks = 0; fs->fs_pendinginodes = 0; } UFS_UNLOCK(ump); if (MOUNTEDSOFTDEP(mp)) softdep_unmount(mp); if (fs->fs_ronly == 0 || ump->um_fsckpid > 0) { fs->fs_clean = fs->fs_flags & (FS_UNCLEAN|FS_NEEDSFSCK) ? 0 : 1; error = ffs_sbupdate(ump, MNT_WAIT, 0); if (error && error != ENXIO) { fs->fs_clean = 0; goto fail; } } if (susp) vfs_write_resume(mp, VR_START_WRITE); DROP_GIANT(); g_topology_lock(); if (ump->um_fsckpid > 0) { /* * Return to normal read-only mode. */ error = g_access(ump->um_cp, 0, -1, 0); ump->um_fsckpid = 0; } g_vfs_close(ump->um_cp); g_topology_unlock(); PICKUP_GIANT(); if (ump->um_devvp->v_type == VCHR && ump->um_devvp->v_rdev != NULL) ump->um_devvp->v_rdev->si_mountpt = NULL; vrele(ump->um_devvp); dev_rel(ump->um_dev); mtx_destroy(UFS_MTX(ump)); if (mp->mnt_gjprovider != NULL) { free(mp->mnt_gjprovider, M_UFSMNT); mp->mnt_gjprovider = NULL; } free(fs->fs_csp, M_UFSMNT); free(fs, M_UFSMNT); free(ump, M_UFSMNT); mp->mnt_data = NULL; MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_LOCAL; MNT_IUNLOCK(mp); return (error); fail: if (susp) vfs_write_resume(mp, VR_START_WRITE); fail1: #ifdef UFS_EXTATTR if (e_restart) { ufs_extattr_uepm_init(&ump->um_extattr); #ifdef UFS_EXTATTR_AUTOSTART (void) ufs_extattr_autostart(mp, td); #endif } #endif return (error); } /* * Flush out all the files in a filesystem. */ int ffs_flushfiles(mp, flags, td) struct mount *mp; int flags; struct thread *td; { struct ufsmount *ump; int qerror, error; ump = VFSTOUFS(mp); qerror = 0; #ifdef QUOTA if (mp->mnt_flag & MNT_QUOTA) { int i; error = vflush(mp, 0, SKIPSYSTEM|flags, td); if (error) return (error); for (i = 0; i < MAXQUOTAS; i++) { error = quotaoff(td, mp, i); if (error != 0) { if ((flags & EARLYFLUSH) == 0) return (error); else qerror = error; } } /* * Here we fall through to vflush again to ensure that * we have gotten rid of all the system vnodes, unless * quotas must not be closed. */ } #endif ASSERT_VOP_LOCKED(ump->um_devvp, "ffs_flushfiles"); if (ump->um_devvp->v_vflag & VV_COPYONWRITE) { if ((error = vflush(mp, 0, SKIPSYSTEM | flags, td)) != 0) return (error); ffs_snapshot_unmount(mp); flags |= FORCECLOSE; /* * Here we fall through to vflush again to ensure * that we have gotten rid of all the system vnodes. */ } /* * Do not close system files if quotas were not closed, to be * able to sync the remaining dquots. The freeblks softupdate * workitems might hold a reference on a dquot, preventing * quotaoff() from completing. Next round of * softdep_flushworklist() iteration should process the * blockers, allowing the next run of quotaoff() to finally * flush held dquots. * * Otherwise, flush all the files. */ if (qerror == 0 && (error = vflush(mp, 0, flags, td)) != 0) return (error); /* * Flush filesystem metadata. */ vn_lock(ump->um_devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_FSYNC(ump->um_devvp, MNT_WAIT, td); VOP_UNLOCK(ump->um_devvp, 0); return (error); } /* * Get filesystem statistics. */ static int ffs_statfs(mp, sbp) struct mount *mp; struct statfs *sbp; { struct ufsmount *ump; struct fs *fs; ump = VFSTOUFS(mp); fs = ump->um_fs; if (fs->fs_magic != FS_UFS1_MAGIC && fs->fs_magic != FS_UFS2_MAGIC) panic("ffs_statfs"); sbp->f_version = STATFS_VERSION; sbp->f_bsize = fs->fs_fsize; sbp->f_iosize = fs->fs_bsize; sbp->f_blocks = fs->fs_dsize; UFS_LOCK(ump); sbp->f_bfree = fs->fs_cstotal.cs_nbfree * fs->fs_frag + fs->fs_cstotal.cs_nffree + dbtofsb(fs, fs->fs_pendingblocks); sbp->f_bavail = freespace(fs, fs->fs_minfree) + dbtofsb(fs, fs->fs_pendingblocks); sbp->f_files = fs->fs_ncg * fs->fs_ipg - ROOTINO; sbp->f_ffree = fs->fs_cstotal.cs_nifree + fs->fs_pendinginodes; UFS_UNLOCK(ump); sbp->f_namemax = NAME_MAX; return (0); } /* * For a lazy sync, we only care about access times, quotas and the * superblock. Other filesystem changes are already converted to * cylinder group blocks or inode blocks updates and are written to * disk by syncer. */ static int ffs_sync_lazy(mp) struct mount *mp; { struct vnode *mvp, *vp; struct inode *ip; struct thread *td; int allerror, error; allerror = 0; td = curthread; if ((mp->mnt_flag & MNT_NOATIME) != 0) goto qupdate; MNT_VNODE_FOREACH_ACTIVE(vp, mp, mvp) { if (vp->v_type == VNON) { VI_UNLOCK(vp); continue; } ip = VTOI(vp); /* * The IN_ACCESS flag is converted to IN_MODIFIED by * ufs_close() and ufs_getattr() by the calls to * ufs_itimes_locked(), without subsequent UFS_UPDATE(). * Test also all the other timestamp flags too, to pick up * any other cases that could be missed. */ if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0) { VI_UNLOCK(vp); continue; } if ((error = vget(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, td)) != 0) continue; error = ffs_update(vp, 0); if (error != 0) allerror = error; vput(vp); } qupdate: #ifdef QUOTA qsync(mp); #endif if (VFSTOUFS(mp)->um_fs->fs_fmod != 0 && (error = ffs_sbupdate(VFSTOUFS(mp), MNT_LAZY, 0)) != 0) allerror = error; return (allerror); } /* * Go through the disk queues to initiate sandbagged IO; * go through the inodes to write those that have been modified; * initiate the writing of the super block if it has been modified. * * Note: we are always called with the filesystem marked busy using * vfs_busy(). */ static int ffs_sync(mp, waitfor) struct mount *mp; int waitfor; { struct vnode *mvp, *vp, *devvp; struct thread *td; struct inode *ip; struct ufsmount *ump = VFSTOUFS(mp); struct fs *fs; int error, count, wait, lockreq, allerror = 0; int suspend; int suspended; int secondary_writes; int secondary_accwrites; int softdep_deps; int softdep_accdeps; struct bufobj *bo; wait = 0; suspend = 0; suspended = 0; td = curthread; fs = ump->um_fs; if (fs->fs_fmod != 0 && fs->fs_ronly != 0 && ump->um_fsckpid == 0) panic("%s: ffs_sync: modification on read-only filesystem", fs->fs_fsmnt); if (waitfor == MNT_LAZY) { if (!rebooting) return (ffs_sync_lazy(mp)); waitfor = MNT_NOWAIT; } /* * Write back each (modified) inode. */ lockreq = LK_EXCLUSIVE | LK_NOWAIT; if (waitfor == MNT_SUSPEND) { suspend = 1; waitfor = MNT_WAIT; } if (waitfor == MNT_WAIT) { wait = 1; lockreq = LK_EXCLUSIVE; } lockreq |= LK_INTERLOCK | LK_SLEEPFAIL; loop: /* Grab snapshot of secondary write counts */ MNT_ILOCK(mp); secondary_writes = mp->mnt_secondary_writes; secondary_accwrites = mp->mnt_secondary_accwrites; MNT_IUNLOCK(mp); /* Grab snapshot of softdep dependency counts */ softdep_get_depcounts(mp, &softdep_deps, &softdep_accdeps); MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { /* * Depend on the vnode interlock to keep things stable enough * for a quick test. Since there might be hundreds of * thousands of vnodes, we cannot afford even a subroutine * call unless there's a good chance that we have work to do. */ if (vp->v_type == VNON) { VI_UNLOCK(vp); continue; } ip = VTOI(vp); if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 && vp->v_bufobj.bo_dirty.bv_cnt == 0) { VI_UNLOCK(vp); continue; } if ((error = vget(vp, lockreq, td)) != 0) { if (error == ENOENT || error == ENOLCK) { MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); goto loop; } continue; } if ((error = ffs_syncvnode(vp, waitfor, 0)) != 0) allerror = error; vput(vp); } /* * Force stale filesystem control information to be flushed. */ if (waitfor == MNT_WAIT || rebooting) { if ((error = softdep_flushworklist(ump->um_mountp, &count, td))) allerror = error; /* Flushed work items may create new vnodes to clean */ if (allerror == 0 && count) goto loop; } #ifdef QUOTA qsync(mp); #endif devvp = ump->um_devvp; bo = &devvp->v_bufobj; BO_LOCK(bo); if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0) { BO_UNLOCK(bo); vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_FSYNC(devvp, waitfor, td); VOP_UNLOCK(devvp, 0); if (MOUNTEDSOFTDEP(mp) && (error == 0 || error == EAGAIN)) error = ffs_sbupdate(ump, waitfor, 0); if (error != 0) allerror = error; if (allerror == 0 && waitfor == MNT_WAIT) goto loop; } else if (suspend != 0) { if (softdep_check_suspend(mp, devvp, softdep_deps, softdep_accdeps, secondary_writes, secondary_accwrites) != 0) { MNT_IUNLOCK(mp); goto loop; /* More work needed */ } mtx_assert(MNT_MTX(mp), MA_OWNED); mp->mnt_kern_flag |= MNTK_SUSPEND2 | MNTK_SUSPENDED; MNT_IUNLOCK(mp); suspended = 1; } else BO_UNLOCK(bo); /* * Write back modified superblock. */ if (fs->fs_fmod != 0 && (error = ffs_sbupdate(ump, waitfor, suspended)) != 0) allerror = error; return (allerror); } int ffs_vget(mp, ino, flags, vpp) struct mount *mp; ino_t ino; int flags; struct vnode **vpp; { return (ffs_vgetf(mp, ino, flags, vpp, 0)); } int ffs_vgetf(mp, ino, flags, vpp, ffs_flags) struct mount *mp; ino_t ino; int flags; struct vnode **vpp; int ffs_flags; { struct fs *fs; struct inode *ip; struct ufsmount *ump; struct buf *bp; struct vnode *vp; struct cdev *dev; int error; error = vfs_hash_get(mp, ino, flags, curthread, vpp, NULL, NULL); if (error || *vpp != NULL) return (error); /* * We must promote to an exclusive lock for vnode creation. This * can happen if lookup is passed LOCKSHARED. */ if ((flags & LK_TYPE_MASK) == LK_SHARED) { flags &= ~LK_TYPE_MASK; flags |= LK_EXCLUSIVE; } /* * We do not lock vnode creation as it is believed to be too * expensive for such rare case as simultaneous creation of vnode * for same ino by different processes. We just allow them to race * and check later to decide who wins. Let the race begin! */ ump = VFSTOUFS(mp); dev = ump->um_dev; fs = ump->um_fs; ip = uma_zalloc(uma_inode, M_WAITOK | M_ZERO); /* Allocate a new vnode/inode. */ if (fs->fs_magic == FS_UFS1_MAGIC) error = getnewvnode("ufs", mp, &ffs_vnodeops1, &vp); else error = getnewvnode("ufs", mp, &ffs_vnodeops2, &vp); if (error) { *vpp = NULL; uma_zfree(uma_inode, ip); return (error); } /* * FFS supports recursive locking. */ lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL); VN_LOCK_AREC(vp); vp->v_data = ip; vp->v_bufobj.bo_bsize = fs->fs_bsize; ip->i_vnode = vp; ip->i_ump = ump; ip->i_fs = fs; ip->i_dev = dev; ip->i_number = ino; ip->i_ea_refs = 0; #ifdef QUOTA { int i; for (i = 0; i < MAXQUOTAS; i++) ip->i_dquot[i] = NODQUOT; } #endif if (ffs_flags & FFSV_FORCEINSMQ) vp->v_vflag |= VV_FORCEINSMQ; error = insmntque(vp, mp); if (error != 0) { uma_zfree(uma_inode, ip); *vpp = NULL; return (error); } vp->v_vflag &= ~VV_FORCEINSMQ; error = vfs_hash_insert(vp, ino, flags, curthread, vpp, NULL, NULL); if (error || *vpp != NULL) return (error); /* Read in the disk contents for the inode, copy into the inode. */ error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ino)), (int)fs->fs_bsize, NOCRED, &bp); if (error) { /* * The inode does not contain anything useful, so it would * be misleading to leave it on its hash chain. With mode * still zero, it will be unlinked and returned to the free * list by vput(). */ brelse(bp); vput(vp); *vpp = NULL; return (error); } if (ip->i_ump->um_fstype == UFS1) ip->i_din1 = uma_zalloc(uma_ufs1, M_WAITOK); else ip->i_din2 = uma_zalloc(uma_ufs2, M_WAITOK); ffs_load_inode(bp, ip, fs, ino); if (DOINGSOFTDEP(vp)) softdep_load_inodeblock(ip); else ip->i_effnlink = ip->i_nlink; bqrelse(bp); /* * Initialize the vnode from the inode, check for aliases. * Note that the underlying vnode may have changed. */ if (ip->i_ump->um_fstype == UFS1) error = ufs_vinit(mp, &ffs_fifoops1, &vp); else error = ufs_vinit(mp, &ffs_fifoops2, &vp); if (error) { vput(vp); *vpp = NULL; return (error); } /* * Finish inode initialization. */ if (vp->v_type != VFIFO) { /* FFS supports shared locking for all files except fifos. */ VN_LOCK_ASHARE(vp); } /* * Set up a generation number for this inode if it does not * already have one. This should only happen on old filesystems. */ if (ip->i_gen == 0) { ip->i_gen = arc4random() / 2 + 1; if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) { ip->i_flag |= IN_MODIFIED; DIP_SET(ip, i_gen, ip->i_gen); } } #ifdef MAC if ((mp->mnt_flag & MNT_MULTILABEL) && ip->i_mode) { /* * If this vnode is already allocated, and we're running * multi-label, attempt to perform a label association * from the extended attributes on the inode. */ error = mac_vnode_associate_extattr(mp, vp); if (error) { /* ufs_inactive will release ip->i_devvp ref. */ vput(vp); *vpp = NULL; return (error); } } #endif *vpp = vp; return (0); } /* * File handle to vnode * * Have to be really careful about stale file handles: * - check that the inode number is valid * - call ffs_vget() to get the locked inode * - check for an unallocated inode (i_mode == 0) * - check that the given client host has export rights and return * those rights via. exflagsp and credanonp */ static int ffs_fhtovp(mp, fhp, flags, vpp) struct mount *mp; struct fid *fhp; int flags; struct vnode **vpp; { struct ufid *ufhp; struct fs *fs; ufhp = (struct ufid *)fhp; fs = VFSTOUFS(mp)->um_fs; if (ufhp->ufid_ino < ROOTINO || ufhp->ufid_ino >= fs->fs_ncg * fs->fs_ipg) return (ESTALE); return (ufs_fhtovp(mp, ufhp, flags, vpp)); } /* * Initialize the filesystem. */ static int ffs_init(vfsp) struct vfsconf *vfsp; { ffs_susp_initialize(); softdep_initialize(); return (ufs_init(vfsp)); } /* * Undo the work of ffs_init(). */ static int ffs_uninit(vfsp) struct vfsconf *vfsp; { int ret; ret = ufs_uninit(vfsp); softdep_uninitialize(); ffs_susp_uninitialize(); return (ret); } /* * Write a superblock and associated information back to disk. */ int ffs_sbupdate(ump, waitfor, suspended) struct ufsmount *ump; int waitfor; int suspended; { struct fs *fs = ump->um_fs; struct buf *sbbp; struct buf *bp; int blks; void *space; int i, size, error, allerror = 0; if (fs->fs_ronly == 1 && (ump->um_mountp->mnt_flag & (MNT_RDONLY | MNT_UPDATE)) != (MNT_RDONLY | MNT_UPDATE) && ump->um_fsckpid == 0) panic("ffs_sbupdate: write read-only filesystem"); /* * We use the superblock's buf to serialize calls to ffs_sbupdate(). */ sbbp = getblk(ump->um_devvp, btodb(fs->fs_sblockloc), (int)fs->fs_sbsize, 0, 0, 0); /* * First write back the summary information. */ blks = howmany(fs->fs_cssize, fs->fs_fsize); space = fs->fs_csp; for (i = 0; i < blks; i += fs->fs_frag) { size = fs->fs_bsize; if (i + fs->fs_frag > blks) size = (blks - i) * fs->fs_fsize; bp = getblk(ump->um_devvp, fsbtodb(fs, fs->fs_csaddr + i), size, 0, 0, 0); bcopy(space, bp->b_data, (u_int)size); space = (char *)space + size; if (suspended) bp->b_flags |= B_VALIDSUSPWRT; if (waitfor != MNT_WAIT) bawrite(bp); else if ((error = bwrite(bp)) != 0) allerror = error; } /* * Now write back the superblock itself. If any errors occurred * up to this point, then fail so that the superblock avoids * being written out as clean. */ if (allerror) { brelse(sbbp); return (allerror); } bp = sbbp; if (fs->fs_magic == FS_UFS1_MAGIC && fs->fs_sblockloc != SBLOCK_UFS1 && (fs->fs_old_flags & FS_FLAGS_UPDATED) == 0) { printf("WARNING: %s: correcting fs_sblockloc from %jd to %d\n", fs->fs_fsmnt, fs->fs_sblockloc, SBLOCK_UFS1); fs->fs_sblockloc = SBLOCK_UFS1; } if (fs->fs_magic == FS_UFS2_MAGIC && fs->fs_sblockloc != SBLOCK_UFS2 && (fs->fs_old_flags & FS_FLAGS_UPDATED) == 0) { printf("WARNING: %s: correcting fs_sblockloc from %jd to %d\n", fs->fs_fsmnt, fs->fs_sblockloc, SBLOCK_UFS2); fs->fs_sblockloc = SBLOCK_UFS2; } fs->fs_fmod = 0; fs->fs_time = time_second; if (MOUNTEDSOFTDEP(ump->um_mountp)) softdep_setup_sbupdate(ump, (struct fs *)bp->b_data, bp); bcopy((caddr_t)fs, bp->b_data, (u_int)fs->fs_sbsize); ffs_oldfscompat_write((struct fs *)bp->b_data, ump); if (suspended) bp->b_flags |= B_VALIDSUSPWRT; if (waitfor != MNT_WAIT) bawrite(bp); else if ((error = bwrite(bp)) != 0) allerror = error; return (allerror); } static int ffs_extattrctl(struct mount *mp, int cmd, struct vnode *filename_vp, int attrnamespace, const char *attrname) { #ifdef UFS_EXTATTR return (ufs_extattrctl(mp, cmd, filename_vp, attrnamespace, attrname)); #else return (vfs_stdextattrctl(mp, cmd, filename_vp, attrnamespace, attrname)); #endif } static void ffs_ifree(struct ufsmount *ump, struct inode *ip) { if (ump->um_fstype == UFS1 && ip->i_din1 != NULL) uma_zfree(uma_ufs1, ip->i_din1); else if (ip->i_din2 != NULL) uma_zfree(uma_ufs2, ip->i_din2); uma_zfree(uma_inode, ip); } static int dobkgrdwrite = 1; SYSCTL_INT(_debug, OID_AUTO, dobkgrdwrite, CTLFLAG_RW, &dobkgrdwrite, 0, "Do background writes (honoring the BV_BKGRDWRITE flag)?"); /* * Complete a background write started from bwrite. */ static void ffs_backgroundwritedone(struct buf *bp) { struct bufobj *bufobj; struct buf *origbp; /* * Find the original buffer that we are writing. */ bufobj = bp->b_bufobj; BO_LOCK(bufobj); if ((origbp = gbincore(bp->b_bufobj, bp->b_lblkno)) == NULL) panic("backgroundwritedone: lost buffer"); BO_UNLOCK(bufobj); /* * Process dependencies then return any unfinished ones. */ pbrelvp(bp); if (!LIST_EMPTY(&bp->b_dep)) buf_complete(bp); #ifdef SOFTUPDATES if (!LIST_EMPTY(&bp->b_dep)) softdep_move_dependencies(bp, origbp); #endif /* * This buffer is marked B_NOCACHE so when it is released * by biodone it will be tossed. */ bp->b_flags |= B_NOCACHE; bp->b_flags &= ~B_CACHE; bufdone(bp); BO_LOCK(bufobj); /* * Clear the BV_BKGRDINPROG flag in the original buffer * and awaken it if it is waiting for the write to complete. * If BV_BKGRDINPROG is not set in the original buffer it must * have been released and re-instantiated - which is not legal. */ KASSERT((origbp->b_vflags & BV_BKGRDINPROG), ("backgroundwritedone: lost buffer2")); origbp->b_vflags &= ~BV_BKGRDINPROG; if (origbp->b_vflags & BV_BKGRDWAIT) { origbp->b_vflags &= ~BV_BKGRDWAIT; wakeup(&origbp->b_xflags); } BO_UNLOCK(bufobj); } /* * Write, release buffer on completion. (Done by iodone * if async). Do not bother writing anything if the buffer * is invalid. * * Note that we set B_CACHE here, indicating that buffer is * fully valid and thus cacheable. This is true even of NFS * now so we set it generally. This could be set either here * or in biodone() since the I/O is synchronous. We put it * here. */ static int ffs_bufwrite(struct buf *bp) { struct buf *newbp; int oldflags; CTR3(KTR_BUF, "bufwrite(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags); if (bp->b_flags & B_INVAL) { brelse(bp); return (0); } oldflags = bp->b_flags; if (!BUF_ISLOCKED(bp)) panic("bufwrite: buffer is not busy???"); /* * If a background write is already in progress, delay * writing this block if it is asynchronous. Otherwise * wait for the background write to complete. */ BO_LOCK(bp->b_bufobj); if (bp->b_vflags & BV_BKGRDINPROG) { if (bp->b_flags & B_ASYNC) { BO_UNLOCK(bp->b_bufobj); bdwrite(bp); return (0); } bp->b_vflags |= BV_BKGRDWAIT; msleep(&bp->b_xflags, BO_LOCKPTR(bp->b_bufobj), PRIBIO, "bwrbg", 0); if (bp->b_vflags & BV_BKGRDINPROG) panic("bufwrite: still writing"); } BO_UNLOCK(bp->b_bufobj); /* * If this buffer is marked for background writing and we * do not have to wait for it, make a copy and write the * copy so as to leave this buffer ready for further use. * * This optimization eats a lot of memory. If we have a page * or buffer shortfall we can't do it. */ if (dobkgrdwrite && (bp->b_xflags & BX_BKGRDWRITE) && (bp->b_flags & B_ASYNC) && !vm_page_count_severe() && !buf_dirty_count_severe()) { KASSERT(bp->b_iodone == NULL, ("bufwrite: needs chained iodone (%p)", bp->b_iodone)); /* get a new block */ newbp = geteblk(bp->b_bufsize, GB_NOWAIT_BD); if (newbp == NULL) goto normal_write; KASSERT((bp->b_flags & B_UNMAPPED) == 0, ("Unmapped cg")); memcpy(newbp->b_data, bp->b_data, bp->b_bufsize); BO_LOCK(bp->b_bufobj); bp->b_vflags |= BV_BKGRDINPROG; BO_UNLOCK(bp->b_bufobj); newbp->b_xflags |= BX_BKGRDMARKER; newbp->b_lblkno = bp->b_lblkno; newbp->b_blkno = bp->b_blkno; newbp->b_offset = bp->b_offset; newbp->b_iodone = ffs_backgroundwritedone; newbp->b_flags |= B_ASYNC; newbp->b_flags &= ~B_INVAL; pbgetvp(bp->b_vp, newbp); #ifdef SOFTUPDATES /* * Move over the dependencies. If there are rollbacks, * leave the parent buffer dirtied as it will need to * be written again. */ if (LIST_EMPTY(&bp->b_dep) || softdep_move_dependencies(bp, newbp) == 0) bundirty(bp); #else bundirty(bp); #endif /* * Initiate write on the copy, release the original. The * BKGRDINPROG flag prevents it from going away until * the background write completes. */ bqrelse(bp); bp = newbp; } else /* Mark the buffer clean */ bundirty(bp); /* Let the normal bufwrite do the rest for us */ normal_write: return (bufwrite(bp)); } static void ffs_geom_strategy(struct bufobj *bo, struct buf *bp) { struct vnode *vp; int error; struct buf *tbp; int nocopy; vp = bo->__bo_vnode; if (bp->b_iocmd == BIO_WRITE) { if ((bp->b_flags & B_VALIDSUSPWRT) == 0 && bp->b_vp != NULL && bp->b_vp->v_mount != NULL && (bp->b_vp->v_mount->mnt_kern_flag & MNTK_SUSPENDED) != 0) panic("ffs_geom_strategy: bad I/O"); nocopy = bp->b_flags & B_NOCOPY; bp->b_flags &= ~(B_VALIDSUSPWRT | B_NOCOPY); if ((vp->v_vflag & VV_COPYONWRITE) && nocopy == 0 && vp->v_rdev->si_snapdata != NULL) { if ((bp->b_flags & B_CLUSTER) != 0) { runningbufwakeup(bp); TAILQ_FOREACH(tbp, &bp->b_cluster.cluster_head, b_cluster.cluster_entry) { error = ffs_copyonwrite(vp, tbp); if (error != 0 && error != EOPNOTSUPP) { bp->b_error = error; bp->b_ioflags |= BIO_ERROR; bufdone(bp); return; } } bp->b_runningbufspace = bp->b_bufsize; atomic_add_long(&runningbufspace, bp->b_runningbufspace); } else { error = ffs_copyonwrite(vp, bp); if (error != 0 && error != EOPNOTSUPP) { bp->b_error = error; bp->b_ioflags |= BIO_ERROR; bufdone(bp); return; } } } #ifdef SOFTUPDATES if ((bp->b_flags & B_CLUSTER) != 0) { TAILQ_FOREACH(tbp, &bp->b_cluster.cluster_head, b_cluster.cluster_entry) { if (!LIST_EMPTY(&tbp->b_dep)) buf_start(tbp); } } else { if (!LIST_EMPTY(&bp->b_dep)) buf_start(bp); } #endif } g_vfs_strategy(bo, bp); } int ffs_own_mount(const struct mount *mp) { if (mp->mnt_op == &ufs_vfsops) return (1); return (0); } #ifdef DDB #ifdef SOFTUPDATES /* defined in ffs_softdep.c */ extern void db_print_ffs(struct ufsmount *ump); DB_SHOW_COMMAND(ffs, db_show_ffs) { struct mount *mp; struct ufsmount *ump; if (have_addr) { ump = VFSTOUFS((struct mount *)addr); db_print_ffs(ump); return; } TAILQ_FOREACH(mp, &mountlist, mnt_list) { if (!strcmp(mp->mnt_stat.f_fstypename, ufs_vfsconf.vfc_name)) db_print_ffs(VFSTOUFS(mp)); } } #endif /* SOFTUPDATES */ #endif /* DDB */ Index: user/ngie/more-tests/sys =================================================================== --- user/ngie/more-tests/sys (revision 281584) +++ user/ngie/more-tests/sys (revision 281585) Property changes on: user/ngie/more-tests/sys ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/sys:r281504-281584 Index: user/ngie/more-tests/usr.bin/Makefile =================================================================== --- user/ngie/more-tests/usr.bin/Makefile (revision 281584) +++ user/ngie/more-tests/usr.bin/Makefile (revision 281585) @@ -1,425 +1,431 @@ # From: @(#)Makefile 8.3 (Berkeley) 1/7/94 # $FreeBSD$ .include # XXX MISSING: deroff diction graph learn plot # spell spline struct xsend # XXX Use GNU versions: diff ld patch # Moved to secure: bdes # SUBDIR= ${_addr2line} \ alias \ apply \ asa \ awk \ banner \ basename \ brandelf \ bsdiff \ bzip2 \ bzip2recover \ cap_mkdb \ chat \ chpass \ cksum \ ${_clang} \ cmp \ col \ colldef \ colrm \ column \ comm \ compress \ cpuset \ csplit \ ctlstat \ cut \ demandoc \ dirname \ dpv \ du \ elf2aout \ ${_elfcopy} \ elfdump \ enigma \ env \ expand \ false \ fetch \ find \ fmt \ fold \ fstat \ fsync \ gcore \ gencat \ getconf \ getent \ getopt \ grep \ gzip \ head \ hexdump \ ${_iconv} \ id \ ipcrm \ ipcs \ join \ jot \ ${_kdump} \ keylogin \ keylogout \ killall \ ktrace \ ktrdump \ lam \ lastcomm \ ldd \ leave \ less \ lessecho \ lesskey \ limits \ locale \ lock \ lockf \ logger \ login \ logins \ logname \ look \ lorder \ lsvfs \ lzmainfo \ m4 \ ${_makewhatis} \ ${_man} \ mandoc \ mesg \ minigzip \ ministat \ ${_mkcsmapper} \ mkdep \ ${_mkesdb} \ mkfifo \ mkimg \ mklocale \ mktemp \ mkulzma \ mkuzip \ mt \ ncal \ netstat \ newgrp \ nfsstat \ nice \ nl \ ${_nm} \ nohup \ opieinfo \ opiekey \ opiepasswd \ pagesize \ passwd \ paste \ patch \ pathchk \ perror \ pr \ printenv \ printf \ procstat \ protect \ rctl \ ${_readelf} \ renice \ rev \ revoke \ rpcinfo \ rs \ rup \ rusers \ rwall \ script \ sed \ send-pr \ seq \ shar \ showmount \ ${_size} \ sockstat \ soeliminate \ sort \ split \ stat \ stdbuf \ ${_strings} \ su \ systat \ tabs \ tail \ tar \ tcopy \ tee \ ${_tests} \ time \ timeout \ tip \ top \ touch \ tput \ tr \ true \ truncate \ ${_truss} \ tset \ tsort \ tty \ uname \ unexpand \ uniq \ unzip \ units \ unvis \ uudecode \ uuencode \ vis \ vmstat \ w \ wall \ wc \ what \ whereis \ which \ whois \ write \ xargs \ xinstall \ xo \ xz \ xzdec \ yes \ ${_ypcat} \ ${_ypmatch} \ ${_ypwhich} # NB: keep these sorted by MK_* knobs .if ${MK_AT} != "no" SUBDIR+= at .endif .if ${MK_ATM} != "no" SUBDIR+= atm .endif .if ${MK_BLUETOOTH} != "no" SUBDIR+= bluetooth .endif .if ${MK_BSD_CPIO} != "no" SUBDIR+= cpio .endif .if ${MK_CALENDAR} != "no" SUBDIR+= calendar .endif .if ${MK_CLANG} != "no" _clang= clang .endif .if ${MK_EE} != "no" SUBDIR+= ee .endif .if ${MK_ELFTOOLCHAIN_TOOLS} != "no" _addr2line= addr2line _elfcopy= elfcopy _nm= nm _readelf= readelf _size= size _strings= strings .endif .if ${MK_FILE} != "no" SUBDIR+= file .endif .if ${MK_FINGER} != "no" SUBDIR+= finger .endif .if ${MK_FMAKE} != "no" SUBDIR+= make .endif .if ${MK_FTP} != "no" SUBDIR+= ftp .endif .if ${MK_GPL_DTC} != "yes" SUBDIR+= dtc .endif .if ${MK_GROFF} != "no" SUBDIR+= vgrind .endif .if ${MK_HESIOD} != "no" SUBDIR+= hesinfo .endif .if ${MK_ICONV} != "no" _iconv= iconv _mkcsmapper= mkcsmapper _mkesdb= mkesdb .endif .if ${MK_ISCSI} != "no" SUBDIR+= iscsictl .endif .if ${MK_KDUMP} != "no" SUBDIR+= kdump +.if ${MACHINE_ARCH} != "aarch64" # ARM64TODO truss does not build SUBDIR+= truss .endif +.endif .if ${MK_KERBEROS_SUPPORT} != "no" SUBDIR+= compile_et .endif .if ${MK_LDNS_UTILS} != "no" SUBDIR+= drill SUBDIR+= host .endif .if ${MK_LOCATE} != "no" SUBDIR+= locate .endif # XXX msgs? .if ${MK_MAIL} != "no" SUBDIR+= biff SUBDIR+= from SUBDIR+= mail SUBDIR+= msgs .endif .if ${MK_MAKE} != "no" SUBDIR+= bmake .endif .if ${MK_MAN_UTILS} != "no" SUBDIR+= catman _makewhatis= makewhatis _man= man .endif .if ${MK_NETCAT} != "no" SUBDIR+= nc .endif .if ${MK_NIS} != "no" SUBDIR+= ypcat SUBDIR+= ypmatch SUBDIR+= ypwhich .endif .if ${MK_OPENSSH} != "no" SUBDIR+= ssh-copy-id .endif .if ${MK_OPENSSL} != "no" SUBDIR+= bc SUBDIR+= chkey SUBDIR+= dc SUBDIR+= newkey .endif .if ${MK_QUOTAS} != "no" SUBDIR+= quota .endif .if ${MK_RCMDS} != "no" SUBDIR+= rlogin SUBDIR+= rsh SUBDIR+= ruptime SUBDIR+= rwho .endif .if ${MK_SENDMAIL} != "no" SUBDIR+= vacation .endif .if ${MK_TALK} != "no" SUBDIR+= talk .endif .if ${MK_TELNET} != "no" SUBDIR+= telnet .endif .if ${MK_TESTS} != "no" _tests= tests .endif .if ${MK_TEXTPROC} != "no" SUBDIR+= checknr SUBDIR+= colcrt SUBDIR+= ul .endif .if ${MK_TFTP} != "no" SUBDIR+= tftp .endif .if ${MK_TOOLCHAIN} != "no" SUBDIR+= ar SUBDIR+= c89 SUBDIR+= c99 SUBDIR+= ctags SUBDIR+= file2c +.if ${MACHINE_ARCH} != "aarch64" # ARM64TODO gprof does not build SUBDIR+= gprof +.endif SUBDIR+= indent SUBDIR+= lex SUBDIR+= mkstr SUBDIR+= rpcgen SUBDIR+= unifdef +.if ${MACHINE_ARCH} != "aarch64" # ARM64TODO xlint does not build SUBDIR+= xlint +.endif SUBDIR+= xstr SUBDIR+= yacc .endif .if ${MK_VI} != "no" SUBDIR+= vi .endif .if ${MK_VT} != "no" SUBDIR+= vtfontcvt .endif .if ${MK_USB} != "no" SUBDIR+= usbhidaction SUBDIR+= usbhidctl .endif .if ${MK_UTMPX} != "no" SUBDIR+= last SUBDIR+= users SUBDIR+= who .endif .if ${MK_SVN} == "yes" || ${MK_SVNLITE} == "yes" SUBDIR+= svn .endif .include SUBDIR:= ${SUBDIR:O} SUBDIR_PARALLEL= .include Index: user/ngie/more-tests/usr.bin/gzip/gzip.c =================================================================== --- user/ngie/more-tests/usr.bin/gzip/gzip.c (revision 281584) +++ user/ngie/more-tests/usr.bin/gzip/gzip.c (revision 281585) @@ -1,2170 +1,2173 @@ /* $NetBSD: gzip.c,v 1.107 2015/01/13 02:37:20 mrg Exp $ */ /*- * Copyright (c) 1997, 1998, 2003, 2004, 2006 Matthew R. Green * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include #ifndef lint __COPYRIGHT("@(#) Copyright (c) 1997, 1998, 2003, 2004, 2006\ Matthew R. Green. All rights reserved."); __FBSDID("$FreeBSD$"); #endif /* not lint */ /* * gzip.c -- GPL free gzip using zlib. * * RFC 1950 covers the zlib format * RFC 1951 covers the deflate format * RFC 1952 covers the gzip format * * TODO: * - use mmap where possible * - make bzip2/compress -v/-t/-l support work as well as possible */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* what type of file are we dealing with */ enum filetype { FT_GZIP, #ifndef NO_BZIP2_SUPPORT FT_BZIP2, #endif #ifndef NO_COMPRESS_SUPPORT FT_Z, #endif #ifndef NO_PACK_SUPPORT FT_PACK, #endif #ifndef NO_XZ_SUPPORT FT_XZ, #endif FT_LAST, FT_UNKNOWN }; #ifndef NO_BZIP2_SUPPORT #include #define BZ2_SUFFIX ".bz2" #define BZIP2_MAGIC "\102\132\150" #endif #ifndef NO_COMPRESS_SUPPORT #define Z_SUFFIX ".Z" #define Z_MAGIC "\037\235" #endif #ifndef NO_PACK_SUPPORT #define PACK_MAGIC "\037\036" #endif #ifndef NO_XZ_SUPPORT #include #define XZ_SUFFIX ".xz" #define XZ_MAGIC "\3757zXZ" #endif #define GZ_SUFFIX ".gz" #define BUFLEN (64 * 1024) #define GZIP_MAGIC0 0x1F #define GZIP_MAGIC1 0x8B #define GZIP_OMAGIC1 0x9E #define GZIP_TIMESTAMP (off_t)4 #define GZIP_ORIGNAME (off_t)10 #define HEAD_CRC 0x02 #define EXTRA_FIELD 0x04 #define ORIG_NAME 0x08 #define COMMENT 0x10 #define OS_CODE 3 /* Unix */ typedef struct { const char *zipped; int ziplen; const char *normal; /* for unzip - must not be longer than zipped */ } suffixes_t; static suffixes_t suffixes[] = { #define SUFFIX(Z, N) {Z, sizeof Z - 1, N} SUFFIX(GZ_SUFFIX, ""), /* Overwritten by -S .xxx */ #ifndef SMALL SUFFIX(GZ_SUFFIX, ""), SUFFIX(".z", ""), SUFFIX("-gz", ""), SUFFIX("-z", ""), SUFFIX("_z", ""), SUFFIX(".taz", ".tar"), SUFFIX(".tgz", ".tar"), #ifndef NO_BZIP2_SUPPORT SUFFIX(BZ2_SUFFIX, ""), SUFFIX(".tbz", ".tar"), SUFFIX(".tbz2", ".tar"), #endif #ifndef NO_COMPRESS_SUPPORT SUFFIX(Z_SUFFIX, ""), #endif #ifndef NO_XZ_SUPPORT SUFFIX(XZ_SUFFIX, ""), #endif SUFFIX(GZ_SUFFIX, ""), /* Overwritten by -S "" */ #endif /* SMALL */ #undef SUFFIX }; #define NUM_SUFFIXES (sizeof suffixes / sizeof suffixes[0]) #define SUFFIX_MAXLEN 30 static const char gzip_version[] = "FreeBSD gzip 20150413"; #ifndef SMALL static const char gzip_copyright[] = \ " Copyright (c) 1997, 1998, 2003, 2004, 2006 Matthew R. Green\n" " All rights reserved.\n" "\n" " Redistribution and use in source and binary forms, with or without\n" " modification, are permitted provided that the following conditions\n" " are met:\n" " 1. Redistributions of source code must retain the above copyright\n" " notice, this list of conditions and the following disclaimer.\n" " 2. Redistributions in binary form must reproduce the above copyright\n" " notice, this list of conditions and the following disclaimer in the\n" " documentation and/or other materials provided with the distribution.\n" "\n" " THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR\n" " IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\n" " OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.\n" " IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,\n" " INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,\n" " BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n" " LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n" " AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\n" " OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY\n" " OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF\n" " SUCH DAMAGE."; #endif static int cflag; /* stdout mode */ static int dflag; /* decompress mode */ static int lflag; /* list mode */ static int numflag = 6; /* gzip -1..-9 value */ #ifndef SMALL static int fflag; /* force mode */ static int kflag; /* don't delete input files */ static int nflag; /* don't save name/timestamp */ static int Nflag; /* don't restore name/timestamp */ static int qflag; /* quiet mode */ static int rflag; /* recursive mode */ static int tflag; /* test */ static int vflag; /* verbose mode */ static const char *remove_file = NULL; /* file to be removed upon SIGINT */ #else #define qflag 0 #define tflag 0 #endif static int exit_value = 0; /* exit value */ static char *infile; /* name of file coming in */ static void maybe_err(const char *fmt, ...) __printflike(1, 2) __dead2; #if !defined(NO_BZIP2_SUPPORT) || !defined(NO_PACK_SUPPORT) || \ !defined(NO_XZ_SUPPORT) static void maybe_errx(const char *fmt, ...) __printflike(1, 2) __dead2; #endif static void maybe_warn(const char *fmt, ...) __printflike(1, 2); static void maybe_warnx(const char *fmt, ...) __printflike(1, 2); static enum filetype file_gettype(u_char *); #ifdef SMALL #define gz_compress(if, of, sz, fn, tm) gz_compress(if, of, sz) #endif static off_t gz_compress(int, int, off_t *, const char *, uint32_t); static off_t gz_uncompress(int, int, char *, size_t, off_t *, const char *); static off_t file_compress(char *, char *, size_t); static off_t file_uncompress(char *, char *, size_t); static void handle_pathname(char *); static void handle_file(char *, struct stat *); static void handle_stdin(void); static void handle_stdout(void); static void print_ratio(off_t, off_t, FILE *); static void print_list(int fd, off_t, const char *, time_t); static void usage(void) __dead2; static void display_version(void) __dead2; #ifndef SMALL static void display_license(void); static void sigint_handler(int); #endif static const suffixes_t *check_suffix(char *, int); static ssize_t read_retry(int, void *, size_t); #ifdef SMALL #define unlink_input(f, sb) unlink(f) #else static off_t cat_fd(unsigned char *, size_t, off_t *, int fd); static void prepend_gzip(char *, int *, char ***); static void handle_dir(char *); static void print_verbage(const char *, const char *, off_t, off_t); static void print_test(const char *, int); static void copymodes(int fd, const struct stat *, const char *file); static int check_outfile(const char *outfile); #endif #ifndef NO_BZIP2_SUPPORT static off_t unbzip2(int, int, char *, size_t, off_t *); #endif #ifndef NO_COMPRESS_SUPPORT static FILE *zdopen(int); static off_t zuncompress(FILE *, FILE *, char *, size_t, off_t *); #endif #ifndef NO_PACK_SUPPORT static off_t unpack(int, int, char *, size_t, off_t *); #endif #ifndef NO_XZ_SUPPORT static off_t unxz(int, int, char *, size_t, off_t *); #endif #ifdef SMALL #define getopt_long(a,b,c,d,e) getopt(a,b,c) #else static const struct option longopts[] = { { "stdout", no_argument, 0, 'c' }, { "to-stdout", no_argument, 0, 'c' }, { "decompress", no_argument, 0, 'd' }, { "uncompress", no_argument, 0, 'd' }, { "force", no_argument, 0, 'f' }, { "help", no_argument, 0, 'h' }, { "keep", no_argument, 0, 'k' }, { "list", no_argument, 0, 'l' }, { "no-name", no_argument, 0, 'n' }, { "name", no_argument, 0, 'N' }, { "quiet", no_argument, 0, 'q' }, { "recursive", no_argument, 0, 'r' }, { "suffix", required_argument, 0, 'S' }, { "test", no_argument, 0, 't' }, { "verbose", no_argument, 0, 'v' }, { "version", no_argument, 0, 'V' }, { "fast", no_argument, 0, '1' }, { "best", no_argument, 0, '9' }, { "ascii", no_argument, 0, 'a' }, { "license", no_argument, 0, 'L' }, { NULL, no_argument, 0, 0 }, }; #endif int main(int argc, char **argv) { const char *progname = getprogname(); #ifndef SMALL char *gzip; int len; #endif int ch; #ifndef SMALL if ((gzip = getenv("GZIP")) != NULL) prepend_gzip(gzip, &argc, &argv); signal(SIGINT, sigint_handler); #endif /* * XXX * handle being called `gunzip', `zcat' and `gzcat' */ if (strcmp(progname, "gunzip") == 0) dflag = 1; else if (strcmp(progname, "zcat") == 0 || strcmp(progname, "gzcat") == 0) dflag = cflag = 1; #ifdef SMALL #define OPT_LIST "123456789cdhlV" #else #define OPT_LIST "123456789acdfhklLNnqrS:tVv" #endif while ((ch = getopt_long(argc, argv, OPT_LIST, longopts, NULL)) != -1) { switch (ch) { case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': numflag = ch - '0'; break; case 'c': cflag = 1; break; case 'd': dflag = 1; break; case 'l': lflag = 1; dflag = 1; break; case 'V': display_version(); /* NOTREACHED */ #ifndef SMALL case 'a': fprintf(stderr, "%s: option --ascii ignored on this system\n", progname); break; case 'f': fflag = 1; break; case 'k': kflag = 1; break; case 'L': display_license(); /* NOT REACHED */ case 'N': nflag = 0; Nflag = 1; break; case 'n': nflag = 1; Nflag = 0; break; case 'q': qflag = 1; break; case 'r': rflag = 1; break; case 'S': len = strlen(optarg); if (len != 0) { if (len > SUFFIX_MAXLEN) errx(1, "incorrect suffix: '%s': too long", optarg); suffixes[0].zipped = optarg; suffixes[0].ziplen = len; } else { suffixes[NUM_SUFFIXES - 1].zipped = ""; suffixes[NUM_SUFFIXES - 1].ziplen = 0; } break; case 't': cflag = 1; tflag = 1; dflag = 1; break; case 'v': vflag = 1; break; #endif default: usage(); /* NOTREACHED */ } } argv += optind; argc -= optind; if (argc == 0) { if (dflag) /* stdin mode */ handle_stdin(); else /* stdout mode */ handle_stdout(); } else { do { handle_pathname(argv[0]); } while (*++argv); } #ifndef SMALL if (qflag == 0 && lflag && argc > 1) print_list(-1, 0, "(totals)", 0); #endif exit(exit_value); } /* maybe print a warning */ void maybe_warn(const char *fmt, ...) { va_list ap; if (qflag == 0) { va_start(ap, fmt); vwarn(fmt, ap); va_end(ap); } if (exit_value == 0) exit_value = 1; } /* ... without an errno. */ void maybe_warnx(const char *fmt, ...) { va_list ap; if (qflag == 0) { va_start(ap, fmt); vwarnx(fmt, ap); va_end(ap); } if (exit_value == 0) exit_value = 1; } /* maybe print an error */ void maybe_err(const char *fmt, ...) { va_list ap; if (qflag == 0) { va_start(ap, fmt); vwarn(fmt, ap); va_end(ap); } exit(2); } #if !defined(NO_BZIP2_SUPPORT) || !defined(NO_PACK_SUPPORT) || \ !defined(NO_XZ_SUPPORT) /* ... without an errno. */ void maybe_errx(const char *fmt, ...) { va_list ap; if (qflag == 0) { va_start(ap, fmt); vwarnx(fmt, ap); va_end(ap); } exit(2); } #endif #ifndef SMALL /* split up $GZIP and prepend it to the argument list */ static void prepend_gzip(char *gzip, int *argc, char ***argv) { char *s, **nargv, **ac; int nenvarg = 0, i; /* scan how many arguments there are */ for (s = gzip;;) { while (*s == ' ' || *s == '\t') s++; if (*s == 0) goto count_done; nenvarg++; while (*s != ' ' && *s != '\t') if (*s++ == 0) goto count_done; } count_done: /* punt early */ if (nenvarg == 0) return; *argc += nenvarg; ac = *argv; nargv = (char **)malloc((*argc + 1) * sizeof(char *)); if (nargv == NULL) maybe_err("malloc"); /* stash this away */ *argv = nargv; /* copy the program name first */ i = 0; nargv[i++] = *(ac++); /* take a copy of $GZIP and add it to the array */ s = strdup(gzip); if (s == NULL) maybe_err("strdup"); for (;;) { /* Skip whitespaces. */ while (*s == ' ' || *s == '\t') s++; if (*s == 0) goto copy_done; nargv[i++] = s; /* Find the end of this argument. */ while (*s != ' ' && *s != '\t') if (*s++ == 0) /* Argument followed by NUL. */ goto copy_done; /* Terminate by overwriting ' ' or '\t' with NUL. */ *s++ = 0; } copy_done: /* copy the original arguments and a NULL */ while (*ac) nargv[i++] = *(ac++); nargv[i] = NULL; } #endif /* compress input to output. Return bytes read, -1 on error */ static off_t gz_compress(int in, int out, off_t *gsizep, const char *origname, uint32_t mtime) { z_stream z; char *outbufp, *inbufp; off_t in_tot = 0, out_tot = 0; ssize_t in_size; int i, error; uLong crc; #ifdef SMALL static char header[] = { GZIP_MAGIC0, GZIP_MAGIC1, Z_DEFLATED, 0, 0, 0, 0, 0, 0, OS_CODE }; #endif outbufp = malloc(BUFLEN); inbufp = malloc(BUFLEN); if (outbufp == NULL || inbufp == NULL) { maybe_err("malloc failed"); goto out; } memset(&z, 0, sizeof z); z.zalloc = Z_NULL; z.zfree = Z_NULL; z.opaque = 0; #ifdef SMALL memcpy(outbufp, header, sizeof header); i = sizeof header; #else if (nflag != 0) { mtime = 0; origname = ""; } i = snprintf(outbufp, BUFLEN, "%c%c%c%c%c%c%c%c%c%c%s", GZIP_MAGIC0, GZIP_MAGIC1, Z_DEFLATED, *origname ? ORIG_NAME : 0, mtime & 0xff, (mtime >> 8) & 0xff, (mtime >> 16) & 0xff, (mtime >> 24) & 0xff, numflag == 1 ? 4 : numflag == 9 ? 2 : 0, OS_CODE, origname); if (i >= BUFLEN) /* this need PATH_MAX > BUFLEN ... */ maybe_err("snprintf"); if (*origname) i++; #endif z.next_out = (unsigned char *)outbufp + i; z.avail_out = BUFLEN - i; error = deflateInit2(&z, numflag, Z_DEFLATED, (-MAX_WBITS), 8, Z_DEFAULT_STRATEGY); if (error != Z_OK) { maybe_warnx("deflateInit2 failed"); in_tot = -1; goto out; } crc = crc32(0L, Z_NULL, 0); for (;;) { if (z.avail_out == 0) { if (write(out, outbufp, BUFLEN) != BUFLEN) { maybe_warn("write"); out_tot = -1; goto out; } out_tot += BUFLEN; z.next_out = (unsigned char *)outbufp; z.avail_out = BUFLEN; } if (z.avail_in == 0) { in_size = read(in, inbufp, BUFLEN); if (in_size < 0) { maybe_warn("read"); in_tot = -1; goto out; } if (in_size == 0) break; crc = crc32(crc, (const Bytef *)inbufp, (unsigned)in_size); in_tot += in_size; z.next_in = (unsigned char *)inbufp; z.avail_in = in_size; } error = deflate(&z, Z_NO_FLUSH); if (error != Z_OK && error != Z_STREAM_END) { maybe_warnx("deflate failed"); in_tot = -1; goto out; } } /* clean up */ for (;;) { size_t len; ssize_t w; error = deflate(&z, Z_FINISH); if (error != Z_OK && error != Z_STREAM_END) { maybe_warnx("deflate failed"); in_tot = -1; goto out; } len = (char *)z.next_out - outbufp; w = write(out, outbufp, len); if (w == -1 || (size_t)w != len) { maybe_warn("write"); out_tot = -1; goto out; } out_tot += len; z.next_out = (unsigned char *)outbufp; z.avail_out = BUFLEN; if (error == Z_STREAM_END) break; } if (deflateEnd(&z) != Z_OK) { maybe_warnx("deflateEnd failed"); in_tot = -1; goto out; } i = snprintf(outbufp, BUFLEN, "%c%c%c%c%c%c%c%c", (int)crc & 0xff, (int)(crc >> 8) & 0xff, (int)(crc >> 16) & 0xff, (int)(crc >> 24) & 0xff, (int)in_tot & 0xff, (int)(in_tot >> 8) & 0xff, (int)(in_tot >> 16) & 0xff, (int)(in_tot >> 24) & 0xff); if (i != 8) maybe_err("snprintf"); if (write(out, outbufp, i) != i) { maybe_warn("write"); in_tot = -1; } else out_tot += i; out: if (inbufp != NULL) free(inbufp); if (outbufp != NULL) free(outbufp); if (gsizep) *gsizep = out_tot; return in_tot; } /* * uncompress input to output then close the input. return the * uncompressed size written, and put the compressed sized read * into `*gsizep'. */ static off_t gz_uncompress(int in, int out, char *pre, size_t prelen, off_t *gsizep, const char *filename) { z_stream z; char *outbufp, *inbufp; off_t out_tot = -1, in_tot = 0; uint32_t out_sub_tot = 0; enum { GZSTATE_MAGIC0, GZSTATE_MAGIC1, GZSTATE_METHOD, GZSTATE_FLAGS, GZSTATE_SKIPPING, GZSTATE_EXTRA, GZSTATE_EXTRA2, GZSTATE_EXTRA3, GZSTATE_ORIGNAME, GZSTATE_COMMENT, GZSTATE_HEAD_CRC1, GZSTATE_HEAD_CRC2, GZSTATE_INIT, GZSTATE_READ, GZSTATE_CRC, GZSTATE_LEN, } state = GZSTATE_MAGIC0; int flags = 0, skip_count = 0; int error = Z_STREAM_ERROR, done_reading = 0; uLong crc = 0; ssize_t wr; int needmore = 0; #define ADVANCE() { z.next_in++; z.avail_in--; } if ((outbufp = malloc(BUFLEN)) == NULL) { maybe_err("malloc failed"); goto out2; } if ((inbufp = malloc(BUFLEN)) == NULL) { maybe_err("malloc failed"); goto out1; } memset(&z, 0, sizeof z); z.avail_in = prelen; z.next_in = (unsigned char *)pre; z.avail_out = BUFLEN; z.next_out = (unsigned char *)outbufp; z.zalloc = NULL; z.zfree = NULL; z.opaque = 0; in_tot = prelen; out_tot = 0; for (;;) { if ((z.avail_in == 0 || needmore) && done_reading == 0) { ssize_t in_size; if (z.avail_in > 0) { memmove(inbufp, z.next_in, z.avail_in); } z.next_in = (unsigned char *)inbufp; in_size = read(in, z.next_in + z.avail_in, BUFLEN - z.avail_in); if (in_size == -1) { maybe_warn("failed to read stdin"); goto stop_and_fail; } else if (in_size == 0) { done_reading = 1; } z.avail_in += in_size; needmore = 0; in_tot += in_size; } if (z.avail_in == 0) { if (done_reading && state != GZSTATE_MAGIC0) { maybe_warnx("%s: unexpected end of file", filename); goto stop_and_fail; } goto stop; } switch (state) { case GZSTATE_MAGIC0: if (*z.next_in != GZIP_MAGIC0) { if (in_tot > 0) { maybe_warnx("%s: trailing garbage " "ignored", filename); goto stop; } maybe_warnx("input not gziped (MAGIC0)"); goto stop_and_fail; } ADVANCE(); state++; out_sub_tot = 0; crc = crc32(0L, Z_NULL, 0); break; case GZSTATE_MAGIC1: if (*z.next_in != GZIP_MAGIC1 && *z.next_in != GZIP_OMAGIC1) { maybe_warnx("input not gziped (MAGIC1)"); goto stop_and_fail; } ADVANCE(); state++; break; case GZSTATE_METHOD: if (*z.next_in != Z_DEFLATED) { maybe_warnx("unknown compression method"); goto stop_and_fail; } ADVANCE(); state++; break; case GZSTATE_FLAGS: flags = *z.next_in; ADVANCE(); skip_count = 6; state++; break; case GZSTATE_SKIPPING: if (skip_count > 0) { skip_count--; ADVANCE(); } else state++; break; case GZSTATE_EXTRA: if ((flags & EXTRA_FIELD) == 0) { state = GZSTATE_ORIGNAME; break; } skip_count = *z.next_in; ADVANCE(); state++; break; case GZSTATE_EXTRA2: skip_count |= ((*z.next_in) << 8); ADVANCE(); state++; break; case GZSTATE_EXTRA3: if (skip_count > 0) { skip_count--; ADVANCE(); } else state++; break; case GZSTATE_ORIGNAME: if ((flags & ORIG_NAME) == 0) { state++; break; } if (*z.next_in == 0) state++; ADVANCE(); break; case GZSTATE_COMMENT: if ((flags & COMMENT) == 0) { state++; break; } if (*z.next_in == 0) state++; ADVANCE(); break; case GZSTATE_HEAD_CRC1: if (flags & HEAD_CRC) skip_count = 2; else skip_count = 0; state++; break; case GZSTATE_HEAD_CRC2: if (skip_count > 0) { skip_count--; ADVANCE(); } else state++; break; case GZSTATE_INIT: if (inflateInit2(&z, -MAX_WBITS) != Z_OK) { maybe_warnx("failed to inflateInit"); goto stop_and_fail; } state++; break; case GZSTATE_READ: error = inflate(&z, Z_FINISH); switch (error) { /* Z_BUF_ERROR goes with Z_FINISH... */ case Z_BUF_ERROR: if (z.avail_out > 0 && !done_reading) continue; case Z_STREAM_END: case Z_OK: break; case Z_NEED_DICT: maybe_warnx("Z_NEED_DICT error"); goto stop_and_fail; case Z_DATA_ERROR: maybe_warnx("data stream error"); goto stop_and_fail; case Z_STREAM_ERROR: maybe_warnx("internal stream error"); goto stop_and_fail; case Z_MEM_ERROR: maybe_warnx("memory allocation error"); goto stop_and_fail; default: maybe_warn("unknown error from inflate(): %d", error); } wr = BUFLEN - z.avail_out; if (wr != 0) { crc = crc32(crc, (const Bytef *)outbufp, (unsigned)wr); if ( #ifndef SMALL /* don't write anything with -t */ tflag == 0 && #endif write(out, outbufp, wr) != wr) { maybe_warn("error writing to output"); goto stop_and_fail; } out_tot += wr; out_sub_tot += wr; } if (error == Z_STREAM_END) { inflateEnd(&z); state++; } z.next_out = (unsigned char *)outbufp; z.avail_out = BUFLEN; break; case GZSTATE_CRC: { uLong origcrc; if (z.avail_in < 4) { if (!done_reading) { needmore = 1; continue; } maybe_warnx("truncated input"); goto stop_and_fail; } origcrc = ((unsigned)z.next_in[0] & 0xff) | ((unsigned)z.next_in[1] & 0xff) << 8 | ((unsigned)z.next_in[2] & 0xff) << 16 | ((unsigned)z.next_in[3] & 0xff) << 24; if (origcrc != crc) { maybe_warnx("invalid compressed" " data--crc error"); goto stop_and_fail; } } z.avail_in -= 4; z.next_in += 4; if (!z.avail_in && done_reading) { goto stop; } state++; break; case GZSTATE_LEN: { uLong origlen; if (z.avail_in < 4) { if (!done_reading) { needmore = 1; continue; } maybe_warnx("truncated input"); goto stop_and_fail; } origlen = ((unsigned)z.next_in[0] & 0xff) | ((unsigned)z.next_in[1] & 0xff) << 8 | ((unsigned)z.next_in[2] & 0xff) << 16 | ((unsigned)z.next_in[3] & 0xff) << 24; if (origlen != out_sub_tot) { maybe_warnx("invalid compressed" " data--length error"); goto stop_and_fail; } } z.avail_in -= 4; z.next_in += 4; if (error < 0) { maybe_warnx("decompression error"); goto stop_and_fail; } state = GZSTATE_MAGIC0; break; } continue; stop_and_fail: out_tot = -1; stop: break; } if (state > GZSTATE_INIT) inflateEnd(&z); free(inbufp); out1: free(outbufp); out2: if (gsizep) *gsizep = in_tot; return (out_tot); } #ifndef SMALL /* * set the owner, mode, flags & utimes using the given file descriptor. * file is only used in possible warning messages. */ static void copymodes(int fd, const struct stat *sbp, const char *file) { struct timespec times[2]; struct stat sb; /* * If we have no info on the input, give this file some * default values and return.. */ if (sbp == NULL) { mode_t mask = umask(022); (void)fchmod(fd, DEFFILEMODE & ~mask); (void)umask(mask); return; } sb = *sbp; /* if the chown fails, remove set-id bits as-per compress(1) */ if (fchown(fd, sb.st_uid, sb.st_gid) < 0) { if (errno != EPERM) maybe_warn("couldn't fchown: %s", file); sb.st_mode &= ~(S_ISUID|S_ISGID); } /* we only allow set-id and the 9 normal permission bits */ sb.st_mode &= S_ISUID | S_ISGID | S_IRWXU | S_IRWXG | S_IRWXO; if (fchmod(fd, sb.st_mode) < 0) maybe_warn("couldn't fchmod: %s", file); times[0] = sb.st_atim; times[1] = sb.st_mtim; if (futimens(fd, times) < 0) maybe_warn("couldn't futimens: %s", file); /* only try flags if they exist already */ if (sb.st_flags != 0 && fchflags(fd, sb.st_flags) < 0) maybe_warn("couldn't fchflags: %s", file); } #endif /* what sort of file is this? */ static enum filetype file_gettype(u_char *buf) { if (buf[0] == GZIP_MAGIC0 && (buf[1] == GZIP_MAGIC1 || buf[1] == GZIP_OMAGIC1)) return FT_GZIP; else #ifndef NO_BZIP2_SUPPORT if (memcmp(buf, BZIP2_MAGIC, 3) == 0 && buf[3] >= '0' && buf[3] <= '9') return FT_BZIP2; else #endif #ifndef NO_COMPRESS_SUPPORT if (memcmp(buf, Z_MAGIC, 2) == 0) return FT_Z; else #endif #ifndef NO_PACK_SUPPORT if (memcmp(buf, PACK_MAGIC, 2) == 0) return FT_PACK; else #endif #ifndef NO_XZ_SUPPORT if (memcmp(buf, XZ_MAGIC, 4) == 0) /* XXX: We only have 4 bytes */ return FT_XZ; else #endif return FT_UNKNOWN; } #ifndef SMALL /* check the outfile is OK. */ static int check_outfile(const char *outfile) { struct stat sb; int ok = 1; if (lflag == 0 && stat(outfile, &sb) == 0) { if (fflag) unlink(outfile); else if (isatty(STDIN_FILENO)) { char ans[10] = { 'n', '\0' }; /* default */ fprintf(stderr, "%s already exists -- do you wish to " "overwrite (y or n)? " , outfile); (void)fgets(ans, sizeof(ans) - 1, stdin); if (ans[0] != 'y' && ans[0] != 'Y') { fprintf(stderr, "\tnot overwriting\n"); ok = 0; } else unlink(outfile); } else { maybe_warnx("%s already exists -- skipping", outfile); ok = 0; } } return ok; } static void unlink_input(const char *file, const struct stat *sb) { struct stat nsb; if (kflag) return; if (stat(file, &nsb) != 0) /* Must be gone already */ return; if (nsb.st_dev != sb->st_dev || nsb.st_ino != sb->st_ino) /* Definitely a different file */ return; unlink(file); } static void sigint_handler(int signo __unused) { if (remove_file != NULL) unlink(remove_file); _exit(2); } #endif static const suffixes_t * check_suffix(char *file, int xlate) { const suffixes_t *s; int len = strlen(file); char *sp; for (s = suffixes; s != suffixes + NUM_SUFFIXES; s++) { /* if it doesn't fit in "a.suf", don't bother */ if (s->ziplen >= len) continue; sp = file + len - s->ziplen; if (strcmp(s->zipped, sp) != 0) continue; if (xlate) strcpy(sp, s->normal); return s; } return NULL; } /* * compress the given file: create a corresponding .gz file and remove the * original. */ static off_t file_compress(char *file, char *outfile, size_t outsize) { int in; int out; off_t size, insize; #ifndef SMALL struct stat isb, osb; const suffixes_t *suff; #endif in = open(file, O_RDONLY); if (in == -1) { maybe_warn("can't open %s", file); return (-1); } #ifndef SMALL if (fstat(in, &isb) != 0) { maybe_warn("couldn't stat: %s", file); close(in); return (-1); } #endif if (cflag == 0) { #ifndef SMALL if (isb.st_nlink > 1 && fflag == 0) { maybe_warnx("%s has %d other link%s -- skipping", file, isb.st_nlink - 1, (isb.st_nlink - 1) == 1 ? "" : "s"); close(in); return (-1); } if (fflag == 0 && (suff = check_suffix(file, 0)) && suff->zipped[0] != 0) { maybe_warnx("%s already has %s suffix -- unchanged", file, suff->zipped); close(in); return (-1); } #endif /* Add (usually) .gz to filename */ if ((size_t)snprintf(outfile, outsize, "%s%s", file, suffixes[0].zipped) >= outsize) memcpy(outfile + outsize - suffixes[0].ziplen - 1, suffixes[0].zipped, suffixes[0].ziplen + 1); #ifndef SMALL if (check_outfile(outfile) == 0) { close(in); return (-1); } #endif } if (cflag == 0) { out = open(outfile, O_WRONLY | O_CREAT | O_EXCL, 0600); if (out == -1) { maybe_warn("could not create output: %s", outfile); fclose(stdin); return (-1); } #ifndef SMALL remove_file = outfile; #endif } else out = STDOUT_FILENO; insize = gz_compress(in, out, &size, basename(file), (uint32_t)isb.st_mtime); (void)close(in); /* * If there was an error, insize will be -1. * If we compressed to stdout, just return the size. * Otherwise stat the file and check it is the correct size. * We only blow away the file if we can stat the output and it * has the expected size. */ if (cflag != 0) return (insize == -1 ? -1 : size); #ifndef SMALL if (fstat(out, &osb) != 0) { maybe_warn("couldn't stat: %s", outfile); goto bad_outfile; } if (osb.st_size != size) { maybe_warnx("output file: %s wrong size (%ju != %ju), deleting", outfile, (uintmax_t)osb.st_size, (uintmax_t)size); goto bad_outfile; } copymodes(out, &isb, outfile); remove_file = NULL; #endif if (close(out) == -1) maybe_warn("couldn't close output"); /* output is good, ok to delete input */ unlink_input(file, &isb); return (size); #ifndef SMALL bad_outfile: if (close(out) == -1) maybe_warn("couldn't close output"); maybe_warnx("leaving original %s", file); unlink(outfile); return (size); #endif } /* uncompress the given file and remove the original */ static off_t file_uncompress(char *file, char *outfile, size_t outsize) { struct stat isb, osb; off_t size; ssize_t rbytes; unsigned char header1[4]; enum filetype method; int fd, ofd, zfd = -1; #ifndef SMALL ssize_t rv; time_t timestamp = 0; char name[PATH_MAX + 1]; #endif /* gather the old name info */ fd = open(file, O_RDONLY); if (fd < 0) { maybe_warn("can't open %s", file); goto lose; } strlcpy(outfile, file, outsize); if (check_suffix(outfile, 1) == NULL && !(cflag || lflag)) { maybe_warnx("%s: unknown suffix -- ignored", file); goto lose; } rbytes = read(fd, header1, sizeof header1); if (rbytes != sizeof header1) { /* we don't want to fail here. */ #ifndef SMALL if (fflag) goto lose; #endif if (rbytes == -1) maybe_warn("can't read %s", file); else goto unexpected_EOF; goto lose; } method = file_gettype(header1); #ifndef SMALL if (fflag == 0 && method == FT_UNKNOWN) { maybe_warnx("%s: not in gzip format", file); goto lose; } #endif #ifndef SMALL if (method == FT_GZIP && Nflag) { unsigned char ts[4]; /* timestamp */ rv = pread(fd, ts, sizeof ts, GZIP_TIMESTAMP); if (rv >= 0 && rv < (ssize_t)(sizeof ts)) goto unexpected_EOF; if (rv == -1) { if (!fflag) maybe_warn("can't read %s", file); goto lose; } timestamp = ts[3] << 24 | ts[2] << 16 | ts[1] << 8 | ts[0]; if (header1[3] & ORIG_NAME) { - rbytes = pread(fd, name, sizeof name, GZIP_ORIGNAME); + rbytes = pread(fd, name, sizeof(name) - 1, GZIP_ORIGNAME); if (rbytes < 0) { maybe_warn("can't read %s", file); goto lose; } - if (name[0] != 0) { + if (name[0] != '\0') { char *dp, *nf; + + /* Make sure that name is NUL-terminated */ + name[rbytes] = '\0'; /* strip saved directory name */ nf = strrchr(name, '/'); if (nf == NULL) nf = name; else nf++; /* preserve original directory name */ dp = strrchr(file, '/'); if (dp == NULL) dp = file; else dp++; snprintf(outfile, outsize, "%.*s%.*s", (int) (dp - file), file, (int) rbytes, nf); } } } #endif lseek(fd, 0, SEEK_SET); if (cflag == 0 || lflag) { if (fstat(fd, &isb) != 0) goto lose; #ifndef SMALL if (isb.st_nlink > 1 && lflag == 0 && fflag == 0) { maybe_warnx("%s has %d other links -- skipping", file, isb.st_nlink - 1); goto lose; } if (nflag == 0 && timestamp) isb.st_mtime = timestamp; if (check_outfile(outfile) == 0) goto lose; #endif } if (cflag == 0 && lflag == 0) { zfd = open(outfile, O_WRONLY|O_CREAT|O_EXCL, 0600); if (zfd == STDOUT_FILENO) { /* We won't close STDOUT_FILENO later... */ zfd = dup(zfd); close(STDOUT_FILENO); } if (zfd == -1) { maybe_warn("can't open %s", outfile); goto lose; } #ifndef SMALL remove_file = outfile; #endif } else zfd = STDOUT_FILENO; switch (method) { #ifndef NO_BZIP2_SUPPORT case FT_BZIP2: /* XXX */ if (lflag) { maybe_warnx("no -l with bzip2 files"); goto lose; } size = unbzip2(fd, zfd, NULL, 0, NULL); break; #endif #ifndef NO_COMPRESS_SUPPORT case FT_Z: { FILE *in, *out; /* XXX */ if (lflag) { maybe_warnx("no -l with Lempel-Ziv files"); goto lose; } if ((in = zdopen(fd)) == NULL) { maybe_warn("zdopen for read: %s", file); goto lose; } out = fdopen(dup(zfd), "w"); if (out == NULL) { maybe_warn("fdopen for write: %s", outfile); fclose(in); goto lose; } size = zuncompress(in, out, NULL, 0, NULL); /* need to fclose() if ferror() is true... */ if (ferror(in) | fclose(in)) { maybe_warn("failed infile fclose"); unlink(outfile); (void)fclose(out); } if (fclose(out) != 0) { maybe_warn("failed outfile fclose"); unlink(outfile); goto lose; } break; } #endif #ifndef NO_PACK_SUPPORT case FT_PACK: if (lflag) { maybe_warnx("no -l with packed files"); goto lose; } size = unpack(fd, zfd, NULL, 0, NULL); break; #endif #ifndef NO_XZ_SUPPORT case FT_XZ: if (lflag) { maybe_warnx("no -l with xz files"); goto lose; } size = unxz(fd, zfd, NULL, 0, NULL); break; #endif #ifndef SMALL case FT_UNKNOWN: if (lflag) { maybe_warnx("no -l for unknown filetypes"); goto lose; } size = cat_fd(NULL, 0, NULL, fd); break; #endif default: if (lflag) { print_list(fd, isb.st_size, outfile, isb.st_mtime); close(fd); return -1; /* XXX */ } size = gz_uncompress(fd, zfd, NULL, 0, NULL, file); break; } if (close(fd) != 0) maybe_warn("couldn't close input"); if (zfd != STDOUT_FILENO && close(zfd) != 0) maybe_warn("couldn't close output"); if (size == -1) { if (cflag == 0) unlink(outfile); maybe_warnx("%s: uncompress failed", file); return -1; } /* if testing, or we uncompressed to stdout, this is all we need */ #ifndef SMALL if (tflag) return size; #endif /* if we are uncompressing to stdin, don't remove the file. */ if (cflag) return size; /* * if we create a file... */ /* * if we can't stat the file don't remove the file. */ ofd = open(outfile, O_RDWR, 0); if (ofd == -1) { maybe_warn("couldn't open (leaving original): %s", outfile); return -1; } if (fstat(ofd, &osb) != 0) { maybe_warn("couldn't stat (leaving original): %s", outfile); close(ofd); return -1; } if (osb.st_size != size) { maybe_warnx("stat gave different size: %ju != %ju (leaving original)", (uintmax_t)size, (uintmax_t)osb.st_size); close(ofd); unlink(outfile); return -1; } #ifndef SMALL copymodes(ofd, &isb, outfile); remove_file = NULL; #endif close(ofd); unlink_input(file, &isb); return size; unexpected_EOF: maybe_warnx("%s: unexpected end of file", file); lose: if (fd != -1) close(fd); if (zfd != -1 && zfd != STDOUT_FILENO) close(fd); return -1; } #ifndef SMALL static off_t cat_fd(unsigned char * prepend, size_t count, off_t *gsizep, int fd) { char buf[BUFLEN]; off_t in_tot; ssize_t w; in_tot = count; w = write(STDOUT_FILENO, prepend, count); if (w == -1 || (size_t)w != count) { maybe_warn("write to stdout"); return -1; } for (;;) { ssize_t rv; rv = read(fd, buf, sizeof buf); if (rv == 0) break; if (rv < 0) { maybe_warn("read from fd %d", fd); break; } if (write(STDOUT_FILENO, buf, rv) != rv) { maybe_warn("write to stdout"); break; } in_tot += rv; } if (gsizep) *gsizep = in_tot; return (in_tot); } #endif static void handle_stdin(void) { unsigned char header1[4]; off_t usize, gsize; enum filetype method; ssize_t bytes_read; #ifndef NO_COMPRESS_SUPPORT FILE *in; #endif #ifndef SMALL if (fflag == 0 && lflag == 0 && isatty(STDIN_FILENO)) { maybe_warnx("standard input is a terminal -- ignoring"); return; } #endif if (lflag) { struct stat isb; /* XXX could read the whole file, etc. */ if (fstat(STDIN_FILENO, &isb) < 0) { maybe_warn("fstat"); return; } print_list(STDIN_FILENO, isb.st_size, "stdout", isb.st_mtime); return; } bytes_read = read_retry(STDIN_FILENO, header1, sizeof header1); if (bytes_read == -1) { maybe_warn("can't read stdin"); return; } else if (bytes_read != sizeof(header1)) { maybe_warnx("(stdin): unexpected end of file"); return; } method = file_gettype(header1); switch (method) { default: #ifndef SMALL if (fflag == 0) { maybe_warnx("unknown compression format"); return; } usize = cat_fd(header1, sizeof header1, &gsize, STDIN_FILENO); break; #endif case FT_GZIP: usize = gz_uncompress(STDIN_FILENO, STDOUT_FILENO, (char *)header1, sizeof header1, &gsize, "(stdin)"); break; #ifndef NO_BZIP2_SUPPORT case FT_BZIP2: usize = unbzip2(STDIN_FILENO, STDOUT_FILENO, (char *)header1, sizeof header1, &gsize); break; #endif #ifndef NO_COMPRESS_SUPPORT case FT_Z: if ((in = zdopen(STDIN_FILENO)) == NULL) { maybe_warnx("zopen of stdin"); return; } usize = zuncompress(in, stdout, (char *)header1, sizeof header1, &gsize); fclose(in); break; #endif #ifndef NO_PACK_SUPPORT case FT_PACK: usize = unpack(STDIN_FILENO, STDOUT_FILENO, (char *)header1, sizeof header1, &gsize); break; #endif #ifndef NO_XZ_SUPPORT case FT_XZ: usize = unxz(STDIN_FILENO, STDOUT_FILENO, (char *)header1, sizeof header1, &gsize); break; #endif } #ifndef SMALL if (vflag && !tflag && usize != -1 && gsize != -1) print_verbage(NULL, NULL, usize, gsize); if (vflag && tflag) print_test("(stdin)", usize != -1); #endif } static void handle_stdout(void) { off_t gsize, usize; struct stat sb; time_t systime; uint32_t mtime; int ret; #ifndef SMALL if (fflag == 0 && isatty(STDOUT_FILENO)) { maybe_warnx("standard output is a terminal -- ignoring"); return; } #endif /* If stdin is a file use its mtime, otherwise use current time */ ret = fstat(STDIN_FILENO, &sb); #ifndef SMALL if (ret < 0) { maybe_warn("Can't stat stdin"); return; } #endif if (S_ISREG(sb.st_mode)) mtime = (uint32_t)sb.st_mtime; else { systime = time(NULL); #ifndef SMALL if (systime == -1) { maybe_warn("time"); return; } #endif mtime = (uint32_t)systime; } usize = gz_compress(STDIN_FILENO, STDOUT_FILENO, &gsize, "", mtime); #ifndef SMALL if (vflag && !tflag && usize != -1 && gsize != -1) print_verbage(NULL, NULL, usize, gsize); #endif } /* do what is asked for, for the path name */ static void handle_pathname(char *path) { char *opath = path, *s = NULL; ssize_t len; int slen; struct stat sb; /* check for stdout/stdin */ if (path[0] == '-' && path[1] == '\0') { if (dflag) handle_stdin(); else handle_stdout(); return; } retry: if (stat(path, &sb) != 0 || (fflag == 0 && cflag == 0 && lstat(path, &sb) != 0)) { /* lets try .gz if we're decompressing */ if (dflag && s == NULL && errno == ENOENT) { len = strlen(path); slen = suffixes[0].ziplen; s = malloc(len + slen + 1); if (s == NULL) maybe_err("malloc"); memcpy(s, path, len); memcpy(s + len, suffixes[0].zipped, slen + 1); path = s; goto retry; } maybe_warn("can't stat: %s", opath); goto out; } if (S_ISDIR(sb.st_mode)) { #ifndef SMALL if (rflag) handle_dir(path); else #endif maybe_warnx("%s is a directory", path); goto out; } if (S_ISREG(sb.st_mode)) handle_file(path, &sb); else maybe_warnx("%s is not a regular file", path); out: if (s) free(s); } /* compress/decompress a file */ static void handle_file(char *file, struct stat *sbp) { off_t usize, gsize; char outfile[PATH_MAX]; infile = file; if (dflag) { usize = file_uncompress(file, outfile, sizeof(outfile)); #ifndef SMALL if (vflag && tflag) print_test(file, usize != -1); #endif if (usize == -1) return; gsize = sbp->st_size; } else { gsize = file_compress(file, outfile, sizeof(outfile)); if (gsize == -1) return; usize = sbp->st_size; } #ifndef SMALL if (vflag && !tflag) print_verbage(file, (cflag) ? NULL : outfile, usize, gsize); #endif } #ifndef SMALL /* this is used with -r to recursively descend directories */ static void handle_dir(char *dir) { char *path_argv[2]; FTS *fts; FTSENT *entry; path_argv[0] = dir; path_argv[1] = 0; fts = fts_open(path_argv, FTS_PHYSICAL | FTS_NOCHDIR, NULL); if (fts == NULL) { warn("couldn't fts_open %s", dir); return; } while ((entry = fts_read(fts))) { switch(entry->fts_info) { case FTS_D: case FTS_DP: continue; case FTS_DNR: case FTS_ERR: case FTS_NS: maybe_warn("%s", entry->fts_path); continue; case FTS_F: handle_file(entry->fts_path, entry->fts_statp); } } (void)fts_close(fts); } #endif /* print a ratio - size reduction as a fraction of uncompressed size */ static void print_ratio(off_t in, off_t out, FILE *where) { int percent10; /* 10 * percent */ off_t diff; char buff[8]; int len; diff = in - out/2; if (diff <= 0) /* * Output is more than double size of input! print -99.9% * Quite possibly we've failed to get the original size. */ percent10 = -999; else { /* * We only need 12 bits of result from the final division, * so reduce the values until a 32bit division will suffice. */ while (in > 0x100000) { diff >>= 1; in >>= 1; } if (in != 0) percent10 = ((u_int)diff * 2000) / (u_int)in - 1000; else percent10 = 0; } len = snprintf(buff, sizeof buff, "%2.2d.", percent10); /* Move the '.' to before the last digit */ buff[len - 1] = buff[len - 2]; buff[len - 2] = '.'; fprintf(where, "%5s%%", buff); } #ifndef SMALL /* print compression statistics, and the new name (if there is one!) */ static void print_verbage(const char *file, const char *nfile, off_t usize, off_t gsize) { if (file) fprintf(stderr, "%s:%s ", file, strlen(file) < 7 ? "\t\t" : "\t"); print_ratio(usize, gsize, stderr); if (nfile) fprintf(stderr, " -- replaced with %s", nfile); fprintf(stderr, "\n"); fflush(stderr); } /* print test results */ static void print_test(const char *file, int ok) { if (exit_value == 0 && ok == 0) exit_value = 1; fprintf(stderr, "%s:%s %s\n", file, strlen(file) < 7 ? "\t\t" : "\t", ok ? "OK" : "NOT OK"); fflush(stderr); } #endif /* print a file's info ala --list */ /* eg: compressed uncompressed ratio uncompressed_name 354841 1679360 78.8% /usr/pkgsrc/distfiles/libglade-2.0.1.tar */ static void print_list(int fd, off_t out, const char *outfile, time_t ts) { static int first = 1; #ifndef SMALL static off_t in_tot, out_tot; uint32_t crc = 0; #endif off_t in = 0, rv; if (first) { #ifndef SMALL if (vflag) printf("method crc date time "); #endif if (qflag == 0) printf(" compressed uncompressed " "ratio uncompressed_name\n"); } first = 0; /* print totals? */ #ifndef SMALL if (fd == -1) { in = in_tot; out = out_tot; } else #endif { /* read the last 4 bytes - this is the uncompressed size */ rv = lseek(fd, (off_t)(-8), SEEK_END); if (rv != -1) { unsigned char buf[8]; uint32_t usize; rv = read(fd, (char *)buf, sizeof(buf)); if (rv == -1) maybe_warn("read of uncompressed size"); else if (rv != sizeof(buf)) maybe_warnx("read of uncompressed size"); else { usize = buf[4] | buf[5] << 8 | buf[6] << 16 | buf[7] << 24; in = (off_t)usize; #ifndef SMALL crc = buf[0] | buf[1] << 8 | buf[2] << 16 | buf[3] << 24; #endif } } } #ifndef SMALL if (vflag && fd == -1) printf(" "); else if (vflag) { char *date = ctime(&ts); /* skip the day, 1/100th second, and year */ date += 4; date[12] = 0; printf("%5s %08x %11s ", "defla"/*XXX*/, crc, date); } in_tot += in; out_tot += out; #else (void)&ts; /* XXX */ #endif printf("%12llu %12llu ", (unsigned long long)out, (unsigned long long)in); print_ratio(in, out, stdout); printf(" %s\n", outfile); } /* display the usage of NetBSD gzip */ static void usage(void) { fprintf(stderr, "%s\n", gzip_version); fprintf(stderr, #ifdef SMALL "usage: %s [-" OPT_LIST "] [ [ ...]]\n", #else "usage: %s [-123456789acdfhklLNnqrtVv] [-S .suffix] [ [ ...]]\n" " -1 --fast fastest (worst) compression\n" " -2 .. -8 set compression level\n" " -9 --best best (slowest) compression\n" " -c --stdout write to stdout, keep original files\n" " --to-stdout\n" " -d --decompress uncompress files\n" " --uncompress\n" " -f --force force overwriting & compress links\n" " -h --help display this help\n" " -k --keep don't delete input files during operation\n" " -l --list list compressed file contents\n" " -N --name save or restore original file name and time stamp\n" " -n --no-name don't save original file name or time stamp\n" " -q --quiet output no warnings\n" " -r --recursive recursively compress files in directories\n" " -S .suf use suffix .suf instead of .gz\n" " --suffix .suf\n" " -t --test test compressed file\n" " -V --version display program version\n" " -v --verbose print extra statistics\n", #endif getprogname()); exit(0); } #ifndef SMALL /* display the license information of FreeBSD gzip */ static void display_license(void) { fprintf(stderr, "%s (based on NetBSD gzip 20150113)\n", gzip_version); fprintf(stderr, "%s\n", gzip_copyright); exit(0); } #endif /* display the version of NetBSD gzip */ static void display_version(void) { fprintf(stderr, "%s\n", gzip_version); exit(0); } #ifndef NO_BZIP2_SUPPORT #include "unbzip2.c" #endif #ifndef NO_COMPRESS_SUPPORT #include "zuncompress.c" #endif #ifndef NO_PACK_SUPPORT #include "unpack.c" #endif #ifndef NO_XZ_SUPPORT #include "unxz.c" #endif static ssize_t read_retry(int fd, void *buf, size_t sz) { char *cp = buf; size_t left = MIN(sz, (size_t) SSIZE_MAX); while (left > 0) { ssize_t ret; ret = read(fd, cp, left); if (ret == -1) { return ret; } else if (ret == 0) { break; /* EOF */ } cp += ret; left -= ret; } return sz - left; } Index: user/ngie/more-tests/usr.bin/iconv/iconv.c =================================================================== --- user/ngie/more-tests/usr.bin/iconv/iconv.c (revision 281584) +++ user/ngie/more-tests/usr.bin/iconv/iconv.c (revision 281585) @@ -1,220 +1,219 @@ /* $FreeBSD$ */ /* $NetBSD: iconv.c,v 1.16 2009/02/20 15:28:21 yamt Exp $ */ /*- * Copyright (c)2003 Citrus Project, * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include static int do_conv(FILE *, const char *, const char *, bool, bool); static int do_list(unsigned int, const char * const *, void *); static void usage(void) __dead2; static const struct option long_options[] = { {"from-code", required_argument, NULL, 'f'}, {"list", no_argument, NULL, 'l'}, {"silent", no_argument, NULL, 's'}, {"to-code", required_argument, NULL, 't'}, {NULL, no_argument, NULL, 0} }; static void usage(void) { (void)fprintf(stderr, "Usage:\t%1$s [-cs] -f -t [file ...]\n" "\t%1$s -f [-cs] [-t ] [file ...]\n" "\t%1$s -t [-cs] [-f ] [file ...]\n" "\t%1$s -l\n", getprogname()); exit(1); } #define INBUFSIZE 1024 #define OUTBUFSIZE (INBUFSIZE * 2) static int do_conv(FILE *fp, const char *from, const char *to, bool silent, bool hide_invalid) { iconv_t cd; - char inbuf[INBUFSIZE], outbuf[OUTBUFSIZE], *out; + char inbuf[INBUFSIZE], outbuf[OUTBUFSIZE], *in, *out; unsigned long long invalids; - const char *in; size_t inbytes, outbytes, ret; if ((cd = iconv_open(to, from)) == (iconv_t)-1) err(EXIT_FAILURE, "iconv_open(%s, %s)", to, from); if (hide_invalid) { int arg = 1; if (iconvctl(cd, ICONV_SET_DISCARD_ILSEQ, (void *)&arg) == -1) err(EXIT_FAILURE, NULL); } invalids = 0; while ((inbytes = fread(inbuf, 1, INBUFSIZE, fp)) > 0) { in = inbuf; while (inbytes > 0) { size_t inval; out = outbuf; outbytes = OUTBUFSIZE; ret = __iconv(cd, &in, &inbytes, &out, &outbytes, 0, &inval); invalids += inval; if (outbytes < OUTBUFSIZE) (void)fwrite(outbuf, 1, OUTBUFSIZE - outbytes, stdout); if (ret == (size_t)-1 && errno != E2BIG) { if (errno != EINVAL || in == inbuf) err(EXIT_FAILURE, "iconv()"); /* incomplete input character */ (void)memmove(inbuf, in, inbytes); ret = fread(inbuf + inbytes, 1, INBUFSIZE - inbytes, fp); if (ret == 0) { fflush(stdout); if (feof(fp)) errx(EXIT_FAILURE, "unexpected end of file; " "the last character is " "incomplete."); else err(EXIT_FAILURE, "fread()"); } in = inbuf; inbytes += ret; } } } /* reset the shift state of the output buffer */ outbytes = OUTBUFSIZE; out = outbuf; ret = iconv(cd, NULL, NULL, &out, &outbytes); if (ret == (size_t)-1) err(EXIT_FAILURE, "iconv()"); if (outbytes < OUTBUFSIZE) (void)fwrite(outbuf, 1, OUTBUFSIZE - outbytes, stdout); if (invalids > 0 && !silent) warnx("warning: invalid characters: %llu", invalids); iconv_close(cd); return (invalids > 0); } static int do_list(unsigned int n, const char * const *list, void *data __unused) { unsigned int i; for(i = 0; i < n; i++) { printf("%s", list[i]); if (i < n - 1) printf(" "); } printf("\n"); return (1); } int main(int argc, char **argv) { FILE *fp; char *opt_f, *opt_t; int ch, i, res; bool opt_c = false, opt_s = false; opt_f = opt_t = strdup(""); setlocale(LC_ALL, ""); setprogname(argv[0]); while ((ch = getopt_long(argc, argv, "csLlf:t:", long_options, NULL)) != -1) { switch (ch) { case 'c': opt_c = true; break; case 's': opt_s = true; break; case 'l': /* list */ if (opt_s || opt_c || strcmp(opt_f, "") != 0 || strcmp(opt_t, "") != 0) { warnx("-l is not allowed with other flags."); usage(); } iconvlist(do_list, NULL); return (EXIT_SUCCESS); case 'f': /* from */ if (optarg != NULL) opt_f = strdup(optarg); break; case 't': /* to */ if (optarg != NULL) opt_t = strdup(optarg); break; default: usage(); } } argc -= optind; argv += optind; if ((strcmp(opt_f, "") == 0) && (strcmp(opt_t, "") == 0)) usage(); if (argc == 0) res = do_conv(stdin, opt_f, opt_t, opt_s, opt_c); else { res = 0; for (i = 0; i < argc; i++) { fp = (strcmp(argv[i], "-") != 0) ? fopen(argv[i], "r") : stdin; if (fp == NULL) err(EXIT_FAILURE, "Cannot open `%s'", argv[i]); res |= do_conv(fp, opt_f, opt_t, opt_s, opt_c); (void)fclose(fp); } } return (res == 0 ? EXIT_SUCCESS : EXIT_FAILURE); } Index: user/ngie/more-tests/usr.bin/ipcs/ipc.c =================================================================== --- user/ngie/more-tests/usr.bin/ipcs/ipc.c (revision 281584) +++ user/ngie/more-tests/usr.bin/ipcs/ipc.c (revision 281585) @@ -1,206 +1,205 @@ /* * Copyright (c) 1994 SigmaSoft, Th. Lockert * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * The split of ipcs.c into ipcs.c and ipc.c to accommodate the * changes in ipcrm.c was done by Edwin Groothuis */ #include __FBSDID("$FreeBSD$"); #include #include #define _KERNEL #include #include #include #undef _KERNEL #include #include #include #include #include #include #include "ipc.h" int use_sysctl = 1; struct semid_kernel *sema; struct seminfo seminfo; struct msginfo msginfo; struct msqid_kernel *msqids; struct shminfo shminfo; struct shmid_kernel *shmsegs; -void kget(int idx, void *addr, size_t size); struct nlist symbols[] = { { .n_name = "sema" }, { .n_name = "seminfo" }, { .n_name = "msginfo" }, { .n_name = "msqids" }, { .n_name = "shminfo" }, { .n_name = "shmsegs" }, { .n_name = NULL } }; #define SHMINFO_XVEC X(shmmax, sizeof(u_long)) \ X(shmmin, sizeof(u_long)) \ X(shmmni, sizeof(u_long)) \ X(shmseg, sizeof(u_long)) \ X(shmall, sizeof(u_long)) #define SEMINFO_XVEC X(semmni, sizeof(int)) \ X(semmns, sizeof(int)) \ X(semmnu, sizeof(int)) \ X(semmsl, sizeof(int)) \ X(semopm, sizeof(int)) \ X(semume, sizeof(int)) \ X(semusz, sizeof(int)) \ X(semvmx, sizeof(int)) \ X(semaem, sizeof(int)) #define MSGINFO_XVEC X(msgmax, sizeof(int)) \ X(msgmni, sizeof(int)) \ X(msgmnb, sizeof(int)) \ X(msgtql, sizeof(int)) \ X(msgssz, sizeof(int)) \ X(msgseg, sizeof(int)) #define X(a, b) { "kern.ipc." #a, offsetof(TYPEC, a), (b) }, #define TYPEC struct shminfo static struct scgs_vector shminfo_scgsv[] = { SHMINFO_XVEC { .sysctl=NULL } }; #undef TYPEC #define TYPEC struct seminfo static struct scgs_vector seminfo_scgsv[] = { SEMINFO_XVEC { .sysctl=NULL } }; #undef TYPEC #define TYPEC struct msginfo static struct scgs_vector msginfo_scgsv[] = { MSGINFO_XVEC { .sysctl=NULL } }; #undef TYPEC #undef X kvm_t *kd; void sysctlgatherstruct(void *addr, size_t size, struct scgs_vector *vecarr) { struct scgs_vector *xp; size_t tsiz; int rv; for (xp = vecarr; xp->sysctl != NULL; xp++) { assert(xp->offset <= size); tsiz = xp->size; rv = sysctlbyname(xp->sysctl, (char *)addr + xp->offset, &tsiz, NULL, 0); if (rv == -1) err(1, "sysctlbyname: %s", xp->sysctl); if (tsiz != xp->size) errx(1, "%s size mismatch (expected %zu, got %zu)", xp->sysctl, xp->size, tsiz); } } void kget(int idx, void *addr, size_t size) { const char *symn; /* symbol name */ size_t tsiz; int rv; unsigned long kaddr; const char *sym2sysctl[] = { /* symbol to sysctl name table */ "kern.ipc.sema", "kern.ipc.seminfo", "kern.ipc.msginfo", "kern.ipc.msqids", "kern.ipc.shminfo", "kern.ipc.shmsegs" }; assert((unsigned)idx <= sizeof(sym2sysctl) / sizeof(*sym2sysctl)); if (!use_sysctl) { symn = symbols[idx].n_name; if (*symn == '_') symn++; if (symbols[idx].n_type == 0 || symbols[idx].n_value == 0) errx(1, "symbol %s undefined", symn); /* * For some symbols, the value we retrieve is * actually a pointer; since we want the actual value, * we have to manually dereference it. */ switch (idx) { case X_MSQIDS: tsiz = sizeof(msqids); rv = kvm_read(kd, symbols[idx].n_value, &msqids, tsiz); kaddr = (u_long)msqids; break; case X_SHMSEGS: tsiz = sizeof(shmsegs); rv = kvm_read(kd, symbols[idx].n_value, &shmsegs, tsiz); kaddr = (u_long)shmsegs; break; case X_SEMA: tsiz = sizeof(sema); rv = kvm_read(kd, symbols[idx].n_value, &sema, tsiz); kaddr = (u_long)sema; break; default: rv = tsiz = 0; kaddr = symbols[idx].n_value; break; } if ((unsigned)rv != tsiz) errx(1, "%s: %s", symn, kvm_geterr(kd)); if ((unsigned)kvm_read(kd, kaddr, addr, size) != size) errx(1, "%s: %s", symn, kvm_geterr(kd)); } else { switch (idx) { case X_SHMINFO: sysctlgatherstruct(addr, size, shminfo_scgsv); break; case X_SEMINFO: sysctlgatherstruct(addr, size, seminfo_scgsv); break; case X_MSGINFO: sysctlgatherstruct(addr, size, msginfo_scgsv); break; default: tsiz = size; rv = sysctlbyname(sym2sysctl[idx], addr, &tsiz, NULL, 0); if (rv == -1) err(1, "sysctlbyname: %s", sym2sysctl[idx]); if (tsiz != size) errx(1, "%s size mismatch " "(expected %zu, got %zu)", sym2sysctl[idx], size, tsiz); break; } } } Index: user/ngie/more-tests/usr.bin/ipcs/ipc.h =================================================================== --- user/ngie/more-tests/usr.bin/ipcs/ipc.h (revision 281584) +++ user/ngie/more-tests/usr.bin/ipcs/ipc.h (revision 281585) @@ -1,71 +1,68 @@ /* * Copyright (c) 1994 SigmaSoft, Th. Lockert * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * The split of ipcs.c into ipcs.c and ipc.c to accommodate the * changes in ipcrm.c was done by Edwin Groothuis * * $FreeBSD$ */ /* Part of struct nlist symbols[] */ #define X_SEMA 0 #define X_SEMINFO 1 #define X_MSGINFO 2 #define X_MSQIDS 3 #define X_SHMINFO 4 #define X_SHMSEGS 5 #define SHMINFO 1 #define SHMTOTAL 2 #define MSGINFO 4 #define MSGTOTAL 8 #define SEMINFO 16 #define SEMTOTAL 32 #define IPC_TO_STR(x) (x == 'Q' ? "msq" : (x == 'M' ? "shm" : "sem")) #define IPC_TO_STRING(x) (x == 'Q' ? "message queue" : \ (x == 'M' ? "shared memory segment" : "semaphore")) /* SysCtlGatherStruct structure. */ struct scgs_vector { const char *sysctl; size_t offset; size_t size; }; void kget(int idx, void *addr, size_t size); void sysctlgatherstruct(void *addr, size_t size, struct scgs_vector *vec); extern int use_sysctl; extern struct nlist symbols[]; extern kvm_t *kd; extern struct semid_kernel *sema; -extern struct seminfo seminfo; -extern struct msginfo msginfo; extern struct msqid_kernel *msqids; -extern struct shminfo shminfo; extern struct shmid_kernel *shmsegs; Index: user/ngie/more-tests/usr.bin/lockf/lockf.c =================================================================== --- user/ngie/more-tests/usr.bin/lockf/lockf.c (revision 281584) +++ user/ngie/more-tests/usr.bin/lockf/lockf.c (revision 281585) @@ -1,241 +1,241 @@ /* * Copyright (C) 1997 John D. Polstra. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY JOHN D. POLSTRA AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL JOHN D. POLSTRA OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include static int acquire_lock(const char *name, int flags); static void cleanup(void); static void killed(int sig); static void timeout(int sig); static void usage(void); static void wait_for_lock(const char *name); static const char *lockname; static int lockfd = -1; static int keep; static volatile sig_atomic_t timed_out; /* * Execute an arbitrary command while holding a file lock. */ int main(int argc, char **argv) { int ch, flags, silent, status, waitsec; pid_t child; silent = keep = 0; flags = O_CREAT; waitsec = -1; /* Infinite. */ while ((ch = getopt(argc, argv, "sknt:")) != -1) { switch (ch) { case 'k': keep = 1; break; case 'n': flags &= ~O_CREAT; break; case 's': silent = 1; break; case 't': { char *endptr; waitsec = strtol(optarg, &endptr, 0); if (*optarg == '\0' || *endptr != '\0' || waitsec < 0) errx(EX_USAGE, "invalid timeout \"%s\"", optarg); } break; default: usage(); } } if (argc - optind < 2) usage(); lockname = argv[optind++]; argc -= optind; argv += optind; if (waitsec > 0) { /* Set up a timeout. */ struct sigaction act; act.sa_handler = timeout; sigemptyset(&act.sa_mask); act.sa_flags = 0; /* Note that we do not set SA_RESTART. */ sigaction(SIGALRM, &act, NULL); alarm(waitsec); } /* * If the "-k" option is not given, then we must not block when * acquiring the lock. If we did, then the lock holder would * unlink the file upon releasing the lock, and we would acquire * a lock on a file with no directory entry. Then another * process could come along and acquire the same lock. To avoid * this problem, we separate out the actions of waiting for the * lock to be available and of actually acquiring the lock. * * That approach produces behavior that is technically correct; * however, it causes some performance & ordering problems for * locks that have a lot of contention. First, it is unfair in * the sense that a released lock isn't necessarily granted to * the process that has been waiting the longest. A waiter may * be starved out indefinitely. Second, it creates a thundering * herd situation each time the lock is released. * * When the "-k" option is used, the unlink race no longer * exists. In that case we can block while acquiring the lock, * avoiding the separate step of waiting for the lock. This * yields fairness and improved performance. */ lockfd = acquire_lock(lockname, flags | O_NONBLOCK); while (lockfd == -1 && !timed_out && waitsec != 0) { if (keep) lockfd = acquire_lock(lockname, flags); else { wait_for_lock(lockname); lockfd = acquire_lock(lockname, flags | O_NONBLOCK); } } if (waitsec > 0) alarm(0); if (lockfd == -1) { /* We failed to acquire the lock. */ if (silent) exit(EX_TEMPFAIL); errx(EX_TEMPFAIL, "%s: already locked", lockname); } /* At this point, we own the lock. */ if (atexit(cleanup) == -1) err(EX_OSERR, "atexit failed"); if ((child = fork()) == -1) err(EX_OSERR, "cannot fork"); if (child == 0) { /* The child process. */ close(lockfd); execvp(argv[0], argv); warn("%s", argv[0]); _exit(1); } /* This is the parent process. */ signal(SIGINT, SIG_IGN); signal(SIGQUIT, SIG_IGN); signal(SIGTERM, killed); if (waitpid(child, &status, 0) == -1) err(EX_OSERR, "waitpid failed"); return (WIFEXITED(status) ? WEXITSTATUS(status) : EX_SOFTWARE); } /* * Try to acquire a lock on the given file, creating the file if * necessary. The flags argument is O_NONBLOCK or 0, depending on * whether we should wait for the lock. Returns an open file descriptor * on success, or -1 on failure. */ static int acquire_lock(const char *name, int flags) { int fd; - if ((fd = open(name, flags|O_RDONLY|O_EXLOCK|flags, 0666)) == -1) { + if ((fd = open(name, O_RDONLY|O_EXLOCK|flags, 0666)) == -1) { if (errno == EAGAIN || errno == EINTR) return (-1); err(EX_CANTCREAT, "cannot open %s", name); } return (fd); } /* * Remove the lock file. */ static void cleanup(void) { if (keep) flock(lockfd, LOCK_UN); else unlink(lockname); } /* * Signal handler for SIGTERM. Cleans up the lock file, then re-raises * the signal. */ static void killed(int sig) { cleanup(); signal(sig, SIG_DFL); if (kill(getpid(), sig) == -1) err(EX_OSERR, "kill failed"); } /* * Signal handler for SIGALRM. */ static void timeout(int sig __unused) { timed_out = 1; } static void usage(void) { fprintf(stderr, "usage: lockf [-kns] [-t seconds] file command [arguments]\n"); exit(EX_USAGE); } /* * Wait until it might be possible to acquire a lock on the given file. * If the file does not exist, return immediately without creating it. */ static void wait_for_lock(const char *name) { int fd; if ((fd = open(name, O_RDONLY|O_EXLOCK, 0666)) == -1) { if (errno == ENOENT || errno == EINTR) return; err(EX_CANTCREAT, "cannot open %s", name); } close(fd); } Index: user/ngie/more-tests/usr.sbin/bhyve/bhyverun.c =================================================================== --- user/ngie/more-tests/usr.sbin/bhyve/bhyverun.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/bhyve/bhyverun.c (revision 281585) @@ -1,885 +1,887 @@ /*- * Copyright (c) 2011 NetApp, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY NETAPP, INC ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL NETAPP, INC OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "bhyverun.h" #include "acpi.h" #include "inout.h" #include "dbgport.h" #include "ioapic.h" #include "mem.h" #include "mevent.h" #include "mptbl.h" #include "pci_emul.h" #include "pci_irq.h" #include "pci_lpc.h" #include "smbiostbl.h" #include "xmsr.h" #include "spinup_ap.h" #include "rtc.h" #define GUEST_NIO_PORT 0x488 /* guest upcalls via i/o port */ #define MB (1024UL * 1024) #define GB (1024UL * MB) typedef int (*vmexit_handler_t)(struct vmctx *, struct vm_exit *, int *vcpu); extern int vmexit_task_switch(struct vmctx *, struct vm_exit *, int *vcpu); char *vmname; int guest_ncpus; char *guest_uuid_str; static int guest_vmexit_on_hlt, guest_vmexit_on_pause; static int virtio_msix = 1; static int x2apic_mode = 0; /* default is xAPIC */ static int strictio; static int strictmsr = 1; static int acpi; static char *progname; static const int BSP = 0; static cpuset_t cpumask; static void vm_loop(struct vmctx *ctx, int vcpu, uint64_t rip); static struct vm_exit vmexit[VM_MAXCPU]; struct bhyvestats { uint64_t vmexit_bogus; uint64_t vmexit_bogus_switch; uint64_t vmexit_hlt; uint64_t vmexit_pause; uint64_t vmexit_mtrap; uint64_t vmexit_inst_emul; uint64_t cpu_switch_rotate; uint64_t cpu_switch_direct; } stats; struct mt_vmm_info { pthread_t mt_thr; struct vmctx *mt_ctx; int mt_vcpu; } mt_vmm_info[VM_MAXCPU]; static cpuset_t *vcpumap[VM_MAXCPU] = { NULL }; static void usage(int code) { fprintf(stderr, "Usage: %s [-abehuwxACHPWY] [-c vcpus] [-g ] [-l ]\n" " %*s [-m mem] [-p vcpu:hostcpu] [-s ] [-U uuid] \n" " -a: local apic is in xAPIC mode (deprecated)\n" " -A: create ACPI tables\n" " -c: # cpus (default 1)\n" " -C: include guest memory in core file\n" " -e: exit on unhandled I/O access\n" " -g: gdb port\n" " -h: help\n" " -H: vmexit from the guest on hlt\n" " -l: LPC device configuration\n" " -m: memory size in MB\n" " -p: pin 'vcpu' to 'hostcpu'\n" " -P: vmexit from the guest on pause\n" " -s: PCI slot config\n" " -u: RTC keeps UTC time\n" " -U: uuid\n" " -w: ignore unimplemented MSRs\n" " -W: force virtio to use single-vector MSI\n" " -x: local apic is in x2APIC mode\n" " -Y: disable MPtable generation\n", progname, (int)strlen(progname), ""); exit(code); } static int pincpu_parse(const char *opt) { int vcpu, pcpu; if (sscanf(opt, "%d:%d", &vcpu, &pcpu) != 2) { fprintf(stderr, "invalid format: %s\n", opt); return (-1); } if (vcpu < 0 || vcpu >= VM_MAXCPU) { fprintf(stderr, "vcpu '%d' outside valid range from 0 to %d\n", vcpu, VM_MAXCPU - 1); return (-1); } if (pcpu < 0 || pcpu >= CPU_SETSIZE) { fprintf(stderr, "hostcpu '%d' outside valid range from " "0 to %d\n", pcpu, CPU_SETSIZE - 1); return (-1); } if (vcpumap[vcpu] == NULL) { if ((vcpumap[vcpu] = malloc(sizeof(cpuset_t))) == NULL) { perror("malloc"); return (-1); } CPU_ZERO(vcpumap[vcpu]); } CPU_SET(pcpu, vcpumap[vcpu]); return (0); } void vm_inject_fault(void *arg, int vcpu, int vector, int errcode_valid, int errcode) { struct vmctx *ctx; int error, restart_instruction; ctx = arg; restart_instruction = 1; error = vm_inject_exception(ctx, vcpu, vector, errcode_valid, errcode, restart_instruction); assert(error == 0); } void * paddr_guest2host(struct vmctx *ctx, uintptr_t gaddr, size_t len) { return (vm_map_gpa(ctx, gaddr, len)); } int fbsdrun_vmexit_on_pause(void) { return (guest_vmexit_on_pause); } int fbsdrun_vmexit_on_hlt(void) { return (guest_vmexit_on_hlt); } int fbsdrun_virtio_msix(void) { return (virtio_msix); } static void * fbsdrun_start_thread(void *param) { char tname[MAXCOMLEN + 1]; struct mt_vmm_info *mtp; int vcpu; mtp = param; vcpu = mtp->mt_vcpu; snprintf(tname, sizeof(tname), "vcpu %d", vcpu); pthread_set_name_np(mtp->mt_thr, tname); vm_loop(mtp->mt_ctx, vcpu, vmexit[vcpu].rip); /* not reached */ exit(1); return (NULL); } void fbsdrun_addcpu(struct vmctx *ctx, int fromcpu, int newcpu, uint64_t rip) { int error; assert(fromcpu == BSP); /* * The 'newcpu' must be activated in the context of 'fromcpu'. If * vm_activate_cpu() is delayed until newcpu's pthread starts running * then vmm.ko is out-of-sync with bhyve and this can create a race * with vm_suspend(). */ error = vm_activate_cpu(ctx, newcpu); assert(error == 0); CPU_SET_ATOMIC(newcpu, &cpumask); /* * Set up the vmexit struct to allow execution to start * at the given RIP */ vmexit[newcpu].rip = rip; vmexit[newcpu].inst_length = 0; mt_vmm_info[newcpu].mt_ctx = ctx; mt_vmm_info[newcpu].mt_vcpu = newcpu; error = pthread_create(&mt_vmm_info[newcpu].mt_thr, NULL, fbsdrun_start_thread, &mt_vmm_info[newcpu]); assert(error == 0); } static int fbsdrun_deletecpu(struct vmctx *ctx, int vcpu) { if (!CPU_ISSET(vcpu, &cpumask)) { fprintf(stderr, "Attempting to delete unknown cpu %d\n", vcpu); exit(1); } CPU_CLR_ATOMIC(vcpu, &cpumask); return (CPU_EMPTY(&cpumask)); } static int vmexit_handle_notify(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu, uint32_t eax) { #if BHYVE_DEBUG /* * put guest-driven debug here */ #endif return (VMEXIT_CONTINUE); } static int vmexit_inout(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu) { int error; int bytes, port, in, out, string; int vcpu; vcpu = *pvcpu; port = vme->u.inout.port; bytes = vme->u.inout.bytes; string = vme->u.inout.string; in = vme->u.inout.in; out = !in; /* Extra-special case of host notifications */ if (out && port == GUEST_NIO_PORT) { error = vmexit_handle_notify(ctx, vme, pvcpu, vme->u.inout.eax); return (error); } error = emulate_inout(ctx, vcpu, vme, strictio); if (error) { - fprintf(stderr, "Unhandled %s%c 0x%04x\n", in ? "in" : "out", - bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'), port); + fprintf(stderr, "Unhandled %s%c 0x%04x at 0x%lx\n", + in ? "in" : "out", + bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'), + port, vmexit->rip); return (VMEXIT_ABORT); } else { return (VMEXIT_CONTINUE); } } static int vmexit_rdmsr(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu) { uint64_t val; uint32_t eax, edx; int error; val = 0; error = emulate_rdmsr(ctx, *pvcpu, vme->u.msr.code, &val); if (error != 0) { fprintf(stderr, "rdmsr to register %#x on vcpu %d\n", vme->u.msr.code, *pvcpu); if (strictmsr) { vm_inject_gp(ctx, *pvcpu); return (VMEXIT_CONTINUE); } } eax = val; error = vm_set_register(ctx, *pvcpu, VM_REG_GUEST_RAX, eax); assert(error == 0); edx = val >> 32; error = vm_set_register(ctx, *pvcpu, VM_REG_GUEST_RDX, edx); assert(error == 0); return (VMEXIT_CONTINUE); } static int vmexit_wrmsr(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu) { int error; error = emulate_wrmsr(ctx, *pvcpu, vme->u.msr.code, vme->u.msr.wval); if (error != 0) { fprintf(stderr, "wrmsr to register %#x(%#lx) on vcpu %d\n", vme->u.msr.code, vme->u.msr.wval, *pvcpu); if (strictmsr) { vm_inject_gp(ctx, *pvcpu); return (VMEXIT_CONTINUE); } } return (VMEXIT_CONTINUE); } static int vmexit_spinup_ap(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu) { int newcpu; int retval = VMEXIT_CONTINUE; newcpu = spinup_ap(ctx, *pvcpu, vme->u.spinup_ap.vcpu, vme->u.spinup_ap.rip); return (retval); } #define DEBUG_EPT_MISCONFIG #ifdef DEBUG_EPT_MISCONFIG #define EXIT_REASON_EPT_MISCONFIG 49 #define VMCS_GUEST_PHYSICAL_ADDRESS 0x00002400 #define VMCS_IDENT(x) ((x) | 0x80000000) static uint64_t ept_misconfig_gpa, ept_misconfig_pte[4]; static int ept_misconfig_ptenum; #endif static int vmexit_vmx(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { fprintf(stderr, "vm exit[%d]\n", *pvcpu); fprintf(stderr, "\treason\t\tVMX\n"); fprintf(stderr, "\trip\t\t0x%016lx\n", vmexit->rip); fprintf(stderr, "\tinst_length\t%d\n", vmexit->inst_length); fprintf(stderr, "\tstatus\t\t%d\n", vmexit->u.vmx.status); fprintf(stderr, "\texit_reason\t%u\n", vmexit->u.vmx.exit_reason); fprintf(stderr, "\tqualification\t0x%016lx\n", vmexit->u.vmx.exit_qualification); fprintf(stderr, "\tinst_type\t\t%d\n", vmexit->u.vmx.inst_type); fprintf(stderr, "\tinst_error\t\t%d\n", vmexit->u.vmx.inst_error); #ifdef DEBUG_EPT_MISCONFIG if (vmexit->u.vmx.exit_reason == EXIT_REASON_EPT_MISCONFIG) { vm_get_register(ctx, *pvcpu, VMCS_IDENT(VMCS_GUEST_PHYSICAL_ADDRESS), &ept_misconfig_gpa); vm_get_gpa_pmap(ctx, ept_misconfig_gpa, ept_misconfig_pte, &ept_misconfig_ptenum); fprintf(stderr, "\tEPT misconfiguration:\n"); fprintf(stderr, "\t\tGPA: %#lx\n", ept_misconfig_gpa); fprintf(stderr, "\t\tPTE(%d): %#lx %#lx %#lx %#lx\n", ept_misconfig_ptenum, ept_misconfig_pte[0], ept_misconfig_pte[1], ept_misconfig_pte[2], ept_misconfig_pte[3]); } #endif /* DEBUG_EPT_MISCONFIG */ return (VMEXIT_ABORT); } static int vmexit_svm(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { fprintf(stderr, "vm exit[%d]\n", *pvcpu); fprintf(stderr, "\treason\t\tSVM\n"); fprintf(stderr, "\trip\t\t0x%016lx\n", vmexit->rip); fprintf(stderr, "\tinst_length\t%d\n", vmexit->inst_length); fprintf(stderr, "\texitcode\t%#lx\n", vmexit->u.svm.exitcode); fprintf(stderr, "\texitinfo1\t%#lx\n", vmexit->u.svm.exitinfo1); fprintf(stderr, "\texitinfo2\t%#lx\n", vmexit->u.svm.exitinfo2); return (VMEXIT_ABORT); } static int vmexit_bogus(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { assert(vmexit->inst_length == 0); stats.vmexit_bogus++; return (VMEXIT_CONTINUE); } static int vmexit_hlt(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { stats.vmexit_hlt++; /* * Just continue execution with the next instruction. We use * the HLT VM exit as a way to be friendly with the host * scheduler. */ return (VMEXIT_CONTINUE); } static int vmexit_pause(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { stats.vmexit_pause++; return (VMEXIT_CONTINUE); } static int vmexit_mtrap(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { assert(vmexit->inst_length == 0); stats.vmexit_mtrap++; return (VMEXIT_CONTINUE); } static int vmexit_inst_emul(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { int err, i; struct vie *vie; stats.vmexit_inst_emul++; vie = &vmexit->u.inst_emul.vie; err = emulate_mem(ctx, *pvcpu, vmexit->u.inst_emul.gpa, vie, &vmexit->u.inst_emul.paging); if (err) { if (err == ESRCH) { fprintf(stderr, "Unhandled memory access to 0x%lx\n", vmexit->u.inst_emul.gpa); } fprintf(stderr, "Failed to emulate instruction ["); for (i = 0; i < vie->num_valid; i++) { fprintf(stderr, "0x%02x%s", vie->inst[i], i != (vie->num_valid - 1) ? " " : ""); } fprintf(stderr, "] at 0x%lx\n", vmexit->rip); return (VMEXIT_ABORT); } return (VMEXIT_CONTINUE); } static pthread_mutex_t resetcpu_mtx = PTHREAD_MUTEX_INITIALIZER; static pthread_cond_t resetcpu_cond = PTHREAD_COND_INITIALIZER; static int vmexit_suspend(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu) { enum vm_suspend_how how; how = vmexit->u.suspended.how; fbsdrun_deletecpu(ctx, *pvcpu); if (*pvcpu != BSP) { pthread_mutex_lock(&resetcpu_mtx); pthread_cond_signal(&resetcpu_cond); pthread_mutex_unlock(&resetcpu_mtx); pthread_exit(NULL); } pthread_mutex_lock(&resetcpu_mtx); while (!CPU_EMPTY(&cpumask)) { pthread_cond_wait(&resetcpu_cond, &resetcpu_mtx); } pthread_mutex_unlock(&resetcpu_mtx); switch (how) { case VM_SUSPEND_RESET: exit(0); case VM_SUSPEND_POWEROFF: exit(1); case VM_SUSPEND_HALT: exit(2); case VM_SUSPEND_TRIPLEFAULT: exit(3); default: fprintf(stderr, "vmexit_suspend: invalid reason %d\n", how); exit(100); } return (0); /* NOTREACHED */ } static vmexit_handler_t handler[VM_EXITCODE_MAX] = { [VM_EXITCODE_INOUT] = vmexit_inout, [VM_EXITCODE_INOUT_STR] = vmexit_inout, [VM_EXITCODE_VMX] = vmexit_vmx, [VM_EXITCODE_SVM] = vmexit_svm, [VM_EXITCODE_BOGUS] = vmexit_bogus, [VM_EXITCODE_RDMSR] = vmexit_rdmsr, [VM_EXITCODE_WRMSR] = vmexit_wrmsr, [VM_EXITCODE_MTRAP] = vmexit_mtrap, [VM_EXITCODE_INST_EMUL] = vmexit_inst_emul, [VM_EXITCODE_SPINUP_AP] = vmexit_spinup_ap, [VM_EXITCODE_SUSPENDED] = vmexit_suspend, [VM_EXITCODE_TASK_SWITCH] = vmexit_task_switch, }; static void vm_loop(struct vmctx *ctx, int vcpu, uint64_t startrip) { int error, rc, prevcpu; enum vm_exitcode exitcode; cpuset_t active_cpus; if (vcpumap[vcpu] != NULL) { error = pthread_setaffinity_np(pthread_self(), sizeof(cpuset_t), vcpumap[vcpu]); assert(error == 0); } error = vm_active_cpus(ctx, &active_cpus); assert(CPU_ISSET(vcpu, &active_cpus)); error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RIP, startrip); assert(error == 0); while (1) { error = vm_run(ctx, vcpu, &vmexit[vcpu]); if (error != 0) break; prevcpu = vcpu; exitcode = vmexit[vcpu].exitcode; if (exitcode >= VM_EXITCODE_MAX || handler[exitcode] == NULL) { fprintf(stderr, "vm_loop: unexpected exitcode 0x%x\n", exitcode); exit(1); } rc = (*handler[exitcode])(ctx, &vmexit[vcpu], &vcpu); switch (rc) { case VMEXIT_CONTINUE: break; case VMEXIT_ABORT: abort(); default: exit(1); } } fprintf(stderr, "vm_run error %d, errno %d\n", error, errno); } static int num_vcpus_allowed(struct vmctx *ctx) { int tmp, error; error = vm_get_capability(ctx, BSP, VM_CAP_UNRESTRICTED_GUEST, &tmp); /* * The guest is allowed to spinup more than one processor only if the * UNRESTRICTED_GUEST capability is available. */ if (error == 0) return (VM_MAXCPU); else return (1); } void fbsdrun_set_capabilities(struct vmctx *ctx, int cpu) { int err, tmp; if (fbsdrun_vmexit_on_hlt()) { err = vm_get_capability(ctx, cpu, VM_CAP_HALT_EXIT, &tmp); if (err < 0) { fprintf(stderr, "VM exit on HLT not supported\n"); exit(1); } vm_set_capability(ctx, cpu, VM_CAP_HALT_EXIT, 1); if (cpu == BSP) handler[VM_EXITCODE_HLT] = vmexit_hlt; } if (fbsdrun_vmexit_on_pause()) { /* * pause exit support required for this mode */ err = vm_get_capability(ctx, cpu, VM_CAP_PAUSE_EXIT, &tmp); if (err < 0) { fprintf(stderr, "SMP mux requested, no pause support\n"); exit(1); } vm_set_capability(ctx, cpu, VM_CAP_PAUSE_EXIT, 1); if (cpu == BSP) handler[VM_EXITCODE_PAUSE] = vmexit_pause; } if (x2apic_mode) err = vm_set_x2apic_state(ctx, cpu, X2APIC_ENABLED); else err = vm_set_x2apic_state(ctx, cpu, X2APIC_DISABLED); if (err) { fprintf(stderr, "Unable to set x2apic state (%d)\n", err); exit(1); } vm_set_capability(ctx, cpu, VM_CAP_ENABLE_INVPCID, 1); } int main(int argc, char *argv[]) { int c, error, gdb_port, err, bvmcons; int dump_guest_memory, max_vcpus, mptgen; int rtc_localtime; struct vmctx *ctx; uint64_t rip; size_t memsize; bvmcons = 0; dump_guest_memory = 0; progname = basename(argv[0]); gdb_port = 0; guest_ncpus = 1; memsize = 256 * MB; mptgen = 1; rtc_localtime = 1; while ((c = getopt(argc, argv, "abehuwxACHIPWYp:g:c:s:m:l:U:")) != -1) { switch (c) { case 'a': x2apic_mode = 0; break; case 'A': acpi = 1; break; case 'b': bvmcons = 1; break; case 'p': if (pincpu_parse(optarg) != 0) { errx(EX_USAGE, "invalid vcpu pinning " "configuration '%s'", optarg); } break; case 'c': guest_ncpus = atoi(optarg); break; case 'C': dump_guest_memory = 1; break; case 'g': gdb_port = atoi(optarg); break; case 'l': if (lpc_device_parse(optarg) != 0) { errx(EX_USAGE, "invalid lpc device " "configuration '%s'", optarg); } break; case 's': if (pci_parse_slot(optarg) != 0) exit(1); else break; case 'm': error = vm_parse_memsize(optarg, &memsize); if (error) errx(EX_USAGE, "invalid memsize '%s'", optarg); break; case 'H': guest_vmexit_on_hlt = 1; break; case 'I': /* * The "-I" option was used to add an ioapic to the * virtual machine. * * An ioapic is now provided unconditionally for each * virtual machine and this option is now deprecated. */ break; case 'P': guest_vmexit_on_pause = 1; break; case 'e': strictio = 1; break; case 'u': rtc_localtime = 0; break; case 'U': guest_uuid_str = optarg; break; case 'w': strictmsr = 0; break; case 'W': virtio_msix = 0; break; case 'x': x2apic_mode = 1; break; case 'Y': mptgen = 0; break; case 'h': usage(0); default: usage(1); } } argc -= optind; argv += optind; if (argc != 1) usage(1); vmname = argv[0]; ctx = vm_open(vmname); if (ctx == NULL) { perror("vm_open"); exit(1); } max_vcpus = num_vcpus_allowed(ctx); if (guest_ncpus > max_vcpus) { fprintf(stderr, "%d vCPUs requested but only %d available\n", guest_ncpus, max_vcpus); exit(1); } fbsdrun_set_capabilities(ctx, BSP); if (dump_guest_memory) vm_set_memflags(ctx, VM_MEM_F_INCORE); err = vm_setup_memory(ctx, memsize, VM_MMAP_ALL); if (err) { fprintf(stderr, "Unable to setup memory (%d)\n", err); exit(1); } error = init_msr(); if (error) { fprintf(stderr, "init_msr error %d", error); exit(1); } init_mem(); init_inout(); pci_irq_init(ctx); ioapic_init(ctx); rtc_init(ctx, rtc_localtime); sci_init(ctx); /* * Exit if a device emulation finds an error in it's initilization */ if (init_pci(ctx) != 0) exit(1); if (gdb_port != 0) init_dbgport(gdb_port); if (bvmcons) init_bvmcons(); error = vm_get_register(ctx, BSP, VM_REG_GUEST_RIP, &rip); assert(error == 0); /* * build the guest tables, MP etc. */ if (mptgen) { error = mptable_build(ctx, guest_ncpus); if (error) exit(1); } error = smbios_build(ctx); assert(error == 0); if (acpi) { error = acpi_build(ctx, guest_ncpus); assert(error == 0); } /* * Change the proc title to include the VM name. */ setproctitle("%s", vmname); /* * Add CPU 0 */ fbsdrun_addcpu(ctx, BSP, BSP, rip); /* * Head off to the main event dispatch loop */ mevent_dispatch(); exit(1); } Index: user/ngie/more-tests/usr.sbin/bhyve =================================================================== --- user/ngie/more-tests/usr.sbin/bhyve (revision 281584) +++ user/ngie/more-tests/usr.sbin/bhyve (revision 281585) Property changes on: user/ngie/more-tests/usr.sbin/bhyve ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/usr.sbin/bhyve:r281414-281584 Index: user/ngie/more-tests/usr.sbin/bhyvectl/bhyvectl.c =================================================================== --- user/ngie/more-tests/usr.sbin/bhyvectl/bhyvectl.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/bhyvectl/bhyvectl.c (revision 281585) @@ -1,2142 +1,2142 @@ /*- * Copyright (c) 2011 NetApp, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY NETAPP, INC ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL NETAPP, INC OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "amd/vmcb.h" #include "intel/vmcs.h" #define MB (1UL << 20) #define GB (1UL << 30) #define REQ_ARG required_argument #define NO_ARG no_argument #define OPT_ARG optional_argument static const char *progname; static void usage(bool cpu_intel) { (void)fprintf(stderr, "Usage: %s --vm=\n" " [--cpu=]\n" " [--create]\n" " [--destroy]\n" " [--get-all]\n" " [--get-stats]\n" " [--set-desc-ds]\n" " [--get-desc-ds]\n" " [--set-desc-es]\n" " [--get-desc-es]\n" " [--set-desc-gs]\n" " [--get-desc-gs]\n" " [--set-desc-fs]\n" " [--get-desc-fs]\n" " [--set-desc-cs]\n" " [--get-desc-cs]\n" " [--set-desc-ss]\n" " [--get-desc-ss]\n" " [--set-desc-tr]\n" " [--get-desc-tr]\n" " [--set-desc-ldtr]\n" " [--get-desc-ldtr]\n" " [--set-desc-gdtr]\n" " [--get-desc-gdtr]\n" " [--set-desc-idtr]\n" " [--get-desc-idtr]\n" " [--run]\n" " [--capname=]\n" " [--getcap]\n" " [--setcap=<0|1>]\n" " [--desc-base=]\n" " [--desc-limit=]\n" " [--desc-access=]\n" " [--set-cr0=]\n" " [--get-cr0]\n" " [--set-cr3=]\n" " [--get-cr3]\n" " [--set-cr4=]\n" " [--get-cr4]\n" " [--set-dr7=]\n" " [--get-dr7]\n" " [--set-rsp=]\n" " [--get-rsp]\n" " [--set-rip=]\n" " [--get-rip]\n" " [--get-rax]\n" " [--set-rax=]\n" " [--get-rbx]\n" " [--get-rcx]\n" " [--get-rdx]\n" " [--get-rsi]\n" " [--get-rdi]\n" " [--get-rbp]\n" " [--get-r8]\n" " [--get-r9]\n" " [--get-r10]\n" " [--get-r11]\n" " [--get-r12]\n" " [--get-r13]\n" " [--get-r14]\n" " [--get-r15]\n" " [--set-rflags=]\n" " [--get-rflags]\n" " [--set-cs]\n" " [--get-cs]\n" " [--set-ds]\n" " [--get-ds]\n" " [--set-es]\n" " [--get-es]\n" " [--set-fs]\n" " [--get-fs]\n" " [--set-gs]\n" " [--get-gs]\n" " [--set-ss]\n" " [--get-ss]\n" " [--get-tr]\n" " [--get-ldtr]\n" " [--set-x2apic-state=]\n" " [--get-x2apic-state]\n" " [--unassign-pptdev=]\n" " [--set-mem=]\n" " [--get-lowmem]\n" " [--get-highmem]\n" " [--get-gpa-pmap]\n" " [--assert-lapic-lvt=]\n" " [--inject-nmi]\n" " [--force-reset]\n" " [--force-poweroff]\n" " [--get-rtc-time]\n" " [--set-rtc-time=]\n" " [--get-rtc-nvram]\n" " [--set-rtc-nvram=]\n" " [--rtc-nvram-offset=]\n" " [--get-active-cpus]\n" " [--get-suspended-cpus]\n" " [--get-intinfo]\n" " [--get-eptp]\n" " [--set-exception-bitmap]\n" " [--get-exception-bitmap]\n" " [--get-tsc-offset]\n" " [--get-guest-pat]\n" " [--get-io-bitmap-address]\n" " [--get-msr-bitmap]\n" " [--get-msr-bitmap-address]\n" " [--get-guest-sysenter]\n" " [--get-exit-reason]\n", progname); if (cpu_intel) { (void)fprintf(stderr, " [--get-vmcs-pinbased-ctls]\n" " [--get-vmcs-procbased-ctls]\n" " [--get-vmcs-procbased-ctls2]\n" " [--get-vmcs-entry-interruption-info]\n" " [--set-vmcs-entry-interruption-info=]\n" " [--get-vmcs-guest-physical-address\n" " [--get-vmcs-guest-linear-address\n" " [--get-vmcs-host-pat]\n" " [--get-vmcs-host-cr0]\n" " [--get-vmcs-host-cr3]\n" " [--get-vmcs-host-cr4]\n" " [--get-vmcs-host-rip]\n" " [--get-vmcs-host-rsp]\n" " [--get-vmcs-cr0-mask]\n" " [--get-vmcs-cr0-shadow]\n" " [--get-vmcs-cr4-mask]\n" " [--get-vmcs-cr4-shadow]\n" " [--get-vmcs-cr3-targets]\n" " [--get-vmcs-apic-access-address]\n" " [--get-vmcs-virtual-apic-address]\n" " [--get-vmcs-tpr-threshold]\n" " [--get-vmcs-vpid]\n" " [--get-vmcs-instruction-error]\n" " [--get-vmcs-exit-ctls]\n" " [--get-vmcs-entry-ctls]\n" " [--get-vmcs-link]\n" " [--get-vmcs-exit-qualification]\n" " [--get-vmcs-exit-interruption-info]\n" " [--get-vmcs-exit-interruption-error]\n" " [--get-vmcs-interruptibility]\n" ); } else { (void)fprintf(stderr, " [--get-vmcb-intercepts]\n" " [--get-vmcb-asid]\n" " [--get-vmcb-exit-details]\n" " [--get-vmcb-tlb-ctrl]\n" " [--get-vmcb-virq]\n" " [--get-avic-apic-bar]\n" " [--get-avic-backing-page]\n" " [--get-avic-table]\n" ); } exit(1); } static int get_rtc_time, set_rtc_time; static int get_rtc_nvram, set_rtc_nvram; static int rtc_nvram_offset; static uint8_t rtc_nvram_value; static time_t rtc_secs; static int get_stats, getcap, setcap, capval, get_gpa_pmap; static int inject_nmi, assert_lapic_lvt; static int force_reset, force_poweroff; static const char *capname; static int create, destroy, get_lowmem, get_highmem; static int get_intinfo; static int get_active_cpus, get_suspended_cpus; static uint64_t memsize; static int set_cr0, get_cr0, set_cr3, get_cr3, set_cr4, get_cr4; static int set_efer, get_efer; static int set_dr7, get_dr7; static int set_rsp, get_rsp, set_rip, get_rip, set_rflags, get_rflags; static int set_rax, get_rax; static int get_rbx, get_rcx, get_rdx, get_rsi, get_rdi, get_rbp; static int get_r8, get_r9, get_r10, get_r11, get_r12, get_r13, get_r14, get_r15; static int set_desc_ds, get_desc_ds; static int set_desc_es, get_desc_es; static int set_desc_fs, get_desc_fs; static int set_desc_gs, get_desc_gs; static int set_desc_cs, get_desc_cs; static int set_desc_ss, get_desc_ss; static int set_desc_gdtr, get_desc_gdtr; static int set_desc_idtr, get_desc_idtr; static int set_desc_tr, get_desc_tr; static int set_desc_ldtr, get_desc_ldtr; static int set_cs, set_ds, set_es, set_fs, set_gs, set_ss, set_tr, set_ldtr; static int get_cs, get_ds, get_es, get_fs, get_gs, get_ss, get_tr, get_ldtr; static int set_x2apic_state, get_x2apic_state; enum x2apic_state x2apic_state; static int unassign_pptdev, bus, slot, func; static int run; /* * VMCB specific. */ static int get_vmcb_intercept, get_vmcb_exit_details, get_vmcb_tlb_ctrl; static int get_vmcb_virq, get_avic_table; /* * VMCS-specific fields */ static int get_pinbased_ctls, get_procbased_ctls, get_procbased_ctls2; static int get_eptp, get_io_bitmap, get_tsc_offset; static int get_vmcs_entry_interruption_info, set_vmcs_entry_interruption_info; static int get_vmcs_interruptibility; uint32_t vmcs_entry_interruption_info; static int get_vmcs_gpa, get_vmcs_gla; static int get_exception_bitmap, set_exception_bitmap, exception_bitmap; static int get_cr0_mask, get_cr0_shadow; static int get_cr4_mask, get_cr4_shadow; static int get_cr3_targets; static int get_apic_access_addr, get_virtual_apic_addr, get_tpr_threshold; static int get_msr_bitmap, get_msr_bitmap_address; static int get_vpid_asid; static int get_inst_err, get_exit_ctls, get_entry_ctls; static int get_host_cr0, get_host_cr3, get_host_cr4; static int get_host_rip, get_host_rsp; static int get_guest_pat, get_host_pat; static int get_guest_sysenter, get_vmcs_link; static int get_exit_reason, get_vmcs_exit_qualification; static int get_vmcs_exit_interruption_info, get_vmcs_exit_interruption_error; static uint64_t desc_base; static uint32_t desc_limit, desc_access; static int get_all; static void dump_vm_run_exitcode(struct vm_exit *vmexit, int vcpu) { printf("vm exit[%d]\n", vcpu); printf("\trip\t\t0x%016lx\n", vmexit->rip); printf("\tinst_length\t%d\n", vmexit->inst_length); switch (vmexit->exitcode) { case VM_EXITCODE_INOUT: printf("\treason\t\tINOUT\n"); printf("\tdirection\t%s\n", vmexit->u.inout.in ? "IN" : "OUT"); printf("\tbytes\t\t%d\n", vmexit->u.inout.bytes); printf("\tflags\t\t%s%s\n", vmexit->u.inout.string ? "STRING " : "", vmexit->u.inout.rep ? "REP " : ""); printf("\tport\t\t0x%04x\n", vmexit->u.inout.port); printf("\teax\t\t0x%08x\n", vmexit->u.inout.eax); break; case VM_EXITCODE_VMX: printf("\treason\t\tVMX\n"); printf("\tstatus\t\t%d\n", vmexit->u.vmx.status); printf("\texit_reason\t0x%08x (%u)\n", vmexit->u.vmx.exit_reason, vmexit->u.vmx.exit_reason); printf("\tqualification\t0x%016lx\n", vmexit->u.vmx.exit_qualification); printf("\tinst_type\t\t%d\n", vmexit->u.vmx.inst_type); printf("\tinst_error\t\t%d\n", vmexit->u.vmx.inst_error); break; case VM_EXITCODE_SVM: printf("\treason\t\tSVM\n"); printf("\texit_reason\t\t%#lx\n", vmexit->u.svm.exitcode); printf("\texitinfo1\t\t%#lx\n", vmexit->u.svm.exitinfo1); printf("\texitinfo2\t\t%#lx\n", vmexit->u.svm.exitinfo2); break; default: printf("*** unknown vm run exitcode %d\n", vmexit->exitcode); break; } } /* AMD 6th generation and Intel compatible MSRs */ #define MSR_AMD6TH_START 0xC0000000 #define MSR_AMD6TH_END 0xC0001FFF /* AMD 7th and 8th generation compatible MSRs */ #define MSR_AMD7TH_START 0xC0010000 #define MSR_AMD7TH_END 0xC0011FFF static const char * msr_name(uint32_t msr) { static char buf[32]; switch(msr) { case MSR_TSC: return ("MSR_TSC"); case MSR_EFER: return ("MSR_EFER"); case MSR_STAR: return ("MSR_STAR"); case MSR_LSTAR: return ("MSR_LSTAR"); case MSR_CSTAR: return ("MSR_CSTAR"); case MSR_SF_MASK: return ("MSR_SF_MASK"); case MSR_FSBASE: return ("MSR_FSBASE"); case MSR_GSBASE: return ("MSR_GSBASE"); case MSR_KGSBASE: return ("MSR_KGSBASE"); case MSR_SYSENTER_CS_MSR: return ("MSR_SYSENTER_CS_MSR"); case MSR_SYSENTER_ESP_MSR: return ("MSR_SYSENTER_ESP_MSR"); case MSR_SYSENTER_EIP_MSR: return ("MSR_SYSENTER_EIP_MSR"); case MSR_PAT: return ("MSR_PAT"); } snprintf(buf, sizeof(buf), "MSR %#08x", msr); return (buf); } static inline void print_msr_pm(uint64_t msr, int vcpu, int readable, int writeable) { if (readable || writeable) { printf("%-20s[%d]\t\t%c%c\n", msr_name(msr), vcpu, readable ? 'R' : '-', writeable ? 'W' : '-'); } } /* * Reference APM vol2, section 15.11 MSR Intercepts. */ static void dump_amd_msr_pm(const char *bitmap, int vcpu) { int byte, bit, readable, writeable; uint32_t msr; for (msr = 0; msr < 0x2000; msr++) { byte = msr / 4; bit = (msr % 4) * 2; /* Look at MSRs in the range 0x00000000 to 0x00001FFF */ readable = (bitmap[byte] & (1 << bit)) ? 0 : 1; writeable = (bitmap[byte] & (2 << bit)) ? 0 : 1; print_msr_pm(msr, vcpu, readable, writeable); /* Look at MSRs in the range 0xC0000000 to 0xC0001FFF */ byte += 2048; readable = (bitmap[byte] & (1 << bit)) ? 0 : 1; writeable = (bitmap[byte] & (2 << bit)) ? 0 : 1; print_msr_pm(msr + MSR_AMD6TH_START, vcpu, readable, writeable); /* MSR 0xC0010000 to 0xC0011FF is only for AMD */ byte += 4096; readable = (bitmap[byte] & (1 << bit)) ? 0 : 1; writeable = (bitmap[byte] & (2 << bit)) ? 0 : 1; print_msr_pm(msr + MSR_AMD7TH_START, vcpu, readable, writeable); } } /* * Reference Intel SDM Vol3 Section 24.6.9 MSR-Bitmap Address */ static void dump_intel_msr_pm(const char *bitmap, int vcpu) { int byte, bit, readable, writeable; uint32_t msr; for (msr = 0; msr < 0x2000; msr++) { byte = msr / 8; bit = msr & 0x7; /* Look at MSRs in the range 0x00000000 to 0x00001FFF */ readable = (bitmap[byte] & (1 << bit)) ? 0 : 1; writeable = (bitmap[2048 + byte] & (1 << bit)) ? 0 : 1; print_msr_pm(msr, vcpu, readable, writeable); /* Look at MSRs in the range 0xC0000000 to 0xC0001FFF */ byte += 1024; readable = (bitmap[byte] & (1 << bit)) ? 0 : 1; writeable = (bitmap[2048 + byte] & (1 << bit)) ? 0 : 1; print_msr_pm(msr + MSR_AMD6TH_START, vcpu, readable, writeable); } } static int dump_msr_bitmap(int vcpu, uint64_t addr, bool cpu_intel) { int error, fd, map_size; const char *bitmap; error = -1; bitmap = MAP_FAILED; fd = open("/dev/mem", O_RDONLY, 0); if (fd < 0) { perror("Couldn't open /dev/mem"); goto done; } if (cpu_intel) map_size = PAGE_SIZE; else map_size = 2 * PAGE_SIZE; bitmap = mmap(NULL, map_size, PROT_READ, MAP_SHARED, fd, addr); if (bitmap == MAP_FAILED) { perror("mmap failed"); goto done; } if (cpu_intel) dump_intel_msr_pm(bitmap, vcpu); else dump_amd_msr_pm(bitmap, vcpu); error = 0; done: if (bitmap != MAP_FAILED) munmap((void *)bitmap, map_size); if (fd >= 0) close(fd); return (error); } static int vm_get_vmcs_field(struct vmctx *ctx, int vcpu, int field, uint64_t *ret_val) { return (vm_get_register(ctx, vcpu, VMCS_IDENT(field), ret_val)); } static int vm_set_vmcs_field(struct vmctx *ctx, int vcpu, int field, uint64_t val) { return (vm_set_register(ctx, vcpu, VMCS_IDENT(field), val)); } static int vm_get_vmcb_field(struct vmctx *ctx, int vcpu, int off, int bytes, uint64_t *ret_val) { return (vm_get_register(ctx, vcpu, VMCB_ACCESS(off, bytes), ret_val)); } static int vm_set_vmcb_field(struct vmctx *ctx, int vcpu, int off, int bytes, uint64_t val) { return (vm_set_register(ctx, vcpu, VMCB_ACCESS(off, bytes), val)); } enum { VMNAME = 1000, /* avoid collision with return values from getopt */ VCPU, SET_MEM, SET_EFER, SET_CR0, SET_CR3, SET_CR4, SET_DR7, SET_RSP, SET_RIP, SET_RAX, SET_RFLAGS, DESC_BASE, DESC_LIMIT, DESC_ACCESS, SET_CS, SET_DS, SET_ES, SET_FS, SET_GS, SET_SS, SET_TR, SET_LDTR, SET_X2APIC_STATE, SET_EXCEPTION_BITMAP, SET_VMCS_ENTRY_INTERRUPTION_INFO, SET_CAP, CAPNAME, UNASSIGN_PPTDEV, GET_GPA_PMAP, ASSERT_LAPIC_LVT, SET_RTC_TIME, SET_RTC_NVRAM, RTC_NVRAM_OFFSET, }; static void print_cpus(const char *banner, const cpuset_t *cpus) { int i, first; first = 1; printf("%s:\t", banner); if (!CPU_EMPTY(cpus)) { for (i = 0; i < CPU_SETSIZE; i++) { if (CPU_ISSET(i, cpus)) { printf("%s%d", first ? " " : ", ", i); first = 0; } } } else printf(" (none)"); printf("\n"); } static void print_intinfo(const char *banner, uint64_t info) { int type; printf("%s:\t", banner); if (info & VM_INTINFO_VALID) { type = info & VM_INTINFO_TYPE; switch (type) { case VM_INTINFO_HWINTR: printf("extint"); break; case VM_INTINFO_NMI: printf("nmi"); break; case VM_INTINFO_SWINTR: printf("swint"); break; default: printf("exception"); break; } printf(" vector %d", (int)VM_INTINFO_VECTOR(info)); if (info & VM_INTINFO_DEL_ERRCODE) printf(" errcode %#x", (u_int)(info >> 32)); } else { printf("n/a"); } printf("\n"); } static bool cpu_vendor_intel(void) { u_int regs[4]; char cpu_vendor[13]; do_cpuid(0, regs); ((u_int *)&cpu_vendor)[0] = regs[1]; ((u_int *)&cpu_vendor)[1] = regs[3]; ((u_int *)&cpu_vendor)[2] = regs[2]; cpu_vendor[12] = '\0'; if (strcmp(cpu_vendor, "AuthenticAMD") == 0) { return (false); } else if (strcmp(cpu_vendor, "GenuineIntel") == 0) { return (true); } else { fprintf(stderr, "Unknown cpu vendor \"%s\"\n", cpu_vendor); exit(1); } } static int get_all_registers(struct vmctx *ctx, int vcpu) { uint64_t cr0, cr3, cr4, dr7, rsp, rip, rflags, efer; uint64_t rax, rbx, rcx, rdx, rsi, rdi, rbp; uint64_t r8, r9, r10, r11, r12, r13, r14, r15; - int error; + int error = 0; - if (get_efer || get_all) { + if (!error && (get_efer || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_EFER, &efer); if (error == 0) printf("efer[%d]\t\t0x%016lx\n", vcpu, efer); } if (!error && (get_cr0 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CR0, &cr0); if (error == 0) printf("cr0[%d]\t\t0x%016lx\n", vcpu, cr0); } if (!error && (get_cr3 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CR3, &cr3); if (error == 0) printf("cr3[%d]\t\t0x%016lx\n", vcpu, cr3); } if (!error && (get_cr4 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CR4, &cr4); if (error == 0) printf("cr4[%d]\t\t0x%016lx\n", vcpu, cr4); } if (!error && (get_dr7 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_DR7, &dr7); if (error == 0) printf("dr7[%d]\t\t0x%016lx\n", vcpu, dr7); } if (!error && (get_rsp || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RSP, &rsp); if (error == 0) printf("rsp[%d]\t\t0x%016lx\n", vcpu, rsp); } if (!error && (get_rip || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RIP, &rip); if (error == 0) printf("rip[%d]\t\t0x%016lx\n", vcpu, rip); } if (!error && (get_rax || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RAX, &rax); if (error == 0) printf("rax[%d]\t\t0x%016lx\n", vcpu, rax); } if (!error && (get_rbx || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RBX, &rbx); if (error == 0) printf("rbx[%d]\t\t0x%016lx\n", vcpu, rbx); } if (!error && (get_rcx || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RCX, &rcx); if (error == 0) printf("rcx[%d]\t\t0x%016lx\n", vcpu, rcx); } if (!error && (get_rdx || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RDX, &rdx); if (error == 0) printf("rdx[%d]\t\t0x%016lx\n", vcpu, rdx); } if (!error && (get_rsi || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RSI, &rsi); if (error == 0) printf("rsi[%d]\t\t0x%016lx\n", vcpu, rsi); } if (!error && (get_rdi || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RDI, &rdi); if (error == 0) printf("rdi[%d]\t\t0x%016lx\n", vcpu, rdi); } if (!error && (get_rbp || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RBP, &rbp); if (error == 0) printf("rbp[%d]\t\t0x%016lx\n", vcpu, rbp); } if (!error && (get_r8 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R8, &r8); if (error == 0) printf("r8[%d]\t\t0x%016lx\n", vcpu, r8); } if (!error && (get_r9 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R9, &r9); if (error == 0) printf("r9[%d]\t\t0x%016lx\n", vcpu, r9); } if (!error && (get_r10 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R10, &r10); if (error == 0) printf("r10[%d]\t\t0x%016lx\n", vcpu, r10); } if (!error && (get_r11 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R11, &r11); if (error == 0) printf("r11[%d]\t\t0x%016lx\n", vcpu, r11); } if (!error && (get_r12 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R12, &r12); if (error == 0) printf("r12[%d]\t\t0x%016lx\n", vcpu, r12); } if (!error && (get_r13 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R13, &r13); if (error == 0) printf("r13[%d]\t\t0x%016lx\n", vcpu, r13); } if (!error && (get_r14 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R14, &r14); if (error == 0) printf("r14[%d]\t\t0x%016lx\n", vcpu, r14); } if (!error && (get_r15 || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R15, &r15); if (error == 0) printf("r15[%d]\t\t0x%016lx\n", vcpu, r15); } if (!error && (get_rflags || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RFLAGS, &rflags); if (error == 0) printf("rflags[%d]\t0x%016lx\n", vcpu, rflags); } return (error); } static int get_all_segments(struct vmctx *ctx, int vcpu) { - int error; uint64_t cs, ds, es, fs, gs, ss, tr, ldtr; + int error = 0; - if (get_desc_ds || get_all) { + if (!error && (get_desc_ds || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_DS, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("ds desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_es || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_ES, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("es desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_fs || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_FS, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("fs desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_gs || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_GS, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("gs desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_ss || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_SS, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("ss desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_cs || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_CS, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("cs desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_tr || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_TR, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("tr desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_ldtr || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_LDTR, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("ldtr desc[%d]\t0x%016lx/0x%08x/0x%08x\n", vcpu, desc_base, desc_limit, desc_access); } } if (!error && (get_desc_gdtr || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_GDTR, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("gdtr[%d]\t\t0x%016lx/0x%08x\n", vcpu, desc_base, desc_limit); } } if (!error && (get_desc_idtr || get_all)) { error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_IDTR, &desc_base, &desc_limit, &desc_access); if (error == 0) { printf("idtr[%d]\t\t0x%016lx/0x%08x\n", vcpu, desc_base, desc_limit); } } if (!error && (get_cs || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CS, &cs); if (error == 0) printf("cs[%d]\t\t0x%04lx\n", vcpu, cs); } if (!error && (get_ds || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_DS, &ds); if (error == 0) printf("ds[%d]\t\t0x%04lx\n", vcpu, ds); } if (!error && (get_es || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_ES, &es); if (error == 0) printf("es[%d]\t\t0x%04lx\n", vcpu, es); } if (!error && (get_fs || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_FS, &fs); if (error == 0) printf("fs[%d]\t\t0x%04lx\n", vcpu, fs); } if (!error && (get_gs || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_GS, &gs); if (error == 0) printf("gs[%d]\t\t0x%04lx\n", vcpu, gs); } if (!error && (get_ss || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_SS, &ss); if (error == 0) printf("ss[%d]\t\t0x%04lx\n", vcpu, ss); } if (!error && (get_tr || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_TR, &tr); if (error == 0) printf("tr[%d]\t\t0x%04lx\n", vcpu, tr); } if (!error && (get_ldtr || get_all)) { error = vm_get_register(ctx, vcpu, VM_REG_GUEST_LDTR, &ldtr); if (error == 0) printf("ldtr[%d]\t\t0x%04lx\n", vcpu, ldtr); } return (error); } static int get_misc_vmcs(struct vmctx *ctx, int vcpu) { uint64_t ctl, cr0, cr3, cr4, rsp, rip, pat, addr, u64; - int error; - - if (get_cr0_mask || get_all) { + int error = 0; + + if (!error && (get_cr0_mask || get_all)) { uint64_t cr0mask; error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR0_MASK, &cr0mask); if (error == 0) printf("cr0_mask[%d]\t\t0x%016lx\n", vcpu, cr0mask); } if (!error && (get_cr0_shadow || get_all)) { uint64_t cr0shadow; error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR0_SHADOW, &cr0shadow); if (error == 0) printf("cr0_shadow[%d]\t\t0x%016lx\n", vcpu, cr0shadow); } if (!error && (get_cr4_mask || get_all)) { uint64_t cr4mask; error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR4_MASK, &cr4mask); if (error == 0) printf("cr4_mask[%d]\t\t0x%016lx\n", vcpu, cr4mask); } if (!error && (get_cr4_shadow || get_all)) { uint64_t cr4shadow; error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR4_SHADOW, &cr4shadow); if (error == 0) printf("cr4_shadow[%d]\t\t0x%016lx\n", vcpu, cr4shadow); } if (!error && (get_cr3_targets || get_all)) { uint64_t target_count, target_addr; error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET_COUNT, &target_count); if (error == 0) { printf("cr3_target_count[%d]\t0x%016lx\n", vcpu, target_count); } error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET0, &target_addr); if (error == 0) { printf("cr3_target0[%d]\t\t0x%016lx\n", vcpu, target_addr); } error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET1, &target_addr); if (error == 0) { printf("cr3_target1[%d]\t\t0x%016lx\n", vcpu, target_addr); } error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET2, &target_addr); if (error == 0) { printf("cr3_target2[%d]\t\t0x%016lx\n", vcpu, target_addr); } error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET3, &target_addr); if (error == 0) { printf("cr3_target3[%d]\t\t0x%016lx\n", vcpu, target_addr); } } if (!error && (get_pinbased_ctls || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_PIN_BASED_CTLS, &ctl); if (error == 0) printf("pinbased_ctls[%d]\t0x%016lx\n", vcpu, ctl); } if (!error && (get_procbased_ctls || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_PRI_PROC_BASED_CTLS, &ctl); if (error == 0) printf("procbased_ctls[%d]\t0x%016lx\n", vcpu, ctl); } if (!error && (get_procbased_ctls2 || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_SEC_PROC_BASED_CTLS, &ctl); if (error == 0) printf("procbased_ctls2[%d]\t0x%016lx\n", vcpu, ctl); } if (!error && (get_vmcs_gla || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_LINEAR_ADDRESS, &u64); if (error == 0) printf("gla[%d]\t\t0x%016lx\n", vcpu, u64); } if (!error && (get_vmcs_gpa || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_PHYSICAL_ADDRESS, &u64); if (error == 0) printf("gpa[%d]\t\t0x%016lx\n", vcpu, u64); } if (!error && (get_vmcs_entry_interruption_info || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_ENTRY_INTR_INFO,&u64); if (error == 0) { printf("entry_interruption_info[%d]\t0x%016lx\n", vcpu, u64); } } if (!error && (get_tpr_threshold || get_all)) { uint64_t threshold; error = vm_get_vmcs_field(ctx, vcpu, VMCS_TPR_THRESHOLD, &threshold); if (error == 0) printf("tpr_threshold[%d]\t0x%016lx\n", vcpu, threshold); } if (!error && (get_inst_err || get_all)) { uint64_t insterr; error = vm_get_vmcs_field(ctx, vcpu, VMCS_INSTRUCTION_ERROR, &insterr); if (error == 0) { printf("instruction_error[%d]\t0x%016lx\n", vcpu, insterr); } } if (!error && (get_exit_ctls || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_CTLS, &ctl); if (error == 0) printf("exit_ctls[%d]\t\t0x%016lx\n", vcpu, ctl); } if (!error && (get_entry_ctls || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_ENTRY_CTLS, &ctl); if (error == 0) printf("entry_ctls[%d]\t\t0x%016lx\n", vcpu, ctl); } if (!error && (get_host_pat || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_IA32_PAT, &pat); if (error == 0) printf("host_pat[%d]\t\t0x%016lx\n", vcpu, pat); } if (!error && (get_host_cr0 || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_CR0, &cr0); if (error == 0) printf("host_cr0[%d]\t\t0x%016lx\n", vcpu, cr0); } if (!error && (get_host_cr3 || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_CR3, &cr3); if (error == 0) printf("host_cr3[%d]\t\t0x%016lx\n", vcpu, cr3); } if (!error && (get_host_cr4 || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_CR4, &cr4); if (error == 0) printf("host_cr4[%d]\t\t0x%016lx\n", vcpu, cr4); } if (!error && (get_host_rip || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_RIP, &rip); if (error == 0) printf("host_rip[%d]\t\t0x%016lx\n", vcpu, rip); } if (!error && (get_host_rsp || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_RSP, &rsp); if (error == 0) printf("host_rsp[%d]\t\t0x%016lx\n", vcpu, rsp); } if (!error && (get_vmcs_link || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_LINK_POINTER, &addr); if (error == 0) printf("vmcs_pointer[%d]\t0x%016lx\n", vcpu, addr); } if (!error && (get_vmcs_exit_interruption_info || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_INTR_INFO, &u64); if (error == 0) { printf("vmcs_exit_interruption_info[%d]\t0x%016lx\n", vcpu, u64); } } if (!error && (get_vmcs_exit_interruption_error || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_INTR_ERRCODE, &u64); if (error == 0) { printf("vmcs_exit_interruption_error[%d]\t0x%016lx\n", vcpu, u64); } } if (!error && (get_vmcs_interruptibility || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_INTERRUPTIBILITY, &u64); if (error == 0) { printf("vmcs_guest_interruptibility[%d]\t0x%016lx\n", vcpu, u64); } } if (!error && (get_vmcs_exit_qualification || get_all)) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_QUALIFICATION, &u64); if (error == 0) printf("vmcs_exit_qualification[%d]\t0x%016lx\n", vcpu, u64); } return (error); } static int get_misc_vmcb(struct vmctx *ctx, int vcpu) { uint64_t ctl, addr; - int error; + int error = 0; - if (get_vmcb_intercept || get_all) { + if (!error && (get_vmcb_intercept || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_CR_INTERCEPT, 4, &ctl); if (error == 0) printf("cr_intercept[%d]\t0x%08x\n", vcpu, (int)ctl); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_DR_INTERCEPT, 4, &ctl); if (error == 0) printf("dr_intercept[%d]\t0x%08x\n", vcpu, (int)ctl); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXC_INTERCEPT, 4, &ctl); if (error == 0) printf("exc_intercept[%d]\t0x%08x\n", vcpu, (int)ctl); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_INST1_INTERCEPT, 4, &ctl); if (error == 0) printf("inst1_intercept[%d]\t0x%08x\n", vcpu, (int)ctl); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_INST2_INTERCEPT, 4, &ctl); if (error == 0) printf("inst2_intercept[%d]\t0x%08x\n", vcpu, (int)ctl); } if (!error && (get_vmcb_tlb_ctrl || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_TLB_CTRL, 4, &ctl); if (error == 0) printf("TLB ctrl[%d]\t0x%016lx\n", vcpu, ctl); } if (!error && (get_vmcb_exit_details || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXITINFO1, 8, &ctl); if (error == 0) printf("exitinfo1[%d]\t0x%016lx\n", vcpu, ctl); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXITINFO2, 8, &ctl); if (error == 0) printf("exitinfo2[%d]\t0x%016lx\n", vcpu, ctl); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXITINTINFO, 8, &ctl); if (error == 0) printf("exitintinfo[%d]\t0x%016lx\n", vcpu, ctl); } if (!error && (get_vmcb_virq || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_VIRQ, 8, &ctl); if (error == 0) printf("v_irq/tpr[%d]\t0x%016lx\n", vcpu, ctl); } if (!error && (get_apic_access_addr || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_BAR, 8, &addr); if (error == 0) printf("AVIC apic_bar[%d]\t0x%016lx\n", vcpu, addr); } if (!error && (get_virtual_apic_addr || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_PAGE, 8, &addr); if (error == 0) printf("AVIC backing page[%d]\t0x%016lx\n", vcpu, addr); } if (!error && (get_avic_table || get_all)) { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_LT, 8, &addr); if (error == 0) printf("AVIC logical table[%d]\t0x%016lx\n", vcpu, addr); error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_PT, 8, &addr); if (error == 0) printf("AVIC physical table[%d]\t0x%016lx\n", vcpu, addr); } return (error); } static struct option * setup_options(bool cpu_intel) { const struct option common_opts[] = { { "vm", REQ_ARG, 0, VMNAME }, { "cpu", REQ_ARG, 0, VCPU }, { "set-mem", REQ_ARG, 0, SET_MEM }, { "set-efer", REQ_ARG, 0, SET_EFER }, { "set-cr0", REQ_ARG, 0, SET_CR0 }, { "set-cr3", REQ_ARG, 0, SET_CR3 }, { "set-cr4", REQ_ARG, 0, SET_CR4 }, { "set-dr7", REQ_ARG, 0, SET_DR7 }, { "set-rsp", REQ_ARG, 0, SET_RSP }, { "set-rip", REQ_ARG, 0, SET_RIP }, { "set-rax", REQ_ARG, 0, SET_RAX }, { "set-rflags", REQ_ARG, 0, SET_RFLAGS }, { "desc-base", REQ_ARG, 0, DESC_BASE }, { "desc-limit", REQ_ARG, 0, DESC_LIMIT }, { "desc-access",REQ_ARG, 0, DESC_ACCESS }, { "set-cs", REQ_ARG, 0, SET_CS }, { "set-ds", REQ_ARG, 0, SET_DS }, { "set-es", REQ_ARG, 0, SET_ES }, { "set-fs", REQ_ARG, 0, SET_FS }, { "set-gs", REQ_ARG, 0, SET_GS }, { "set-ss", REQ_ARG, 0, SET_SS }, { "set-tr", REQ_ARG, 0, SET_TR }, { "set-ldtr", REQ_ARG, 0, SET_LDTR }, { "set-x2apic-state",REQ_ARG, 0, SET_X2APIC_STATE }, { "set-exception-bitmap", REQ_ARG, 0, SET_EXCEPTION_BITMAP }, { "capname", REQ_ARG, 0, CAPNAME }, { "unassign-pptdev", REQ_ARG, 0, UNASSIGN_PPTDEV }, { "setcap", REQ_ARG, 0, SET_CAP }, { "get-gpa-pmap", REQ_ARG, 0, GET_GPA_PMAP }, { "assert-lapic-lvt", REQ_ARG, 0, ASSERT_LAPIC_LVT }, { "get-rtc-time", NO_ARG, &get_rtc_time, 1 }, { "set-rtc-time", REQ_ARG, 0, SET_RTC_TIME }, { "rtc-nvram-offset", REQ_ARG, 0, RTC_NVRAM_OFFSET }, { "get-rtc-nvram", NO_ARG, &get_rtc_nvram, 1 }, { "set-rtc-nvram", REQ_ARG, 0, SET_RTC_NVRAM }, { "getcap", NO_ARG, &getcap, 1 }, { "get-stats", NO_ARG, &get_stats, 1 }, { "get-desc-ds",NO_ARG, &get_desc_ds, 1 }, { "set-desc-ds",NO_ARG, &set_desc_ds, 1 }, { "get-desc-es",NO_ARG, &get_desc_es, 1 }, { "set-desc-es",NO_ARG, &set_desc_es, 1 }, { "get-desc-ss",NO_ARG, &get_desc_ss, 1 }, { "set-desc-ss",NO_ARG, &set_desc_ss, 1 }, { "get-desc-cs",NO_ARG, &get_desc_cs, 1 }, { "set-desc-cs",NO_ARG, &set_desc_cs, 1 }, { "get-desc-fs",NO_ARG, &get_desc_fs, 1 }, { "set-desc-fs",NO_ARG, &set_desc_fs, 1 }, { "get-desc-gs",NO_ARG, &get_desc_gs, 1 }, { "set-desc-gs",NO_ARG, &set_desc_gs, 1 }, { "get-desc-tr",NO_ARG, &get_desc_tr, 1 }, { "set-desc-tr",NO_ARG, &set_desc_tr, 1 }, { "set-desc-ldtr", NO_ARG, &set_desc_ldtr, 1 }, { "get-desc-ldtr", NO_ARG, &get_desc_ldtr, 1 }, { "set-desc-gdtr", NO_ARG, &set_desc_gdtr, 1 }, { "get-desc-gdtr", NO_ARG, &get_desc_gdtr, 1 }, { "set-desc-idtr", NO_ARG, &set_desc_idtr, 1 }, { "get-desc-idtr", NO_ARG, &get_desc_idtr, 1 }, { "get-lowmem", NO_ARG, &get_lowmem, 1 }, { "get-highmem",NO_ARG, &get_highmem, 1 }, { "get-efer", NO_ARG, &get_efer, 1 }, { "get-cr0", NO_ARG, &get_cr0, 1 }, { "get-cr3", NO_ARG, &get_cr3, 1 }, { "get-cr4", NO_ARG, &get_cr4, 1 }, { "get-dr7", NO_ARG, &get_dr7, 1 }, { "get-rsp", NO_ARG, &get_rsp, 1 }, { "get-rip", NO_ARG, &get_rip, 1 }, { "get-rax", NO_ARG, &get_rax, 1 }, { "get-rbx", NO_ARG, &get_rbx, 1 }, { "get-rcx", NO_ARG, &get_rcx, 1 }, { "get-rdx", NO_ARG, &get_rdx, 1 }, { "get-rsi", NO_ARG, &get_rsi, 1 }, { "get-rdi", NO_ARG, &get_rdi, 1 }, { "get-rbp", NO_ARG, &get_rbp, 1 }, { "get-r8", NO_ARG, &get_r8, 1 }, { "get-r9", NO_ARG, &get_r9, 1 }, { "get-r10", NO_ARG, &get_r10, 1 }, { "get-r11", NO_ARG, &get_r11, 1 }, { "get-r12", NO_ARG, &get_r12, 1 }, { "get-r13", NO_ARG, &get_r13, 1 }, { "get-r14", NO_ARG, &get_r14, 1 }, { "get-r15", NO_ARG, &get_r15, 1 }, { "get-rflags", NO_ARG, &get_rflags, 1 }, { "get-cs", NO_ARG, &get_cs, 1 }, { "get-ds", NO_ARG, &get_ds, 1 }, { "get-es", NO_ARG, &get_es, 1 }, { "get-fs", NO_ARG, &get_fs, 1 }, { "get-gs", NO_ARG, &get_gs, 1 }, { "get-ss", NO_ARG, &get_ss, 1 }, { "get-tr", NO_ARG, &get_tr, 1 }, { "get-ldtr", NO_ARG, &get_ldtr, 1 }, { "get-eptp", NO_ARG, &get_eptp, 1 }, { "get-exception-bitmap", NO_ARG, &get_exception_bitmap, 1 }, { "get-io-bitmap-address", NO_ARG, &get_io_bitmap, 1 }, { "get-tsc-offset", NO_ARG, &get_tsc_offset, 1 }, { "get-msr-bitmap", NO_ARG, &get_msr_bitmap, 1 }, { "get-msr-bitmap-address", NO_ARG, &get_msr_bitmap_address, 1 }, { "get-guest-pat", NO_ARG, &get_guest_pat, 1 }, { "get-guest-sysenter", NO_ARG, &get_guest_sysenter, 1 }, { "get-exit-reason", NO_ARG, &get_exit_reason, 1 }, { "get-x2apic-state", NO_ARG, &get_x2apic_state, 1 }, { "get-all", NO_ARG, &get_all, 1 }, { "run", NO_ARG, &run, 1 }, { "create", NO_ARG, &create, 1 }, { "destroy", NO_ARG, &destroy, 1 }, { "inject-nmi", NO_ARG, &inject_nmi, 1 }, { "force-reset", NO_ARG, &force_reset, 1 }, { "force-poweroff", NO_ARG, &force_poweroff, 1 }, { "get-active-cpus", NO_ARG, &get_active_cpus, 1 }, { "get-suspended-cpus", NO_ARG, &get_suspended_cpus, 1 }, { "get-intinfo", NO_ARG, &get_intinfo, 1 }, }; const struct option intel_opts[] = { { "get-vmcs-pinbased-ctls", NO_ARG, &get_pinbased_ctls, 1 }, { "get-vmcs-procbased-ctls", NO_ARG, &get_procbased_ctls, 1 }, { "get-vmcs-procbased-ctls2", NO_ARG, &get_procbased_ctls2, 1 }, { "get-vmcs-guest-linear-address", NO_ARG, &get_vmcs_gla, 1 }, { "get-vmcs-guest-physical-address", NO_ARG, &get_vmcs_gpa, 1 }, { "get-vmcs-entry-interruption-info", NO_ARG, &get_vmcs_entry_interruption_info, 1}, { "get-vmcs-cr0-mask", NO_ARG, &get_cr0_mask, 1 }, { "get-vmcs-cr0-shadow", NO_ARG,&get_cr0_shadow, 1 }, { "get-vmcs-cr4-mask", NO_ARG, &get_cr4_mask, 1 }, { "get-vmcs-cr4-shadow", NO_ARG, &get_cr4_shadow, 1 }, { "get-vmcs-cr3-targets", NO_ARG, &get_cr3_targets, 1 }, { "get-vmcs-tpr-threshold", NO_ARG, &get_tpr_threshold, 1 }, { "get-vmcs-vpid", NO_ARG, &get_vpid_asid, 1 }, { "get-vmcs-exit-ctls", NO_ARG, &get_exit_ctls, 1 }, { "get-vmcs-entry-ctls", NO_ARG, &get_entry_ctls, 1 }, { "get-vmcs-instruction-error", NO_ARG, &get_inst_err, 1 }, { "get-vmcs-host-pat", NO_ARG, &get_host_pat, 1 }, { "get-vmcs-host-cr0", NO_ARG, &get_host_cr0, 1 }, { "set-vmcs-entry-interruption-info", REQ_ARG, 0, SET_VMCS_ENTRY_INTERRUPTION_INFO }, { "get-vmcs-exit-qualification", NO_ARG, &get_vmcs_exit_qualification, 1 }, { "get-vmcs-interruptibility", NO_ARG, &get_vmcs_interruptibility, 1 }, { "get-vmcs-exit-interruption-error", NO_ARG, &get_vmcs_exit_interruption_error, 1 }, { "get-vmcs-exit-interruption-info", NO_ARG, &get_vmcs_exit_interruption_info, 1 }, { "get-vmcs-link", NO_ARG, &get_vmcs_link, 1 }, { "get-vmcs-host-cr3", NO_ARG, &get_host_cr3, 1 }, { "get-vmcs-host-cr4", NO_ARG, &get_host_cr4, 1 }, { "get-vmcs-host-rip", NO_ARG, &get_host_rip, 1 }, { "get-vmcs-host-rsp", NO_ARG, &get_host_rsp, 1 }, { "get-apic-access-address", NO_ARG, &get_apic_access_addr, 1}, { "get-virtual-apic-address", NO_ARG, &get_virtual_apic_addr, 1} }; const struct option amd_opts[] = { { "get-vmcb-intercepts", NO_ARG, &get_vmcb_intercept, 1 }, { "get-vmcb-asid", NO_ARG, &get_vpid_asid, 1 }, { "get-vmcb-exit-details", NO_ARG, &get_vmcb_exit_details, 1 }, { "get-vmcb-tlb-ctrl", NO_ARG, &get_vmcb_tlb_ctrl, 1 }, { "get-vmcb-virq", NO_ARG, &get_vmcb_virq, 1 }, { "get-avic-apic-bar", NO_ARG, &get_apic_access_addr, 1 }, { "get-avic-backing-page", NO_ARG, &get_virtual_apic_addr, 1 }, { "get-avic-table", NO_ARG, &get_avic_table, 1 } }; const struct option null_opt = { NULL, 0, NULL, 0 }; struct option *all_opts; char *cp; int optlen; optlen = sizeof(common_opts); if (cpu_intel) optlen += sizeof(intel_opts); else optlen += sizeof(amd_opts); optlen += sizeof(null_opt); all_opts = malloc(optlen); cp = (char *)all_opts; memcpy(cp, common_opts, sizeof(common_opts)); cp += sizeof(common_opts); if (cpu_intel) { memcpy(cp, intel_opts, sizeof(intel_opts)); cp += sizeof(intel_opts); } else { memcpy(cp, amd_opts, sizeof(amd_opts)); cp += sizeof(amd_opts); } memcpy(cp, &null_opt, sizeof(null_opt)); cp += sizeof(null_opt); return (all_opts); } static const char * wday_str(int idx) { static const char *weekdays[] = { "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" }; if (idx >= 0 && idx < 7) return (weekdays[idx]); else return ("UNK"); } static const char * mon_str(int idx) { static const char *months[] = { "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" }; if (idx >= 0 && idx < 12) return (months[idx]); else return ("UNK"); } int main(int argc, char *argv[]) { char *vmname; int error, ch, vcpu, ptenum; vm_paddr_t gpa, gpa_pmap; size_t len; struct vm_exit vmexit; uint64_t rax, cr0, cr3, cr4, dr7, rsp, rip, rflags, efer, pat; uint64_t eptp, bm, addr, u64, pteval[4], *pte, info[2]; struct vmctx *ctx; int wired; cpuset_t cpus; bool cpu_intel; uint64_t cs, ds, es, fs, gs, ss, tr, ldtr; struct tm tm; struct option *opts; cpu_intel = cpu_vendor_intel(); opts = setup_options(cpu_intel); vcpu = 0; vmname = NULL; assert_lapic_lvt = -1; progname = basename(argv[0]); while ((ch = getopt_long(argc, argv, "", opts, NULL)) != -1) { switch (ch) { case 0: break; case VMNAME: vmname = optarg; break; case VCPU: vcpu = atoi(optarg); break; case SET_MEM: memsize = atoi(optarg) * MB; memsize = roundup(memsize, 2 * MB); break; case SET_EFER: efer = strtoul(optarg, NULL, 0); set_efer = 1; break; case SET_CR0: cr0 = strtoul(optarg, NULL, 0); set_cr0 = 1; break; case SET_CR3: cr3 = strtoul(optarg, NULL, 0); set_cr3 = 1; break; case SET_CR4: cr4 = strtoul(optarg, NULL, 0); set_cr4 = 1; break; case SET_DR7: dr7 = strtoul(optarg, NULL, 0); set_dr7 = 1; break; case SET_RSP: rsp = strtoul(optarg, NULL, 0); set_rsp = 1; break; case SET_RIP: rip = strtoul(optarg, NULL, 0); set_rip = 1; break; case SET_RAX: rax = strtoul(optarg, NULL, 0); set_rax = 1; break; case SET_RFLAGS: rflags = strtoul(optarg, NULL, 0); set_rflags = 1; break; case DESC_BASE: desc_base = strtoul(optarg, NULL, 0); break; case DESC_LIMIT: desc_limit = strtoul(optarg, NULL, 0); break; case DESC_ACCESS: desc_access = strtoul(optarg, NULL, 0); break; case SET_CS: cs = strtoul(optarg, NULL, 0); set_cs = 1; break; case SET_DS: ds = strtoul(optarg, NULL, 0); set_ds = 1; break; case SET_ES: es = strtoul(optarg, NULL, 0); set_es = 1; break; case SET_FS: fs = strtoul(optarg, NULL, 0); set_fs = 1; break; case SET_GS: gs = strtoul(optarg, NULL, 0); set_gs = 1; break; case SET_SS: ss = strtoul(optarg, NULL, 0); set_ss = 1; break; case SET_TR: tr = strtoul(optarg, NULL, 0); set_tr = 1; break; case SET_LDTR: ldtr = strtoul(optarg, NULL, 0); set_ldtr = 1; break; case SET_X2APIC_STATE: x2apic_state = strtol(optarg, NULL, 0); set_x2apic_state = 1; break; case SET_EXCEPTION_BITMAP: exception_bitmap = strtoul(optarg, NULL, 0); set_exception_bitmap = 1; break; case SET_VMCS_ENTRY_INTERRUPTION_INFO: vmcs_entry_interruption_info = strtoul(optarg, NULL, 0); set_vmcs_entry_interruption_info = 1; break; case SET_CAP: capval = strtoul(optarg, NULL, 0); setcap = 1; break; case SET_RTC_TIME: rtc_secs = strtoul(optarg, NULL, 0); set_rtc_time = 1; break; case SET_RTC_NVRAM: rtc_nvram_value = (uint8_t)strtoul(optarg, NULL, 0); set_rtc_nvram = 1; break; case RTC_NVRAM_OFFSET: rtc_nvram_offset = strtoul(optarg, NULL, 0); break; case GET_GPA_PMAP: gpa_pmap = strtoul(optarg, NULL, 0); get_gpa_pmap = 1; break; case CAPNAME: capname = optarg; break; case UNASSIGN_PPTDEV: unassign_pptdev = 1; if (sscanf(optarg, "%d/%d/%d", &bus, &slot, &func) != 3) usage(cpu_intel); break; case ASSERT_LAPIC_LVT: assert_lapic_lvt = atoi(optarg); break; default: usage(cpu_intel); } } argc -= optind; argv += optind; if (vmname == NULL) usage(cpu_intel); error = 0; if (!error && create) error = vm_create(vmname); if (!error) { ctx = vm_open(vmname); if (ctx == NULL) { printf("VM:%s is not created.\n", vmname); exit (1); } } if (!error && memsize) error = vm_setup_memory(ctx, memsize, VM_MMAP_NONE); if (!error && set_efer) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_EFER, efer); if (!error && set_cr0) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CR0, cr0); if (!error && set_cr3) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CR3, cr3); if (!error && set_cr4) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CR4, cr4); if (!error && set_dr7) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_DR7, dr7); if (!error && set_rsp) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RSP, rsp); if (!error && set_rip) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RIP, rip); if (!error && set_rax) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RAX, rax); if (!error && set_rflags) { error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RFLAGS, rflags); } if (!error && set_desc_ds) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_DS, desc_base, desc_limit, desc_access); } if (!error && set_desc_es) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_ES, desc_base, desc_limit, desc_access); } if (!error && set_desc_ss) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_SS, desc_base, desc_limit, desc_access); } if (!error && set_desc_cs) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_CS, desc_base, desc_limit, desc_access); } if (!error && set_desc_fs) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_FS, desc_base, desc_limit, desc_access); } if (!error && set_desc_gs) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_GS, desc_base, desc_limit, desc_access); } if (!error && set_desc_tr) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_TR, desc_base, desc_limit, desc_access); } if (!error && set_desc_ldtr) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_LDTR, desc_base, desc_limit, desc_access); } if (!error && set_desc_gdtr) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_GDTR, desc_base, desc_limit, 0); } if (!error && set_desc_idtr) { error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_IDTR, desc_base, desc_limit, 0); } if (!error && set_cs) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CS, cs); if (!error && set_ds) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_DS, ds); if (!error && set_es) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_ES, es); if (!error && set_fs) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_FS, fs); if (!error && set_gs) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_GS, gs); if (!error && set_ss) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_SS, ss); if (!error && set_tr) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_TR, tr); if (!error && set_ldtr) error = vm_set_register(ctx, vcpu, VM_REG_GUEST_LDTR, ldtr); if (!error && set_x2apic_state) error = vm_set_x2apic_state(ctx, vcpu, x2apic_state); if (!error && unassign_pptdev) error = vm_unassign_pptdev(ctx, bus, slot, func); if (!error && set_exception_bitmap) { if (cpu_intel) error = vm_set_vmcs_field(ctx, vcpu, VMCS_EXCEPTION_BITMAP, exception_bitmap); else error = vm_set_vmcb_field(ctx, vcpu, VMCB_OFF_EXC_INTERCEPT, 4, exception_bitmap); } if (!error && cpu_intel && set_vmcs_entry_interruption_info) { error = vm_set_vmcs_field(ctx, vcpu, VMCS_ENTRY_INTR_INFO, vmcs_entry_interruption_info); } if (!error && inject_nmi) { error = vm_inject_nmi(ctx, vcpu); } if (!error && assert_lapic_lvt != -1) { error = vm_lapic_local_irq(ctx, vcpu, assert_lapic_lvt); } if (!error && (get_lowmem || get_all)) { gpa = 0; error = vm_get_memory_seg(ctx, gpa, &len, &wired); if (error == 0) printf("lowmem\t\t0x%016lx/%ld%s\n", gpa, len, wired ? " wired" : ""); } if (!error && (get_highmem || get_all)) { gpa = 4 * GB; error = vm_get_memory_seg(ctx, gpa, &len, &wired); if (error == 0) printf("highmem\t\t0x%016lx/%ld%s\n", gpa, len, wired ? " wired" : ""); } if (!error) error = get_all_registers(ctx, vcpu); if (!error) error = get_all_segments(ctx, vcpu); if (!error) { if (cpu_intel) error = get_misc_vmcs(ctx, vcpu); else error = get_misc_vmcb(ctx, vcpu); } if (!error && (get_x2apic_state || get_all)) { error = vm_get_x2apic_state(ctx, vcpu, &x2apic_state); if (error == 0) printf("x2apic_state[%d]\t%d\n", vcpu, x2apic_state); } if (!error && (get_eptp || get_all)) { if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_EPTP, &eptp); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_NPT_BASE, 8, &eptp); if (error == 0) printf("%s[%d]\t\t0x%016lx\n", cpu_intel ? "eptp" : "rvi/npt", vcpu, eptp); } if (!error && (get_exception_bitmap || get_all)) { if(cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXCEPTION_BITMAP, &bm); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXC_INTERCEPT, 4, &bm); if (error == 0) printf("exception_bitmap[%d]\t%#lx\n", vcpu, bm); } if (!error && (get_io_bitmap || get_all)) { if (cpu_intel) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_IO_BITMAP_A, &bm); if (error == 0) printf("io_bitmap_a[%d]\t%#lx\n", vcpu, bm); error = vm_get_vmcs_field(ctx, vcpu, VMCS_IO_BITMAP_B, &bm); if (error == 0) printf("io_bitmap_b[%d]\t%#lx\n", vcpu, bm); } else { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_IO_PERM, 8, &bm); if (error == 0) printf("io_bitmap[%d]\t%#lx\n", vcpu, bm); } } if (!error && (get_tsc_offset || get_all)) { uint64_t tscoff; if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_TSC_OFFSET, &tscoff); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_TSC_OFFSET, 8, &tscoff); if (error == 0) printf("tsc_offset[%d]\t0x%016lx\n", vcpu, tscoff); } if (!error && (get_msr_bitmap_address || get_all)) { if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_MSR_BITMAP, &addr); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_MSR_PERM, 8, &addr); if (error == 0) printf("msr_bitmap[%d]\t\t%#lx\n", vcpu, addr); } if (!error && (get_msr_bitmap || get_all)) { if (cpu_intel) { error = vm_get_vmcs_field(ctx, vcpu, VMCS_MSR_BITMAP, &addr); } else { error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_MSR_PERM, 8, &addr); } if (error == 0) error = dump_msr_bitmap(vcpu, addr, cpu_intel); } if (!error && (get_vpid_asid || get_all)) { uint64_t vpid; if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_VPID, &vpid); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_ASID, 4, &vpid); if (error == 0) printf("%s[%d]\t\t0x%04lx\n", cpu_intel ? "vpid" : "asid", vcpu, vpid); } if (!error && (get_guest_pat || get_all)) { if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_IA32_PAT, &pat); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_GUEST_PAT, 8, &pat); if (error == 0) printf("guest_pat[%d]\t\t0x%016lx\n", vcpu, pat); } if (!error && (get_guest_sysenter || get_all)) { if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_IA32_SYSENTER_CS, &cs); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_SYSENTER_CS, 8, &cs); if (error == 0) printf("guest_sysenter_cs[%d]\t%#lx\n", vcpu, cs); if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_IA32_SYSENTER_ESP, &rsp); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_SYSENTER_ESP, 8, &rsp); if (error == 0) printf("guest_sysenter_sp[%d]\t%#lx\n", vcpu, rsp); if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_GUEST_IA32_SYSENTER_EIP, &rip); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_SYSENTER_EIP, 8, &rip); if (error == 0) printf("guest_sysenter_ip[%d]\t%#lx\n", vcpu, rip); } if (!error && (get_exit_reason || get_all)) { if (cpu_intel) error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_REASON, &u64); else error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXIT_REASON, 8, &u64); if (error == 0) printf("exit_reason[%d]\t%#lx\n", vcpu, u64); } if (!error && setcap) { int captype; captype = vm_capability_name2type(capname); error = vm_set_capability(ctx, vcpu, captype, capval); if (error != 0 && errno == ENOENT) printf("Capability \"%s\" is not available\n", capname); } if (!error && get_gpa_pmap) { error = vm_get_gpa_pmap(ctx, gpa_pmap, pteval, &ptenum); if (error == 0) { printf("gpa %#lx:", gpa_pmap); pte = &pteval[0]; while (ptenum-- > 0) printf(" %#lx", *pte++); printf("\n"); } } if (!error && set_rtc_nvram) error = vm_rtc_write(ctx, rtc_nvram_offset, rtc_nvram_value); if (!error && (get_rtc_nvram || get_all)) { error = vm_rtc_read(ctx, rtc_nvram_offset, &rtc_nvram_value); if (error == 0) { printf("rtc nvram[%03d]: 0x%02x\n", rtc_nvram_offset, rtc_nvram_value); } } if (!error && set_rtc_time) error = vm_rtc_settime(ctx, rtc_secs); if (!error && (get_rtc_time || get_all)) { error = vm_rtc_gettime(ctx, &rtc_secs); if (error == 0) { gmtime_r(&rtc_secs, &tm); printf("rtc time %#lx: %s %s %02d %02d:%02d:%02d %d\n", rtc_secs, wday_str(tm.tm_wday), mon_str(tm.tm_mon), tm.tm_mday, tm.tm_hour, tm.tm_min, tm.tm_sec, 1900 + tm.tm_year); } } if (!error && (getcap || get_all)) { int captype, val, getcaptype; if (getcap && capname) getcaptype = vm_capability_name2type(capname); else getcaptype = -1; for (captype = 0; captype < VM_CAP_MAX; captype++) { if (getcaptype >= 0 && captype != getcaptype) continue; error = vm_get_capability(ctx, vcpu, captype, &val); if (error == 0) { printf("Capability \"%s\" is %s on vcpu %d\n", vm_capability_type2name(captype), val ? "set" : "not set", vcpu); } else if (errno == ENOENT) { error = 0; printf("Capability \"%s\" is not available\n", vm_capability_type2name(captype)); } else { break; } } } if (!error && (get_active_cpus || get_all)) { error = vm_active_cpus(ctx, &cpus); if (!error) print_cpus("active cpus", &cpus); } if (!error && (get_suspended_cpus || get_all)) { error = vm_suspended_cpus(ctx, &cpus); if (!error) print_cpus("suspended cpus", &cpus); } if (!error && (get_intinfo || get_all)) { error = vm_get_intinfo(ctx, vcpu, &info[0], &info[1]); if (!error) { print_intinfo("pending", info[0]); print_intinfo("current", info[1]); } } if (!error && (get_stats || get_all)) { int i, num_stats; uint64_t *stats; struct timeval tv; const char *desc; stats = vm_get_stats(ctx, vcpu, &tv, &num_stats); if (stats != NULL) { printf("vcpu%d stats:\n", vcpu); for (i = 0; i < num_stats; i++) { desc = vm_get_stat_desc(ctx, i); printf("%-40s\t%ld\n", desc, stats[i]); } } } if (!error && run) { error = vm_run(ctx, vcpu, &vmexit); if (error == 0) dump_vm_run_exitcode(&vmexit, vcpu); else printf("vm_run error %d\n", error); } if (!error && force_reset) error = vm_suspend(ctx, VM_SUSPEND_RESET); if (!error && force_poweroff) error = vm_suspend(ctx, VM_SUSPEND_POWEROFF); if (error) printf("errno = %d\n", errno); if (!error && destroy) vm_destroy(ctx); free (opts); exit(error); } Index: user/ngie/more-tests/usr.sbin/bhyvectl =================================================================== --- user/ngie/more-tests/usr.sbin/bhyvectl (revision 281584) +++ user/ngie/more-tests/usr.sbin/bhyvectl (revision 281585) Property changes on: user/ngie/more-tests/usr.sbin/bhyvectl ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/usr.sbin/bhyvectl:r281414-281584 Index: user/ngie/more-tests/usr.sbin/ctld/discovery.c =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/discovery.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/discovery.c (revision 281585) @@ -1,337 +1,336 @@ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include -#include #include #include #include #include #include #include #include "ctld.h" #include "iscsi_proto.h" static struct pdu * text_receive(struct connection *conn) { struct pdu *request; struct iscsi_bhs_text_request *bhstr; request = pdu_new(conn); pdu_receive(request); if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) != ISCSI_BHS_OPCODE_TEXT_REQUEST) log_errx(1, "protocol error: received invalid opcode 0x%x", request->pdu_bhs->bhs_opcode); bhstr = (struct iscsi_bhs_text_request *)request->pdu_bhs; #if 0 if ((bhstr->bhstr_flags & ISCSI_BHSTR_FLAGS_FINAL) == 0) log_errx(1, "received Text PDU without the \"F\" flag"); #endif /* * XXX: Implement the C flag some day. */ if ((bhstr->bhstr_flags & BHSTR_FLAGS_CONTINUE) != 0) log_errx(1, "received Text PDU with unsupported \"C\" flag"); if (ISCSI_SNLT(ntohl(bhstr->bhstr_cmdsn), conn->conn_cmdsn)) { log_errx(1, "received Text PDU with decreasing CmdSN: " "was %u, is %u", conn->conn_cmdsn, ntohl(bhstr->bhstr_cmdsn)); } if (ntohl(bhstr->bhstr_expstatsn) != conn->conn_statsn) { log_errx(1, "received Text PDU with wrong StatSN: " "is %u, should be %u", ntohl(bhstr->bhstr_expstatsn), conn->conn_statsn); } conn->conn_cmdsn = ntohl(bhstr->bhstr_cmdsn); if ((bhstr->bhstr_opcode & ISCSI_BHS_OPCODE_IMMEDIATE) == 0) conn->conn_cmdsn++; return (request); } static struct pdu * text_new_response(struct pdu *request) { struct pdu *response; struct connection *conn; struct iscsi_bhs_text_request *bhstr; struct iscsi_bhs_text_response *bhstr2; bhstr = (struct iscsi_bhs_text_request *)request->pdu_bhs; conn = request->pdu_connection; response = pdu_new_response(request); bhstr2 = (struct iscsi_bhs_text_response *)response->pdu_bhs; bhstr2->bhstr_opcode = ISCSI_BHS_OPCODE_TEXT_RESPONSE; bhstr2->bhstr_flags = BHSTR_FLAGS_FINAL; bhstr2->bhstr_lun = bhstr->bhstr_lun; bhstr2->bhstr_initiator_task_tag = bhstr->bhstr_initiator_task_tag; bhstr2->bhstr_target_transfer_tag = bhstr->bhstr_target_transfer_tag; bhstr2->bhstr_statsn = htonl(conn->conn_statsn++); bhstr2->bhstr_expcmdsn = htonl(conn->conn_cmdsn); bhstr2->bhstr_maxcmdsn = htonl(conn->conn_cmdsn); return (response); } static struct pdu * logout_receive(struct connection *conn) { struct pdu *request; struct iscsi_bhs_logout_request *bhslr; request = pdu_new(conn); pdu_receive(request); if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) != ISCSI_BHS_OPCODE_LOGOUT_REQUEST) log_errx(1, "protocol error: received invalid opcode 0x%x", request->pdu_bhs->bhs_opcode); bhslr = (struct iscsi_bhs_logout_request *)request->pdu_bhs; if ((bhslr->bhslr_reason & 0x7f) != BHSLR_REASON_CLOSE_SESSION) log_debugx("received Logout PDU with invalid reason 0x%x; " "continuing anyway", bhslr->bhslr_reason & 0x7f); if (ISCSI_SNLT(ntohl(bhslr->bhslr_cmdsn), conn->conn_cmdsn)) { log_errx(1, "received Logout PDU with decreasing CmdSN: " "was %u, is %u", conn->conn_cmdsn, ntohl(bhslr->bhslr_cmdsn)); } if (ntohl(bhslr->bhslr_expstatsn) != conn->conn_statsn) { log_errx(1, "received Logout PDU with wrong StatSN: " "is %u, should be %u", ntohl(bhslr->bhslr_expstatsn), conn->conn_statsn); } conn->conn_cmdsn = ntohl(bhslr->bhslr_cmdsn); if ((bhslr->bhslr_opcode & ISCSI_BHS_OPCODE_IMMEDIATE) == 0) conn->conn_cmdsn++; return (request); } static struct pdu * logout_new_response(struct pdu *request) { struct pdu *response; struct connection *conn; struct iscsi_bhs_logout_request *bhslr; struct iscsi_bhs_logout_response *bhslr2; bhslr = (struct iscsi_bhs_logout_request *)request->pdu_bhs; conn = request->pdu_connection; response = pdu_new_response(request); bhslr2 = (struct iscsi_bhs_logout_response *)response->pdu_bhs; bhslr2->bhslr_opcode = ISCSI_BHS_OPCODE_LOGOUT_RESPONSE; bhslr2->bhslr_flags = 0x80; bhslr2->bhslr_response = BHSLR_RESPONSE_CLOSED_SUCCESSFULLY; bhslr2->bhslr_initiator_task_tag = bhslr->bhslr_initiator_task_tag; bhslr2->bhslr_statsn = htonl(conn->conn_statsn++); bhslr2->bhslr_expcmdsn = htonl(conn->conn_cmdsn); bhslr2->bhslr_maxcmdsn = htonl(conn->conn_cmdsn); return (response); } static void discovery_add_target(struct keys *response_keys, const struct target *targ) { struct port *port; struct portal *portal; char *buf; char hbuf[NI_MAXHOST], sbuf[NI_MAXSERV]; struct addrinfo *ai; int ret; keys_add(response_keys, "TargetName", targ->t_name); TAILQ_FOREACH(port, &targ->t_ports, p_ts) { if (port->p_portal_group == NULL) continue; TAILQ_FOREACH(portal, &port->p_portal_group->pg_portals, p_next) { ai = portal->p_ai; ret = getnameinfo(ai->ai_addr, ai->ai_addrlen, hbuf, sizeof(hbuf), sbuf, sizeof(sbuf), NI_NUMERICHOST | NI_NUMERICSERV); if (ret != 0) { log_warnx("getnameinfo: %s", gai_strerror(ret)); continue; } switch (ai->ai_addr->sa_family) { case AF_INET: if (strcmp(hbuf, "0.0.0.0") == 0) continue; ret = asprintf(&buf, "%s:%s,%d", hbuf, sbuf, port->p_portal_group->pg_tag); break; case AF_INET6: if (strcmp(hbuf, "::") == 0) continue; ret = asprintf(&buf, "[%s]:%s,%d", hbuf, sbuf, port->p_portal_group->pg_tag); break; default: continue; } if (ret <= 0) log_err(1, "asprintf"); keys_add(response_keys, "TargetAddress", buf); free(buf); } } } static bool discovery_target_filtered_out(const struct connection *conn, const struct port *port) { const struct auth_group *ag; const struct portal_group *pg; const struct target *targ; const struct auth *auth; int error; targ = port->p_target; ag = port->p_auth_group; if (ag == NULL) ag = targ->t_auth_group; pg = conn->conn_portal->p_portal_group; assert(pg->pg_discovery_auth_group != PG_FILTER_UNKNOWN); if (pg->pg_discovery_filter >= PG_FILTER_PORTAL && auth_portal_check(ag, &conn->conn_initiator_sa) != 0) { log_debugx("initiator does not match initiator portals " "allowed for target \"%s\"; skipping", targ->t_name); return (true); } if (pg->pg_discovery_filter >= PG_FILTER_PORTAL_NAME && auth_name_check(ag, conn->conn_initiator_name) != 0) { log_debugx("initiator does not match initiator names " "allowed for target \"%s\"; skipping", targ->t_name); return (true); } if (pg->pg_discovery_filter >= PG_FILTER_PORTAL_NAME_AUTH && ag->ag_type != AG_TYPE_NO_AUTHENTICATION) { if (conn->conn_chap == NULL) { assert(pg->pg_discovery_auth_group->ag_type == AG_TYPE_NO_AUTHENTICATION); log_debugx("initiator didn't authenticate, but target " "\"%s\" requires CHAP; skipping", targ->t_name); return (true); } assert(conn->conn_user != NULL); auth = auth_find(ag, conn->conn_user); if (auth == NULL) { log_debugx("CHAP user \"%s\" doesn't match target " "\"%s\"; skipping", conn->conn_user, targ->t_name); return (true); } error = chap_authenticate(conn->conn_chap, auth->a_secret); if (error != 0) { log_debugx("password for CHAP user \"%s\" doesn't " "match target \"%s\"; skipping", conn->conn_user, targ->t_name); return (true); } } return (false); } void discovery(struct connection *conn) { struct pdu *request, *response; struct keys *request_keys, *response_keys; const struct port *port; const struct portal_group *pg; const char *send_targets; pg = conn->conn_portal->p_portal_group; log_debugx("beginning discovery session; waiting for Text PDU"); request = text_receive(conn); request_keys = keys_new(); keys_load(request_keys, request); send_targets = keys_find(request_keys, "SendTargets"); if (send_targets == NULL) log_errx(1, "received Text PDU without SendTargets"); response = text_new_response(request); response_keys = keys_new(); if (strcmp(send_targets, "All") == 0) { TAILQ_FOREACH(port, &pg->pg_ports, p_pgs) { if (discovery_target_filtered_out(conn, port)) { /* Ignore this target. */ continue; } discovery_add_target(response_keys, port->p_target); } } else { port = port_find_in_pg(pg, send_targets); if (port == NULL) { log_debugx("initiator requested information on unknown " "target \"%s\"; returning nothing", send_targets); } else { if (discovery_target_filtered_out(conn, port)) { /* Ignore this target. */ } else { discovery_add_target(response_keys, port->p_target); } } } keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); pdu_delete(request); keys_delete(request_keys); log_debugx("done sending targets; waiting for Logout PDU"); request = logout_receive(conn); response = logout_new_response(request); pdu_send(response); pdu_delete(response); pdu_delete(request); log_debugx("discovery session done"); } Index: user/ngie/more-tests/usr.sbin/ctld/isns.c =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/isns.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/isns.c (revision 281585) @@ -1,258 +1,252 @@ /*- * Copyright (c) 2014 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include -#include -#include -#include #include -#include #include -#include -#include #include #include #include #include "ctld.h" #include "isns.h" struct isns_req * isns_req_alloc(void) { struct isns_req *req; req = calloc(sizeof(struct isns_req), 1); if (req == NULL) { log_err(1, "calloc"); return (NULL); } req->ir_buflen = sizeof(struct isns_hdr); req->ir_usedlen = 0; req->ir_buf = calloc(req->ir_buflen, 1); if (req->ir_buf == NULL) { free(req); log_err(1, "calloc"); return (NULL); } return (req); } struct isns_req * isns_req_create(uint16_t func, uint16_t flags) { struct isns_req *req; struct isns_hdr *hdr; req = isns_req_alloc(); req->ir_usedlen = sizeof(struct isns_hdr); hdr = (struct isns_hdr *)req->ir_buf; be16enc(hdr->ih_version, ISNS_VERSION); be16enc(hdr->ih_function, func); be16enc(hdr->ih_flags, flags); return (req); } void isns_req_free(struct isns_req *req) { free(req->ir_buf); free(req); } static int isns_req_getspace(struct isns_req *req, uint32_t len) { void *newbuf; int newlen; if (req->ir_usedlen + len <= req->ir_buflen) return (0); newlen = 1 << flsl(req->ir_usedlen + len); newbuf = realloc(req->ir_buf, newlen); if (newbuf == NULL) { log_err(1, "realloc"); return (1); } req->ir_buf = newbuf; req->ir_buflen = newlen; return (0); } void isns_req_add(struct isns_req *req, uint32_t tag, uint32_t len, const void *value) { struct isns_tlv *tlv; uint32_t vlen; vlen = len + ((len & 3) ? (4 - (len & 3)) : 0); isns_req_getspace(req, sizeof(*tlv) + vlen); tlv = (struct isns_tlv *)&req->ir_buf[req->ir_usedlen]; be32enc(tlv->it_tag, tag); be32enc(tlv->it_length, vlen); memcpy(tlv->it_value, value, len); if (vlen != len) memset(&tlv->it_value[len], 0, vlen - len); req->ir_usedlen += sizeof(*tlv) + vlen; } void isns_req_add_delim(struct isns_req *req) { isns_req_add(req, 0, 0, NULL); } void isns_req_add_str(struct isns_req *req, uint32_t tag, const char *value) { isns_req_add(req, tag, strlen(value) + 1, value); } void isns_req_add_32(struct isns_req *req, uint32_t tag, uint32_t value) { uint32_t beval; be32enc(&beval, value); isns_req_add(req, tag, sizeof(value), &beval); } void isns_req_add_addr(struct isns_req *req, uint32_t tag, struct addrinfo *ai) { struct sockaddr_in *in4; struct sockaddr_in6 *in6; uint8_t buf[16]; switch (ai->ai_addr->sa_family) { case AF_INET: in4 = (struct sockaddr_in *)(void *)ai->ai_addr; memset(buf, 0, 10); buf[10] = 0xff; buf[11] = 0xff; memcpy(&buf[12], &in4->sin_addr, sizeof(in4->sin_addr)); isns_req_add(req, tag, sizeof(buf), buf); break; case AF_INET6: in6 = (struct sockaddr_in6 *)(void *)ai->ai_addr; isns_req_add(req, tag, sizeof(in6->sin6_addr), &in6->sin6_addr); break; default: log_errx(1, "Unsupported address family %d", ai->ai_addr->sa_family); } } void isns_req_add_port(struct isns_req *req, uint32_t tag, struct addrinfo *ai) { struct sockaddr_in *in4; struct sockaddr_in6 *in6; uint32_t buf; switch (ai->ai_addr->sa_family) { case AF_INET: in4 = (struct sockaddr_in *)(void *)ai->ai_addr; be32enc(&buf, ntohs(in4->sin_port)); isns_req_add(req, tag, sizeof(buf), &buf); break; case AF_INET6: in6 = (struct sockaddr_in6 *)(void *)ai->ai_addr; be32enc(&buf, ntohs(in6->sin6_port)); isns_req_add(req, tag, sizeof(buf), &buf); break; default: log_errx(1, "Unsupported address family %d", ai->ai_addr->sa_family); } } int isns_req_send(int s, struct isns_req *req) { struct isns_hdr *hdr; int res; hdr = (struct isns_hdr *)req->ir_buf; be16enc(hdr->ih_length, req->ir_usedlen - sizeof(*hdr)); be16enc(hdr->ih_flags, be16dec(hdr->ih_flags) | ISNS_FLAG_LAST | ISNS_FLAG_FIRST); be16enc(hdr->ih_transaction, 0); be16enc(hdr->ih_sequence, 0); res = write(s, req->ir_buf, req->ir_usedlen); return ((res < 0) ? -1 : 0); } int isns_req_receive(int s, struct isns_req *req) { struct isns_hdr *hdr; ssize_t res, len; req->ir_usedlen = 0; isns_req_getspace(req, sizeof(*hdr)); res = read(s, req->ir_buf, sizeof(*hdr)); if (res < (ssize_t)sizeof(*hdr)) return (-1); req->ir_usedlen = sizeof(*hdr); hdr = (struct isns_hdr *)req->ir_buf; if (be16dec(hdr->ih_version) != ISNS_VERSION) return (-1); if ((be16dec(hdr->ih_flags) & (ISNS_FLAG_LAST | ISNS_FLAG_FIRST)) != (ISNS_FLAG_LAST | ISNS_FLAG_FIRST)) return (-1); len = be16dec(hdr->ih_length); isns_req_getspace(req, len); res = read(s, &req->ir_buf[req->ir_usedlen], len); if (res < len) return (-1); req->ir_usedlen += len; return (0); } uint32_t isns_req_get_status(struct isns_req *req) { if (req->ir_usedlen < sizeof(struct isns_hdr) + 4) return (-1); return (be32dec(&req->ir_buf[sizeof(struct isns_hdr)])); } Index: user/ngie/more-tests/usr.sbin/ctld/keys.c =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/keys.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/keys.c (revision 281585) @@ -1,198 +1,197 @@ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include -#include #include #include #include #include "ctld.h" struct keys * keys_new(void) { struct keys *keys; keys = calloc(sizeof(*keys), 1); if (keys == NULL) log_err(1, "calloc"); return (keys); } void keys_delete(struct keys *keys) { free(keys->keys_data); free(keys); } void keys_load(struct keys *keys, const struct pdu *pdu) { int i; char *pair; size_t pair_len; if (pdu->pdu_data_len == 0) return; if (pdu->pdu_data[pdu->pdu_data_len - 1] != '\0') log_errx(1, "protocol error: key not NULL-terminated\n"); assert(keys->keys_data == NULL); keys->keys_data_len = pdu->pdu_data_len; keys->keys_data = malloc(keys->keys_data_len); if (keys->keys_data == NULL) log_err(1, "malloc"); memcpy(keys->keys_data, pdu->pdu_data, keys->keys_data_len); /* * XXX: Review this carefully. */ pair = keys->keys_data; for (i = 0;; i++) { if (i >= KEYS_MAX) log_errx(1, "too many keys received"); pair_len = strlen(pair); keys->keys_values[i] = pair; keys->keys_names[i] = strsep(&keys->keys_values[i], "="); if (keys->keys_names[i] == NULL || keys->keys_values[i] == NULL) log_errx(1, "malformed keys"); log_debugx("key received: \"%s=%s\"", keys->keys_names[i], keys->keys_values[i]); pair += pair_len + 1; /* +1 to skip the terminating '\0'. */ if (pair == keys->keys_data + keys->keys_data_len) break; assert(pair < keys->keys_data + keys->keys_data_len); } } void keys_save(struct keys *keys, struct pdu *pdu) { char *data; size_t len; int i; /* * XXX: Not particularly efficient. */ len = 0; for (i = 0; i < KEYS_MAX; i++) { if (keys->keys_names[i] == NULL) break; /* * +1 for '=', +1 for '\0'. */ len += strlen(keys->keys_names[i]) + strlen(keys->keys_values[i]) + 2; } if (len == 0) return; data = malloc(len); if (data == NULL) log_err(1, "malloc"); pdu->pdu_data = data; pdu->pdu_data_len = len; for (i = 0; i < KEYS_MAX; i++) { if (keys->keys_names[i] == NULL) break; data += sprintf(data, "%s=%s", keys->keys_names[i], keys->keys_values[i]); data += 1; /* for '\0'. */ } } const char * keys_find(struct keys *keys, const char *name) { int i; /* * Note that we don't handle duplicated key names here, * as they are not supposed to happen in requests, and if they do, * it's an initiator error. */ for (i = 0; i < KEYS_MAX; i++) { if (keys->keys_names[i] == NULL) return (NULL); if (strcmp(keys->keys_names[i], name) == 0) return (keys->keys_values[i]); } return (NULL); } void keys_add(struct keys *keys, const char *name, const char *value) { int i; log_debugx("key to send: \"%s=%s\"", name, value); /* * Note that we don't check for duplicates here, as they are perfectly * fine in responses, e.g. the "TargetName" keys in discovery sesion * response. */ for (i = 0; i < KEYS_MAX; i++) { if (keys->keys_names[i] == NULL) { keys->keys_names[i] = checked_strdup(name); keys->keys_values[i] = checked_strdup(value); return; } } log_errx(1, "too many keys"); } void keys_add_int(struct keys *keys, const char *name, int value) { char *str; int ret; ret = asprintf(&str, "%d", value); if (ret <= 0) log_err(1, "asprintf"); keys_add(keys, name, str); free(str); } Index: user/ngie/more-tests/usr.sbin/ctld/login.c =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/login.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/login.c (revision 281585) @@ -1,1006 +1,1004 @@ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include -#include -#include #include #include #include #include #include "ctld.h" #include "iscsi_proto.h" static void login_send_error(struct pdu *request, char class, char detail); static void login_set_nsg(struct pdu *response, int nsg) { struct iscsi_bhs_login_response *bhslr; assert(nsg == BHSLR_STAGE_SECURITY_NEGOTIATION || nsg == BHSLR_STAGE_OPERATIONAL_NEGOTIATION || nsg == BHSLR_STAGE_FULL_FEATURE_PHASE); bhslr = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr->bhslr_flags &= 0xFC; bhslr->bhslr_flags |= nsg; } static int login_csg(const struct pdu *request) { struct iscsi_bhs_login_request *bhslr; bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; return ((bhslr->bhslr_flags & 0x0C) >> 2); } static void login_set_csg(struct pdu *response, int csg) { struct iscsi_bhs_login_response *bhslr; assert(csg == BHSLR_STAGE_SECURITY_NEGOTIATION || csg == BHSLR_STAGE_OPERATIONAL_NEGOTIATION || csg == BHSLR_STAGE_FULL_FEATURE_PHASE); bhslr = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr->bhslr_flags &= 0xF3; bhslr->bhslr_flags |= csg << 2; } static struct pdu * login_receive(struct connection *conn, bool initial) { struct pdu *request; struct iscsi_bhs_login_request *bhslr; request = pdu_new(conn); pdu_receive(request); if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) != ISCSI_BHS_OPCODE_LOGIN_REQUEST) { /* * The first PDU in session is special - if we receive any PDU * different than login request, we have to drop the connection * without sending response ("A target receiving any PDU * except a Login request before the Login Phase is started MUST * immediately terminate the connection on which the PDU * was received.") */ if (initial == false) login_send_error(request, 0x02, 0x0b); log_errx(1, "protocol error: received invalid opcode 0x%x", request->pdu_bhs->bhs_opcode); } bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; /* * XXX: Implement the C flag some day. */ if ((bhslr->bhslr_flags & BHSLR_FLAGS_CONTINUE) != 0) { login_send_error(request, 0x03, 0x00); log_errx(1, "received Login PDU with unsupported \"C\" flag"); } if (bhslr->bhslr_version_max != 0x00) { login_send_error(request, 0x02, 0x05); log_errx(1, "received Login PDU with unsupported " "Version-max 0x%x", bhslr->bhslr_version_max); } if (bhslr->bhslr_version_min != 0x00) { login_send_error(request, 0x02, 0x05); log_errx(1, "received Login PDU with unsupported " "Version-min 0x%x", bhslr->bhslr_version_min); } if (ISCSI_SNLT(ntohl(bhslr->bhslr_cmdsn), conn->conn_cmdsn)) { login_send_error(request, 0x02, 0x05); log_errx(1, "received Login PDU with decreasing CmdSN: " "was %u, is %u", conn->conn_cmdsn, ntohl(bhslr->bhslr_cmdsn)); } if (initial == false && ntohl(bhslr->bhslr_expstatsn) != conn->conn_statsn) { login_send_error(request, 0x02, 0x05); log_errx(1, "received Login PDU with wrong ExpStatSN: " "is %u, should be %u", ntohl(bhslr->bhslr_expstatsn), conn->conn_statsn); } conn->conn_cmdsn = ntohl(bhslr->bhslr_cmdsn); return (request); } static struct pdu * login_new_response(struct pdu *request) { struct pdu *response; struct connection *conn; struct iscsi_bhs_login_request *bhslr; struct iscsi_bhs_login_response *bhslr2; bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; conn = request->pdu_connection; response = pdu_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_opcode = ISCSI_BHS_OPCODE_LOGIN_RESPONSE; login_set_csg(response, BHSLR_STAGE_SECURITY_NEGOTIATION); memcpy(bhslr2->bhslr_isid, bhslr->bhslr_isid, sizeof(bhslr2->bhslr_isid)); bhslr2->bhslr_initiator_task_tag = bhslr->bhslr_initiator_task_tag; bhslr2->bhslr_statsn = htonl(conn->conn_statsn++); bhslr2->bhslr_expcmdsn = htonl(conn->conn_cmdsn); bhslr2->bhslr_maxcmdsn = htonl(conn->conn_cmdsn); return (response); } static void login_send_error(struct pdu *request, char class, char detail) { struct pdu *response; struct iscsi_bhs_login_response *bhslr2; log_debugx("sending Login Response PDU with failure class 0x%x/0x%x; " "see next line for reason", class, detail); response = login_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_status_class = class; bhslr2->bhslr_status_detail = detail; pdu_send(response); pdu_delete(response); } static int login_list_contains(const char *list, const char *what) { char *tofree, *str, *token; tofree = str = checked_strdup(list); while ((token = strsep(&str, ",")) != NULL) { if (strcmp(token, what) == 0) { free(tofree); return (1); } } free(tofree); return (0); } static int login_list_prefers(const char *list, const char *choice1, const char *choice2) { char *tofree, *str, *token; tofree = str = checked_strdup(list); while ((token = strsep(&str, ",")) != NULL) { if (strcmp(token, choice1) == 0) { free(tofree); return (1); } if (strcmp(token, choice2) == 0) { free(tofree); return (2); } } free(tofree); return (-1); } static struct pdu * login_receive_chap_a(struct connection *conn) { struct pdu *request; struct keys *request_keys; const char *chap_a; request = login_receive(conn, false); request_keys = keys_new(); keys_load(request_keys, request); chap_a = keys_find(request_keys, "CHAP_A"); if (chap_a == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU without CHAP_A"); } if (login_list_contains(chap_a, "5") == 0) { login_send_error(request, 0x02, 0x01); log_errx(1, "received CHAP Login PDU with unsupported CHAP_A " "\"%s\"", chap_a); } keys_delete(request_keys); return (request); } static void login_send_chap_c(struct pdu *request, struct chap *chap) { struct pdu *response; struct keys *response_keys; char *chap_c, *chap_i; chap_c = chap_get_challenge(chap); chap_i = chap_get_id(chap); response = login_new_response(request); response_keys = keys_new(); keys_add(response_keys, "CHAP_A", "5"); keys_add(response_keys, "CHAP_I", chap_i); keys_add(response_keys, "CHAP_C", chap_c); free(chap_i); free(chap_c); keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); } static struct pdu * login_receive_chap_r(struct connection *conn, struct auth_group *ag, struct chap *chap, const struct auth **authp) { struct pdu *request; struct keys *request_keys; const char *chap_n, *chap_r; const struct auth *auth; int error; request = login_receive(conn, false); request_keys = keys_new(); keys_load(request_keys, request); chap_n = keys_find(request_keys, "CHAP_N"); if (chap_n == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU without CHAP_N"); } chap_r = keys_find(request_keys, "CHAP_R"); if (chap_r == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU without CHAP_R"); } error = chap_receive(chap, chap_r); if (error != 0) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU with malformed CHAP_R"); } /* * Verify the response. */ assert(ag->ag_type == AG_TYPE_CHAP || ag->ag_type == AG_TYPE_CHAP_MUTUAL); auth = auth_find(ag, chap_n); if (auth == NULL) { login_send_error(request, 0x02, 0x01); log_errx(1, "received CHAP Login with invalid user \"%s\"", chap_n); } assert(auth->a_secret != NULL); assert(strlen(auth->a_secret) > 0); error = chap_authenticate(chap, auth->a_secret); if (error != 0) { login_send_error(request, 0x02, 0x01); log_errx(1, "CHAP authentication failed for user \"%s\"", auth->a_user); } keys_delete(request_keys); *authp = auth; return (request); } static void login_send_chap_success(struct pdu *request, const struct auth *auth) { struct pdu *response; struct keys *request_keys, *response_keys; struct iscsi_bhs_login_response *bhslr2; struct rchap *rchap; const char *chap_i, *chap_c; char *chap_r; int error; response = login_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_flags |= BHSLR_FLAGS_TRANSIT; login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); /* * Actually, one more thing: mutual authentication. */ request_keys = keys_new(); keys_load(request_keys, request); chap_i = keys_find(request_keys, "CHAP_I"); chap_c = keys_find(request_keys, "CHAP_C"); if (chap_i != NULL || chap_c != NULL) { if (chap_i == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "initiator requested target " "authentication, but didn't send CHAP_I"); } if (chap_c == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "initiator requested target " "authentication, but didn't send CHAP_C"); } if (auth->a_auth_group->ag_type != AG_TYPE_CHAP_MUTUAL) { login_send_error(request, 0x02, 0x01); log_errx(1, "initiator requests target authentication " "for user \"%s\", but mutual user/secret " "is not set", auth->a_user); } log_debugx("performing mutual authentication as user \"%s\"", auth->a_mutual_user); rchap = rchap_new(auth->a_mutual_secret); error = rchap_receive(rchap, chap_i, chap_c); if (error != 0) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU with malformed " "CHAP_I or CHAP_C"); } chap_r = rchap_get_response(rchap); rchap_delete(rchap); response_keys = keys_new(); keys_add(response_keys, "CHAP_N", auth->a_mutual_user); keys_add(response_keys, "CHAP_R", chap_r); free(chap_r); keys_save(response_keys, response); keys_delete(response_keys); } else { log_debugx("initiator did not request target authentication"); } keys_delete(request_keys); pdu_send(response); pdu_delete(response); } static void login_chap(struct connection *conn, struct auth_group *ag) { const struct auth *auth; struct chap *chap; struct pdu *request; /* * Receive CHAP_A PDU. */ log_debugx("beginning CHAP authentication; waiting for CHAP_A"); request = login_receive_chap_a(conn); /* * Generate the challenge. */ chap = chap_new(); /* * Send the challenge. */ log_debugx("sending CHAP_C, binary challenge size is %zd bytes", sizeof(chap->chap_challenge)); login_send_chap_c(request, chap); pdu_delete(request); /* * Receive CHAP_N/CHAP_R PDU and authenticate. */ log_debugx("waiting for CHAP_N/CHAP_R"); request = login_receive_chap_r(conn, ag, chap, &auth); /* * Yay, authentication succeeded! */ log_debugx("authentication succeeded for user \"%s\"; " "transitioning to Negotiation Phase", auth->a_user); login_send_chap_success(request, auth); pdu_delete(request); /* * Leave username and CHAP information for discovery(). */ conn->conn_user = auth->a_user; conn->conn_chap = chap; } static void login_negotiate_key(struct pdu *request, const char *name, const char *value, bool skipped_security, struct keys *response_keys) { int which; size_t tmp; struct connection *conn; conn = request->pdu_connection; if (strcmp(name, "InitiatorName") == 0) { if (!skipped_security) log_errx(1, "initiator resent InitiatorName"); } else if (strcmp(name, "SessionType") == 0) { if (!skipped_security) log_errx(1, "initiator resent SessionType"); } else if (strcmp(name, "TargetName") == 0) { if (!skipped_security) log_errx(1, "initiator resent TargetName"); } else if (strcmp(name, "InitiatorAlias") == 0) { if (conn->conn_initiator_alias != NULL) free(conn->conn_initiator_alias); conn->conn_initiator_alias = checked_strdup(value); } else if (strcmp(value, "Irrelevant") == 0) { /* Ignore. */ } else if (strcmp(name, "HeaderDigest") == 0) { /* * We don't handle digests for discovery sessions. */ if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) { log_debugx("discovery session; digests disabled"); keys_add(response_keys, name, "None"); return; } which = login_list_prefers(value, "CRC32C", "None"); switch (which) { case 1: log_debugx("initiator prefers CRC32C " "for header digest; we'll use it"); conn->conn_header_digest = CONN_DIGEST_CRC32C; keys_add(response_keys, name, "CRC32C"); break; case 2: log_debugx("initiator prefers not to do " "header digest; we'll comply"); keys_add(response_keys, name, "None"); break; default: log_warnx("initiator sent unrecognized " "HeaderDigest value \"%s\"; will use None", value); keys_add(response_keys, name, "None"); break; } } else if (strcmp(name, "DataDigest") == 0) { if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) { log_debugx("discovery session; digests disabled"); keys_add(response_keys, name, "None"); return; } which = login_list_prefers(value, "CRC32C", "None"); switch (which) { case 1: log_debugx("initiator prefers CRC32C " "for data digest; we'll use it"); conn->conn_data_digest = CONN_DIGEST_CRC32C; keys_add(response_keys, name, "CRC32C"); break; case 2: log_debugx("initiator prefers not to do " "data digest; we'll comply"); keys_add(response_keys, name, "None"); break; default: log_warnx("initiator sent unrecognized " "DataDigest value \"%s\"; will use None", value); keys_add(response_keys, name, "None"); break; } } else if (strcmp(name, "MaxConnections") == 0) { keys_add(response_keys, name, "1"); } else if (strcmp(name, "InitialR2T") == 0) { keys_add(response_keys, name, "Yes"); } else if (strcmp(name, "ImmediateData") == 0) { if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) { log_debugx("discovery session; ImmediateData irrelevant"); keys_add(response_keys, name, "Irrelevant"); } else { if (strcmp(value, "Yes") == 0) { conn->conn_immediate_data = true; keys_add(response_keys, name, "Yes"); } else { conn->conn_immediate_data = false; keys_add(response_keys, name, "No"); } } } else if (strcmp(name, "MaxRecvDataSegmentLength") == 0) { tmp = strtoul(value, NULL, 10); if (tmp <= 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "received invalid " "MaxRecvDataSegmentLength"); } if (tmp > conn->conn_data_segment_limit) { log_debugx("capping MaxRecvDataSegmentLength " "from %zd to %zd", tmp, conn->conn_data_segment_limit); tmp = conn->conn_data_segment_limit; } conn->conn_max_data_segment_length = tmp; keys_add_int(response_keys, name, tmp); } else if (strcmp(name, "MaxBurstLength") == 0) { tmp = strtoul(value, NULL, 10); if (tmp <= 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "received invalid MaxBurstLength"); } if (tmp > MAX_BURST_LENGTH) { log_debugx("capping MaxBurstLength from %zd to %d", tmp, MAX_BURST_LENGTH); tmp = MAX_BURST_LENGTH; } conn->conn_max_burst_length = tmp; keys_add(response_keys, name, value); } else if (strcmp(name, "FirstBurstLength") == 0) { tmp = strtoul(value, NULL, 10); if (tmp <= 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "received invalid " "FirstBurstLength"); } if (tmp > conn->conn_data_segment_limit) { log_debugx("capping FirstBurstLength from %zd to %zd", tmp, conn->conn_data_segment_limit); tmp = conn->conn_data_segment_limit; } /* * We don't pass the value to the kernel; it only enforces * hardcoded limit anyway. */ keys_add_int(response_keys, name, tmp); } else if (strcmp(name, "DefaultTime2Wait") == 0) { keys_add(response_keys, name, value); } else if (strcmp(name, "DefaultTime2Retain") == 0) { keys_add(response_keys, name, "0"); } else if (strcmp(name, "MaxOutstandingR2T") == 0) { keys_add(response_keys, name, "1"); } else if (strcmp(name, "DataPDUInOrder") == 0) { keys_add(response_keys, name, "Yes"); } else if (strcmp(name, "DataSequenceInOrder") == 0) { keys_add(response_keys, name, "Yes"); } else if (strcmp(name, "ErrorRecoveryLevel") == 0) { keys_add(response_keys, name, "0"); } else if (strcmp(name, "OFMarker") == 0) { keys_add(response_keys, name, "No"); } else if (strcmp(name, "IFMarker") == 0) { keys_add(response_keys, name, "No"); } else { log_debugx("unknown key \"%s\"; responding " "with NotUnderstood", name); keys_add(response_keys, name, "NotUnderstood"); } } static void login_redirect(struct pdu *request, const char *target_address) { struct pdu *response; struct iscsi_bhs_login_response *bhslr2; struct keys *response_keys; response = login_new_response(request); login_set_csg(response, login_csg(request)); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_status_class = 0x01; bhslr2->bhslr_status_detail = 0x01; response_keys = keys_new(); keys_add(response_keys, "TargetAddress", target_address); keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); } static bool login_portal_redirect(struct connection *conn, struct pdu *request) { const struct portal_group *pg; pg = conn->conn_portal->p_portal_group; if (pg->pg_redirection == NULL) return (false); log_debugx("portal-group \"%s\" configured to redirect to %s", pg->pg_name, pg->pg_redirection); login_redirect(request, pg->pg_redirection); return (true); } static bool login_target_redirect(struct connection *conn, struct pdu *request) { const char *target_address; assert(conn->conn_portal->p_portal_group->pg_redirection == NULL); if (conn->conn_target == NULL) return (false); target_address = conn->conn_target->t_redirection; if (target_address == NULL) return (false); log_debugx("target \"%s\" configured to redirect to %s", conn->conn_target->t_name, target_address); login_redirect(request, target_address); return (true); } static void login_negotiate(struct connection *conn, struct pdu *request) { struct pdu *response; struct iscsi_bhs_login_response *bhslr2; struct keys *request_keys, *response_keys; int i; bool redirected, skipped_security; if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { /* * Query the kernel for MaxDataSegmentLength it can handle. * In case of offload, it depends on hardware capabilities. */ assert(conn->conn_target != NULL); kernel_limits(conn->conn_portal->p_portal_group->pg_offload, &conn->conn_data_segment_limit); } else { conn->conn_data_segment_limit = MAX_DATA_SEGMENT_LENGTH; } if (request == NULL) { log_debugx("beginning operational parameter negotiation; " "waiting for Login PDU"); request = login_receive(conn, false); skipped_security = false; } else skipped_security = true; /* * RFC 3720, 10.13.5. Status-Class and Status-Detail, says * the redirection SHOULD be accepted by the initiator before * authentication, but MUST be be accepted afterwards; that's * why we're doing it here and not earlier. */ redirected = login_target_redirect(conn, request); if (redirected) { log_debugx("initiator redirected; exiting"); exit(0); } request_keys = keys_new(); keys_load(request_keys, request); response = login_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_flags |= BHSLR_FLAGS_TRANSIT; bhslr2->bhslr_tsih = htons(0xbadd); login_set_csg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); login_set_nsg(response, BHSLR_STAGE_FULL_FEATURE_PHASE); response_keys = keys_new(); if (skipped_security && conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { if (conn->conn_target->t_alias != NULL) keys_add(response_keys, "TargetAlias", conn->conn_target->t_alias); keys_add_int(response_keys, "TargetPortalGroupTag", conn->conn_portal->p_portal_group->pg_tag); } for (i = 0; i < KEYS_MAX; i++) { if (request_keys->keys_names[i] == NULL) break; login_negotiate_key(request, request_keys->keys_names[i], request_keys->keys_values[i], skipped_security, response_keys); } log_debugx("operational parameter negotiation done; " "transitioning to Full Feature Phase"); keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); pdu_delete(request); keys_delete(request_keys); } void login(struct connection *conn) { struct pdu *request, *response; struct iscsi_bhs_login_request *bhslr; struct iscsi_bhs_login_response *bhslr2; struct keys *request_keys, *response_keys; struct auth_group *ag; struct portal_group *pg; const char *initiator_name, *initiator_alias, *session_type, *target_name, *auth_method; bool redirected; /* * Handle the initial Login Request - figure out required authentication * method and either transition to the next phase, if no authentication * is required, or call appropriate authentication code. */ log_debugx("beginning Login Phase; waiting for Login PDU"); request = login_receive(conn, true); bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; if (bhslr->bhslr_tsih != 0) { login_send_error(request, 0x02, 0x0a); log_errx(1, "received Login PDU with non-zero TSIH"); } pg = conn->conn_portal->p_portal_group; memcpy(conn->conn_initiator_isid, bhslr->bhslr_isid, sizeof(conn->conn_initiator_isid)); /* * XXX: Implement the C flag some day. */ request_keys = keys_new(); keys_load(request_keys, request); assert(conn->conn_initiator_name == NULL); initiator_name = keys_find(request_keys, "InitiatorName"); if (initiator_name == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received Login PDU without InitiatorName"); } if (valid_iscsi_name(initiator_name) == false) { login_send_error(request, 0x02, 0x00); log_errx(1, "received Login PDU with invalid InitiatorName"); } conn->conn_initiator_name = checked_strdup(initiator_name); log_set_peer_name(conn->conn_initiator_name); /* * XXX: This doesn't work (does nothing) because of Capsicum. */ setproctitle("%s (%s)", conn->conn_initiator_addr, conn->conn_initiator_name); redirected = login_portal_redirect(conn, request); if (redirected) { log_debugx("initiator redirected; exiting"); exit(0); } initiator_alias = keys_find(request_keys, "InitiatorAlias"); if (initiator_alias != NULL) conn->conn_initiator_alias = checked_strdup(initiator_alias); assert(conn->conn_session_type == CONN_SESSION_TYPE_NONE); session_type = keys_find(request_keys, "SessionType"); if (session_type != NULL) { if (strcmp(session_type, "Normal") == 0) { conn->conn_session_type = CONN_SESSION_TYPE_NORMAL; } else if (strcmp(session_type, "Discovery") == 0) { conn->conn_session_type = CONN_SESSION_TYPE_DISCOVERY; } else { login_send_error(request, 0x02, 0x00); log_errx(1, "received Login PDU with invalid " "SessionType \"%s\"", session_type); } } else conn->conn_session_type = CONN_SESSION_TYPE_NORMAL; assert(conn->conn_target == NULL); if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { target_name = keys_find(request_keys, "TargetName"); if (target_name == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received Login PDU without TargetName"); } conn->conn_port = port_find_in_pg(pg, target_name); if (conn->conn_port == NULL) { login_send_error(request, 0x02, 0x03); log_errx(1, "requested target \"%s\" not found", target_name); } conn->conn_target = conn->conn_port->p_target; } /* * At this point we know what kind of authentication we need. */ if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { ag = conn->conn_port->p_auth_group; if (ag == NULL) ag = conn->conn_target->t_auth_group; if (ag->ag_name != NULL) { log_debugx("initiator requests to connect " "to target \"%s\"; auth-group \"%s\"", conn->conn_target->t_name, ag->ag_name); } else { log_debugx("initiator requests to connect " "to target \"%s\"", conn->conn_target->t_name); } } else { assert(conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY); ag = pg->pg_discovery_auth_group; if (ag->ag_name != NULL) { log_debugx("initiator requests " "discovery session; auth-group \"%s\"", ag->ag_name); } else { log_debugx("initiator requests discovery session"); } } /* * Enforce initiator-name and initiator-portal. */ if (auth_name_check(ag, initiator_name) != 0) { login_send_error(request, 0x02, 0x02); log_errx(1, "initiator does not match allowed initiator names"); } if (auth_portal_check(ag, &conn->conn_initiator_sa) != 0) { login_send_error(request, 0x02, 0x02); log_errx(1, "initiator does not match allowed " "initiator portals"); } /* * Let's see if the initiator intends to do any kind of authentication * at all. */ if (login_csg(request) == BHSLR_STAGE_OPERATIONAL_NEGOTIATION) { if (ag->ag_type != AG_TYPE_NO_AUTHENTICATION) { login_send_error(request, 0x02, 0x01); log_errx(1, "initiator skipped the authentication, " "but authentication is required"); } keys_delete(request_keys); log_debugx("initiator skipped the authentication, " "and we don't need it; proceeding with negotiation"); login_negotiate(conn, request); return; } if (ag->ag_type == AG_TYPE_NO_AUTHENTICATION) { /* * Initiator might want to to authenticate, * but we don't need it. */ log_debugx("authentication not required; " "transitioning to operational parameter negotiation"); if ((bhslr->bhslr_flags & BHSLR_FLAGS_TRANSIT) == 0) log_warnx("initiator did not set the \"T\" flag; " "transitioning anyway"); response = login_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_flags |= BHSLR_FLAGS_TRANSIT; login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); response_keys = keys_new(); /* * Required by Linux initiator. */ auth_method = keys_find(request_keys, "AuthMethod"); if (auth_method != NULL && login_list_contains(auth_method, "None")) keys_add(response_keys, "AuthMethod", "None"); if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { if (conn->conn_target->t_alias != NULL) keys_add(response_keys, "TargetAlias", conn->conn_target->t_alias); keys_add_int(response_keys, "TargetPortalGroupTag", pg->pg_tag); } keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); pdu_delete(request); keys_delete(request_keys); login_negotiate(conn, NULL); return; } if (ag->ag_type == AG_TYPE_DENY) { login_send_error(request, 0x02, 0x01); log_errx(1, "auth-type is \"deny\""); } if (ag->ag_type == AG_TYPE_UNKNOWN) { /* * This can happen with empty auth-group. */ login_send_error(request, 0x02, 0x01); log_errx(1, "auth-type not set, denying access"); } log_debugx("CHAP authentication required"); auth_method = keys_find(request_keys, "AuthMethod"); if (auth_method == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received Login PDU without AuthMethod"); } /* * XXX: This should be Reject, not just a login failure (5.3.2). */ if (login_list_contains(auth_method, "CHAP") == 0) { login_send_error(request, 0x02, 0x01); log_errx(1, "initiator requests unsupported AuthMethod \"%s\" " "instead of \"CHAP\"", auth_method); } response = login_new_response(request); response_keys = keys_new(); keys_add(response_keys, "AuthMethod", "CHAP"); if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { if (conn->conn_target->t_alias != NULL) keys_add(response_keys, "TargetAlias", conn->conn_target->t_alias); keys_add_int(response_keys, "TargetPortalGroupTag", pg->pg_tag); } keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); pdu_delete(request); keys_delete(request_keys); login_chap(conn, ag); login_negotiate(conn, NULL); } Index: user/ngie/more-tests/usr.sbin/ctld/parse.y =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/parse.y (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/parse.y (revision 281585) @@ -1,1062 +1,1061 @@ %{ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #include #include -#include #include #include #include "ctld.h" extern FILE *yyin; extern char *yytext; extern int lineno; static struct conf *conf = NULL; static struct auth_group *auth_group = NULL; static struct portal_group *portal_group = NULL; static struct target *target = NULL; static struct lun *lun = NULL; extern void yyerror(const char *); extern int yylex(void); extern void yyrestart(FILE *); %} %token ALIAS AUTH_GROUP AUTH_TYPE BACKEND BLOCKSIZE CHAP CHAP_MUTUAL %token CLOSING_BRACKET DEBUG DEVICE_ID DISCOVERY_AUTH_GROUP DISCOVERY_FILTER %token INITIATOR_NAME INITIATOR_PORTAL ISNS_SERVER ISNS_PERIOD ISNS_TIMEOUT %token LISTEN LISTEN_ISER LUN MAXPROC OFFLOAD OPENING_BRACKET OPTION %token PATH PIDFILE PORT PORTAL_GROUP REDIRECT SEMICOLON SERIAL SIZE STR %token TARGET TIMEOUT %union { char *str; } %token STR %% statements: | statements statement | statements statement SEMICOLON ; statement: debug | timeout | maxproc | pidfile | isns_server | isns_period | isns_timeout | auth_group | portal_group | lun | target ; debug: DEBUG STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } conf->conf_debug = tmp; } ; timeout: TIMEOUT STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } conf->conf_timeout = tmp; } ; maxproc: MAXPROC STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } conf->conf_maxproc = tmp; } ; pidfile: PIDFILE STR { if (conf->conf_pidfile_path != NULL) { log_warnx("pidfile specified more than once"); free($2); return (1); } conf->conf_pidfile_path = $2; } ; isns_server: ISNS_SERVER STR { int error; error = isns_new(conf, $2); free($2); if (error != 0) return (1); } ; isns_period: ISNS_PERIOD STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } conf->conf_isns_period = tmp; } ; isns_timeout: ISNS_TIMEOUT STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } conf->conf_isns_timeout = tmp; } ; auth_group: AUTH_GROUP auth_group_name OPENING_BRACKET auth_group_entries CLOSING_BRACKET { auth_group = NULL; } ; auth_group_name: STR { /* * Make it possible to redefine default * auth-group. but only once. */ if (strcmp($1, "default") == 0 && conf->conf_default_ag_defined == false) { auth_group = auth_group_find(conf, $1); conf->conf_default_ag_defined = true; } else { auth_group = auth_group_new(conf, $1); } free($1); if (auth_group == NULL) return (1); } ; auth_group_entries: | auth_group_entries auth_group_entry | auth_group_entries auth_group_entry SEMICOLON ; auth_group_entry: auth_group_auth_type | auth_group_chap | auth_group_chap_mutual | auth_group_initiator_name | auth_group_initiator_portal ; auth_group_auth_type: AUTH_TYPE STR { int error; error = auth_group_set_type(auth_group, $2); free($2); if (error != 0) return (1); } ; auth_group_chap: CHAP STR STR { const struct auth *ca; ca = auth_new_chap(auth_group, $2, $3); free($2); free($3); if (ca == NULL) return (1); } ; auth_group_chap_mutual: CHAP_MUTUAL STR STR STR STR { const struct auth *ca; ca = auth_new_chap_mutual(auth_group, $2, $3, $4, $5); free($2); free($3); free($4); free($5); if (ca == NULL) return (1); } ; auth_group_initiator_name: INITIATOR_NAME STR { const struct auth_name *an; an = auth_name_new(auth_group, $2); free($2); if (an == NULL) return (1); } ; auth_group_initiator_portal: INITIATOR_PORTAL STR { const struct auth_portal *ap; ap = auth_portal_new(auth_group, $2); free($2); if (ap == NULL) return (1); } ; portal_group: PORTAL_GROUP portal_group_name OPENING_BRACKET portal_group_entries CLOSING_BRACKET { portal_group = NULL; } ; portal_group_name: STR { /* * Make it possible to redefine default * portal-group. but only once. */ if (strcmp($1, "default") == 0 && conf->conf_default_pg_defined == false) { portal_group = portal_group_find(conf, $1); conf->conf_default_pg_defined = true; } else { portal_group = portal_group_new(conf, $1); } free($1); if (portal_group == NULL) return (1); } ; portal_group_entries: | portal_group_entries portal_group_entry | portal_group_entries portal_group_entry SEMICOLON ; portal_group_entry: portal_group_discovery_auth_group | portal_group_discovery_filter | portal_group_listen | portal_group_listen_iser | portal_group_offload | portal_group_redirect ; portal_group_discovery_auth_group: DISCOVERY_AUTH_GROUP STR { if (portal_group->pg_discovery_auth_group != NULL) { log_warnx("discovery-auth-group for portal-group " "\"%s\" specified more than once", portal_group->pg_name); return (1); } portal_group->pg_discovery_auth_group = auth_group_find(conf, $2); if (portal_group->pg_discovery_auth_group == NULL) { log_warnx("unknown discovery-auth-group \"%s\" " "for portal-group \"%s\"", $2, portal_group->pg_name); return (1); } free($2); } ; portal_group_discovery_filter: DISCOVERY_FILTER STR { int error; error = portal_group_set_filter(portal_group, $2); free($2); if (error != 0) return (1); } ; portal_group_listen: LISTEN STR { int error; error = portal_group_add_listen(portal_group, $2, false); free($2); if (error != 0) return (1); } ; portal_group_listen_iser: LISTEN_ISER STR { int error; error = portal_group_add_listen(portal_group, $2, true); free($2); if (error != 0) return (1); } ; portal_group_offload: OFFLOAD STR { int error; error = portal_group_set_offload(portal_group, $2); free($2); if (error != 0) return (1); } ; portal_group_redirect: REDIRECT STR { int error; error = portal_group_set_redirection(portal_group, $2); free($2); if (error != 0) return (1); } ; lun: LUN lun_name OPENING_BRACKET lun_entries CLOSING_BRACKET { lun = NULL; } ; lun_name: STR { lun = lun_new(conf, $1); free($1); if (lun == NULL) return (1); } ; target: TARGET target_name OPENING_BRACKET target_entries CLOSING_BRACKET { target = NULL; } ; target_name: STR { target = target_new(conf, $1); free($1); if (target == NULL) return (1); } ; target_entries: | target_entries target_entry | target_entries target_entry SEMICOLON ; target_entry: target_alias | target_auth_group | target_auth_type | target_chap | target_chap_mutual | target_initiator_name | target_initiator_portal | target_portal_group | target_port | target_redirect | target_lun | target_lun_ref ; target_alias: ALIAS STR { if (target->t_alias != NULL) { log_warnx("alias for target \"%s\" " "specified more than once", target->t_name); return (1); } target->t_alias = $2; } ; target_auth_group: AUTH_GROUP STR { if (target->t_auth_group != NULL) { if (target->t_auth_group->ag_name != NULL) log_warnx("auth-group for target \"%s\" " "specified more than once", target->t_name); else log_warnx("cannot use both auth-group and explicit " "authorisations for target \"%s\"", target->t_name); return (1); } target->t_auth_group = auth_group_find(conf, $2); if (target->t_auth_group == NULL) { log_warnx("unknown auth-group \"%s\" for target " "\"%s\"", $2, target->t_name); return (1); } free($2); } ; target_auth_type: AUTH_TYPE STR { int error; if (target->t_auth_group != NULL) { if (target->t_auth_group->ag_name != NULL) { log_warnx("cannot use both auth-group and " "auth-type for target \"%s\"", target->t_name); return (1); } } else { target->t_auth_group = auth_group_new(conf, NULL); if (target->t_auth_group == NULL) { free($2); return (1); } target->t_auth_group->ag_target = target; } error = auth_group_set_type(target->t_auth_group, $2); free($2); if (error != 0) return (1); } ; target_chap: CHAP STR STR { const struct auth *ca; if (target->t_auth_group != NULL) { if (target->t_auth_group->ag_name != NULL) { log_warnx("cannot use both auth-group and " "chap for target \"%s\"", target->t_name); free($2); free($3); return (1); } } else { target->t_auth_group = auth_group_new(conf, NULL); if (target->t_auth_group == NULL) { free($2); free($3); return (1); } target->t_auth_group->ag_target = target; } ca = auth_new_chap(target->t_auth_group, $2, $3); free($2); free($3); if (ca == NULL) return (1); } ; target_chap_mutual: CHAP_MUTUAL STR STR STR STR { const struct auth *ca; if (target->t_auth_group != NULL) { if (target->t_auth_group->ag_name != NULL) { log_warnx("cannot use both auth-group and " "chap-mutual for target \"%s\"", target->t_name); free($2); free($3); free($4); free($5); return (1); } } else { target->t_auth_group = auth_group_new(conf, NULL); if (target->t_auth_group == NULL) { free($2); free($3); free($4); free($5); return (1); } target->t_auth_group->ag_target = target; } ca = auth_new_chap_mutual(target->t_auth_group, $2, $3, $4, $5); free($2); free($3); free($4); free($5); if (ca == NULL) return (1); } ; target_initiator_name: INITIATOR_NAME STR { const struct auth_name *an; if (target->t_auth_group != NULL) { if (target->t_auth_group->ag_name != NULL) { log_warnx("cannot use both auth-group and " "initiator-name for target \"%s\"", target->t_name); free($2); return (1); } } else { target->t_auth_group = auth_group_new(conf, NULL); if (target->t_auth_group == NULL) { free($2); return (1); } target->t_auth_group->ag_target = target; } an = auth_name_new(target->t_auth_group, $2); free($2); if (an == NULL) return (1); } ; target_initiator_portal: INITIATOR_PORTAL STR { const struct auth_portal *ap; if (target->t_auth_group != NULL) { if (target->t_auth_group->ag_name != NULL) { log_warnx("cannot use both auth-group and " "initiator-portal for target \"%s\"", target->t_name); free($2); return (1); } } else { target->t_auth_group = auth_group_new(conf, NULL); if (target->t_auth_group == NULL) { free($2); return (1); } target->t_auth_group->ag_target = target; } ap = auth_portal_new(target->t_auth_group, $2); free($2); if (ap == NULL) return (1); } ; target_portal_group: PORTAL_GROUP STR STR { struct portal_group *tpg; struct auth_group *tag; struct port *tp; tpg = portal_group_find(conf, $2); if (tpg == NULL) { log_warnx("unknown portal-group \"%s\" for target " "\"%s\"", $2, target->t_name); free($2); free($3); return (1); } tag = auth_group_find(conf, $3); if (tag == NULL) { log_warnx("unknown auth-group \"%s\" for target " "\"%s\"", $3, target->t_name); free($2); free($3); return (1); } tp = port_new(conf, target, tpg); if (tp == NULL) { log_warnx("can't link portal-group \"%s\" to target " "\"%s\"", $2, target->t_name); free($2); return (1); } tp->p_auth_group = tag; free($2); free($3); } | PORTAL_GROUP STR { struct portal_group *tpg; struct port *tp; tpg = portal_group_find(conf, $2); if (tpg == NULL) { log_warnx("unknown portal-group \"%s\" for target " "\"%s\"", $2, target->t_name); free($2); return (1); } tp = port_new(conf, target, tpg); if (tp == NULL) { log_warnx("can't link portal-group \"%s\" to target " "\"%s\"", $2, target->t_name); free($2); return (1); } free($2); } ; target_port: PORT STR { struct pport *pp; struct port *tp; pp = pport_find(conf, $2); if (pp == NULL) { log_warnx("unknown port \"%s\" for target \"%s\"", $2, target->t_name); free($2); return (1); } if (!TAILQ_EMPTY(&pp->pp_ports)) { log_warnx("can't link port \"%s\" to target \"%s\", " "port already linked to some target", $2, target->t_name); free($2); return (1); } tp = port_new_pp(conf, target, pp); if (tp == NULL) { log_warnx("can't link port \"%s\" to target \"%s\"", $2, target->t_name); free($2); return (1); } free($2); } ; target_redirect: REDIRECT STR { int error; error = target_set_redirection(target, $2); free($2); if (error != 0) return (1); } ; target_lun: LUN lun_number OPENING_BRACKET lun_entries CLOSING_BRACKET { lun = NULL; } ; lun_number: STR { uint64_t tmp; int ret; char *name; if (expand_number($1, &tmp) != 0) { yyerror("invalid numeric value"); free($1); return (1); } ret = asprintf(&name, "%s,lun,%ju", target->t_name, tmp); if (ret <= 0) log_err(1, "asprintf"); lun = lun_new(conf, name); if (lun == NULL) return (1); lun_set_scsiname(lun, name); target->t_luns[tmp] = lun; } ; target_lun_ref: LUN STR STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); free($3); return (1); } free($2); lun = lun_find(conf, $3); free($3); if (lun == NULL) return (1); target->t_luns[tmp] = lun; } ; lun_entries: | lun_entries lun_entry | lun_entries lun_entry SEMICOLON ; lun_entry: lun_backend | lun_blocksize | lun_device_id | lun_option | lun_path | lun_serial | lun_size ; lun_backend: BACKEND STR { if (lun->l_backend != NULL) { log_warnx("backend for lun \"%s\" " "specified more than once", lun->l_name); free($2); return (1); } lun_set_backend(lun, $2); free($2); } ; lun_blocksize: BLOCKSIZE STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } if (lun->l_blocksize != 0) { log_warnx("blocksize for lun \"%s\" " "specified more than once", lun->l_name); return (1); } lun_set_blocksize(lun, tmp); } ; lun_device_id: DEVICE_ID STR { if (lun->l_device_id != NULL) { log_warnx("device_id for lun \"%s\" " "specified more than once", lun->l_name); free($2); return (1); } lun_set_device_id(lun, $2); free($2); } ; lun_option: OPTION STR STR { struct lun_option *clo; clo = lun_option_new(lun, $2, $3); free($2); free($3); if (clo == NULL) return (1); } ; lun_path: PATH STR { if (lun->l_path != NULL) { log_warnx("path for lun \"%s\" " "specified more than once", lun->l_name); free($2); return (1); } lun_set_path(lun, $2); free($2); } ; lun_serial: SERIAL STR { if (lun->l_serial != NULL) { log_warnx("serial for lun \"%s\" " "specified more than once", lun->l_name); free($2); return (1); } lun_set_serial(lun, $2); free($2); } ; lun_size: SIZE STR { uint64_t tmp; if (expand_number($2, &tmp) != 0) { yyerror("invalid numeric value"); free($2); return (1); } if (lun->l_size != 0) { log_warnx("size for lun \"%s\" " "specified more than once", lun->l_name); return (1); } lun_set_size(lun, tmp); } ; %% void yyerror(const char *str) { log_warnx("error in configuration file at line %d near '%s': %s", lineno, yytext, str); } static void check_perms(const char *path) { struct stat sb; int error; error = stat(path, &sb); if (error != 0) { log_warn("stat"); return; } if (sb.st_mode & S_IWOTH) { log_warnx("%s is world-writable", path); } else if (sb.st_mode & S_IROTH) { log_warnx("%s is world-readable", path); } else if (sb.st_mode & S_IXOTH) { /* * Ok, this one doesn't matter, but still do it, * just for consistency. */ log_warnx("%s is world-executable", path); } /* * XXX: Should we also check for owner != 0? */ } struct conf * conf_new_from_file(const char *path, struct conf *oldconf) { struct auth_group *ag; struct portal_group *pg; struct pport *pp; int error; log_debugx("obtaining configuration from %s", path); conf = conf_new(); TAILQ_FOREACH(pp, &oldconf->conf_pports, pp_next) pport_copy(pp, conf); ag = auth_group_new(conf, "default"); assert(ag != NULL); ag = auth_group_new(conf, "no-authentication"); assert(ag != NULL); ag->ag_type = AG_TYPE_NO_AUTHENTICATION; ag = auth_group_new(conf, "no-access"); assert(ag != NULL); ag->ag_type = AG_TYPE_DENY; pg = portal_group_new(conf, "default"); assert(pg != NULL); yyin = fopen(path, "r"); if (yyin == NULL) { log_warn("unable to open configuration file %s", path); conf_delete(conf); return (NULL); } check_perms(path); lineno = 1; yyrestart(yyin); error = yyparse(); auth_group = NULL; portal_group = NULL; target = NULL; lun = NULL; fclose(yyin); if (error != 0) { conf_delete(conf); return (NULL); } if (conf->conf_default_ag_defined == false) { log_debugx("auth-group \"default\" not defined; " "going with defaults"); ag = auth_group_find(conf, "default"); assert(ag != NULL); ag->ag_type = AG_TYPE_DENY; } if (conf->conf_default_pg_defined == false) { log_debugx("portal-group \"default\" not defined; " "going with defaults"); pg = portal_group_find(conf, "default"); assert(pg != NULL); portal_group_add_listen(pg, "0.0.0.0:3260", false); portal_group_add_listen(pg, "[::]:3260", false); } conf->conf_kernel_port_on = true; error = conf_verify(conf); if (error != 0) { conf_delete(conf); return (NULL); } return (conf); } Index: user/ngie/more-tests/usr.sbin/ctld/pdu.c =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/pdu.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/pdu.c (revision 281585) @@ -1,266 +1,264 @@ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include -#include -#include #include #include #include "ctld.h" #include "iscsi_proto.h" #ifdef ICL_KERNEL_PROXY #include #endif extern bool proxy_mode; static int pdu_ahs_length(const struct pdu *pdu) { return (pdu->pdu_bhs->bhs_total_ahs_len * 4); } static int pdu_data_segment_length(const struct pdu *pdu) { uint32_t len = 0; len += pdu->pdu_bhs->bhs_data_segment_len[0]; len <<= 8; len += pdu->pdu_bhs->bhs_data_segment_len[1]; len <<= 8; len += pdu->pdu_bhs->bhs_data_segment_len[2]; return (len); } static void pdu_set_data_segment_length(struct pdu *pdu, uint32_t len) { pdu->pdu_bhs->bhs_data_segment_len[2] = len; pdu->pdu_bhs->bhs_data_segment_len[1] = len >> 8; pdu->pdu_bhs->bhs_data_segment_len[0] = len >> 16; } struct pdu * pdu_new(struct connection *conn) { struct pdu *pdu; pdu = calloc(sizeof(*pdu), 1); if (pdu == NULL) log_err(1, "calloc"); pdu->pdu_bhs = calloc(sizeof(*pdu->pdu_bhs), 1); if (pdu->pdu_bhs == NULL) log_err(1, "calloc"); pdu->pdu_connection = conn; return (pdu); } struct pdu * pdu_new_response(struct pdu *request) { return (pdu_new(request->pdu_connection)); } #ifdef ICL_KERNEL_PROXY static void pdu_receive_proxy(struct pdu *pdu) { size_t len; assert(proxy_mode); kernel_receive(pdu); len = pdu_ahs_length(pdu); if (len > 0) log_errx(1, "protocol error: non-empty AHS"); len = pdu_data_segment_length(pdu); assert(len <= MAX_DATA_SEGMENT_LENGTH); pdu->pdu_data_len = len; } static void pdu_send_proxy(struct pdu *pdu) { assert(proxy_mode); pdu_set_data_segment_length(pdu, pdu->pdu_data_len); kernel_send(pdu); } #endif /* ICL_KERNEL_PROXY */ static size_t pdu_padding(const struct pdu *pdu) { if ((pdu->pdu_data_len % 4) != 0) return (4 - (pdu->pdu_data_len % 4)); return (0); } static void pdu_read(int fd, char *data, size_t len) { ssize_t ret; while (len > 0) { ret = read(fd, data, len); if (ret < 0) { if (timed_out()) log_errx(1, "exiting due to timeout"); log_err(1, "read"); } else if (ret == 0) log_errx(1, "read: connection lost"); len -= ret; data += ret; } } void pdu_receive(struct pdu *pdu) { size_t len, padding; char dummy[4]; #ifdef ICL_KERNEL_PROXY if (proxy_mode) return (pdu_receive_proxy(pdu)); #endif assert(proxy_mode == false); pdu_read(pdu->pdu_connection->conn_socket, (char *)pdu->pdu_bhs, sizeof(*pdu->pdu_bhs)); len = pdu_ahs_length(pdu); if (len > 0) log_errx(1, "protocol error: non-empty AHS"); len = pdu_data_segment_length(pdu); if (len > 0) { if (len > MAX_DATA_SEGMENT_LENGTH) { log_errx(1, "protocol error: received PDU " "with DataSegmentLength exceeding %d", MAX_DATA_SEGMENT_LENGTH); } pdu->pdu_data_len = len; pdu->pdu_data = malloc(len); if (pdu->pdu_data == NULL) log_err(1, "malloc"); pdu_read(pdu->pdu_connection->conn_socket, (char *)pdu->pdu_data, pdu->pdu_data_len); padding = pdu_padding(pdu); if (padding != 0) { assert(padding < sizeof(dummy)); pdu_read(pdu->pdu_connection->conn_socket, (char *)dummy, padding); } } } void pdu_send(struct pdu *pdu) { ssize_t ret, total_len; size_t padding; uint32_t zero = 0; struct iovec iov[3]; int iovcnt; #ifdef ICL_KERNEL_PROXY if (proxy_mode) return (pdu_send_proxy(pdu)); #endif assert(proxy_mode == false); pdu_set_data_segment_length(pdu, pdu->pdu_data_len); iov[0].iov_base = pdu->pdu_bhs; iov[0].iov_len = sizeof(*pdu->pdu_bhs); total_len = iov[0].iov_len; iovcnt = 1; if (pdu->pdu_data_len > 0) { iov[1].iov_base = pdu->pdu_data; iov[1].iov_len = pdu->pdu_data_len; total_len += iov[1].iov_len; iovcnt = 2; padding = pdu_padding(pdu); if (padding > 0) { assert(padding < sizeof(zero)); iov[2].iov_base = &zero; iov[2].iov_len = padding; total_len += iov[2].iov_len; iovcnt = 3; } } ret = writev(pdu->pdu_connection->conn_socket, iov, iovcnt); if (ret < 0) { if (timed_out()) log_errx(1, "exiting due to timeout"); log_err(1, "writev"); } if (ret != total_len) log_errx(1, "short write"); } void pdu_delete(struct pdu *pdu) { free(pdu->pdu_data); free(pdu->pdu_bhs); free(pdu); } Index: user/ngie/more-tests/usr.sbin/ctld/token.l =================================================================== --- user/ngie/more-tests/usr.sbin/ctld/token.l (revision 281584) +++ user/ngie/more-tests/usr.sbin/ctld/token.l (revision 281585) @@ -1,93 +1,92 @@ %{ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include -#include "ctld.h" #include "y.tab.h" int lineno; #define YY_DECL int yylex(void) extern int yylex(void); %} %option noinput %option nounput %% alias { return ALIAS; } auth-group { return AUTH_GROUP; } auth-type { return AUTH_TYPE; } backend { return BACKEND; } blocksize { return BLOCKSIZE; } chap { return CHAP; } chap-mutual { return CHAP_MUTUAL; } debug { return DEBUG; } device-id { return DEVICE_ID; } discovery-auth-group { return DISCOVERY_AUTH_GROUP; } discovery-filter { return DISCOVERY_FILTER; } initiator-name { return INITIATOR_NAME; } initiator-portal { return INITIATOR_PORTAL; } listen { return LISTEN; } listen-iser { return LISTEN_ISER; } lun { return LUN; } maxproc { return MAXPROC; } offload { return OFFLOAD; } option { return OPTION; } path { return PATH; } pidfile { return PIDFILE; } isns-server { return ISNS_SERVER; } isns-period { return ISNS_PERIOD; } isns-timeout { return ISNS_TIMEOUT; } port { return PORT; } portal-group { return PORTAL_GROUP; } redirect { return REDIRECT; } serial { return SERIAL; } size { return SIZE; } target { return TARGET; } timeout { return TIMEOUT; } \"[^"]+\" { yylval.str = strndup(yytext + 1, strlen(yytext) - 2); return STR; } [a-zA-Z0-9\.\-_/\:\[\]]+ { yylval.str = strdup(yytext); return STR; } \{ { return OPENING_BRACKET; } \} { return CLOSING_BRACKET; } #.*$ /* ignore comments */; \r\n { lineno++; } \n { lineno++; } ; { return SEMICOLON; } [ \t]+ /* ignore whitespace */; . { yylval.str = strdup(yytext); return STR; } %% Index: user/ngie/more-tests/usr.sbin/freebsd-update/freebsd-update.sh =================================================================== --- user/ngie/more-tests/usr.sbin/freebsd-update/freebsd-update.sh (revision 281584) +++ user/ngie/more-tests/usr.sbin/freebsd-update/freebsd-update.sh (revision 281585) @@ -1,3291 +1,3291 @@ #!/bin/sh #- # Copyright 2004-2007 Colin Percival # All rights reserved # # Redistribution and use in source and binary forms, with or without # modification, are permitted providing that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY # DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, # STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING # IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE # POSSIBILITY OF SUCH DAMAGE. # $FreeBSD$ #### Usage function -- called from command-line handling code. # Usage instructions. Options not listed: # --debug -- don't filter output from utilities # --no-stats -- don't show progress statistics while fetching files usage () { cat < ${LINE}" exit 1 fi done < ${CONFFILE} # Merge the settings read from the configuration file with those # provided at the command line. mergeconfig } # Provide some default parameters default_params () { # Save any parameters already configured, and clear the slate saveconfig nullconfig # Default configurations config_WorkDir /var/db/freebsd-update config_MailTo root config_AllowAdd yes config_AllowDelete yes config_KeepModifiedMetadata yes config_BaseDir / config_VerboseLevel stats config_StrictComponents no config_BackupKernel yes config_BackupKernelDir /boot/kernel.old config_BackupKernelSymbolFiles no # Merge these defaults into the earlier-configured settings mergeconfig } # Set utility output filtering options, based on ${VERBOSELEVEL} fetch_setup_verboselevel () { case ${VERBOSELEVEL} in debug) QUIETREDIR="/dev/stderr" QUIETFLAG=" " STATSREDIR="/dev/stderr" DDSTATS=".." XARGST="-t" NDEBUG=" " ;; nostats) QUIETREDIR="" QUIETFLAG="" STATSREDIR="/dev/null" DDSTATS=".." XARGST="" NDEBUG="" ;; stats) QUIETREDIR="/dev/null" QUIETFLAG="-q" STATSREDIR="/dev/stdout" DDSTATS="" XARGST="" NDEBUG="-n" ;; esac } # Perform sanity checks and set some final parameters # in preparation for fetching files. Figure out which # set of updates should be downloaded: If the user is # running *-p[0-9]+, strip off the last part; if the # user is running -SECURITY, call it -RELEASE. Chdir # into the working directory. fetchupgrade_check_params () { export HTTP_USER_AGENT="freebsd-update (${COMMAND}, `uname -r`)" _SERVERNAME_z=\ "SERVERNAME must be given via command line or configuration file." _KEYPRINT_z="Key must be given via -k option or configuration file." _KEYPRINT_bad="Invalid key fingerprint: " _WORKDIR_bad="Directory does not exist or is not writable: " _WORKDIR_bad2="Directory is not on a persistent filesystem: " if [ -z "${SERVERNAME}" ]; then echo -n "`basename $0`: " echo "${_SERVERNAME_z}" exit 1 fi if [ -z "${KEYPRINT}" ]; then echo -n "`basename $0`: " echo "${_KEYPRINT_z}" exit 1 fi if ! echo "${KEYPRINT}" | grep -qE "^[0-9a-f]{64}$"; then echo -n "`basename $0`: " echo -n "${_KEYPRINT_bad}" echo ${KEYPRINT} exit 1 fi if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then echo -n "`basename $0`: " echo -n "${_WORKDIR_bad}" echo ${WORKDIR} exit 1 fi case `df -T ${WORKDIR}` in */dev/md[0-9]* | *tmpfs*) echo -n "`basename $0`: " echo -n "${_WORKDIR_bad2}" echo ${WORKDIR} exit 1 ;; esac chmod 700 ${WORKDIR} cd ${WORKDIR} || exit 1 # Generate release number. The s/SECURITY/RELEASE/ bit exists # to provide an upgrade path for FreeBSD Update 1.x users, since # the kernels provided by FreeBSD Update 1.x are always labelled # as X.Y-SECURITY. RELNUM=`uname -r | sed -E 's,-p[0-9]+,,' | sed -E 's,-SECURITY,-RELEASE,'` ARCH=`uname -m` FETCHDIR=${RELNUM}/${ARCH} PATCHDIR=${RELNUM}/${ARCH}/bp # Figure out what directory contains the running kernel BOOTFILE=`sysctl -n kern.bootfile` KERNELDIR=${BOOTFILE%/kernel} if ! [ -d ${KERNELDIR} ]; then echo "Cannot identify running kernel" exit 1 fi # Figure out what kernel configuration is running. We start with # the output of `uname -i`, and then make the following adjustments: # 1. Replace "SMP-GENERIC" with "SMP". Why the SMP kernel config # file says "ident SMP-GENERIC", I don't know... # 2. If the kernel claims to be GENERIC _and_ ${ARCH} is "amd64" # _and_ `sysctl kern.version` contains a line which ends "/SMP", then # we're running an SMP kernel. This mis-identification is a bug # which was fixed in 6.2-STABLE. KERNCONF=`uname -i` if [ ${KERNCONF} = "SMP-GENERIC" ]; then KERNCONF=SMP fi if [ ${KERNCONF} = "GENERIC" ] && [ ${ARCH} = "amd64" ]; then if sysctl kern.version | grep -qE '/SMP$'; then KERNCONF=SMP fi fi # Define some paths BSPATCH=/usr/bin/bspatch SHA256=/sbin/sha256 PHTTPGET=/usr/libexec/phttpget # Set up variables relating to VERBOSELEVEL fetch_setup_verboselevel # Construct a unique name from ${BASEDIR} BDHASH=`echo ${BASEDIR} | sha256 -q` } # Perform sanity checks etc. before fetching updates. fetch_check_params () { fetchupgrade_check_params if ! [ -z "${TARGETRELEASE}" ]; then echo -n "`basename $0`: " echo -n "-r option is meaningless with 'fetch' command. " echo "(Did you mean 'upgrade' instead?)" exit 1 fi # Check that we have updates ready to install - if [ -f ${BDHASH}-install/kerneldone && $FORCEFETCH -eq 0 ]; then + if [ -f ${BDHASH}-install/kerneldone -a $FORCEFETCH -eq 0 ]; then echo "You have a partially completed upgrade pending" echo "Run '$0 install' first." echo "Run '$0 fetch -F' to proceed anyway." exit 1 fi } # Perform sanity checks etc. before fetching upgrades. upgrade_check_params () { fetchupgrade_check_params # Unless set otherwise, we're upgrading to the same kernel config. NKERNCONF=${KERNCONF} # We need TARGETRELEASE set _TARGETRELEASE_z="Release target must be specified via -r option." if [ -z "${TARGETRELEASE}" ]; then echo -n "`basename $0`: " echo "${_TARGETRELEASE_z}" exit 1 fi # The target release should be != the current release. if [ "${TARGETRELEASE}" = "${RELNUM}" ]; then echo -n "`basename $0`: " echo "Cannot upgrade from ${RELNUM} to itself" exit 1 fi # Turning off AllowAdd or AllowDelete is a bad idea for upgrades. if [ "${ALLOWADD}" = "no" ]; then echo -n "`basename $0`: " echo -n "WARNING: \"AllowAdd no\" is a bad idea " echo "when upgrading between releases." echo fi if [ "${ALLOWDELETE}" = "no" ]; then echo -n "`basename $0`: " echo -n "WARNING: \"AllowDelete no\" is a bad idea " echo "when upgrading between releases." echo fi # Set EDITOR to /usr/bin/vi if it isn't already set : ${EDITOR:='/usr/bin/vi'} } # Perform sanity checks and set some final parameters in # preparation for installing updates. install_check_params () { # Check that we are root. All sorts of things won't work otherwise. if [ `id -u` != 0 ]; then echo "You must be root to run this." exit 1 fi # Check that securelevel <= 0. Otherwise we can't update schg files. if [ `sysctl -n kern.securelevel` -gt 0 ]; then echo "Updates cannot be installed when the system securelevel" echo "is greater than zero." exit 1 fi # Check that we have a working directory _WORKDIR_bad="Directory does not exist or is not writable: " if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then echo -n "`basename $0`: " echo -n "${_WORKDIR_bad}" echo ${WORKDIR} exit 1 fi cd ${WORKDIR} || exit 1 # Construct a unique name from ${BASEDIR} BDHASH=`echo ${BASEDIR} | sha256 -q` # Check that we have updates ready to install if ! [ -L ${BDHASH}-install ]; then echo "No updates are available to install." echo "Run '$0 fetch' first." exit 1 fi if ! [ -f ${BDHASH}-install/INDEX-OLD ] || ! [ -f ${BDHASH}-install/INDEX-NEW ]; then echo "Update manifest is corrupt -- this should never happen." echo "Re-run '$0 fetch'." exit 1 fi # Figure out what directory contains the running kernel BOOTFILE=`sysctl -n kern.bootfile` KERNELDIR=${BOOTFILE%/kernel} if ! [ -d ${KERNELDIR} ]; then echo "Cannot identify running kernel" exit 1 fi } # Perform sanity checks and set some final parameters in # preparation for UNinstalling updates. rollback_check_params () { # Check that we are root. All sorts of things won't work otherwise. if [ `id -u` != 0 ]; then echo "You must be root to run this." exit 1 fi # Check that we have a working directory _WORKDIR_bad="Directory does not exist or is not writable: " if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then echo -n "`basename $0`: " echo -n "${_WORKDIR_bad}" echo ${WORKDIR} exit 1 fi cd ${WORKDIR} || exit 1 # Construct a unique name from ${BASEDIR} BDHASH=`echo ${BASEDIR} | sha256 -q` # Check that we have updates ready to rollback if ! [ -L ${BDHASH}-rollback ]; then echo "No rollback directory found." exit 1 fi if ! [ -f ${BDHASH}-rollback/INDEX-OLD ] || ! [ -f ${BDHASH}-rollback/INDEX-NEW ]; then echo "Update manifest is corrupt -- this should never happen." exit 1 fi } # Perform sanity checks and set some final parameters # in preparation for comparing the system against the # published index. Figure out which index we should # compare against: If the user is running *-p[0-9]+, # strip off the last part; if the user is running # -SECURITY, call it -RELEASE. Chdir into the working # directory. IDS_check_params () { export HTTP_USER_AGENT="freebsd-update (${COMMAND}, `uname -r`)" _SERVERNAME_z=\ "SERVERNAME must be given via command line or configuration file." _KEYPRINT_z="Key must be given via -k option or configuration file." _KEYPRINT_bad="Invalid key fingerprint: " _WORKDIR_bad="Directory does not exist or is not writable: " if [ -z "${SERVERNAME}" ]; then echo -n "`basename $0`: " echo "${_SERVERNAME_z}" exit 1 fi if [ -z "${KEYPRINT}" ]; then echo -n "`basename $0`: " echo "${_KEYPRINT_z}" exit 1 fi if ! echo "${KEYPRINT}" | grep -qE "^[0-9a-f]{64}$"; then echo -n "`basename $0`: " echo -n "${_KEYPRINT_bad}" echo ${KEYPRINT} exit 1 fi if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then echo -n "`basename $0`: " echo -n "${_WORKDIR_bad}" echo ${WORKDIR} exit 1 fi cd ${WORKDIR} || exit 1 # Generate release number. The s/SECURITY/RELEASE/ bit exists # to provide an upgrade path for FreeBSD Update 1.x users, since # the kernels provided by FreeBSD Update 1.x are always labelled # as X.Y-SECURITY. RELNUM=`uname -r | sed -E 's,-p[0-9]+,,' | sed -E 's,-SECURITY,-RELEASE,'` ARCH=`uname -m` FETCHDIR=${RELNUM}/${ARCH} PATCHDIR=${RELNUM}/${ARCH}/bp # Figure out what directory contains the running kernel BOOTFILE=`sysctl -n kern.bootfile` KERNELDIR=${BOOTFILE%/kernel} if ! [ -d ${KERNELDIR} ]; then echo "Cannot identify running kernel" exit 1 fi # Figure out what kernel configuration is running. We start with # the output of `uname -i`, and then make the following adjustments: # 1. Replace "SMP-GENERIC" with "SMP". Why the SMP kernel config # file says "ident SMP-GENERIC", I don't know... # 2. If the kernel claims to be GENERIC _and_ ${ARCH} is "amd64" # _and_ `sysctl kern.version` contains a line which ends "/SMP", then # we're running an SMP kernel. This mis-identification is a bug # which was fixed in 6.2-STABLE. KERNCONF=`uname -i` if [ ${KERNCONF} = "SMP-GENERIC" ]; then KERNCONF=SMP fi if [ ${KERNCONF} = "GENERIC" ] && [ ${ARCH} = "amd64" ]; then if sysctl kern.version | grep -qE '/SMP$'; then KERNCONF=SMP fi fi # Define some paths SHA256=/sbin/sha256 PHTTPGET=/usr/libexec/phttpget # Set up variables relating to VERBOSELEVEL fetch_setup_verboselevel } #### Core functionality -- the actual work gets done here # Use an SRV query to pick a server. If the SRV query doesn't provide # a useful answer, use the server name specified by the user. # Put another way... look up _http._tcp.${SERVERNAME} and pick a server # from that; or if no servers are returned, use ${SERVERNAME}. # This allows a user to specify "portsnap.freebsd.org" (in which case # portsnap will select one of the mirrors) or "portsnap5.tld.freebsd.org" # (in which case portsnap will use that particular server, since there # won't be an SRV entry for that name). # # We ignore the Port field, since we are always going to use port 80. # Fetch the mirror list, but do not pick a mirror yet. Returns 1 if # no mirrors are available for any reason. fetch_pick_server_init () { : > serverlist_tried # Check that host(1) exists (i.e., that the system wasn't built with the # WITHOUT_BIND set) and don't try to find a mirror if it doesn't exist. if ! which -s host; then : > serverlist_full return 1 fi echo -n "Looking up ${SERVERNAME} mirrors... " # Issue the SRV query and pull out the Priority, Weight, and Target fields. # BIND 9 prints "$name has SRV record ..." while BIND 8 prints # "$name server selection ..."; we allow either format. MLIST="_http._tcp.${SERVERNAME}" host -t srv "${MLIST}" | sed -nE "s/${MLIST} (has SRV record|server selection) //p" | cut -f 1,2,4 -d ' ' | sed -e 's/\.$//' | sort > serverlist_full # If no records, give up -- we'll just use the server name we were given. if [ `wc -l < serverlist_full` -eq 0 ]; then echo "none found." return 1 fi # Report how many mirrors we found. echo `wc -l < serverlist_full` "mirrors found." # Generate a random seed for use in picking mirrors. If HTTP_PROXY # is set, this will be used to generate the seed; otherwise, the seed # will be random. if [ -n "${HTTP_PROXY}${http_proxy}" ]; then RANDVALUE=`sha256 -qs "${HTTP_PROXY}${http_proxy}" | tr -d 'a-f' | cut -c 1-9` else RANDVALUE=`jot -r 1 0 999999999` fi } # Pick a mirror. Returns 1 if we have run out of mirrors to try. fetch_pick_server () { # Generate a list of not-yet-tried mirrors sort serverlist_tried | comm -23 serverlist_full - > serverlist # Have we run out of mirrors? if [ `wc -l < serverlist` -eq 0 ]; then echo "No mirrors remaining, giving up." return 1 fi # Find the highest priority level (lowest numeric value). SRV_PRIORITY=`cut -f 1 -d ' ' serverlist | sort -n | head -1` # Add up the weights of the response lines at that priority level. SRV_WSUM=0; while read X; do case "$X" in ${SRV_PRIORITY}\ *) SRV_W=`echo $X | cut -f 2 -d ' '` SRV_WSUM=$(($SRV_WSUM + $SRV_W)) ;; esac done < serverlist # If all the weights are 0, pretend that they are all 1 instead. if [ ${SRV_WSUM} -eq 0 ]; then SRV_WSUM=`grep -E "^${SRV_PRIORITY} " serverlist | wc -l` SRV_W_ADD=1 else SRV_W_ADD=0 fi # Pick a value between 0 and the sum of the weights - 1 SRV_RND=`expr ${RANDVALUE} % ${SRV_WSUM}` # Read through the list of mirrors and set SERVERNAME. Write the line # corresponding to the mirror we selected into serverlist_tried so that # we won't try it again. while read X; do case "$X" in ${SRV_PRIORITY}\ *) SRV_W=`echo $X | cut -f 2 -d ' '` SRV_W=$(($SRV_W + $SRV_W_ADD)) if [ $SRV_RND -lt $SRV_W ]; then SERVERNAME=`echo $X | cut -f 3 -d ' '` echo "$X" >> serverlist_tried break else SRV_RND=$(($SRV_RND - $SRV_W)) fi ;; esac done < serverlist } # Take a list of ${oldhash}|${newhash} and output a list of needed patches, # i.e., those for which we have ${oldhash} and don't have ${newhash}. fetch_make_patchlist () { grep -vE "^([0-9a-f]{64})\|\1$" | tr '|' ' ' | while read X Y; do if [ -f "files/${Y}.gz" ] || [ ! -f "files/${X}.gz" ]; then continue fi echo "${X}|${Y}" done | uniq } # Print user-friendly progress statistics fetch_progress () { LNC=0 while read x; do LNC=$(($LNC + 1)) if [ $(($LNC % 10)) = 0 ]; then echo -n $LNC elif [ $(($LNC % 2)) = 0 ]; then echo -n . fi done echo -n " " } # Function for asking the user if everything is ok continuep () { while read -p "Does this look reasonable (y/n)? " CONTINUE; do case "${CONTINUE}" in y*) return 0 ;; n*) return 1 ;; esac done } # Initialize the working directory workdir_init () { mkdir -p files touch tINDEX.present } # Check that we have a public key with an appropriate hash, or # fetch the key if it doesn't exist. Returns 1 if the key has # not yet been fetched. fetch_key () { if [ -r pub.ssl ] && [ `${SHA256} -q pub.ssl` = ${KEYPRINT} ]; then return 0 fi echo -n "Fetching public key from ${SERVERNAME}... " rm -f pub.ssl fetch ${QUIETFLAG} http://${SERVERNAME}/${FETCHDIR}/pub.ssl \ 2>${QUIETREDIR} || true if ! [ -r pub.ssl ]; then echo "failed." return 1 fi if ! [ `${SHA256} -q pub.ssl` = ${KEYPRINT} ]; then echo "key has incorrect hash." rm -f pub.ssl return 1 fi echo "done." } # Fetch metadata signature, aka "tag". fetch_tag () { echo -n "Fetching metadata signature " echo ${NDEBUG} "for ${RELNUM} from ${SERVERNAME}... " rm -f latest.ssl fetch ${QUIETFLAG} http://${SERVERNAME}/${FETCHDIR}/latest.ssl \ 2>${QUIETREDIR} || true if ! [ -r latest.ssl ]; then echo "failed." return 1 fi openssl rsautl -pubin -inkey pub.ssl -verify \ < latest.ssl > tag.new 2>${QUIETREDIR} || true rm latest.ssl if ! [ `wc -l < tag.new` = 1 ] || ! grep -qE \ "^freebsd-update\|${ARCH}\|${RELNUM}\|[0-9]+\|[0-9a-f]{64}\|[0-9]{10}" \ tag.new; then echo "invalid signature." return 1 fi echo "done." RELPATCHNUM=`cut -f 4 -d '|' < tag.new` TINDEXHASH=`cut -f 5 -d '|' < tag.new` EOLTIME=`cut -f 6 -d '|' < tag.new` } # Sanity-check the patch number in a tag, to make sure that we're not # going to "update" backwards and to prevent replay attacks. fetch_tagsanity () { # Check that we're not going to move from -pX to -pY with Y < X. RELPX=`uname -r | sed -E 's,.*-,,'` if echo ${RELPX} | grep -qE '^p[0-9]+$'; then RELPX=`echo ${RELPX} | cut -c 2-` else RELPX=0 fi if [ "${RELPATCHNUM}" -lt "${RELPX}" ]; then echo echo -n "Files on mirror (${RELNUM}-p${RELPATCHNUM})" echo " appear older than what" echo "we are currently running (`uname -r`)!" echo "Cowardly refusing to proceed any further." return 1 fi # If "tag" exists and corresponds to ${RELNUM}, make sure that # it contains a patch number <= RELPATCHNUM, in order to protect # against rollback (replay) attacks. if [ -f tag ] && grep -qE \ "^freebsd-update\|${ARCH}\|${RELNUM}\|[0-9]+\|[0-9a-f]{64}\|[0-9]{10}" \ tag; then LASTRELPATCHNUM=`cut -f 4 -d '|' < tag` if [ "${RELPATCHNUM}" -lt "${LASTRELPATCHNUM}" ]; then echo echo -n "Files on mirror (${RELNUM}-p${RELPATCHNUM})" echo " are older than the" echo -n "most recently seen updates" echo " (${RELNUM}-p${LASTRELPATCHNUM})." echo "Cowardly refusing to proceed any further." return 1 fi fi } # Fetch metadata index file fetch_metadata_index () { echo ${NDEBUG} "Fetching metadata index... " rm -f ${TINDEXHASH} fetch ${QUIETFLAG} http://${SERVERNAME}/${FETCHDIR}/t/${TINDEXHASH} 2>${QUIETREDIR} if ! [ -f ${TINDEXHASH} ]; then echo "failed." return 1 fi if [ `${SHA256} -q ${TINDEXHASH}` != ${TINDEXHASH} ]; then echo "update metadata index corrupt." return 1 fi echo "done." } # Print an error message about signed metadata being bogus. fetch_metadata_bogus () { echo echo "The update metadata$1 is correctly signed, but" echo "failed an integrity check." echo "Cowardly refusing to proceed any further." return 1 } # Construct tINDEX.new by merging the lines named in $1 from ${TINDEXHASH} # with the lines not named in $@ from tINDEX.present (if that file exists). fetch_metadata_index_merge () { for METAFILE in $@; do if [ `grep -E "^${METAFILE}\|" ${TINDEXHASH} | wc -l` \ -ne 1 ]; then fetch_metadata_bogus " index" return 1 fi grep -E "${METAFILE}\|" ${TINDEXHASH} done | sort > tINDEX.wanted if [ -f tINDEX.present ]; then join -t '|' -v 2 tINDEX.wanted tINDEX.present | sort -m - tINDEX.wanted > tINDEX.new rm tINDEX.wanted else mv tINDEX.wanted tINDEX.new fi } # Sanity check all the lines of tINDEX.new. Even if more metadata lines # are added by future versions of the server, this won't cause problems, # since the only lines which appear in tINDEX.new are the ones which we # specifically grepped out of ${TINDEXHASH}. fetch_metadata_index_sanity () { if grep -qvE '^[0-9A-Z.-]+\|[0-9a-f]{64}$' tINDEX.new; then fetch_metadata_bogus " index" return 1 fi } # Sanity check the metadata file $1. fetch_metadata_sanity () { # Some aliases to save space later: ${P} is a character which can # appear in a path; ${M} is the four numeric metadata fields; and # ${H} is a sha256 hash. P="[-+./:=,%@_[~[:alnum:]]" M="[0-9]+\|[0-9]+\|[0-9]+\|[0-9]+" H="[0-9a-f]{64}" # Check that the first four fields make sense. if gunzip -c < files/$1.gz | grep -qvE "^[a-z]+\|[0-9a-z]+\|${P}+\|[fdL-]\|"; then fetch_metadata_bogus "" return 1 fi # Remove the first three fields. gunzip -c < files/$1.gz | cut -f 4- -d '|' > sanitycheck.tmp # Sanity check entries with type 'f' if grep -E '^f' sanitycheck.tmp | grep -qvE "^f\|${M}\|${H}\|${P}*\$"; then fetch_metadata_bogus "" return 1 fi # Sanity check entries with type 'd' if grep -E '^d' sanitycheck.tmp | grep -qvE "^d\|${M}\|\|\$"; then fetch_metadata_bogus "" return 1 fi # Sanity check entries with type 'L' if grep -E '^L' sanitycheck.tmp | grep -qvE "^L\|${M}\|${P}*\|\$"; then fetch_metadata_bogus "" return 1 fi # Sanity check entries with type '-' if grep -E '^-' sanitycheck.tmp | grep -qvE "^-\|\|\|\|\|\|"; then fetch_metadata_bogus "" return 1 fi # Clean up rm sanitycheck.tmp } # Fetch the metadata index and metadata files listed in $@, # taking advantage of metadata patches where possible. fetch_metadata () { fetch_metadata_index || return 1 fetch_metadata_index_merge $@ || return 1 fetch_metadata_index_sanity || return 1 # Generate a list of wanted metadata patches join -t '|' -o 1.2,2.2 tINDEX.present tINDEX.new | fetch_make_patchlist > patchlist if [ -s patchlist ]; then # Attempt to fetch metadata patches echo -n "Fetching `wc -l < patchlist | tr -d ' '` " echo ${NDEBUG} "metadata patches.${DDSTATS}" tr '|' '-' < patchlist | lam -s "${FETCHDIR}/tp/" - -s ".gz" | xargs ${XARGST} ${PHTTPGET} ${SERVERNAME} \ 2>${STATSREDIR} | fetch_progress echo "done." # Attempt to apply metadata patches echo -n "Applying metadata patches... " tr '|' ' ' < patchlist | while read X Y; do if [ ! -f "${X}-${Y}.gz" ]; then continue; fi gunzip -c < ${X}-${Y}.gz > diff gunzip -c < files/${X}.gz > diff-OLD # Figure out which lines are being added and removed grep -E '^-' diff | cut -c 2- | while read PREFIX; do look "${PREFIX}" diff-OLD done | sort > diff-rm grep -E '^\+' diff | cut -c 2- > diff-add # Generate the new file comm -23 diff-OLD diff-rm | sort - diff-add > diff-NEW if [ `${SHA256} -q diff-NEW` = ${Y} ]; then mv diff-NEW files/${Y} gzip -n files/${Y} else mv diff-NEW ${Y}.bad fi rm -f ${X}-${Y}.gz diff rm -f diff-OLD diff-NEW diff-add diff-rm done 2>${QUIETREDIR} echo "done." fi # Update metadata without patches cut -f 2 -d '|' < tINDEX.new | while read Y; do if [ ! -f "files/${Y}.gz" ]; then echo ${Y}; fi done | sort -u > filelist if [ -s filelist ]; then echo -n "Fetching `wc -l < filelist | tr -d ' '` " echo ${NDEBUG} "metadata files... " lam -s "${FETCHDIR}/m/" - -s ".gz" < filelist | xargs ${XARGST} ${PHTTPGET} ${SERVERNAME} \ 2>${QUIETREDIR} while read Y; do if ! [ -f ${Y}.gz ]; then echo "failed." return 1 fi if [ `gunzip -c < ${Y}.gz | ${SHA256} -q` = ${Y} ]; then mv ${Y}.gz files/${Y}.gz else echo "metadata is corrupt." return 1 fi done < filelist echo "done." fi # Sanity-check the metadata files. cut -f 2 -d '|' tINDEX.new > filelist while read X; do fetch_metadata_sanity ${X} || return 1 done < filelist # Remove files which are no longer needed cut -f 2 -d '|' tINDEX.present | sort > oldfiles cut -f 2 -d '|' tINDEX.new | sort | comm -13 - oldfiles | lam -s "files/" - -s ".gz" | xargs rm -f rm patchlist filelist oldfiles rm ${TINDEXHASH} # We're done! mv tINDEX.new tINDEX.present mv tag.new tag return 0 } # Extract a subset of a downloaded metadata file containing only the parts # which are listed in COMPONENTS. fetch_filter_metadata_components () { METAHASH=`look "$1|" tINDEX.present | cut -f 2 -d '|'` gunzip -c < files/${METAHASH}.gz > $1.all # Fish out the lines belonging to components we care about. for C in ${COMPONENTS}; do look "`echo ${C} | tr '/' '|'`|" $1.all done > $1 # Remove temporary file. rm $1.all } # Generate a filtered version of the metadata file $1 from the downloaded # file, by fishing out the lines corresponding to components we're trying # to keep updated, and then removing lines corresponding to paths we want # to ignore. fetch_filter_metadata () { # Fish out the lines belonging to components we care about. fetch_filter_metadata_components $1 # Canonicalize directory names by removing any trailing / in # order to avoid listing directories multiple times if they # belong to multiple components. Turning "/" into "" doesn't # matter, since we add a leading "/" when we use paths later. cut -f 3- -d '|' $1 | sed -e 's,/|d|,|d|,' | sed -e 's,/|-|,|-|,' | sort -u > $1.tmp # Figure out which lines to ignore and remove them. for X in ${IGNOREPATHS}; do grep -E "^${X}" $1.tmp done | sort -u | comm -13 - $1.tmp > $1 # Remove temporary files. rm $1.tmp } # Filter the metadata file $1 by adding lines with "/boot/$2" # replaced by ${KERNELDIR} (which is `sysctl -n kern.bootfile` minus the # trailing "/kernel"); and if "/boot/$2" does not exist, remove # the original lines which start with that. # Put another way: Deal with the fact that the FOO kernel is sometimes # installed in /boot/FOO/ and is sometimes installed elsewhere. fetch_filter_kernel_names () { grep ^/boot/$2 $1 | sed -e "s,/boot/$2,${KERNELDIR},g" | sort - $1 > $1.tmp mv $1.tmp $1 if ! [ -d /boot/$2 ]; then grep -v ^/boot/$2 $1 > $1.tmp mv $1.tmp $1 fi } # For all paths appearing in $1 or $3, inspect the system # and generate $2 describing what is currently installed. fetch_inspect_system () { # No errors yet... rm -f .err # Tell the user why his disk is suddenly making lots of noise echo -n "Inspecting system... " # Generate list of files to inspect cat $1 $3 | cut -f 1 -d '|' | sort -u > filelist # Examine each file and output lines of the form # /path/to/file|type|device-inum|user|group|perm|flags|value # sorted by device and inode number. while read F; do # If the symlink/file/directory does not exist, record this. if ! [ -e ${BASEDIR}/${F} ]; then echo "${F}|-||||||" continue fi if ! [ -r ${BASEDIR}/${F} ]; then echo "Cannot read file: ${BASEDIR}/${F}" \ >/dev/stderr touch .err return 1 fi # Otherwise, output an index line. if [ -L ${BASEDIR}/${F} ]; then echo -n "${F}|L|" stat -n -f '%d-%i|%u|%g|%Mp%Lp|%Of|' ${BASEDIR}/${F}; readlink ${BASEDIR}/${F}; elif [ -f ${BASEDIR}/${F} ]; then echo -n "${F}|f|" stat -n -f '%d-%i|%u|%g|%Mp%Lp|%Of|' ${BASEDIR}/${F}; sha256 -q ${BASEDIR}/${F}; elif [ -d ${BASEDIR}/${F} ]; then echo -n "${F}|d|" stat -f '%d-%i|%u|%g|%Mp%Lp|%Of|' ${BASEDIR}/${F}; else echo "Unknown file type: ${BASEDIR}/${F}" \ >/dev/stderr touch .err return 1 fi done < filelist | sort -k 3,3 -t '|' > $2.tmp rm filelist # Check if an error occurred during system inspection if [ -f .err ]; then return 1 fi # Convert to the form # /path/to/file|type|user|group|perm|flags|value|hlink # by resolving identical device and inode numbers into hard links. cut -f 1,3 -d '|' $2.tmp | sort -k 1,1 -t '|' | sort -s -u -k 2,2 -t '|' | join -1 2 -2 3 -t '|' - $2.tmp | awk -F \| -v OFS=\| \ '{ if (($2 == $3) || ($4 == "-")) print $3,$4,$5,$6,$7,$8,$9,"" else print $3,$4,$5,$6,$7,$8,$9,$2 }' | sort > $2 rm $2.tmp # We're finished looking around echo "done." } # For any paths matching ${MERGECHANGES}, compare $1 and $2 and find any # files which differ; generate $3 containing these paths and the old hashes. fetch_filter_mergechanges () { # Pull out the paths and hashes of the files matching ${MERGECHANGES}. for F in $1 $2; do for X in ${MERGECHANGES}; do grep -E "^${X}" ${F} done | cut -f 1,2,7 -d '|' | sort > ${F}-values done # Any line in $2-values which doesn't appear in $1-values and is a # file means that we should list the path in $3. comm -13 $1-values $2-values | fgrep '|f|' | cut -f 1 -d '|' > $2-paths # For each path, pull out one (and only one!) entry from $1-values. # Note that we cannot distinguish which "old" version the user made # changes to; but hopefully any changes which occur due to security # updates will exist in both the "new" version and the version which # the user has installed, so the merging will still work. while read X; do look "${X}|" $1-values | head -1 done < $2-paths > $3 # Clean up rm $1-values $2-values $2-paths } # For any paths matching ${UPDATEIFUNMODIFIED}, remove lines from $[123] # which correspond to lines in $2 with hashes not matching $1 or $3, unless # the paths are listed in $4. For entries in $2 marked "not present" # (aka. type -), remove lines from $[123] unless there is a corresponding # entry in $1. fetch_filter_unmodified_notpresent () { # Figure out which lines of $1 and $3 correspond to bits which # should only be updated if they haven't changed, and fish out # the (path, type, value) tuples. # NOTE: We don't consider a file to be "modified" if it matches # the hash from $3. for X in ${UPDATEIFUNMODIFIED}; do grep -E "^${X}" $1 grep -E "^${X}" $3 done | cut -f 1,2,7 -d '|' | sort > $1-values # Do the same for $2. for X in ${UPDATEIFUNMODIFIED}; do grep -E "^${X}" $2 done | cut -f 1,2,7 -d '|' | sort > $2-values # Any entry in $2-values which is not in $1-values corresponds to # a path which we need to remove from $1, $2, and $3, unless it # that path appears in $4. comm -13 $1-values $2-values | sort -t '|' -k 1,1 > mlines.tmp cut -f 1 -d '|' $4 | sort | join -v 2 -t '|' - mlines.tmp | sort > mlines rm $1-values $2-values mlines.tmp # Any lines in $2 which are not in $1 AND are "not present" lines # also belong in mlines. comm -13 $1 $2 | cut -f 1,2,7 -d '|' | fgrep '|-|' >> mlines # Remove lines from $1, $2, and $3 for X in $1 $2 $3; do sort -t '|' -k 1,1 ${X} > ${X}.tmp cut -f 1 -d '|' < mlines | sort | join -v 2 -t '|' - ${X}.tmp | sort > ${X} rm ${X}.tmp done # Store a list of the modified files, for future reference fgrep -v '|-|' mlines | cut -f 1 -d '|' > modifiedfiles rm mlines } # For each entry in $1 of type -, remove any corresponding # entry from $2 if ${ALLOWADD} != "yes". Remove all entries # of type - from $1. fetch_filter_allowadd () { cut -f 1,2 -d '|' < $1 | fgrep '|-' | cut -f 1 -d '|' > filesnotpresent if [ ${ALLOWADD} != "yes" ]; then sort < $2 | join -v 1 -t '|' - filesnotpresent | sort > $2.tmp mv $2.tmp $2 fi sort < $1 | join -v 1 -t '|' - filesnotpresent | sort > $1.tmp mv $1.tmp $1 rm filesnotpresent } # If ${ALLOWDELETE} != "yes", then remove any entries from $1 # which don't correspond to entries in $2. fetch_filter_allowdelete () { # Produce a lists ${PATH}|${TYPE} for X in $1 $2; do cut -f 1-2 -d '|' < ${X} | sort -u > ${X}.nodes done # Figure out which lines need to be removed from $1. if [ ${ALLOWDELETE} != "yes" ]; then comm -23 $1.nodes $2.nodes > $1.badnodes else : > $1.badnodes fi # Remove the relevant lines from $1 while read X; do look "${X}|" $1 done < $1.badnodes | comm -13 - $1 > $1.tmp mv $1.tmp $1 rm $1.badnodes $1.nodes $2.nodes } # If ${KEEPMODIFIEDMETADATA} == "yes", then for each entry in $2 # with metadata not matching any entry in $1, replace the corresponding # line of $3 with one having the same metadata as the entry in $2. fetch_filter_modified_metadata () { # Fish out the metadata from $1 and $2 for X in $1 $2; do cut -f 1-6 -d '|' < ${X} > ${X}.metadata done # Find the metadata we need to keep if [ ${KEEPMODIFIEDMETADATA} = "yes" ]; then comm -13 $1.metadata $2.metadata > keepmeta else : > keepmeta fi # Extract the lines which we need to remove from $3, and # construct the lines which we need to add to $3. : > $3.remove : > $3.add while read LINE; do NODE=`echo "${LINE}" | cut -f 1-2 -d '|'` look "${NODE}|" $3 >> $3.remove look "${NODE}|" $3 | cut -f 7- -d '|' | lam -s "${LINE}|" - >> $3.add done < keepmeta # Remove the specified lines and add the new lines. sort $3.remove | comm -13 - $3 | sort -u - $3.add > $3.tmp mv $3.tmp $3 rm keepmeta $1.metadata $2.metadata $3.add $3.remove } # Remove lines from $1 and $2 which are identical; # no need to update a file if it isn't changing. fetch_filter_uptodate () { comm -23 $1 $2 > $1.tmp comm -13 $1 $2 > $2.tmp mv $1.tmp $1 mv $2.tmp $2 } # Fetch any "clean" old versions of files we need for merging changes. fetch_files_premerge () { # We only need to do anything if $1 is non-empty. if [ -s $1 ]; then # Tell the user what we're doing echo -n "Fetching files from ${OLDRELNUM} for merging... " # List of files wanted fgrep '|f|' < $1 | cut -f 3 -d '|' | sort -u > files.wanted # Only fetch the files we don't already have while read Y; do if [ ! -f "files/${Y}.gz" ]; then echo ${Y}; fi done < files.wanted > filelist # Actually fetch them lam -s "${OLDFETCHDIR}/f/" - -s ".gz" < filelist | xargs ${XARGST} ${PHTTPGET} ${SERVERNAME} \ 2>${QUIETREDIR} # Make sure we got them all, and move them into /files/ while read Y; do if ! [ -f ${Y}.gz ]; then echo "failed." return 1 fi if [ `gunzip -c < ${Y}.gz | ${SHA256} -q` = ${Y} ]; then mv ${Y}.gz files/${Y}.gz else echo "${Y} has incorrect hash." return 1 fi done < filelist echo "done." # Clean up rm filelist files.wanted fi } # Prepare to fetch files: Generate a list of the files we need, # copy the unmodified files we have into /files/, and generate # a list of patches to download. fetch_files_prepare () { # Tell the user why his disk is suddenly making lots of noise echo -n "Preparing to download files... " # Reduce indices to ${PATH}|${HASH} pairs for X in $1 $2 $3; do cut -f 1,2,7 -d '|' < ${X} | fgrep '|f|' | cut -f 1,3 -d '|' | sort > ${X}.hashes done # List of files wanted cut -f 2 -d '|' < $3.hashes | sort -u | while read HASH; do if ! [ -f files/${HASH}.gz ]; then echo ${HASH} fi done > files.wanted # Generate a list of unmodified files comm -12 $1.hashes $2.hashes | sort -k 1,1 -t '|' > unmodified.files # Copy all files into /files/. We only need the unmodified files # for use in patching; but we'll want all of them if the user asks # to rollback the updates later. while read LINE; do F=`echo "${LINE}" | cut -f 1 -d '|'` HASH=`echo "${LINE}" | cut -f 2 -d '|'` # Skip files we already have. if [ -f files/${HASH}.gz ]; then continue fi # Make sure the file hasn't changed. cp "${BASEDIR}/${F}" tmpfile if [ `sha256 -q tmpfile` != ${HASH} ]; then echo echo "File changed while FreeBSD Update running: ${F}" return 1 fi # Place the file into storage. gzip -c < tmpfile > files/${HASH}.gz rm tmpfile done < $2.hashes # Produce a list of patches to download sort -k 1,1 -t '|' $3.hashes | join -t '|' -o 2.2,1.2 - unmodified.files | fetch_make_patchlist > patchlist # Garbage collect rm unmodified.files $1.hashes $2.hashes $3.hashes # We don't need the list of possible old files any more. rm $1 # We're finished making noise echo "done." } # Fetch files. fetch_files () { # Attempt to fetch patches if [ -s patchlist ]; then echo -n "Fetching `wc -l < patchlist | tr -d ' '` " echo ${NDEBUG} "patches.${DDSTATS}" tr '|' '-' < patchlist | lam -s "${PATCHDIR}/" - | xargs ${XARGST} ${PHTTPGET} ${SERVERNAME} \ 2>${STATSREDIR} | fetch_progress echo "done." # Attempt to apply patches echo -n "Applying patches... " tr '|' ' ' < patchlist | while read X Y; do if [ ! -f "${X}-${Y}" ]; then continue; fi gunzip -c < files/${X}.gz > OLD bspatch OLD NEW ${X}-${Y} if [ `${SHA256} -q NEW` = ${Y} ]; then mv NEW files/${Y} gzip -n files/${Y} fi rm -f diff OLD NEW ${X}-${Y} done 2>${QUIETREDIR} echo "done." fi # Download files which couldn't be generate via patching while read Y; do if [ ! -f "files/${Y}.gz" ]; then echo ${Y}; fi done < files.wanted > filelist if [ -s filelist ]; then echo -n "Fetching `wc -l < filelist | tr -d ' '` " echo ${NDEBUG} "files... " lam -s "${FETCHDIR}/f/" - -s ".gz" < filelist | xargs ${XARGST} ${PHTTPGET} ${SERVERNAME} \ 2>${QUIETREDIR} while read Y; do if ! [ -f ${Y}.gz ]; then echo "failed." return 1 fi if [ `gunzip -c < ${Y}.gz | ${SHA256} -q` = ${Y} ]; then mv ${Y}.gz files/${Y}.gz else echo "${Y} has incorrect hash." return 1 fi done < filelist echo "done." fi # Clean up rm files.wanted filelist patchlist } # Create and populate install manifest directory; and report what updates # are available. fetch_create_manifest () { # If we have an existing install manifest, nuke it. if [ -L "${BDHASH}-install" ]; then rm -r ${BDHASH}-install/ rm ${BDHASH}-install fi # Report to the user if any updates were avoided due to local changes if [ -s modifiedfiles ]; then echo echo -n "The following files are affected by updates, " echo "but no changes have" echo -n "been downloaded because the files have been " echo "modified locally:" cat modifiedfiles fi | $PAGER rm modifiedfiles # If no files will be updated, tell the user and exit if ! [ -s INDEX-PRESENT ] && ! [ -s INDEX-NEW ]; then rm INDEX-PRESENT INDEX-NEW echo echo -n "No updates needed to update system to " echo "${RELNUM}-p${RELPATCHNUM}." return fi # Divide files into (a) removed files, (b) added files, and # (c) updated files. cut -f 1 -d '|' < INDEX-PRESENT | sort > INDEX-PRESENT.flist cut -f 1 -d '|' < INDEX-NEW | sort > INDEX-NEW.flist comm -23 INDEX-PRESENT.flist INDEX-NEW.flist > files.removed comm -13 INDEX-PRESENT.flist INDEX-NEW.flist > files.added comm -12 INDEX-PRESENT.flist INDEX-NEW.flist > files.updated rm INDEX-PRESENT.flist INDEX-NEW.flist # Report removed files, if any if [ -s files.removed ]; then echo echo -n "The following files will be removed " echo "as part of updating to ${RELNUM}-p${RELPATCHNUM}:" cat files.removed fi | $PAGER rm files.removed # Report added files, if any if [ -s files.added ]; then echo echo -n "The following files will be added " echo "as part of updating to ${RELNUM}-p${RELPATCHNUM}:" cat files.added fi | $PAGER rm files.added # Report updated files, if any if [ -s files.updated ]; then echo echo -n "The following files will be updated " echo "as part of updating to ${RELNUM}-p${RELPATCHNUM}:" cat files.updated fi | $PAGER rm files.updated # Create a directory for the install manifest. MDIR=`mktemp -d install.XXXXXX` || return 1 # Populate it mv INDEX-PRESENT ${MDIR}/INDEX-OLD mv INDEX-NEW ${MDIR}/INDEX-NEW # Link it into place ln -s ${MDIR} ${BDHASH}-install } # Warn about any upcoming EoL fetch_warn_eol () { # What's the current time? NOWTIME=`date "+%s"` # When did we last warn about the EoL date? if [ -f lasteolwarn ]; then LASTWARN=`cat lasteolwarn` else LASTWARN=`expr ${NOWTIME} - 63072000` fi # If the EoL time is past, warn. if [ ${EOLTIME} -lt ${NOWTIME} ]; then echo cat <<-EOF WARNING: `uname -sr` HAS PASSED ITS END-OF-LIFE DATE. Any security issues discovered after `date -r ${EOLTIME}` will not have been corrected. EOF return 1 fi # Figure out how long it has been since we last warned about the # upcoming EoL, and how much longer we have left. SINCEWARN=`expr ${NOWTIME} - ${LASTWARN}` TIMELEFT=`expr ${EOLTIME} - ${NOWTIME}` # Don't warn if the EoL is more than 3 months away if [ ${TIMELEFT} -gt 7884000 ]; then return 0 fi # Don't warn if the time remaining is more than 3 times the time # since the last warning. if [ ${TIMELEFT} -gt `expr ${SINCEWARN} \* 3` ]; then return 0 fi # Figure out what time units to use. if [ ${TIMELEFT} -lt 604800 ]; then UNIT="day" SIZE=86400 elif [ ${TIMELEFT} -lt 2678400 ]; then UNIT="week" SIZE=604800 else UNIT="month" SIZE=2678400 fi # Compute the right number of units NUM=`expr ${TIMELEFT} / ${SIZE}` if [ ${NUM} != 1 ]; then UNIT="${UNIT}s" fi # Print the warning echo cat <<-EOF WARNING: `uname -sr` is approaching its End-of-Life date. It is strongly recommended that you upgrade to a newer release within the next ${NUM} ${UNIT}. EOF # Update the stored time of last warning echo ${NOWTIME} > lasteolwarn } # Do the actual work involved in "fetch" / "cron". fetch_run () { workdir_init || return 1 # Prepare the mirror list. fetch_pick_server_init && fetch_pick_server # Try to fetch the public key until we run out of servers. while ! fetch_key; do fetch_pick_server || return 1 done # Try to fetch the metadata index signature ("tag") until we run # out of available servers; and sanity check the downloaded tag. while ! fetch_tag; do fetch_pick_server || return 1 done fetch_tagsanity || return 1 # Fetch the latest INDEX-NEW and INDEX-OLD files. fetch_metadata INDEX-NEW INDEX-OLD || return 1 # Generate filtered INDEX-NEW and INDEX-OLD files containing only # the lines which (a) belong to components we care about, and (b) # don't correspond to paths we're explicitly ignoring. fetch_filter_metadata INDEX-NEW || return 1 fetch_filter_metadata INDEX-OLD || return 1 # Translate /boot/${KERNCONF} into ${KERNELDIR} fetch_filter_kernel_names INDEX-NEW ${KERNCONF} fetch_filter_kernel_names INDEX-OLD ${KERNCONF} # For all paths appearing in INDEX-OLD or INDEX-NEW, inspect the # system and generate an INDEX-PRESENT file. fetch_inspect_system INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1 # Based on ${UPDATEIFUNMODIFIED}, remove lines from INDEX-* which # correspond to lines in INDEX-PRESENT with hashes not appearing # in INDEX-OLD or INDEX-NEW. Also remove lines where the entry in # INDEX-PRESENT has type - and there isn't a corresponding entry in # INDEX-OLD with type -. fetch_filter_unmodified_notpresent \ INDEX-OLD INDEX-PRESENT INDEX-NEW /dev/null # For each entry in INDEX-PRESENT of type -, remove any corresponding # entry from INDEX-NEW if ${ALLOWADD} != "yes". Remove all entries # of type - from INDEX-PRESENT. fetch_filter_allowadd INDEX-PRESENT INDEX-NEW # If ${ALLOWDELETE} != "yes", then remove any entries from # INDEX-PRESENT which don't correspond to entries in INDEX-NEW. fetch_filter_allowdelete INDEX-PRESENT INDEX-NEW # If ${KEEPMODIFIEDMETADATA} == "yes", then for each entry in # INDEX-PRESENT with metadata not matching any entry in INDEX-OLD, # replace the corresponding line of INDEX-NEW with one having the # same metadata as the entry in INDEX-PRESENT. fetch_filter_modified_metadata INDEX-OLD INDEX-PRESENT INDEX-NEW # Remove lines from INDEX-PRESENT and INDEX-NEW which are identical; # no need to update a file if it isn't changing. fetch_filter_uptodate INDEX-PRESENT INDEX-NEW # Prepare to fetch files: Generate a list of the files we need, # copy the unmodified files we have into /files/, and generate # a list of patches to download. fetch_files_prepare INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1 # Fetch files. fetch_files || return 1 # Create and populate install manifest directory; and report what # updates are available. fetch_create_manifest || return 1 # Warn about any upcoming EoL fetch_warn_eol || return 1 } # If StrictComponents is not "yes", generate a new components list # with only the components which appear to be installed. upgrade_guess_components () { if [ "${STRICTCOMPONENTS}" = "no" ]; then # Generate filtered INDEX-ALL with only the components listed # in COMPONENTS. fetch_filter_metadata_components $1 || return 1 # Tell the user why his disk is suddenly making lots of noise echo -n "Inspecting system... " # Look at the files on disk, and assume that a component is # supposed to be present if it is more than half-present. cut -f 1-3 -d '|' < INDEX-ALL | tr '|' ' ' | while read C S F; do if [ -e ${BASEDIR}/${F} ]; then echo "+ ${C}|${S}" fi echo "= ${C}|${S}" done | sort | uniq -c | sed -E 's,^ +,,' > compfreq grep ' = ' compfreq | cut -f 1,3 -d ' ' | sort -k 2,2 -t ' ' > compfreq.total grep ' + ' compfreq | cut -f 1,3 -d ' ' | sort -k 2,2 -t ' ' > compfreq.present join -t ' ' -1 2 -2 2 compfreq.present compfreq.total | while read S P T; do if [ ${P} -gt `expr ${T} / 2` ]; then echo ${S} fi done > comp.present cut -f 2 -d ' ' < compfreq.total > comp.total rm INDEX-ALL compfreq compfreq.total compfreq.present # We're done making noise. echo "done." # Sometimes the kernel isn't installed where INDEX-ALL # thinks that it should be: In particular, it is often in # /boot/kernel instead of /boot/GENERIC or /boot/SMP. To # deal with this, if "kernel|X" is listed in comp.total # (i.e., is a component which would be upgraded if it is # found to be present) we will add it to comp.present. # If "kernel|" is in comp.total but "kernel|X" is # not, we print a warning -- the user is running a kernel # which isn't part of the release. KCOMP=`echo ${KERNCONF} | tr 'A-Z' 'a-z'` grep -E "^kernel\|${KCOMP}\$" comp.total >> comp.present if grep -qE "^kernel\|" comp.total && ! grep -qE "^kernel\|${KCOMP}\$" comp.total; then cat <<-EOF WARNING: This system is running a "${KCOMP}" kernel, which is not a kernel configuration distributed as part of FreeBSD ${RELNUM}. This kernel will not be updated: you MUST update the kernel manually before running "$0 install". EOF fi # Re-sort the list of installed components and generate # the list of non-installed components. sort -u < comp.present > comp.present.tmp mv comp.present.tmp comp.present comm -13 comp.present comp.total > comp.absent # Ask the user to confirm that what we have is correct. To # reduce user confusion, translate "X|Y" back to "X/Y" (as # subcomponents must be listed in the configuration file). echo echo -n "The following components of FreeBSD " echo "seem to be installed:" tr '|' '/' < comp.present | fmt -72 echo echo -n "The following components of FreeBSD " echo "do not seem to be installed:" tr '|' '/' < comp.absent | fmt -72 echo continuep || return 1 echo # Suck the generated list of components into ${COMPONENTS}. # Note that comp.present.tmp is used due to issues with # pipelines and setting variables. COMPONENTS="" tr '|' '/' < comp.present > comp.present.tmp while read C; do COMPONENTS="${COMPONENTS} ${C}" done < comp.present.tmp # Delete temporary files rm comp.present comp.present.tmp comp.absent comp.total fi } # If StrictComponents is not "yes", COMPONENTS contains an entry # corresponding to the currently running kernel, and said kernel # does not exist in the new release, add "kernel/generic" to the # list of components. upgrade_guess_new_kernel () { if [ "${STRICTCOMPONENTS}" = "no" ]; then # Grab the unfiltered metadata file. METAHASH=`look "$1|" tINDEX.present | cut -f 2 -d '|'` gunzip -c < files/${METAHASH}.gz > $1.all # If "kernel/${KCOMP}" is in ${COMPONENTS} and that component # isn't in $1.all, we need to add kernel/generic. for C in ${COMPONENTS}; do if [ ${C} = "kernel/${KCOMP}" ] && ! grep -qE "^kernel\|${KCOMP}\|" $1.all; then COMPONENTS="${COMPONENTS} kernel/generic" NKERNCONF="GENERIC" cat <<-EOF WARNING: This system is running a "${KCOMP}" kernel, which is not a kernel configuration distributed as part of FreeBSD ${RELNUM}. As part of upgrading to FreeBSD ${RELNUM}, this kernel will be replaced with a "generic" kernel. EOF continuep || return 1 fi done # Don't need this any more... rm $1.all fi } # Convert INDEX-OLD (last release) and INDEX-ALL (new release) into # INDEX-OLD and INDEX-NEW files (in the sense of normal upgrades). upgrade_oldall_to_oldnew () { # For each ${F}|... which appears in INDEX-ALL but does not appear # in INDEX-OLD, add ${F}|-|||||| to INDEX-OLD. cut -f 1 -d '|' < $1 | sort -u > $1.paths cut -f 1 -d '|' < $2 | sort -u | comm -13 $1.paths - | lam - -s "|-||||||" | sort - $1 > $1.tmp mv $1.tmp $1 # Remove lines from INDEX-OLD which also appear in INDEX-ALL comm -23 $1 $2 > $1.tmp mv $1.tmp $1 # Remove lines from INDEX-ALL which have a file name not appearing # anywhere in INDEX-OLD (since these must be files which haven't # changed -- if they were new, there would be an entry of type "-"). cut -f 1 -d '|' < $1 | sort -u > $1.paths sort -k 1,1 -t '|' < $2 | join -t '|' - $1.paths | sort > $2.tmp rm $1.paths mv $2.tmp $2 # Rename INDEX-ALL to INDEX-NEW. mv $2 $3 } # Helper for upgrade_merge: Return zero true iff the two files differ only # in the contents of their RCS tags. samef () { X=`sed -E 's/\\$FreeBSD.*\\$/\$FreeBSD\$/' < $1 | ${SHA256}` Y=`sed -E 's/\\$FreeBSD.*\\$/\$FreeBSD\$/' < $2 | ${SHA256}` if [ $X = $Y ]; then return 0; else return 1; fi } # From the list of "old" files in $1, merge changes in $2 with those in $3, # and update $3 to reflect the hashes of merged files. upgrade_merge () { # We only need to do anything if $1 is non-empty. if [ -s $1 ]; then cut -f 1 -d '|' $1 | sort > $1-paths # Create staging area for merging files rm -rf merge/ while read F; do D=`dirname ${F}` mkdir -p merge/old/${D} mkdir -p merge/${OLDRELNUM}/${D} mkdir -p merge/${RELNUM}/${D} mkdir -p merge/new/${D} done < $1-paths # Copy in files while read F; do # Currently installed file V=`look "${F}|" $2 | cut -f 7 -d '|'` gunzip < files/${V}.gz > merge/old/${F} # Old release if look "${F}|" $1 | fgrep -q "|f|"; then V=`look "${F}|" $1 | cut -f 3 -d '|'` gunzip < files/${V}.gz \ > merge/${OLDRELNUM}/${F} fi # New release if look "${F}|" $3 | cut -f 1,2,7 -d '|' | fgrep -q "|f|"; then V=`look "${F}|" $3 | cut -f 7 -d '|'` gunzip < files/${V}.gz \ > merge/${RELNUM}/${F} fi done < $1-paths # Attempt to automatically merge changes echo -n "Attempting to automatically merge " echo -n "changes in files..." : > failed.merges while read F; do # If the file doesn't exist in the new release, # the result of "merging changes" is having the file # not exist. if ! [ -f merge/${RELNUM}/${F} ]; then continue fi # If the file didn't exist in the old release, we're # going to throw away the existing file and hope that # the version from the new release is what we want. if ! [ -f merge/${OLDRELNUM}/${F} ]; then cp merge/${RELNUM}/${F} merge/new/${F} continue fi # Some files need special treatment. case ${F} in /etc/spwd.db | /etc/pwd.db | /etc/login.conf.db) # Don't merge these -- we're rebuild them # after updates are installed. cp merge/old/${F} merge/new/${F} ;; *) if ! merge -p -L "current version" \ -L "${OLDRELNUM}" -L "${RELNUM}" \ merge/old/${F} \ merge/${OLDRELNUM}/${F} \ merge/${RELNUM}/${F} \ > merge/new/${F} 2>/dev/null; then echo ${F} >> failed.merges fi ;; esac done < $1-paths echo " done." # Ask the user to handle any files which didn't merge. while read F; do # If the installed file differs from the version in # the old release only due to RCS tag expansion # then just use the version in the new release. if samef merge/old/${F} merge/${OLDRELNUM}/${F}; then cp merge/${RELNUM}/${F} merge/new/${F} continue fi cat <<-EOF The following file could not be merged automatically: ${F} Press Enter to edit this file in ${EDITOR} and resolve the conflicts manually... EOF read dummy files/${V}.gz echo "${F}|${V}" fi done < $1-paths > newhashes # Pull lines out from $3 which need to be updated to # reflect merged files. while read F; do look "${F}|" $3 done < $1-paths > $3-oldlines # Update lines to reflect merged files join -t '|' -o 1.1,1.2,1.3,1.4,1.5,1.6,2.2,1.8 \ $3-oldlines newhashes > $3-newlines # Remove old lines from $3 and add new lines. sort $3-oldlines | comm -13 - $3 | sort - $3-newlines > $3.tmp mv $3.tmp $3 # Clean up rm $1-paths newhashes $3-oldlines $3-newlines rm -rf merge/ fi # We're done with merging files. rm $1 } # Do the work involved in fetching upgrades to a new release upgrade_run () { workdir_init || return 1 # Prepare the mirror list. fetch_pick_server_init && fetch_pick_server # Try to fetch the public key until we run out of servers. while ! fetch_key; do fetch_pick_server || return 1 done # Try to fetch the metadata index signature ("tag") until we run # out of available servers; and sanity check the downloaded tag. while ! fetch_tag; do fetch_pick_server || return 1 done fetch_tagsanity || return 1 # Fetch the INDEX-OLD and INDEX-ALL. fetch_metadata INDEX-OLD INDEX-ALL || return 1 # If StrictComponents is not "yes", generate a new components list # with only the components which appear to be installed. upgrade_guess_components INDEX-ALL || return 1 # Generate filtered INDEX-OLD and INDEX-ALL files containing only # the components we want and without anything marked as "Ignore". fetch_filter_metadata INDEX-OLD || return 1 fetch_filter_metadata INDEX-ALL || return 1 # Merge the INDEX-OLD and INDEX-ALL files into INDEX-OLD. sort INDEX-OLD INDEX-ALL > INDEX-OLD.tmp mv INDEX-OLD.tmp INDEX-OLD rm INDEX-ALL # Adjust variables for fetching files from the new release. OLDRELNUM=${RELNUM} RELNUM=${TARGETRELEASE} OLDFETCHDIR=${FETCHDIR} FETCHDIR=${RELNUM}/${ARCH} # Try to fetch the NEW metadata index signature ("tag") until we run # out of available servers; and sanity check the downloaded tag. while ! fetch_tag; do fetch_pick_server || return 1 done # Fetch the new INDEX-ALL. fetch_metadata INDEX-ALL || return 1 # If StrictComponents is not "yes", COMPONENTS contains an entry # corresponding to the currently running kernel, and said kernel # does not exist in the new release, add "kernel/generic" to the # list of components. upgrade_guess_new_kernel INDEX-ALL || return 1 # Filter INDEX-ALL to contain only the components we want and without # anything marked as "Ignore". fetch_filter_metadata INDEX-ALL || return 1 # Convert INDEX-OLD (last release) and INDEX-ALL (new release) into # INDEX-OLD and INDEX-NEW files (in the sense of normal upgrades). upgrade_oldall_to_oldnew INDEX-OLD INDEX-ALL INDEX-NEW # Translate /boot/${KERNCONF} or /boot/${NKERNCONF} into ${KERNELDIR} fetch_filter_kernel_names INDEX-NEW ${NKERNCONF} fetch_filter_kernel_names INDEX-OLD ${KERNCONF} # For all paths appearing in INDEX-OLD or INDEX-NEW, inspect the # system and generate an INDEX-PRESENT file. fetch_inspect_system INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1 # Based on ${MERGECHANGES}, generate a file tomerge-old with the # paths and hashes of old versions of files to merge. fetch_filter_mergechanges INDEX-OLD INDEX-PRESENT tomerge-old # Based on ${UPDATEIFUNMODIFIED}, remove lines from INDEX-* which # correspond to lines in INDEX-PRESENT with hashes not appearing # in INDEX-OLD or INDEX-NEW. Also remove lines where the entry in # INDEX-PRESENT has type - and there isn't a corresponding entry in # INDEX-OLD with type -. fetch_filter_unmodified_notpresent \ INDEX-OLD INDEX-PRESENT INDEX-NEW tomerge-old # For each entry in INDEX-PRESENT of type -, remove any corresponding # entry from INDEX-NEW if ${ALLOWADD} != "yes". Remove all entries # of type - from INDEX-PRESENT. fetch_filter_allowadd INDEX-PRESENT INDEX-NEW # If ${ALLOWDELETE} != "yes", then remove any entries from # INDEX-PRESENT which don't correspond to entries in INDEX-NEW. fetch_filter_allowdelete INDEX-PRESENT INDEX-NEW # If ${KEEPMODIFIEDMETADATA} == "yes", then for each entry in # INDEX-PRESENT with metadata not matching any entry in INDEX-OLD, # replace the corresponding line of INDEX-NEW with one having the # same metadata as the entry in INDEX-PRESENT. fetch_filter_modified_metadata INDEX-OLD INDEX-PRESENT INDEX-NEW # Remove lines from INDEX-PRESENT and INDEX-NEW which are identical; # no need to update a file if it isn't changing. fetch_filter_uptodate INDEX-PRESENT INDEX-NEW # Fetch "clean" files from the old release for merging changes. fetch_files_premerge tomerge-old # Prepare to fetch files: Generate a list of the files we need, # copy the unmodified files we have into /files/, and generate # a list of patches to download. fetch_files_prepare INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1 # Fetch patches from to-${RELNUM}/${ARCH}/bp/ PATCHDIR=to-${RELNUM}/${ARCH}/bp fetch_files || return 1 # Merge configuration file changes. upgrade_merge tomerge-old INDEX-PRESENT INDEX-NEW || return 1 # Create and populate install manifest directory; and report what # updates are available. fetch_create_manifest || return 1 # Leave a note behind to tell the "install" command that the kernel # needs to be installed before the world. touch ${BDHASH}-install/kernelfirst # Remind the user that they need to run "freebsd-update install" # to install the downloaded bits, in case they didn't RTFM. echo "To install the downloaded upgrades, run \"$0 install\"." } # Make sure that all the file hashes mentioned in $@ have corresponding # gzipped files stored in /files/. install_verify () { # Generate a list of hashes cat $@ | cut -f 2,7 -d '|' | grep -E '^f' | cut -f 2 -d '|' | sort -u > filelist # Make sure all the hashes exist while read HASH; do if ! [ -f files/${HASH}.gz ]; then echo -n "Update files missing -- " echo "this should never happen." echo "Re-run '$0 fetch'." return 1 fi done < filelist # Clean up rm filelist } # Remove the system immutable flag from files install_unschg () { # Generate file list cat $@ | cut -f 1 -d '|' > filelist # Remove flags while read F; do if ! [ -e ${BASEDIR}/${F} ]; then continue fi chflags noschg ${BASEDIR}/${F} || return 1 done < filelist # Clean up rm filelist } # Decide which directory name to use for kernel backups. backup_kernel_finddir () { CNT=0 while true ; do # Pathname does not exist, so it is OK use that name # for backup directory. if [ ! -e $BASEDIR/$BACKUPKERNELDIR ]; then return 0 fi # If directory do exist, we only use if it has our # marker file. if [ -d $BASEDIR/$BACKUPKERNELDIR -a \ -e $BASEDIR/$BACKUPKERNELDIR/.freebsd-update ]; then return 0 fi # We could not use current directory name, so add counter to # the end and try again. CNT=$((CNT + 1)) if [ $CNT -gt 9 ]; then echo "Could not find valid backup dir ($BASEDIR/$BACKUPKERNELDIR)" exit 1 fi BACKUPKERNELDIR="`echo $BACKUPKERNELDIR | sed -Ee 's/[0-9]\$//'`" BACKUPKERNELDIR="${BACKUPKERNELDIR}${CNT}" done } # Backup the current kernel using hardlinks, if not disabled by user. # Since we delete all files in the directory used for previous backups # we create a marker file called ".freebsd-update" in the directory so # we can determine on the next run that the directory was created by # freebsd-update and we then do not accidentally remove user files in # the unlikely case that the user has created a directory with a # conflicting name. backup_kernel () { # Only make kernel backup is so configured. if [ $BACKUPKERNEL != yes ]; then return 0 fi # Decide which directory name to use for kernel backups. backup_kernel_finddir # Remove old kernel backup files. If $BACKUPKERNELDIR was # "not ours", backup_kernel_finddir would have exited, so # deleting the directory content is as safe as we can make it. if [ -d $BASEDIR/$BACKUPKERNELDIR ]; then rm -fr $BASEDIR/$BACKUPKERNELDIR fi # Create directories for backup. mkdir -p $BASEDIR/$BACKUPKERNELDIR mtree -cdn -p "${BASEDIR}/${KERNELDIR}" | \ mtree -Ue -p "${BASEDIR}/${BACKUPKERNELDIR}" > /dev/null # Mark the directory as having been created by freebsd-update. touch $BASEDIR/$BACKUPKERNELDIR/.freebsd-update if [ $? -ne 0 ]; then echo "Could not create kernel backup directory" exit 1 fi # Disable pathname expansion to be sure *.symbols is not # expanded. set -f # Use find to ignore symbol files, unless disabled by user. if [ $BACKUPKERNELSYMBOLFILES = yes ]; then FINDFILTER="" else FINDFILTER=-"a ! -name *.symbols" fi # Backup all the kernel files using hardlinks. (cd ${BASEDIR}/${KERNELDIR} && find . -type f $FINDFILTER -exec \ cp -pl '{}' ${BASEDIR}/${BACKUPKERNELDIR}/'{}' \;) # Re-enable patchname expansion. set +f } # Install new files install_from_index () { # First pass: Do everything apart from setting file flags. We # can't set flags yet, because schg inhibits hard linking. sort -k 1,1 -t '|' $1 | tr '|' ' ' | while read FPATH TYPE OWNER GROUP PERM FLAGS HASH LINK; do case ${TYPE} in d) # Create a directory install -d -o ${OWNER} -g ${GROUP} \ -m ${PERM} ${BASEDIR}/${FPATH} ;; f) if [ -z "${LINK}" ]; then # Create a file, without setting flags. gunzip < files/${HASH}.gz > ${HASH} install -S -o ${OWNER} -g ${GROUP} \ -m ${PERM} ${HASH} ${BASEDIR}/${FPATH} rm ${HASH} else # Create a hard link. ln -f ${BASEDIR}/${LINK} ${BASEDIR}/${FPATH} fi ;; L) # Create a symlink ln -sfh ${HASH} ${BASEDIR}/${FPATH} ;; esac done # Perform a second pass, adding file flags. tr '|' ' ' < $1 | while read FPATH TYPE OWNER GROUP PERM FLAGS HASH LINK; do if [ ${TYPE} = "f" ] && ! [ ${FLAGS} = "0" ]; then chflags ${FLAGS} ${BASEDIR}/${FPATH} fi done } # Remove files which we want to delete install_delete () { # Generate list of new files cut -f 1 -d '|' < $2 | sort > newfiles # Generate subindex of old files we want to nuke sort -k 1,1 -t '|' $1 | join -t '|' -v 1 - newfiles | sort -r -k 1,1 -t '|' | cut -f 1,2 -d '|' | tr '|' ' ' > killfiles # Remove the offending bits while read FPATH TYPE; do case ${TYPE} in d) rmdir ${BASEDIR}/${FPATH} ;; f) rm ${BASEDIR}/${FPATH} ;; L) rm ${BASEDIR}/${FPATH} ;; esac done < killfiles # Clean up rm newfiles killfiles } # Install new files, delete old files, and update linker.hints install_files () { # If we haven't already dealt with the kernel, deal with it. if ! [ -f $1/kerneldone ]; then grep -E '^/boot/' $1/INDEX-OLD > INDEX-OLD grep -E '^/boot/' $1/INDEX-NEW > INDEX-NEW # Backup current kernel before installing a new one backup_kernel || return 1 # Install new files install_from_index INDEX-NEW || return 1 # Remove files which need to be deleted install_delete INDEX-OLD INDEX-NEW || return 1 # Update linker.hints if necessary if [ -s INDEX-OLD -o -s INDEX-NEW ]; then kldxref -R ${BASEDIR}/boot/ 2>/dev/null fi # We've finished updating the kernel. touch $1/kerneldone # Do we need to ask for a reboot now? if [ -f $1/kernelfirst ] && [ -s INDEX-OLD -o -s INDEX-NEW ]; then cat <<-EOF Kernel updates have been installed. Please reboot and run "$0 install" again to finish installing updates. EOF exit 0 fi fi # If we haven't already dealt with the world, deal with it. if ! [ -f $1/worlddone ]; then # Create any necessary directories first grep -vE '^/boot/' $1/INDEX-NEW | grep -E '^[^|]+\|d\|' > INDEX-NEW install_from_index INDEX-NEW || return 1 # Install new runtime linker grep -vE '^/boot/' $1/INDEX-NEW | grep -vE '^[^|]+\|d\|' | grep -E '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' > INDEX-NEW install_from_index INDEX-NEW || return 1 # Install new shared libraries next grep -vE '^/boot/' $1/INDEX-NEW | grep -vE '^[^|]+\|d\|' | grep -vE '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' | grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-NEW install_from_index INDEX-NEW || return 1 # Deal with everything else grep -vE '^/boot/' $1/INDEX-OLD | grep -vE '^[^|]+\|d\|' | grep -vE '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' | grep -vE '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-OLD grep -vE '^/boot/' $1/INDEX-NEW | grep -vE '^[^|]+\|d\|' | grep -vE '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' | grep -vE '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-NEW install_from_index INDEX-NEW || return 1 install_delete INDEX-OLD INDEX-NEW || return 1 # Rebuild /etc/spwd.db and /etc/pwd.db if necessary. if [ ${BASEDIR}/etc/master.passwd -nt ${BASEDIR}/etc/spwd.db ] || [ ${BASEDIR}/etc/master.passwd -nt ${BASEDIR}/etc/pwd.db ]; then pwd_mkdb -d ${BASEDIR}/etc ${BASEDIR}/etc/master.passwd fi # Rebuild /etc/login.conf.db if necessary. if [ ${BASEDIR}/etc/login.conf -nt ${BASEDIR}/etc/login.conf.db ]; then cap_mkdb ${BASEDIR}/etc/login.conf fi # We've finished installing the world and deleting old files # which are not shared libraries. touch $1/worlddone # Do we need to ask the user to portupgrade now? grep -vE '^/boot/' $1/INDEX-NEW | grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' | cut -f 1 -d '|' | sort > newfiles if grep -vE '^/boot/' $1/INDEX-OLD | grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' | cut -f 1 -d '|' | sort | join -v 1 - newfiles | grep -q .; then cat <<-EOF Completing this upgrade requires removing old shared object files. Please rebuild all installed 3rd party software (e.g., programs installed from the ports tree) and then run "$0 install" again to finish installing updates. EOF rm newfiles exit 0 fi rm newfiles fi # Remove old shared libraries grep -vE '^/boot/' $1/INDEX-NEW | grep -vE '^[^|]+\|d\|' | grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-NEW grep -vE '^/boot/' $1/INDEX-OLD | grep -vE '^[^|]+\|d\|' | grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-OLD install_delete INDEX-OLD INDEX-NEW || return 1 # Remove old directories grep -vE '^/boot/' $1/INDEX-NEW | grep -E '^[^|]+\|d\|' > INDEX-NEW grep -vE '^/boot/' $1/INDEX-OLD | grep -E '^[^|]+\|d\|' > INDEX-OLD install_delete INDEX-OLD INDEX-NEW || return 1 # Remove temporary files rm INDEX-OLD INDEX-NEW } # Rearrange bits to allow the installed updates to be rolled back install_setup_rollback () { # Remove the "reboot after installing kernel", "kernel updated", and # "finished installing the world" flags if present -- they are # irrelevant when rolling back updates. if [ -f ${BDHASH}-install/kernelfirst ]; then rm ${BDHASH}-install/kernelfirst rm ${BDHASH}-install/kerneldone fi if [ -f ${BDHASH}-install/worlddone ]; then rm ${BDHASH}-install/worlddone fi if [ -L ${BDHASH}-rollback ]; then mv ${BDHASH}-rollback ${BDHASH}-install/rollback fi mv ${BDHASH}-install ${BDHASH}-rollback } # Actually install updates install_run () { echo -n "Installing updates..." # Make sure we have all the files we should have install_verify ${BDHASH}-install/INDEX-OLD \ ${BDHASH}-install/INDEX-NEW || return 1 # Remove system immutable flag from files install_unschg ${BDHASH}-install/INDEX-OLD \ ${BDHASH}-install/INDEX-NEW || return 1 # Install new files, delete old files, and update linker.hints install_files ${BDHASH}-install || return 1 # Rearrange bits to allow the installed updates to be rolled back install_setup_rollback echo " done." } # Rearrange bits to allow the previous set of updates to be rolled back next. rollback_setup_rollback () { if [ -L ${BDHASH}-rollback/rollback ]; then mv ${BDHASH}-rollback/rollback rollback-tmp rm -r ${BDHASH}-rollback/ rm ${BDHASH}-rollback mv rollback-tmp ${BDHASH}-rollback else rm -r ${BDHASH}-rollback/ rm ${BDHASH}-rollback fi } # Install old files, delete new files, and update linker.hints rollback_files () { # Install old shared library files which don't have the same path as # a new shared library file. grep -vE '^/boot/' $1/INDEX-NEW | grep -E '/lib/.*\.so\.[0-9]+\|' | cut -f 1 -d '|' | sort > INDEX-NEW.libs.flist grep -vE '^/boot/' $1/INDEX-OLD | grep -E '/lib/.*\.so\.[0-9]+\|' | sort -k 1,1 -t '|' - | join -t '|' -v 1 - INDEX-NEW.libs.flist > INDEX-OLD install_from_index INDEX-OLD || return 1 # Deal with files which are neither kernel nor shared library grep -vE '^/boot/' $1/INDEX-OLD | grep -vE '/lib/.*\.so\.[0-9]+\|' > INDEX-OLD grep -vE '^/boot/' $1/INDEX-NEW | grep -vE '/lib/.*\.so\.[0-9]+\|' > INDEX-NEW install_from_index INDEX-OLD || return 1 install_delete INDEX-NEW INDEX-OLD || return 1 # Install any old shared library files which we didn't install above. grep -vE '^/boot/' $1/INDEX-OLD | grep -E '/lib/.*\.so\.[0-9]+\|' | sort -k 1,1 -t '|' - | join -t '|' - INDEX-NEW.libs.flist > INDEX-OLD install_from_index INDEX-OLD || return 1 # Delete unneeded shared library files grep -vE '^/boot/' $1/INDEX-OLD | grep -E '/lib/.*\.so\.[0-9]+\|' > INDEX-OLD grep -vE '^/boot/' $1/INDEX-NEW | grep -E '/lib/.*\.so\.[0-9]+\|' > INDEX-NEW install_delete INDEX-NEW INDEX-OLD || return 1 # Deal with kernel files grep -E '^/boot/' $1/INDEX-OLD > INDEX-OLD grep -E '^/boot/' $1/INDEX-NEW > INDEX-NEW install_from_index INDEX-OLD || return 1 install_delete INDEX-NEW INDEX-OLD || return 1 if [ -s INDEX-OLD -o -s INDEX-NEW ]; then kldxref -R /boot/ 2>/dev/null fi # Remove temporary files rm INDEX-OLD INDEX-NEW INDEX-NEW.libs.flist } # Actually rollback updates rollback_run () { echo -n "Uninstalling updates..." # If there are updates waiting to be installed, remove them; we # want the user to re-run 'fetch' after rolling back updates. if [ -L ${BDHASH}-install ]; then rm -r ${BDHASH}-install/ rm ${BDHASH}-install fi # Make sure we have all the files we should have install_verify ${BDHASH}-rollback/INDEX-NEW \ ${BDHASH}-rollback/INDEX-OLD || return 1 # Remove system immutable flag from files install_unschg ${BDHASH}-rollback/INDEX-NEW \ ${BDHASH}-rollback/INDEX-OLD || return 1 # Install old files, delete new files, and update linker.hints rollback_files ${BDHASH}-rollback || return 1 # Remove the rollback directory and the symlink pointing to it; and # rearrange bits to allow the previous set of updates to be rolled # back next. rollback_setup_rollback echo " done." } # Compare INDEX-ALL and INDEX-PRESENT and print warnings about differences. IDS_compare () { # Get all the lines which mismatch in something other than file # flags. We ignore file flags because sysinstall doesn't seem to # set them when it installs FreeBSD; warning about these adds a # very large amount of noise. cut -f 1-5,7-8 -d '|' $1 > $1.noflags sort -k 1,1 -t '|' $1.noflags > $1.sorted cut -f 1-5,7-8 -d '|' $2 | comm -13 $1.noflags - | fgrep -v '|-|||||' | sort -k 1,1 -t '|' | join -t '|' $1.sorted - > INDEX-NOTMATCHING # Ignore files which match IDSIGNOREPATHS. for X in ${IDSIGNOREPATHS}; do grep -E "^${X}" INDEX-NOTMATCHING done | sort -u | comm -13 - INDEX-NOTMATCHING > INDEX-NOTMATCHING.tmp mv INDEX-NOTMATCHING.tmp INDEX-NOTMATCHING # Go through the lines and print warnings. local IFS='|' while read FPATH TYPE OWNER GROUP PERM HASH LINK P_TYPE P_OWNER P_GROUP P_PERM P_HASH P_LINK; do # Warn about different object types. if ! [ "${TYPE}" = "${P_TYPE}" ]; then echo -n "${FPATH} is a " case "${P_TYPE}" in f) echo -n "regular file, " ;; d) echo -n "directory, " ;; L) echo -n "symlink, " ;; esac echo -n "but should be a " case "${TYPE}" in f) echo -n "regular file." ;; d) echo -n "directory." ;; L) echo -n "symlink." ;; esac echo # Skip other tests, since they don't make sense if # we're comparing different object types. continue fi # Warn about different owners. if ! [ "${OWNER}" = "${P_OWNER}" ]; then echo -n "${FPATH} is owned by user id ${P_OWNER}, " echo "but should be owned by user id ${OWNER}." fi # Warn about different groups. if ! [ "${GROUP}" = "${P_GROUP}" ]; then echo -n "${FPATH} is owned by group id ${P_GROUP}, " echo "but should be owned by group id ${GROUP}." fi # Warn about different permissions. We do not warn about # different permissions on symlinks, since some archivers # don't extract symlink permissions correctly and they are # ignored anyway. if ! [ "${PERM}" = "${P_PERM}" ] && ! [ "${TYPE}" = "L" ]; then echo -n "${FPATH} has ${P_PERM} permissions, " echo "but should have ${PERM} permissions." fi # Warn about different file hashes / symlink destinations. if ! [ "${HASH}" = "${P_HASH}" ]; then if [ "${TYPE}" = "L" ]; then echo -n "${FPATH} is a symlink to ${P_HASH}, " echo "but should be a symlink to ${HASH}." fi if [ "${TYPE}" = "f" ]; then echo -n "${FPATH} has SHA256 hash ${P_HASH}, " echo "but should have SHA256 hash ${HASH}." fi fi # We don't warn about different hard links, since some # some archivers break hard links, and as long as the # underlying data is correct they really don't matter. done < INDEX-NOTMATCHING # Clean up rm $1 $1.noflags $1.sorted $2 INDEX-NOTMATCHING } # Do the work involved in comparing the system to a "known good" index IDS_run () { workdir_init || return 1 # Prepare the mirror list. fetch_pick_server_init && fetch_pick_server # Try to fetch the public key until we run out of servers. while ! fetch_key; do fetch_pick_server || return 1 done # Try to fetch the metadata index signature ("tag") until we run # out of available servers; and sanity check the downloaded tag. while ! fetch_tag; do fetch_pick_server || return 1 done fetch_tagsanity || return 1 # Fetch INDEX-OLD and INDEX-ALL. fetch_metadata INDEX-OLD INDEX-ALL || return 1 # Generate filtered INDEX-OLD and INDEX-ALL files containing only # the components we want and without anything marked as "Ignore". fetch_filter_metadata INDEX-OLD || return 1 fetch_filter_metadata INDEX-ALL || return 1 # Merge the INDEX-OLD and INDEX-ALL files into INDEX-ALL. sort INDEX-OLD INDEX-ALL > INDEX-ALL.tmp mv INDEX-ALL.tmp INDEX-ALL rm INDEX-OLD # Translate /boot/${KERNCONF} to ${KERNELDIR} fetch_filter_kernel_names INDEX-ALL ${KERNCONF} # Inspect the system and generate an INDEX-PRESENT file. fetch_inspect_system INDEX-ALL INDEX-PRESENT /dev/null || return 1 # Compare INDEX-ALL and INDEX-PRESENT and print warnings about any # differences. IDS_compare INDEX-ALL INDEX-PRESENT } #### Main functions -- call parameter-handling and core functions # Using the command line, configuration file, and defaults, # set all the parameters which are needed later. get_params () { init_params parse_cmdline $@ parse_conffile default_params } # Fetch command. Make sure that we're being called # interactively, then run fetch_check_params and fetch_run cmd_fetch () { - if [ ! -t 0 && $NOTTYOK -eq 0 ]; then + if [ ! -t 0 -a $NOTTYOK -eq 0 ]; then echo -n "`basename $0` fetch should not " echo "be run non-interactively." echo "Run `basename $0` cron instead." exit 1 fi fetch_check_params fetch_run || exit 1 } # Cron command. Make sure the parameters are sensible; wait # rand(3600) seconds; then fetch updates. While fetching updates, # send output to a temporary file; only print that file if the # fetching failed. cmd_cron () { fetch_check_params sleep `jot -r 1 0 3600` TMPFILE=`mktemp /tmp/freebsd-update.XXXXXX` || exit 1 if ! fetch_run >> ${TMPFILE} || ! grep -q "No updates needed" ${TMPFILE} || [ ${VERBOSELEVEL} = "debug" ]; then mail -s "`hostname` security updates" ${MAILTO} < ${TMPFILE} fi rm ${TMPFILE} } # Fetch files for upgrading to a new release. cmd_upgrade () { upgrade_check_params upgrade_run || exit 1 } # Install downloaded updates. cmd_install () { install_check_params install_run || exit 1 } # Rollback most recently installed updates. cmd_rollback () { rollback_check_params rollback_run || exit 1 } # Compare system against a "known good" index. cmd_IDS () { IDS_check_params IDS_run || exit 1 } #### Entry point # Make sure we find utilities from the base system export PATH=/sbin:/bin:/usr/sbin:/usr/bin:${PATH} # Set a pager if the user doesn't if [ -z "$PAGER" ]; then PAGER=/usr/bin/more fi # Set LC_ALL in order to avoid problems with character ranges like [A-Z]. export LC_ALL=C get_params $@ for COMMAND in ${COMMANDS}; do cmd_${COMMAND} done Index: user/ngie/more-tests/usr.sbin/vidcontrol/vidcontrol.c =================================================================== --- user/ngie/more-tests/usr.sbin/vidcontrol/vidcontrol.c (revision 281584) +++ user/ngie/more-tests/usr.sbin/vidcontrol/vidcontrol.c (revision 281585) @@ -1,1480 +1,1480 @@ /*- * Copyright (c) 1994-1996 Søren Schmidt * All rights reserved. * * Portions of this software are based in part on the work of * Sascha Wildner contributed to The DragonFly Project * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $DragonFly: src/usr.sbin/vidcontrol/vidcontrol.c,v 1.10 2005/03/02 06:08:29 joerg Exp $ */ #ifndef lint static const char rcsid[] = "$FreeBSD$"; #endif /* not lint */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "path.h" #include "decode.h" #define DATASIZE(x) ((x).w * (x).h * 256 / 8) /* Screen dump modes */ #define DUMP_FMT_RAW 1 #define DUMP_FMT_TXT 2 /* Screen dump options */ #define DUMP_FBF 0 #define DUMP_ALL 1 /* Screen dump file format revision */ #define DUMP_FMT_REV 1 static const char *legal_colors[16] = { "black", "blue", "green", "cyan", "red", "magenta", "brown", "white", "grey", "lightblue", "lightgreen", "lightcyan", "lightred", "lightmagenta", "yellow", "lightwhite" }; static struct { int active_vty; vid_info_t console_info; unsigned char screen_map[256]; int video_mode_number; struct video_info video_mode_info; } cur_info; struct vt4font_header { uint8_t magic[8]; uint8_t width; uint8_t height; uint16_t pad; uint32_t glyph_count; uint32_t map_count[4]; } __packed; static int hex = 0; static int vesa_cols; static int vesa_rows; static int font_height; static int colors_changed; static int video_mode_changed; static int normal_fore_color, normal_back_color; static int revers_fore_color, revers_back_color; static int vt4_mode = 0; static struct vid_info info; static struct video_info new_mode_info; /* * Initialize revert data. * * NOTE: the following parameters are not yet saved/restored: * * screen saver timeout * cursor type * mouse character and mouse show/hide state * vty switching on/off state * history buffer size * history contents * font maps */ static void init(void) { if (ioctl(0, VT_GETACTIVE, &cur_info.active_vty) == -1) errc(1, errno, "getting active vty"); cur_info.console_info.size = sizeof(cur_info.console_info); if (ioctl(0, CONS_GETINFO, &cur_info.console_info) == -1) errc(1, errno, "getting console information"); /* vt(4) use unicode, so no screen mapping required. */ if (vt4_mode == 0 && ioctl(0, GIO_SCRNMAP, &cur_info.screen_map) == -1) errc(1, errno, "getting screen map"); if (ioctl(0, CONS_GET, &cur_info.video_mode_number) == -1) errc(1, errno, "getting video mode number"); cur_info.video_mode_info.vi_mode = cur_info.video_mode_number; if (ioctl(0, CONS_MODEINFO, &cur_info.video_mode_info) == -1) errc(1, errno, "getting video mode parameters"); normal_fore_color = cur_info.console_info.mv_norm.fore; normal_back_color = cur_info.console_info.mv_norm.back; revers_fore_color = cur_info.console_info.mv_rev.fore; revers_back_color = cur_info.console_info.mv_rev.back; } /* * If something goes wrong along the way we call revert() to go back to the * console state we came from (which is assumed to be working). * * NOTE: please also read the comments of init(). */ static void revert(void) { int size[3]; ioctl(0, VT_ACTIVATE, cur_info.active_vty); fprintf(stderr, "\033[=%dA", cur_info.console_info.mv_ovscan); fprintf(stderr, "\033[=%dF", cur_info.console_info.mv_norm.fore); fprintf(stderr, "\033[=%dG", cur_info.console_info.mv_norm.back); fprintf(stderr, "\033[=%dH", cur_info.console_info.mv_rev.fore); fprintf(stderr, "\033[=%dI", cur_info.console_info.mv_rev.back); if (vt4_mode == 0) ioctl(0, PIO_SCRNMAP, &cur_info.screen_map); if (cur_info.video_mode_number >= M_VESA_BASE) ioctl(0, _IO('V', cur_info.video_mode_number - M_VESA_BASE), NULL); else ioctl(0, _IO('S', cur_info.video_mode_number), NULL); if (cur_info.video_mode_info.vi_flags & V_INFO_GRAPHICS) { size[0] = cur_info.video_mode_info.vi_width / 8; size[1] = cur_info.video_mode_info.vi_height / cur_info.console_info.font_size; size[2] = cur_info.console_info.font_size; ioctl(0, KDRASTER, size); } } /* * Print a short usage string describing all options, then exit. */ static void usage(void) { if (vt4_mode) fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n", "usage: vidcontrol [-CHPpx] [-b color] [-c appearance] [-f [[size] file]]", " [-g geometry] [-h size] [-i adapter | mode]", " [-M char] [-m on | off] [-r foreground background]", " [-S on | off] [-s number] [-T xterm | cons25] [-t N | off]", " [mode] [foreground [background]] [show]"); else fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n", "usage: vidcontrol [-CdHLPpx] [-b color] [-c appearance] [-f [size] file]", " [-g geometry] [-h size] [-i adapter | mode] [-l screen_map]", " [-M char] [-m on | off] [-r foreground background]", " [-S on | off] [-s number] [-T xterm | cons25] [-t N | off]", " [mode] [foreground [background]] [show]"); exit(1); } /* Detect presence of vt(4). */ static int is_vt4(void) { char vty_name[4] = ""; size_t len = sizeof(vty_name); if (sysctlbyname("kern.vty", vty_name, &len, NULL, 0) != 0) return (0); return (strcmp(vty_name, "vt") == 0); } /* * Retrieve the next argument from the command line (for options that require * more than one argument). */ static char * nextarg(int ac, char **av, int *indp, int oc, int strict) { if (*indp < ac) return(av[(*indp)++]); if (strict != 0) { revert(); errx(1, "option requires two arguments -- %c", oc); } return(NULL); } /* * Guess which file to open. Try to open each combination of a specified set * of file name components. */ static FILE * openguess(const char *a[], const char *b[], const char *c[], const char *d[], char **name) { FILE *f; int i, j, k, l; for (i = 0; a[i] != NULL; i++) { for (j = 0; b[j] != NULL; j++) { for (k = 0; c[k] != NULL; k++) { for (l = 0; d[l] != NULL; l++) { asprintf(name, "%s%s%s%s", a[i], b[j], c[k], d[l]); f = fopen(*name, "r"); if (f != NULL) return (f); free(*name); } } } } return (NULL); } /* * Load a screenmap from a file and set it. */ static void load_scrnmap(const char *filename) { FILE *fd; int size; char *name; scrmap_t scrnmap; const char *a[] = {"", SCRNMAP_PATH, NULL}; const char *b[] = {filename, NULL}; const char *c[] = {"", ".scm", NULL}; const char *d[] = {"", NULL}; fd = openguess(a, b, c, d, &name); if (fd == NULL) { revert(); errx(1, "screenmap file not found"); } size = sizeof(scrnmap); if (decode(fd, (char *)&scrnmap, size) != size) { rewind(fd); if (fread(&scrnmap, 1, size, fd) != (size_t)size) { warnx("bad screenmap file"); fclose(fd); revert(); errx(1, "bad screenmap file"); } } if (ioctl(0, PIO_SCRNMAP, &scrnmap) == -1) { revert(); errc(1, errno, "loading screenmap"); } fclose(fd); } /* * Set the default screenmap. */ static void load_default_scrnmap(void) { scrmap_t scrnmap; int i; for (i=0; i<256; i++) *((char*)&scrnmap + i) = i; if (ioctl(0, PIO_SCRNMAP, &scrnmap) == -1) { revert(); errc(1, errno, "loading default screenmap"); } } /* * Print the current screenmap to stdout. */ static void print_scrnmap(void) { unsigned char map[256]; size_t i; if (ioctl(0, GIO_SCRNMAP, &map) == -1) { revert(); errc(1, errno, "getting screenmap"); } for (i=0; i= M_VESA_BASE) mode = _IO('V', new_mode_num - M_VESA_BASE); else mode = _IO('S', new_mode_num); } /* * Try setting the new mode. */ if (ioctl(0, mode, NULL) == -1) { revert(); errc(1, errno, "setting video mode"); } /* * For raster modes it's not enough to just set the mode. * We also need to explicitly set the raster mode. */ if (new_mode_info.vi_flags & V_INFO_GRAPHICS) { /* font size */ if (font_height == 0) font_height = cur_info.console_info.font_size; size[2] = font_height; /* adjust columns */ if ((vesa_cols * 8 > new_mode_info.vi_width) || (vesa_cols <= 0)) { size[0] = new_mode_info.vi_width / 8; } else { size[0] = vesa_cols; } /* adjust rows */ if ((vesa_rows * font_height > new_mode_info.vi_height) || (vesa_rows <= 0)) { size[1] = new_mode_info.vi_height / font_height; } else { size[1] = vesa_rows; } /* set raster mode */ if (ioctl(0, KDRASTER, size)) { ioerr = errno; if (cur_mode >= M_VESA_BASE) ioctl(0, _IO('V', cur_mode - M_VESA_BASE), NULL); else ioctl(0, _IO('S', cur_mode), NULL); revert(); warnc(ioerr, "cannot activate raster display"); return EXIT_FAILURE; } } video_mode_changed = 1; (*mode_index)++; } return EXIT_SUCCESS; } /* * Return the number for a specified color name. */ static int get_color_number(char *color) { int i; for (i=0; i<16; i++) { if (!strcmp(color, legal_colors[i])) return i; } return -1; } /* * Get normal text and background colors. */ static void get_normal_colors(int argc, char **argv, int *_index) { int color; if (*_index < argc && (color = get_color_number(argv[*_index])) != -1) { (*_index)++; fprintf(stderr, "\033[=%dF", color); normal_fore_color=color; colors_changed = 1; if (*_index < argc && (color = get_color_number(argv[*_index])) != -1 && color < 8) { (*_index)++; fprintf(stderr, "\033[=%dG", color); normal_back_color=color; } } } /* * Get reverse text and background colors. */ static void get_reverse_colors(int argc, char **argv, int *_index) { int color; if ((color = get_color_number(argv[*(_index)-1])) != -1) { fprintf(stderr, "\033[=%dH", color); revers_fore_color=color; colors_changed = 1; if (*_index < argc && (color = get_color_number(argv[*_index])) != -1 && color < 8) { (*_index)++; fprintf(stderr, "\033[=%dI", color); revers_back_color=color; } } } /* * Set normal and reverse foreground and background colors. */ static void set_colors(void) { fprintf(stderr, "\033[=%dF", normal_fore_color); fprintf(stderr, "\033[=%dG", normal_back_color); fprintf(stderr, "\033[=%dH", revers_fore_color); fprintf(stderr, "\033[=%dI", revers_back_color); } /* * Switch to virtual terminal #arg. */ static void set_console(char *arg) { int n; if(!arg || strspn(arg,"0123456789") != strlen(arg)) { revert(); errx(1, "bad console number"); } n = atoi(arg); if (n < 1 || n > 16) { revert(); errx(1, "console number out of range"); } else if (ioctl(0, VT_ACTIVATE, n) == -1) { revert(); errc(1, errno, "switching vty"); } } /* * Sets the border color. */ static void set_border_color(char *arg) { int color; if ((color = get_color_number(arg)) != -1) { fprintf(stderr, "\033[=%dA", color); } else usage(); } static void set_mouse_char(char *arg) { struct mouse_info mouse; long l; l = strtol(arg, NULL, 0); if ((l < 0) || (l > UCHAR_MAX - 3)) { revert(); warnx("argument to -M must be 0 through %d", UCHAR_MAX - 3); return; } mouse.operation = MOUSE_MOUSECHAR; mouse.u.mouse_char = (int)l; if (ioctl(0, CONS_MOUSECTL, &mouse) == -1) { revert(); errc(1, errno, "setting mouse character"); } } /* * Show/hide the mouse. */ static void set_mouse(char *arg) { struct mouse_info mouse; if (!strcmp(arg, "on")) { mouse.operation = MOUSE_SHOW; } else if (!strcmp(arg, "off")) { mouse.operation = MOUSE_HIDE; } else { revert(); errx(1, "argument to -m must be either on or off"); } if (ioctl(0, CONS_MOUSECTL, &mouse) == -1) { revert(); errc(1, errno, "%sing the mouse", mouse.operation == MOUSE_SHOW ? "show" : "hid"); } } static void set_lockswitch(char *arg) { int data; if (!strcmp(arg, "off")) { data = 0x01; } else if (!strcmp(arg, "on")) { data = 0x02; } else { revert(); errx(1, "argument to -S must be either on or off"); } if (ioctl(0, VT_LOCKSWITCH, &data) == -1) { revert(); errc(1, errno, "turning %s vty switching", data == 0x01 ? "off" : "on"); } } /* * Return the adapter name for a specified type. */ static const char *adapter_name(int type) { static struct { int type; const char *name; } names[] = { { KD_MONO, "MDA" }, { KD_HERCULES, "Hercules" }, { KD_CGA, "CGA" }, { KD_EGA, "EGA" }, { KD_VGA, "VGA" }, { KD_PC98, "PC-98xx" }, { KD_TGA, "TGA" }, { -1, "Unknown" }, }; int i; for (i = 0; names[i].type != -1; ++i) if (names[i].type == type) break; return names[i].name; } /* * Show graphics adapter information. */ static void show_adapter_info(void) { struct video_adapter_info ad; ad.va_index = 0; if (ioctl(0, CONS_ADPINFO, &ad) == -1) { revert(); errc(1, errno, "obtaining adapter information"); } printf("fb%d:\n", ad.va_index); printf(" %.*s%d, type:%s%s (%d), flags:0x%x\n", (int)sizeof(ad.va_name), ad.va_name, ad.va_unit, (ad.va_flags & V_ADP_VESA) ? "VESA " : "", adapter_name(ad.va_type), ad.va_type, ad.va_flags); printf(" initial mode:%d, current mode:%d, BIOS mode:%d\n", ad.va_initial_mode, ad.va_mode, ad.va_initial_bios_mode); printf(" frame buffer window:0x%zx, buffer size:0x%zx\n", ad.va_window, ad.va_buffer_size); printf(" window size:0x%zx, origin:0x%x\n", ad.va_window_size, ad.va_window_orig); printf(" display start address (%d, %d), scan line width:%d\n", ad.va_disp_start.x, ad.va_disp_start.y, ad.va_line_width); printf(" reserved:0x%zx\n", ad.va_unused0); } /* * Show video mode information. */ static void show_mode_info(void) { char buf[80]; struct video_info _info; int c; int mm; int mode; printf(" mode# flags type size " "font window linear buffer\n"); printf("---------------------------------------" "---------------------------------------\n"); for (mode = 0; mode <= M_VESA_MODE_MAX; ++mode) { _info.vi_mode = mode; if (ioctl(0, CONS_MODEINFO, &_info)) continue; if (_info.vi_mode != mode) continue; printf("%3d (0x%03x)", mode, mode); printf(" 0x%08x", _info.vi_flags); if (_info.vi_flags & V_INFO_GRAPHICS) { c = 'G'; if (_info.vi_mem_model == V_INFO_MM_PLANAR) snprintf(buf, sizeof(buf), "%dx%dx%d %d", _info.vi_width, _info.vi_height, _info.vi_depth, _info.vi_planes); else { switch (_info.vi_mem_model) { case V_INFO_MM_PACKED: mm = 'P'; break; case V_INFO_MM_DIRECT: mm = 'D'; break; case V_INFO_MM_CGA: mm = 'C'; break; case V_INFO_MM_HGC: mm = 'H'; break; case V_INFO_MM_VGAX: mm = 'V'; break; default: mm = ' '; break; } snprintf(buf, sizeof(buf), "%dx%dx%d %c", _info.vi_width, _info.vi_height, _info.vi_depth, mm); } } else { c = 'T'; snprintf(buf, sizeof(buf), "%dx%d", _info.vi_width, _info.vi_height); } printf(" %c %-15s", c, buf); snprintf(buf, sizeof(buf), "%dx%d", _info.vi_cwidth, _info.vi_cheight); printf(" %-5s", buf); printf(" 0x%05zx %2dk %2dk", _info.vi_window, (int)_info.vi_window_size/1024, (int)_info.vi_window_gran/1024); printf(" 0x%08zx %dk\n", _info.vi_buffer, (int)_info.vi_buffer_size/1024); } } static void show_info(char *arg) { if (!strcmp(arg, "adapter")) { show_adapter_info(); } else if (!strcmp(arg, "mode")) { show_mode_info(); } else { revert(); errx(1, "argument to -i must be either adapter or mode"); } } static void test_frame(void) { int i, cur_mode, fore; fore = 15; if (ioctl(0, CONS_GET, &cur_mode) < 0) err(1, "must be on a virtual console"); switch (cur_mode) { case M_PC98_80x25: case M_PC98_80x30: fore = 7; break; } fprintf(stdout, "\033[=0G\n\n"); for (i=0; i<8; i++) { fprintf(stdout, "\033[=%dF\033[=0G %2d \033[=%dF%-16s" "\033[=%dF\033[=0G %2d \033[=%dF%-16s " "\033[=%dF %2d \033[=%dGBACKGROUND\033[=0G\n", fore, i, i, legal_colors[i], fore, i+8, i+8, legal_colors[i+8], fore, i, i); } fprintf(stdout, "\033[=%dF\033[=%dG\033[=%dH\033[=%dI\n", info.mv_norm.fore, info.mv_norm.back, info.mv_rev.fore, info.mv_rev.back); } /* * Snapshot the video memory of that terminal, using the CONS_SCRSHOT * ioctl, and writes the results to stdout either in the special * binary format (see manual page for details), or in the plain * text format. */ static void dump_screen(int mode, int opt) { scrshot_t shot; vid_info_t _info; _info.size = sizeof(_info); if (ioctl(0, CONS_GETINFO, &_info) == -1) { revert(); errc(1, errno, "obtaining current video mode parameters"); return; } shot.x = shot.y = 0; shot.xsize = _info.mv_csz; shot.ysize = _info.mv_rsz; if (opt == DUMP_ALL) shot.ysize += _info.mv_hsz; shot.buf = alloca(shot.xsize * shot.ysize * sizeof(u_int16_t)); if (shot.buf == NULL) { revert(); errx(1, "failed to allocate memory for dump"); } if (ioctl(0, CONS_SCRSHOT, &shot) == -1) { revert(); errc(1, errno, "dumping screen"); } if (mode == DUMP_FMT_RAW) { printf("SCRSHOT_%c%c%c%c", DUMP_FMT_REV, 2, shot.xsize, shot.ysize); fflush(stdout); write(STDOUT_FILENO, shot.buf, shot.xsize * shot.ysize * sizeof(u_int16_t)); } else { char *line; int x, y; u_int16_t ch; line = alloca(shot.xsize + 1); if (line == NULL) { revert(); errx(1, "failed to allocate memory for line buffer"); } for (y = 0; y < shot.ysize; y++) { for (x = 0; x < shot.xsize; x++) { ch = shot.buf[x + (y * shot.xsize)]; ch &= 0xff; if (isprint(ch) == 0) ch = ' '; line[x] = (char)ch; } /* Trim trailing spaces */ do { line[x--] = '\0'; } while (line[x] == ' ' && x != 0); puts(line); } fflush(stdout); } } /* * Set the console history buffer size. */ static void set_history(char *opt) { int size; size = atoi(opt); if ((*opt == '\0') || size < 0) { revert(); errx(1, "argument must be a positive number"); } if (ioctl(0, CONS_HISTORY, &size) == -1) { revert(); errc(1, errno, "setting history buffer size"); } } /* * Clear the console history buffer. */ static void clear_history(void) { if (ioctl(0, CONS_CLRHIST) == -1) { revert(); errc(1, errno, "clearing history buffer"); } } static void set_terminal_mode(char *arg) { if (strcmp(arg, "xterm") == 0) fprintf(stderr, "\033[=T"); else if (strcmp(arg, "cons25") == 0) fprintf(stderr, "\033[=1T"); } int main(int argc, char **argv) { char *font, *type, *termmode; const char *opts; int dumpmod, dumpopt, opt; int reterr; vt4_mode = is_vt4(); init(); info.size = sizeof(info); if (ioctl(0, CONS_GETINFO, &info) == -1) err(1, "must be on a virtual console"); dumpmod = 0; dumpopt = DUMP_FBF; termmode = NULL; if (vt4_mode) opts = "b:Cc:fg:h:Hi:M:m:pPr:S:s:T:t:x"; else - opts = "b:Cc:df:g:h:Hi:l:LM:m:pPr:S:s:T:t:x"; + opts = "b:Cc:dfg:h:Hi:l:LM:m:pPr:S:s:T:t:x"; while ((opt = getopt(argc, argv, opts)) != -1) switch(opt) { case 'b': set_border_color(optarg); break; case 'C': clear_history(); break; case 'c': set_cursor_type(optarg); break; case 'd': if (vt4_mode) break; print_scrnmap(); break; case 'f': optarg = nextarg(argc, argv, &optind, 'f', 0); if (optarg != NULL) { font = nextarg(argc, argv, &optind, 'f', 0); if (font == NULL) { type = NULL; font = optarg; } else type = optarg; load_font(type, font); } else { if (!vt4_mode) usage(); /* Switch syscons to ROM? */ load_default_vt4font(); } break; case 'g': if (sscanf(optarg, "%dx%d", &vesa_cols, &vesa_rows) != 2) { revert(); warnx("incorrect geometry: %s", optarg); usage(); } break; case 'h': set_history(optarg); break; case 'H': dumpopt = DUMP_ALL; break; case 'i': show_info(optarg); break; case 'l': if (vt4_mode) break; load_scrnmap(optarg); break; case 'L': if (vt4_mode) break; load_default_scrnmap(); break; case 'M': set_mouse_char(optarg); break; case 'm': set_mouse(optarg); break; case 'p': dumpmod = DUMP_FMT_RAW; break; case 'P': dumpmod = DUMP_FMT_TXT; break; case 'r': get_reverse_colors(argc, argv, &optind); break; case 'S': set_lockswitch(optarg); break; case 's': set_console(optarg); break; case 'T': if (strcmp(optarg, "xterm") != 0 && strcmp(optarg, "cons25") != 0) usage(); termmode = optarg; break; case 't': set_screensaver_timeout(optarg); break; case 'x': hex = 1; break; default: usage(); } if (dumpmod != 0) dump_screen(dumpmod, dumpopt); reterr = video_mode(argc, argv, &optind); get_normal_colors(argc, argv, &optind); if (optind < argc && !strcmp(argv[optind], "show")) { test_frame(); optind++; } video_mode(argc, argv, &optind); if (termmode != NULL) set_terminal_mode(termmode); get_normal_colors(argc, argv, &optind); if (colors_changed || video_mode_changed) { if (!(new_mode_info.vi_flags & V_INFO_GRAPHICS)) { if ((normal_back_color < 8) && (revers_back_color < 8)) { set_colors(); } else { revert(); errx(1, "bg color for text modes must be < 8"); } } else { set_colors(); } } if ((optind != argc) || (argc == 1)) usage(); return reterr; } Index: user/ngie/more-tests =================================================================== --- user/ngie/more-tests (revision 281584) +++ user/ngie/more-tests (revision 281585) Property changes on: user/ngie/more-tests ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head:r281504-281584