Index: projects/runtime-coverage/UPDATING =================================================================== --- projects/runtime-coverage/UPDATING (revision 322921) +++ projects/runtime-coverage/UPDATING (revision 322922) @@ -1,1958 +1,1963 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/updating-src.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. NOTE TO PEOPLE WHO THINK THAT FreeBSD 12.x IS SLOW: FreeBSD 12.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely disable the most expensive debugging functionality run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) ****************************** SPECIAL WARNING: ****************************** Due to a bug in some versions of clang that's very hard to workaround in the upgrade process, to upgrade to -current you must first upgrade either stable/9 after r286035 or stable/10 after r286033 (including 10.3-RELEASE) or current after r286007 (including stable/11 and 11.0-RELEASE). These revisions post-date the 10.2 and 9.3 releases, so you'll need to take the unusual step of upgrading to the tip of the stable branch before moving to 11 or -current via a source upgrade. stable/11 and 11.0-RELEASE have working newer compiler. This differs from the historical situation where one could upgrade from anywhere on the last couple of stable branches, so be careful. If you're running a hybrid system on 9.x or 10.x with an updated clang compiler or are using an supported external toolchain, the build system will allow the upgrade. Otherwise it will print a reminder. ****************************** SPECIAL WARNING: ****************************** +20170825: + Move PMTUD blackhole counters to TCPSTATS and remove them from bare + sysctl values. Minor nit, but requires a rebuild of both world/kernel + to complete. + 20170814: "make check" behavior (made in ^/head@r295380) has been changed to execute from a limited sandbox, as opposed to executing from ${TESTSDIR}. Behavioral changes: - The "beforecheck" and "aftercheck" targets are now specified. - ${CHECKDIR} (added in commit noted above) has been removed. - Legacy behavior can be enabled by setting WITHOUT_MAKE_CHECK_USE_SANDBOX in src.conf(5) or the environment. If the limited sandbox mode is enabled, "make check" will execute "make distribution", then install, execute the tests, and clean up the sandbox if successful. The "make distribution" and "make install" targets are typically run as root to set appropriate permissions and ownership at installation time. The end-user should set "WITH_INSTALL_AS_USER" in src.conf(5) or the environment if executing "make check" with limited sandbox mode using an unprivileged user. 20170808: Since the switch to GPT disk labels, fsck for UFS/FFS has been unable to automatically find alternate superblocks. As of r322297, the information needed to find alternate superblocks has been moved to the end of the area reserved for the boot block. Filesystems created with a newfs of this vintage or later will create the recovery information. If you have a filesystem created prior to this change and wish to have a recovery block created for your filesystem, you can do so by running fsck in forground mode (i.e., do not use the -p or -y options). As it starts, fsck will ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should answer yes. 20170728: As of r321665, an NFSv4 server configuration that services Kerberos mounts or clients that do not support the uid/gid in owner/owner_group string capability, must explicitly enable the nfsuserd daemon by adding nfsuserd_enable="YES" to the machine's /etc/rc.conf file. 20170722: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 5.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170701: WITHOUT_RCMDS is now the default. Set WITH_RCMDS if you need the r-commands (rlogin, rsh, etc.) to be built with the base system. 20170625: The FreeBSD/powerpc platform now uses a 64-bit type for time_t. This is a very major ABI incompatible change, so users of FreeBSD/powerpc must be careful when performing source upgrades. It is best to run 'make installworld' from an alternate root system, either a live CD/memory stick, or a temporary root partition. Additionally, all ports must be recompiled. powerpc64 is largely unaffected, except in the case of 32-bit compatibility. All 32-bit binaries will be affected. 20170623: Forward compatibility for the "ino64" project have been committed. This will allow most new binaries to run on older kernels in a limited fashion. This prevents many of the common foot-shooting actions in the upgrade as well as the limited ability to roll back the kernel across the ino64 upgrade. Complicated use cases may not work properly, though enough simpler ones work to allow recovery in most situations. 20170620: Switch back to the BSDL dtc (Device Tree Compiler). Set WITH_GPL_DTC if you require the GPL compiler. 20170618: The internal ABI used for communication between the NFS kernel modules was changed by r320085, so __FreeBSD_version was bumped to ensure all the NFS related modules are updated together. 20170617: The ABI of struct event was changed by extending the data member to 64bit and adding ext fields. For upgrade, same precautions as for the entry 20170523 "ino64" must be followed. 20170531: The GNU roff toolchain has been removed from base. To render manpages which are not supported by mandoc(1), man(1) can fallback on GNU roff from ports (and recommends to install it). To render roff(7) documents, consider using GNU roff from ports or the heirloom doctools roff toolchain from ports via pkg install groff or via pkg install heirloom-doctools. 20170524: The ath(4) and ath_hal(4) modules now build piecemeal to allow for smaller runtime footprint builds. This is useful for embedded systems which only require one chipset support. If you load it as a module, make sure this is in /boot/loader.conf: if_ath_load="YES" This will load the HAL, all chip/RF backends and if_ath_pci. If you have if_ath_pci in /boot/loader.conf, ensure it is after if_ath or it will not load any HAL chipset support. If you want to selectively load things (eg on ye cheape ARM/MIPS platforms where RAM is at a premium) you should: * load ath_hal * load the chip modules in question * load ath_rate, ath_dfs * load ath_main * load if_ath_pci and/or if_ath_ahb depending upon your particular bus bind type - this is where probe/attach is done. For further comments/feedback, poke adrian@ . 20170523: The "ino64" 64-bit inode project has been committed, which extends a number of types to 64 bits. Upgrading in place requires care and adherence to the documented upgrade procedure. If using a custom kernel configuration ensure that the COMPAT_FREEBSD11 option is included (as during the upgrade the system will be running the ino64 kernel with the existing world). For the safest in-place upgrade begin by removing previous build artifacts via "rm -rf /usr/obj/*". Then, carefully follow the full procedure documented below under the heading "To rebuild everything and install it on the current system." Specifically, a reboot is required after installing the new kernel before installing world. 20170424: The NATM framework including the en(4), fatm(4), hatm(4), and patm(4) devices has been removed. Consumers should plan a migration before the end-of-life date for FreeBSD 11. 20170420: GNU diff has been replaced by a BSD licensed diff. Some features of GNU diff has not been implemented, if those are needed a newer version of GNU diff is available via the diffutils package under the gdiff name. 20170413: As of r316810 for ipfilter, keep frags is no longer assumed when keep state is specified in a rule. r316810 aligns ipfilter with documentation in man pages separating keep frags from keep state. This allows keep state to be specified without forcing keep frags and allows keep frags to be specified independently of keep state. To maintain previous behaviour, also specify keep frags with keep state (as documented in ipf.conf.5). 20170407: arm64 builds now use the base system LLD 4.0.0 linker by default, instead of requiring that the aarch64-binutils port or package be installed. To continue using aarch64-binutils, set CROSS_BINUTILS_PREFIX=/usr/local/aarch64-freebsd/bin . 20170405: The UDP optimization in entry 20160818 that added the sysctl net.inet.udp.require_l2_bcast has been reverted. L2 broadcast packets will no longer be treated as L3 broadcast packets. 20170331: Binds and sends to the loopback addresses, IPv6 and IPv4, will now use any explicitly assigned loopback address available in the jail instead of using the first assigned address of the jail. 20170329: The ctl.ko module no longer implements the iSCSI target frontend: cfiscsi.ko does instead. If building cfiscsi.ko as a kernel module, the module can be loaded via one of the following methods: - `cfiscsi_load="YES"` in loader.conf(5). - Add `cfiscsi` to `$kld_list` in rc.conf(5). - ctladm(8)/ctld(8), when compiled with iSCSI support (`WITH_ISCSI=yes` in src.conf(5)) Please see cfiscsi(4) for more details. 20170316: The mmcsd.ko module now additionally depends on geom_flashmap.ko. Also, mmc.ko and mmcsd.ko need to be a matching pair built from the same source (previously, the dependency of mmcsd.ko on mmc.ko was missing, but mmcsd.ko now will refuse to load if it is incompatible with mmc.ko). 20170315: The syntax of ipfw(8) named states was changed to avoid ambiguity. If you have used named states in the firewall rules, you need to modify them after installworld and before rebooting. Now named states must be prefixed with colon. 20170311: The old drm (sys/dev/drm/) drivers for i915 and radeon have been removed as the userland we provide cannot use them. The KMS version (sys/dev/drm2) supports the same hardware. 20170302: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170221: The code that provides support for ZFS .zfs/ directory functionality has been reimplemented. It's not possible now to create a snapshot by mkdir under .zfs/snapshot/. That should be the only user visible change. 20170216: EISA bus support has been removed. The WITH_EISA option is no longer valid. 20170215: MCA bus support has been removed. 20170127: The WITH_LLD_AS_LD / WITHOUT_LLD_AS_LD build knobs have been renamed WITH_LLD_IS_LD / WITHOUT_LLD_IS_LD, for consistency with CLANG_IS_CC. 20170112: The EM_MULTIQUEUE kernel configuration option is deprecated now that the em(4) driver conforms to iflib specifications. 20170109: The igb(4), em(4) and lem(4) ethernet drivers are now implemented via IFLIB. If you have a custom kernel configuration that excludes em(4) but you use igb(4), you need to re-add em(4) to your custom configuration. 20161217: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161124: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161119: The layout of the pmap structure has changed for powerpc to put the pmap statistics at the front for all CPU variations. libkvm(3) and all tools that link against it need to be recompiled. 20161030: isl(4) and cyapa(4) drivers now require a new driver, chromebook_platform(4), to work properly on Chromebook-class hardware. On other types of hardware the drivers may need to be configured using device hints. Please see the corresponding manual pages for details. 20161017: The urtwn(4) driver was merged into rtwn(4) and now consists of rtwn(4) main module + rtwn_usb(4) and rtwn_pci(4) bus-specific parts. Also, firmware for RTL8188CE was renamed due to possible name conflict (rtwnrtl8192cU(B) -> rtwnrtl8192cE(B)) 20161015: GNU rcs has been removed from base. It is available as packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) before it was removed from base. 20161008: Use of the cc_cdg, cc_chd, cc_hd, or cc_vegas congestion control modules now requires that the kernel configuration contain the TCP_HHOOK option. (This option is included in the GENERIC kernel.) 20161003: The WITHOUT_ELFCOPY_AS_OBJCOPY src.conf(5) knob has been retired. ELF Tool Chain's elfcopy is always installed as /usr/bin/objcopy. 20160924: Relocatable object files with the extension of .So have been renamed to use an extension of .pico instead. The purpose of this change is to avoid a name clash with shared libraries on case-insensitive file systems. On those file systems, foo.So is the same file as foo.so. 20160918: GNU rcs has been turned off by default. It can (temporarily) be built again by adding WITH_RCS knob in src.conf. Otherwise, GNU rcs is available from packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) from base. 20160918: The backup_uses_rcs functionality has been removed from rc.subr. 20160908: The queue(3) debugging macro, QUEUE_MACRO_DEBUG, has been split into two separate components, QUEUE_MACRO_DEBUG_TRACE and QUEUE_MACRO_DEBUG_TRASH. Define both for the original QUEUE_MACRO_DEBUG behavior. 20160824: r304787 changed some ioctl interfaces between the iSCSI userspace programs and the kernel. ctladm, ctld, iscsictl, and iscsid must be rebuilt to work with new kernels. __FreeBSD_version has been bumped to 1200005. 20160818: The UDP receive code has been updated to only treat incoming UDP packets that were addressed to an L2 broadcast address as L3 broadcast packets. It is not expected that this will affect any standards-conforming UDP application. The new behaviour can be disabled by setting the sysctl net.inet.udp.require_l2_bcast to 0. 20160818: Remove the openbsd_poll system call. __FreeBSD_version has been bumped because of this. 20160622: The libc stub for the pipe(2) system call has been replaced with a wrapper that calls the pipe2(2) system call and the pipe(2) system call is now only implemented by the kernels that include "options COMPAT_FREEBSD10" in their config file (this is the default). Users should ensure that this option is enabled in their kernel or upgrade userspace to r302092 before upgrading their kernel. 20160527: CAM will now strip leading spaces from SCSI disks' serial numbers. This will affect users who create UFS filesystems on SCSI disks using those disk's diskid device nodes. For example, if /etc/fstab previously contained a line like "/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should change it to "/dev/diskid/DISK-ABCDEFG0123456". Users of geom transforms like gmirror may also be affected. ZFS users should generally be fine. 20160523: The bitstring(3) API has been updated with new functionality and improved performance. But it is binary-incompatible with the old API. Objects built with the new headers may not be linked against objects built with the old headers. 20160520: The brk and sbrk functions have been removed from libc on arm64. Binutils from ports has been updated to not link to these functions and should be updated to the latest version before installing a new libc. 20160517: The armv6 port now defaults to hard float ABI. Limited support for running both hardfloat and soft float on the same system is available using the libraries installed with -DWITH_LIBSOFT. This has only been tested as an upgrade path for installworld and packages may fail or need manual intervention to run. New packages will be needed. To update an existing self-hosted armv6hf system, you must add TARGET_ARCH=armv6 on the make command line for both the build and the install steps. 20160510: Kernel modules compiled outside of a kernel build now default to installing to /boot/modules instead of /boot/kernel. Many kernel modules built this way (such as those in ports) already overrode KMODDIR explicitly to install into /boot/modules. However, manually building and installing a module from /sys/modules will now install to /boot/modules instead of /boot/kernel. 20160414: The CAM I/O scheduler has been committed to the kernel. There should be no user visible impact. This does enable NCQ Trim on ada SSDs. While the list of known rogues that claim support for this but actually corrupt data is believed to be complete, be on the lookout for data corruption. The known rogue list is believed to be complete: o Crucial MX100, M550 drives with MU01 firmware. o Micron M510 and M550 drives with MU01 firmware. o Micron M500 prior to MU07 firmware o Samsung 830, 840, and 850 all firmwares o FCCT M500 all firmwares Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware with working NCQ TRIM. For Micron branded drives, see your sales rep for updated firmware. Black listed drives will work correctly because these drives work correctly so long as no NCQ TRIMs are sent to them. Given this list is the same as found in Linux, it's believed there are no other rogues in the market place. All other models from the above vendors work. To be safe, if you are at all concerned, you can quirk each of your drives to prevent NCQ from being sent by setting: kern.cam.ada.X.quirks="0x2" in loader.conf. If the drive requires the 4k sector quirk, set the quirks entry to 0x3. 20160330: The FAST_DEPEND build option has been removed and its functionality is now the one true way. The old mkdep(1) style of 'make depend' has been removed. See 20160311 for further details. 20160317: Resource range types have grown from unsigned long to uintmax_t. All drivers, and anything using libdevinfo, need to be recompiled. 20160311: WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree builds. It no longer runs mkdep(1) during 'make depend', and the 'make depend' stage can safely be skipped now as it is auto ran when building 'make all' and will generate all SRCS and DPSRCS before building anything else. Dependencies are gathered at compile time with -MF flags kept in separate .depend files per object file. Users should run 'make cleandepend' once if using -DNO_CLEAN to clean out older stale .depend files. 20160306: On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into kernel modules. Therefore, if you load any kernel modules at boot time, please install the boot loaders after you install the kernel, but before rebooting, e.g.: make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE make -C sys/boot install Then follow the usual steps, described in the General Notes section, below. 20160305: Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20160301: The AIO subsystem is now a standard part of the kernel. The VFS_AIO kernel option and aio.ko kernel module have been removed. Due to stability concerns, asynchronous I/O requests are only permitted on sockets and raw disks by default. To enable asynchronous I/O requests on all file types, set the vfs.aio.enable_unsafe sysctl to a non-zero value. 20160226: The ELF object manipulation tool objcopy is now provided by the ELF Tool Chain project rather than by GNU binutils. It should be a drop-in replacement, with the addition of arm64 support. The (temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set to obtain the GNU version if necessary. 20160129: Building ZFS pools on top of zvols is prohibited by default. That feature has never worked safely; it's always been prone to deadlocks. Using a zvol as the backing store for a VM guest's virtual disk will still work, even if the guest is using ZFS. Legacy behavior can be restored by setting vfs.zfs.vol.recursive=1. 20160119: The NONE and HPN patches has been removed from OpenSSH. They are still available in the security/openssh-portable port. 20160113: With the addition of ypldap(8), a new _ypldap user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20151216: The tftp loader (pxeboot) now uses the option root-path directive. As a consequence it no longer looks for a pxeboot.4th file on the tftp server. Instead it uses the regular /boot infrastructure as with the other loaders. 20151211: The code to start recording plug and play data into the modules has been committed. While the old tools will properly build a new kernel, a number of warnings about "unknown metadata record 4" will be produced for an older kldxref. To avoid such warnings, make sure to rebuild the kernel toolchain (or world). Make sure that you have r292078 or later when trying to build 292077 or later before rebuilding. 20151207: Debug data files are now built by default with 'make buildworld' and installed with 'make installworld'. This facilitates debugging but requires more disk space both during the build and for the installed world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes in src.conf(5). 20151130: r291527 changed the internal interface between the nfsd.ko and nfscommon.ko modules. As such, they must both be upgraded to-gether. __FreeBSD_version has been bumped because of this. 20151108: Add support for unicode collation strings leads to a change of order of files listed by ls(1) for example. To get back to the old behaviour, set LC_COLLATE environment variable to "C". Databases administrators will need to reindex their databases given collation results will be different. Due to a bug in install(1) it is recommended to remove the ancient locales before running make installworld. rm -rf /usr/share/locale/* 20151030: The OpenSSL has been upgraded to 1.0.2d. Any binaries requiring libcrypto.so.7 or libssl.so.7 must be recompiled. 20151020: Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0. Kernel modules isp_2400_multi and isp_2500_multi were removed and should be replaced with isp_2400 and isp_2500 modules respectively. 20151017: The build previously allowed using 'make -n' to not recurse into sub-directories while showing what commands would be executed, and 'make -n -n' to recursively show commands. Now 'make -n' will recurse and 'make -N' will not. 20151012: If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster and etcupdate will now use this file. A custom sendmail.cf is now updated via this mechanism rather than via installworld. If you had excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may want to remove the exclusion or change it to "always install". /etc/mail/sendmail.cf is now managed the same way regardless of whether SENDMAIL_MC/SENDMAIL_CF is used. If you are not using SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior. 20151011: Compatibility shims for legacy ATA device names have been removed. It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.* environment variables, /dev/ad* and /dev/ar* symbolic links. 20151006: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20150924: Kernel debug files have been moved to /usr/lib/debug/boot/kernel/, and renamed from .symbols to .debug. This reduces the size requirements on the boot partition or file system and provides consistency with userland debug files. When using the supported kernel installation method the /usr/lib/debug/boot/kernel directory will be renamed (to kernel.old) as is done with /boot/kernel. Developers wishing to maintain the historical behavior of installing debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5). 20150827: The wireless drivers had undergone changes that remove the 'parent interface' from the ifconfig -l output. The rc.d network scripts used to check presence of a parent interface in the list, so old scripts would fail to start wireless networking. Thus, etcupdate(3) or mergemaster(8) run is required after kernel update, to update your rc.d scripts in /etc. 20150827: pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl' These configurations are now automatically interpreted as 'scrub fragment reassemble'. 20150817: Kernel-loadable modules for the random(4) device are back. To use them, the kernel must have device random options RANDOM_LOADABLE kldload(8) can then be used to load random_fortuna.ko or random_yarrow.ko. Please note that due to the indirect function calls that the loadable modules need to provide, the build-in variants will be slightly more efficient. The random(4) kernel option RANDOM_DUMMY has been retired due to unpopularity. It was not all that useful anyway. 20150813: The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired. Control over building the ELF Tool Chain tools is now provided by the WITHOUT_TOOLCHAIN knob. 20150810: The polarity of Pulse Per Second (PPS) capture events with the uart(4) driver has been corrected. Prior to this change the PPS "assert" event corresponded to the trailing edge of a positive PPS pulse and the "clear" event was the leading edge of the next pulse. As the width of a PPS pulse in a typical GPS receiver is on the order of 1 millisecond, most users will not notice any significant difference with this change. Anyone who has compensated for the historical polarity reversal by configuring a negative offset equal to the pulse width will need to remove that workaround. 20150809: The default group assigned to /dev/dri entries has been changed from 'wheel' to 'video' with the id of '44'. If you want to have access to the dri devices please add yourself to the video group with: # pw groupmod video -m $USER 20150806: The menu.rc and loader.rc files will now be replaced during upgrades. Please migrate local changes to menu.rc.local and loader.rc.local instead. 20150805: GNU Binutils versions of addr2line, c++filt, nm, readelf, size, strings and strip have been removed. The src.conf(5) knob WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools. 20150728: As ZFS requires more kernel stack pages than is the default on some architectures e.g. i386, it now warns if KSTACK_PAGES is less than ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing). Please consider using 'options KSTACK_PAGES=X' where X is greater than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations. 20150706: sendmail has been updated to 8.15.2. Starting with FreeBSD 11.0 and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by default, i.e., they will not contain "::". For example, instead of ::1, it will be 0:0:0:0:0:0:0:1. This permits a zero subnet to have a more specific match, such as different map entries for IPv6:0:0 vs IPv6:0. This change requires that configuration data (including maps, files, classes, custom ruleset, etc.) must use the same format, so make certain such configuration data is upgrading. As a very simple check search for patterns like 'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'. To return to the old behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or the cf option UseCompressedIPv6Addresses. 20150630: The default kernel entropy-processing algorithm is now Fortuna, replacing Yarrow. Assuming you have 'device random' in your kernel config file, the configurations allow a kernel option to override this default. You may choose *ONE* of: options RANDOM_YARROW # Legacy /dev/random algorithm. options RANDOM_DUMMY # Blocking-only driver. If you have neither, you get Fortuna. For most people, read no further, Fortuna will give a /dev/random that works like it always used to, and the difference will be irrelevant. If you remove 'device random', you get *NO* kernel-processed entropy at all. This may be acceptable to folks building embedded systems, but has complications. Carry on reading, and it is assumed you know what you need. *PLEASE* read random(4) and random(9) if you are in the habit of tweaking kernel configs, and/or if you are a member of the embedded community, wanting specific and not-usual behaviour from your security subsystems. NOTE!! If you use RANDOM_DUMMY and/or have no 'device random', you will NOT have a functioning /dev/random, and many cryptographic features will not work, including SSH. You may also find strange behaviour from the random(3) set of library functions, in particular sranddev(3), srandomdev(3) and arc4random(3). The reason for this is that the KERN_ARND sysctl only returns entropy if it thinks it has some to share, and with RANDOM_DUMMY or no 'device random' this will never happen. 20150623: An additional fix for the issue described in the 20150614 sendmail entry below has been been committed in revision 284717. 20150616: FreeBSD's old make (fmake) has been removed from the system. It is available as the devel/fmake port or via pkg install fmake. 20150615: The fix for the issue described in the 20150614 sendmail entry below has been been committed in revision 284436. The work around described in that entry is no longer needed unless the default setting is overridden by a confDH_PARAMETERS configuration setting of '5' or pointing to a 512 bit DH parameter file. 20150614: ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf and devel/kyua to version 0.20+ and adjust any calling code to work with Kyuafile and kyua. 20150614: The import of openssl to address the FreeBSD-SA-15:10.openssl security advisory includes a change which rejects handshakes with DH parameters below 768 bits. sendmail releases prior to 8.15.2 (not yet released), defaulted to a 512 bit DH parameter setting for client connections. To work around this interoperability, sendmail can be configured to use a 2048 bit DH parameter by: 1. Edit /etc/mail/`hostname`.mc 2. If a setting for confDH_PARAMETERS does not exist or exists and is set to a string beginning with '5', replace it with '2'. 3. If a setting for confDH_PARAMETERS exists and is set to a file path, create a new file with: openssl dhparam -out /path/to/file 2048 4. Rebuild the .cf file: cd /etc/mail/; make; make install 5. Restart sendmail: cd /etc/mail/; make restart A sendmail patch is coming, at which time this file will be updated. 20150604: Generation of legacy formatted entries have been disabled by default in pwd_mkdb(8), as all base system consumers of the legacy formatted entries were converted to use the new format by default when the new, machine independent format have been added and supported since FreeBSD 5.x. Please see the pwd_mkdb(8) manual page for further details. 20150525: Clang and llvm have been upgraded to 3.6.1 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150521: TI platform code switched to using vendor DTS files and this update may break existing systems running on Beaglebone, Beaglebone Black, and Pandaboard: - dtb files should be regenerated/reinstalled. Filenames are the same but content is different now - GPIO addressing was changed, now each GPIO bank (32 pins per bank) has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old addressing scheme is now pin 25 on /dev/gpioc3. - Pandaboard: /etc/ttys should be updated, serial console device is now /dev/ttyu2, not /dev/ttyu0 20150501: soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim. If you need the GNU extension from groff soelim(1), install groff from package: pkg install groff, or via ports: textproc/groff. 20150423: chmod, chflags, chown and chgrp now affect symlinks in -R mode as defined in symlink(7); previously symlinks were silently ignored. 20150415: The const qualifier has been removed from iconv(3) to comply with POSIX. The ports tree is aware of this from r384038 onwards. 20150416: Libraries specified by LIBADD in Makefiles must have a corresponding DPADD_ variable to ensure correct dependencies. This is now enforced in src.libnames.mk. 20150324: From legacy ata(4) driver was removed support for SATA controllers supported by more functional drivers ahci(4), siis(4) and mvs(4). Kernel modules ataahci and ataadaptec were removed completely, replaced by ahci and mvs modules respectively. 20150315: Clang, llvm and lldb have been upgraded to 3.6.0 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150307: The 32-bit PowerPC kernel has been changed to a position-independent executable. This can only be booted with a version of loader(8) newer than January 31, 2015, so make sure to update both world and kernel before rebooting. 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. See 20150805 for updated information. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The stable/10 branch has been created in subversion from head revision r256279. 20131010: The rc.d/jail script has been updated to support jail(8) configuration file. The "jail__*" rc.conf(5) variables for per-jail configuration are automatically converted to /var/run/jail..conf before the jail(8) utility is invoked. This is transparently backward compatible. See below about some incompatibilities and rc.conf(5) manual page for more details. These variables are now deprecated in favor of jail(8) configuration file. One can use "rc.d/jail config " command to generate a jail(8) configuration file in /var/run/jail..conf without running the jail(8) utility. The default pathname of the configuration file is /etc/jail.conf and can be specified by using $jail_conf or $jail__conf variables. Please note that jail_devfs_ruleset accepts an integer at this moment. Please consider to rewrite the ruleset name with an integer. 20130930: BIND has been removed from the base system. If all you need is a local resolver, simply enable and start the local_unbound service instead. Otherwise, several versions of BIND are available in the ports tree. The dns/bind99 port is one example. With this change, nslookup(1) and dig(1) are no longer in the base system. Users should instead use host(1) and drill(1) which are in the base system. Alternatively, nslookup and dig can be obtained by installing the dns/bind-tools port. 20130916: With the addition of unbound(8), a new unbound user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20130911: OpenSSH is now built with DNSSEC support, and will by default silently trust signed SSHFP records. This can be controlled with the VerifyHostKeyDNS client configuration setting. DNSSEC support can be disabled entirely with the WITHOUT_LDNS option in src.conf. 20130906: The GNU Compiler Collection and C++ standard library (libstdc++) are no longer built by default on platforms where clang is the system compiler. You can enable them with the WITH_GCC and WITH_GNUCXX options in src.conf. 20130905: The PROCDESC kernel option is now part of the GENERIC kernel configuration and is required for the rwhod(8) to work. If you are using custom kernel configuration, you should include 'options PROCDESC'. 20130905: The API and ABI related to the Capsicum framework was modified in backward incompatible way. The userland libraries and programs have to be recompiled to work with the new kernel. This includes the following libraries and programs, but the whole buildworld is advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl, kdump, procstat, rwho, rwhod, uniq. 20130903: AES-NI intrinsic support has been added to gcc. The AES-NI module has been updated to use this support. A new gcc is required to build the aesni module on both i386 and amd64. 20130821: The PADLOCK_RNG and RDRAND_RNG kernel options are now devices. Thus "device padlock_rng" and "device rdrand_rng" should be used instead of "options PADLOCK_RNG" & "options RDRAND_RNG". 20130813: WITH_ICONV has been split into two feature sets. WITH_ICONV now enables just the iconv* functionality and is now on by default. WITH_LIBICONV_COMPAT enables the libiconv api and link time compatibility. Set WITHOUT_ICONV to build the old way. If you have been using WITH_ICONV before, you will very likely need to turn on WITH_LIBICONV_COMPAT. 20130806: INVARIANTS option now enables DEBUG for code with OpenSolaris and Illumos origin, including ZFS. If you have INVARIANTS in your kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG explicitly. DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS) locks if WITNESS option was set. Because that generated a lot of witness(9) reports and all of them were believed to be false positives, this is no longer done. New option OPENSOLARIS_WITNESS can be used to achieve the previous behavior. 20130806: Timer values in IPv6 data structures now use time_uptime instead of time_second. Although this is not a user-visible functional change, userland utilities which directly use them---ndp(8), rtadvd(8), and rtsold(8) in the base system---need to be updated to r253970 or later. 20130802: find -delete can now delete the pathnames given as arguments, instead of only files found below them or if the pathname did not contain any slashes. Formerly, the following error message would result: find: -delete: : relative path potentially not safe Deleting the pathnames given as arguments can be prevented without error messages using -mindepth 1 or by changing directory and passing "." as argument to find. This works in the old as well as the new version of find. 20130726: Behavior of devfs rules path matching has been changed. Pattern is now always matched against fully qualified devfs path and slash characters must be explicitly matched by slashes in pattern (FNM_PATHNAME). Rulesets involving devfs subdirectories must be reviewed. 20130716: The default ARM ABI has changed to the ARM EABI. The old ABI is incompatible with the ARM EABI and all programs and modules will need to be rebuilt to work with a new kernel. To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set. NOTE: Support for the old ABI will be removed in the future and users are advised to upgrade. 20130709: pkg_install has been disconnected from the build if you really need it you should add WITH_PKGTOOLS in your src.conf(5). 20130709: Most of network statistics structures were changed to be able keep 64-bits counters. Thus all tools, that work with networking statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.) 20130618: Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file. 20130615: CVS has been removed from the base system. An exact copy of the code is available from the devel/cvs port. 20130613: Some people report the following error after the switch to bmake: make: illegal option -- J usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable] ... *** [buildworld] Error code 2 this likely due to an old instance of make in ${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE}) which src/Makefile will use that blindly, if it exists, so if you see the above error: rm -rf `make -V MAKEPATH` should resolve it. 20130516: Use bmake by default. Whereas before one could choose to build with bmake via -DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old make. The goal is to remove these knobs for 10-RELEASE. It is worth noting that bmake (like gmake) treats the command line as the unit of failure, rather than statements within the command line. Thus '(cd some/where && dosomething)' is safer than 'cd some/where; dosomething'. The '()' allows consistent behavior in parallel build. 20130429: Fix a bug that allows NFS clients to issue READDIR on files. 20130426: The WITHOUT_IDEA option has been removed because the IDEA patent expired. 20130426: The sysctl which controls TRIM support under ZFS has been renamed from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been enabled by default. 20130425: The mergemaster command now uses the default MAKEOBJDIRPREFIX rather than creating it's own in the temporary directory in order allow access to bootstrapped versions of tools such as install and mtree. When upgrading from version of FreeBSD where the install command does not support -l, you will need to install a new mergemaster command if mergemaster -p is required. This can be accomplished with the command (cd src/usr.sbin/mergemaster && make install). 20130404: Legacy ATA stack, disabled and replaced by new CAM-based one since FreeBSD 9.0, completely removed from the sources. Kernel modules atadisk and atapi*, user-level tools atacontrol and burncd are removed. Kernel option `options ATA_CAM` is now permanently enabled and removed. 20130319: SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2) and socketpair(2). Software, in particular Kerberos, may automatically detect and use these during building. The resulting binaries will not work on older kernels. 20130308: CTL_DISABLE has also been added to the sparc64 GENERIC (for further information, see the respective 20130304 entry). 20130304: Recent commits to callout(9) changed the size of struct callout, so the KBI is probably heavily disturbed. Also, some functions in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced by macros. Every kernel module using it won't load, so rebuild is requested. The ctl device has been re-enabled in GENERIC for i386 and amd64, but does not initialize by default (because of the new CTL_DISABLE option) to save memory. To re-enable it, remove the CTL_DISABLE option from the kernel config file or set kern.cam.ctl.disable=0 in /boot/loader.conf. 20130301: The ctl device has been disabled in GENERIC for i386 and amd64. This was done due to the extra memory being allocated at system initialisation time by the ctl driver which was only used if a CAM target device was created. This makes a FreeBSD system unusable on 128MB or less of RAM. 20130208: A new compression method (lz4) has been merged to -HEAD. Please refer to zpool-features(7) for more information. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20130129: A BSD-licensed patch(1) variant has been added and is installed as bsdpatch, being the GNU version the default patch. To inverse the logic and use the BSD-licensed one as default, while having the GNU version installed as gnupatch, rebuild and install world with the WITH_BSD_PATCH knob set. 20130121: Due to the use of the new -l option to install(1) during build and install, you must take care not to directly set the INSTALL make variable in your /etc/make.conf, /etc/src.conf, or on the command line. If you wish to use the -C flag for all installs you may be able to add INSTALL+=-C to /etc/make.conf or /etc/src.conf. 20130118: The install(1) option -M has changed meaning and now takes an argument that is a file or path to append logs to. In the unlikely event that -M was the last option on the command line and the command line contained at least two files and a target directory the first file will have logs appended to it. The -M option served little practical purpose in the last decade so its use is expected to be extremely rare. 20121223: After switching to Clang as the default compiler some users of ZFS on i386 systems started to experience stack overflow kernel panics. Please consider using 'options KSTACK_PAGES=4' in such configurations. 20121222: GEOM_LABEL now mangles label names read from file system metadata. Mangling affect labels containing spaces, non-printable characters, '%' or '"'. Device names in /etc/fstab and other places may need to be updated. 20121217: By default, only the 10 most recent kernel dumps will be saved. To restore the previous behaviour (no limit on the number of kernel dumps stored in the dump directory) add the following line to /etc/rc.conf: savecore_flags="" 20121201: With the addition of auditdistd(8), a new auditdistd user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20121117: The sin6_scope_id member variable in struct sockaddr_in6 is now filled by the kernel before passing the structure to the userland via sysctl or routing socket. This means the KAME-specific embedded scope id in sin6_addr.s6_addr[2] is always cleared in userland application. This behavior can be controlled by net.inet6.ip6.deembed_scopeid. __FreeBSD_version is bumped to 1000025. 20121105: On i386 and amd64 systems WITH_CLANG_IS_CC is now the default. This means that the world and kernel will be compiled with clang and that clang will be installed as /usr/bin/cc, /usr/bin/c++, and /usr/bin/cpp. To disable this behavior and revert to building with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions of current may need to bootstrap WITHOUT_CLANG first if the clang build fails (its compatibility window doesn't extend to the 9 stable branch point). 20121102: The IPFIREWALL_FORWARD kernel option has been removed. Its functionality now turned on by default. 20121023: The ZERO_COPY_SOCKET kernel option has been removed and split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP. NB: SOCKET_SEND_COW uses the VM page based copy-on-write mechanism which is not safe and may result in kernel crashes. NB: The SOCKET_RECV_PFLIP mechanism is useless as no current driver supports disposeable external page sized mbuf storage. Proper replacements for both zero-copy mechanisms are under consideration and will eventually lead to complete removal of the two kernel options. 20121023: The IPv4 network stack has been converted to network byte order. The following modules need to be recompiled together with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4), pf(4), ipfw(4), ng_ipfw(4), stf(4). 20121022: Support for non-MPSAFE filesystems was removed from VFS. The VFS_VERSION was bumped, all filesystem modules shall be recompiled. 20121018: All the non-MPSAFE filesystems have been disconnected from the build. The full list includes: codafs, hpfs, ntfs, nwfs, portalfs, smbfs, xfs. 20121016: The interface cloning API and ABI has changed. The following modules need to be recompiled together with kernel: ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4), vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4), faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4). 20121015: The sdhci driver was split in two parts: sdhci (generic SD Host Controller logic) and sdhci_pci (actual hardware driver). No kernel config modifications are required, but if you load sdhc as a module you must switch to sdhci_pci instead. 20121014: Import the FUSE kernel and userland support into base system. 20121013: The GNU sort(1) program has been removed since the BSD-licensed sort(1) has been the default for quite some time and no serious problems have been reported. The corresponding WITH_GNU_SORT knob has also gone. 20121006: The pfil(9) API/ABI for AF_INET family has been changed. Packet filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled with new kernel. 20121001: The net80211(4) ABI has been changed to allow for improved driver PS-POLL and power-save support. All wireless drivers need to be recompiled to work with the new kernel. 20120913: The random(4) support for the VIA hardware random number generator (`PADLOCK') is no longer enabled unconditionally. Add the padlock_rng device in the custom kernel config if needed. The GENERIC kernels on i386 and amd64 do include the device, so the change only affects the custom kernel configurations. 20120908: The pf(4) packet filter ABI has been changed. pfctl(8) and snmp_pf module need to be recompiled to work with new kernel. 20120828: A new ZFS feature flag "com.delphix:empty_bpobj" has been merged to -HEAD. Pools that have empty_bpobj in active state can not be imported read-write with ZFS implementations that do not support this feature. For more information read the zpool-features(5) manual page. 20120727: The sparc64 ZFS loader has been changed to no longer try to auto- detect ZFS providers based on diskN aliases but now requires these to be explicitly listed in the OFW boot-device environment variable. 20120712: The OpenSSL has been upgraded to 1.0.1c. Any binaries requiring libcrypto.so.6 or libssl.so.6 must be recompiled. Also, there are configuration changes. Make sure to merge /etc/ssl/openssl.cnf. 20120712: The following sysctls and tunables have been renamed for consistency with other variables: kern.cam.da.da_send_ordered -> kern.cam.da.send_ordered kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered 20120628: The sort utility has been replaced with BSD sort. For now, GNU sort is also available as "gnusort" or the default can be set back to GNU sort by setting WITH_GNU_SORT. In this case, BSD sort will be installed as "bsdsort". 20120611: A new version of ZFS (pool version 5000) has been merged to -HEAD. Starting with this version the old system of ZFS pool versioning is superseded by "feature flags". This concept enables forward compatibility against certain future changes in functionality of ZFS pools. The first read-only compatible "feature flag" for ZFS pools is named "com.delphix:async_destroy". For more information read the new zpool-features(5) manual page. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20120417: The malloc(3) implementation embedded in libc now uses sources imported as contrib/jemalloc. The most disruptive API change is to /etc/malloc.conf. If your system has an old-style /etc/malloc.conf, delete it prior to installworld, and optionally re-create it using the new format after rebooting. See malloc.conf(5) for details (specifically the TUNING section and the "opt.*" entries in the MALLCTL NAMESPACE section). 20120328: Big-endian MIPS TARGET_ARCH values no longer end in "eb". mips64eb is now spelled mips64. mipsn32eb is now spelled mipsn32. mipseb is now spelled mips. This is to aid compatibility with third-party software that expects this naming scheme in uname(3). Little-endian settings are unchanged. If you are updating a big-endian mips64 machine from before this change, you may need to set MACHINE_ARCH=mips64 in your environment before the new build system will recognize your machine. 20120306: Disable by default the option VFS_ALLOW_NONMPSAFE for all supported platforms. 20120229: Now unix domain sockets behave "as expected" on nullfs(5). Previously nullfs(5) did not pass through all behaviours to the underlying layer, as a result if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. 20120211: The getifaddrs upgrade path broken with 20111215 has been restored. If you have upgraded in between 20111215 and 20120209 you need to recompile libc again with your kernel. You still need to recompile world to be able to configure CARP but this restriction already comes from 20111215. 20120114: The set_rcvar() function has been removed from /etc/rc.subr. All base and ports rc.d scripts have been updated, so if you have a port installed with a script in /usr/local/etc/rc.d you can either hand-edit the rcvar= line, or reinstall the port. An easy way to handle the mass-update of /etc/rc.d: rm /etc/rc.d/* && mergemaster -i 20120109: panic(9) now stops other CPUs in the SMP systems, disables interrupts on the current CPU and prevents other threads from running. This behavior can be reverted using the kern.stop_scheduler_on_panic tunable/sysctl. The new behavior can be incompatible with kern.sync_on_panic. 20111215: The carp(4) facility has been changed significantly. Configuration of the CARP protocol via ifconfig(8) has changed, as well as format of CARP events submitted to devd(8) has changed. See manual pages for more information. The arpbalance feature of carp(4) is currently not supported anymore. Size of struct in_aliasreq, struct in6_aliasreq has changed. User utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8), need to be recompiled. 20111122: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. 20111108: The option VFS_ALLOW_NONMPSAFE option has been added in order to explicitely support non-MPSAFE filesystems. It is on by default for all supported platform at this present time. 20111101: The broken amd(4) driver has been replaced with esp(4) in the amd64, i386 and pc98 GENERIC kernel configuration files. 20110930: sysinstall has been removed 20110923: The stable/9 branch created in subversion. This corresponds to the RELENG_9 branch in CVS. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach if you encounter problems with a major version upgrade. Since the stable 4.x branch point, one has generally been able to upgade from anywhere in the most recent stable branch to head / current (or even the last couple of stable branches). See the top of this file when there's an exception. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. This file should be read as a log of events. When a later event changes information of a prior event, the prior event should not be deleted. Instead, a pointer to the entry with the new information should be placed in the old entry. Readers of this file should also sanity check older entries before relying on them blindly. Authors of new entries should write them with this in mind. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To just build a kernel when you know that it won't mess you up -------------------------------------------------------------- This assumes you are already running a CURRENT system. Replace ${arch} with the architecture of your machine (e.g. "i386", "arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc). cd src/sys/${arch}/conf config KERNEL_NAME_HERE cd ../compile/KERNEL_NAME_HERE make depend make make install If this fails, go to the "To build a kernel" section. To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make kernel KERNCONF=YOUR_KERNEL_HERE [8] [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a cd src adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a noop. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] In order to have a kernel that can run the 4.x binaries needed to do an installworld, you must include the COMPAT_FREEBSD4 option in your kernel. Failure to do so may leave you with a system that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is required to run the 5.x binaries on more recent kernels. And so on for COMPAT_FREEBSD6 and COMPAT_FREEBSD7. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. [9] When checking out sources, you must include the -P flag to have cvs prune empty directories. If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since September 23, 2011. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. All Rights Reserved. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: projects/runtime-coverage/bin/dd/args.c =================================================================== --- projects/runtime-coverage/bin/dd/args.c (revision 322921) +++ projects/runtime-coverage/bin/dd/args.c (revision 322922) @@ -1,508 +1,509 @@ /*- * Copyright (c) 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Keith Muller of the University of California, San Diego and Lance * Visser of Convex Computer Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef lint #if 0 static char sccsid[] = "@(#)args.c 8.3 (Berkeley) 4/2/94"; #endif #endif /* not lint */ #include __FBSDID("$FreeBSD$"); #include +#include #include #include #include #include #include #include #include #include "dd.h" #include "extern.h" static int c_arg(const void *, const void *); static int c_conv(const void *, const void *); static void f_bs(char *); static void f_cbs(char *); static void f_conv(char *); static void f_count(char *); static void f_files(char *); static void f_fillchar(char *); static void f_ibs(char *); static void f_if(char *); static void f_obs(char *); static void f_of(char *); static void f_seek(char *); static void f_skip(char *); static void f_speed(char *); static void f_status(char *); static uintmax_t get_num(const char *); static off_t get_off_t(const char *); static const struct arg { const char *name; void (*f)(char *); u_int set, noset; } args[] = { { "bs", f_bs, C_BS, C_BS|C_IBS|C_OBS|C_OSYNC }, { "cbs", f_cbs, C_CBS, C_CBS }, { "conv", f_conv, 0, 0 }, { "count", f_count, C_COUNT, C_COUNT }, { "files", f_files, C_FILES, C_FILES }, { "fillchar", f_fillchar, C_FILL, C_FILL }, { "ibs", f_ibs, C_IBS, C_BS|C_IBS }, { "if", f_if, C_IF, C_IF }, { "iseek", f_skip, C_SKIP, C_SKIP }, { "obs", f_obs, C_OBS, C_BS|C_OBS }, { "of", f_of, C_OF, C_OF }, { "oseek", f_seek, C_SEEK, C_SEEK }, { "seek", f_seek, C_SEEK, C_SEEK }, { "skip", f_skip, C_SKIP, C_SKIP }, { "speed", f_speed, 0, 0 }, { "status", f_status, C_STATUS,C_STATUS }, }; static char *oper; /* * args -- parse JCL syntax of dd. */ void jcl(char **argv) { struct arg *ap, tmp; char *arg; in.dbsz = out.dbsz = 512; while ((oper = *++argv) != NULL) { if ((oper = strdup(oper)) == NULL) errx(1, "unable to allocate space for the argument \"%s\"", *argv); if ((arg = strchr(oper, '=')) == NULL) errx(1, "unknown operand %s", oper); *arg++ = '\0'; if (!*arg) errx(1, "no value specified for %s", oper); tmp.name = oper; if (!(ap = (struct arg *)bsearch(&tmp, args, sizeof(args)/sizeof(struct arg), sizeof(struct arg), c_arg))) errx(1, "unknown operand %s", tmp.name); if (ddflags & ap->noset) errx(1, "%s: illegal argument combination or already set", tmp.name); ddflags |= ap->set; ap->f(arg); } /* Final sanity checks. */ if (ddflags & C_BS) { /* * Bs is turned off by any conversion -- we assume the user * just wanted to set both the input and output block sizes * and didn't want the bs semantics, so we don't warn. */ if (ddflags & (C_BLOCK | C_LCASE | C_SWAB | C_UCASE | C_UNBLOCK)) ddflags &= ~C_BS; /* Bs supersedes ibs and obs. */ if (ddflags & C_BS && ddflags & (C_IBS | C_OBS)) warnx("bs supersedes ibs and obs"); } /* * Ascii/ebcdic and cbs implies block/unblock. * Block/unblock requires cbs and vice-versa. */ if (ddflags & (C_BLOCK | C_UNBLOCK)) { if (!(ddflags & C_CBS)) errx(1, "record operations require cbs"); if (cbsz == 0) errx(1, "cbs cannot be zero"); cfunc = ddflags & C_BLOCK ? block : unblock; } else if (ddflags & C_CBS) { if (ddflags & (C_ASCII | C_EBCDIC)) { if (ddflags & C_ASCII) { ddflags |= C_UNBLOCK; cfunc = unblock; } else { ddflags |= C_BLOCK; cfunc = block; } } else errx(1, "cbs meaningless if not doing record operations"); } else cfunc = def; } static int c_arg(const void *a, const void *b) { return (strcmp(((const struct arg *)a)->name, ((const struct arg *)b)->name)); } static void f_bs(char *arg) { uintmax_t res; res = get_num(arg); if (res < 1 || res > SSIZE_MAX) - errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX); + errx(1, "bs must be between 1 and %zd", (ssize_t)SSIZE_MAX); in.dbsz = out.dbsz = (size_t)res; } static void f_cbs(char *arg) { uintmax_t res; res = get_num(arg); if (res < 1 || res > SSIZE_MAX) - errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX); + errx(1, "cbs must be between 1 and %zd", (ssize_t)SSIZE_MAX); cbsz = (size_t)res; } static void f_count(char *arg) { - intmax_t res; + uintmax_t res; - res = (intmax_t)get_num(arg); - if (res < 0) - errx(1, "count cannot be negative"); + res = get_num(arg); + if (res == UINTMAX_MAX) + errc(1, ERANGE, "%s", oper); if (res == 0) - cpy_cnt = (uintmax_t)-1; + cpy_cnt = UINTMAX_MAX; else - cpy_cnt = (uintmax_t)res; + cpy_cnt = res; } static void f_files(char *arg) { files_cnt = get_num(arg); if (files_cnt < 1) - errx(1, "files must be between 1 and %jd", (uintmax_t)-1); + errx(1, "files must be between 1 and %zu", SIZE_MAX); } static void f_fillchar(char *arg) { if (strlen(arg) != 1) errx(1, "need exactly one fill char"); fill_char = arg[0]; } static void f_ibs(char *arg) { uintmax_t res; if (!(ddflags & C_BS)) { res = get_num(arg); if (res < 1 || res > SSIZE_MAX) - errx(1, "ibs must be between 1 and %jd", - (intmax_t)SSIZE_MAX); + errx(1, "ibs must be between 1 and %zd", + (ssize_t)SSIZE_MAX); in.dbsz = (size_t)res; } } static void f_if(char *arg) { in.name = arg; } static void f_obs(char *arg) { uintmax_t res; if (!(ddflags & C_BS)) { res = get_num(arg); if (res < 1 || res > SSIZE_MAX) - errx(1, "obs must be between 1 and %jd", - (intmax_t)SSIZE_MAX); + errx(1, "obs must be between 1 and %zd", + (ssize_t)SSIZE_MAX); out.dbsz = (size_t)res; } } static void f_of(char *arg) { out.name = arg; } static void f_seek(char *arg) { out.offset = get_off_t(arg); } static void f_skip(char *arg) { in.offset = get_off_t(arg); } static void f_speed(char *arg) { speed = get_num(arg); } static void f_status(char *arg) { if (strcmp(arg, "none") == 0) ddflags |= C_NOINFO; else if (strcmp(arg, "noxfer") == 0) ddflags |= C_NOXFER; else errx(1, "unknown status %s", arg); } static const struct conv { const char *name; u_int set, noset; const u_char *ctab; } clist[] = { { "ascii", C_ASCII, C_EBCDIC, e2a_POSIX }, { "block", C_BLOCK, C_UNBLOCK, NULL }, { "ebcdic", C_EBCDIC, C_ASCII, a2e_POSIX }, { "ibm", C_EBCDIC, C_ASCII, a2ibm_POSIX }, { "lcase", C_LCASE, C_UCASE, NULL }, { "noerror", C_NOERROR, 0, NULL }, { "notrunc", C_NOTRUNC, 0, NULL }, { "oldascii", C_ASCII, C_EBCDIC, e2a_32V }, { "oldebcdic", C_EBCDIC, C_ASCII, a2e_32V }, { "oldibm", C_EBCDIC, C_ASCII, a2ibm_32V }, { "osync", C_OSYNC, C_BS, NULL }, { "pareven", C_PAREVEN, C_PARODD|C_PARSET|C_PARNONE, NULL}, { "parnone", C_PARNONE, C_PARODD|C_PARSET|C_PAREVEN, NULL}, { "parodd", C_PARODD, C_PAREVEN|C_PARSET|C_PARNONE, NULL}, { "parset", C_PARSET, C_PARODD|C_PAREVEN|C_PARNONE, NULL}, { "sparse", C_SPARSE, 0, NULL }, { "swab", C_SWAB, 0, NULL }, { "sync", C_SYNC, 0, NULL }, { "ucase", C_UCASE, C_LCASE, NULL }, { "unblock", C_UNBLOCK, C_BLOCK, NULL }, }; static void f_conv(char *arg) { struct conv *cp, tmp; while (arg != NULL) { tmp.name = strsep(&arg, ","); cp = bsearch(&tmp, clist, sizeof(clist) / sizeof(struct conv), sizeof(struct conv), c_conv); if (cp == NULL) errx(1, "unknown conversion %s", tmp.name); if (ddflags & cp->noset) errx(1, "%s: illegal conversion combination", tmp.name); ddflags |= cp->set; if (cp->ctab) ctab = cp->ctab; } } static int c_conv(const void *a, const void *b) { return (strcmp(((const struct conv *)a)->name, ((const struct conv *)b)->name)); } static intmax_t postfix_to_mult(const char expr) { intmax_t mult; mult = 0; switch (expr) { case 'B': case 'b': mult = 512; break; case 'K': case 'k': mult = 1 << 10; break; case 'M': case 'm': mult = 1 << 20; break; case 'G': case 'g': mult = 1 << 30; break; case 'T': case 't': mult = (uintmax_t)1 << 40; break; case 'P': case 'p': mult = (uintmax_t)1 << 50; break; case 'W': case 'w': mult = sizeof(int); break; } return (mult); } /* * Convert an expression of the following forms to a uintmax_t. * 1) A positive decimal number. * 2) A positive decimal number followed by a 'b' or 'B' (mult by 512). * 3) A positive decimal number followed by a 'k' or 'K' (mult by 1 << 10). * 4) A positive decimal number followed by a 'm' or 'M' (mult by 1 << 20). * 5) A positive decimal number followed by a 'g' or 'G' (mult by 1 << 30). * 6) A positive decimal number followed by a 't' or 'T' (mult by 1 << 40). * 7) A positive decimal number followed by a 'p' or 'P' (mult by 1 << 50). * 8) A positive decimal number followed by a 'w' or 'W' (mult by sizeof int). * 9) Two or more positive decimal numbers (with/without [BbKkMmGgWw]) * separated by 'x' or 'X' (also '*' for backwards compatibility), * specifying the product of the indicated values. */ static uintmax_t get_num(const char *val) { uintmax_t num, mult, prevnum; char *expr; errno = 0; num = strtoumax(val, &expr, 0); if (expr == val) /* No valid digits. */ errx(1, "%s: invalid numeric value", oper); if (errno != 0) err(1, "%s", oper); mult = postfix_to_mult(*expr); if (mult != 0) { prevnum = num; num *= mult; /* Check for overflow. */ if (num / mult != prevnum) goto erange; expr++; } switch (*expr) { case '\0': break; case '*': /* Backward compatible. */ case 'X': case 'x': mult = get_num(expr + 1); prevnum = num; num *= mult; if (num / mult == prevnum) break; erange: errx(1, "%s: %s", oper, strerror(ERANGE)); default: errx(1, "%s: illegal numeric value", oper); } return (num); } /* * Convert an expression of the following forms to an off_t. This is the * same as get_num(), but it uses signed numbers. * * The major problem here is that an off_t may not necessarily be a intmax_t. */ static off_t get_off_t(const char *val) { intmax_t num, mult, prevnum; char *expr; errno = 0; num = strtoimax(val, &expr, 0); if (expr == val) /* No valid digits. */ errx(1, "%s: invalid numeric value", oper); if (errno != 0) err(1, "%s", oper); mult = postfix_to_mult(*expr); if (mult != 0) { prevnum = num; num *= mult; /* Check for overflow. */ if ((prevnum > 0) != (num > 0) || num / mult != prevnum) goto erange; expr++; } switch (*expr) { case '\0': break; case '*': /* Backward compatible. */ case 'X': case 'x': mult = (intmax_t)get_off_t(expr + 1); prevnum = num; num *= mult; if ((prevnum > 0) == (num > 0) && num / mult == prevnum) break; erange: errx(1, "%s: %s", oper, strerror(ERANGE)); default: errx(1, "%s: illegal numeric value", oper); } return (num); } Index: projects/runtime-coverage/bin/dd/conv.c =================================================================== --- projects/runtime-coverage/bin/dd/conv.c (revision 322921) +++ projects/runtime-coverage/bin/dd/conv.c (revision 322922) @@ -1,268 +1,268 @@ /*- * Copyright (c) 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Keith Muller of the University of California, San Diego and Lance * Visser of Convex Computer Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef lint #if 0 static char sccsid[] = "@(#)conv.c 8.3 (Berkeley) 4/2/94"; #endif #endif /* not lint */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include "dd.h" #include "extern.h" /* * def -- * Copy input to output. Input is buffered until reaches obs, and then * output until less than obs remains. Only a single buffer is used. * Worst case buffer calculation is (ibs + obs - 1). */ void def(void) { u_char *inp; const u_char *t; size_t cnt; if ((t = ctab) != NULL) for (inp = in.dbp - (cnt = in.dbrcnt); cnt--; ++inp) *inp = t[*inp]; /* Make the output buffer look right. */ out.dbp = in.dbp; out.dbcnt = in.dbcnt; if (in.dbcnt >= out.dbsz) { /* If the output buffer is full, write it. */ dd_out(0); /* * dd_out copies the leftover output to the beginning of * the buffer and resets the output buffer. Reset the * input buffer to match it. */ in.dbp = out.dbp; in.dbcnt = out.dbcnt; } } void def_close(void) { /* Just update the count, everything is already in the buffer. */ if (in.dbcnt) out.dbcnt = in.dbcnt; } /* * Copy variable length newline terminated records with a max size cbsz * bytes to output. Records less than cbs are padded with spaces. * * max in buffer: MAX(ibs, cbsz) * max out buffer: obs + cbsz */ void block(void) { u_char *inp, *outp; const u_char *t; size_t cnt, maxlen; static int intrunc; int ch; /* * Record truncation can cross block boundaries. If currently in a * truncation state, keep tossing characters until reach a newline. * Start at the beginning of the buffer, as the input buffer is always * left empty. */ if (intrunc) { for (inp = in.db, cnt = in.dbrcnt; cnt && *inp++ != '\n'; --cnt) ; if (!cnt) { in.dbcnt = 0; in.dbp = in.db; return; } intrunc = 0; /* Adjust the input buffer numbers. */ in.dbcnt = cnt - 1; in.dbp = inp + cnt - 1; } /* * Copy records (max cbsz size chunks) into the output buffer. The * translation is done as we copy into the output buffer. */ ch = 0; for (inp = in.dbp - in.dbcnt, outp = out.dbp; in.dbcnt;) { - maxlen = MIN(cbsz, in.dbcnt); + maxlen = MIN(cbsz, (size_t)in.dbcnt); if ((t = ctab) != NULL) for (cnt = 0; cnt < maxlen && (ch = *inp++) != '\n'; ++cnt) *outp++ = t[ch]; else for (cnt = 0; cnt < maxlen && (ch = *inp++) != '\n'; ++cnt) *outp++ = ch; /* * Check for short record without a newline. Reassemble the * input block. */ - if (ch != '\n' && in.dbcnt < cbsz) { + if (ch != '\n' && (size_t)in.dbcnt < cbsz) { (void)memmove(in.db, in.dbp - in.dbcnt, in.dbcnt); break; } /* Adjust the input buffer numbers. */ in.dbcnt -= cnt; if (ch == '\n') --in.dbcnt; /* Pad short records with spaces. */ if (cnt < cbsz) (void)memset(outp, ctab ? ctab[' '] : ' ', cbsz - cnt); else { /* * If the next character wouldn't have ended the * block, it's a truncation. */ if (!in.dbcnt || *inp != '\n') ++st.trunc; /* Toss characters to a newline. */ for (; in.dbcnt && *inp++ != '\n'; --in.dbcnt) ; if (!in.dbcnt) intrunc = 1; else --in.dbcnt; } /* Adjust output buffer numbers. */ out.dbp += cbsz; if ((out.dbcnt += cbsz) >= out.dbsz) dd_out(0); outp = out.dbp; } in.dbp = in.db + in.dbcnt; } void block_close(void) { /* * Copy any remaining data into the output buffer and pad to a record. * Don't worry about truncation or translation, the input buffer is * always empty when truncating, and no characters have been added for * translation. The bottom line is that anything left in the input * buffer is a truncated record. Anything left in the output buffer * just wasn't big enough. */ if (in.dbcnt) { ++st.trunc; (void)memmove(out.dbp, in.dbp - in.dbcnt, in.dbcnt); (void)memset(out.dbp + in.dbcnt, ctab ? ctab[' '] : ' ', cbsz - in.dbcnt); out.dbcnt += cbsz; } } /* * Convert fixed length (cbsz) records to variable length. Deletes any * trailing blanks and appends a newline. * * max in buffer: MAX(ibs, cbsz) + cbsz * max out buffer: obs + cbsz */ void unblock(void) { u_char *inp; const u_char *t; size_t cnt; /* Translation and case conversion. */ if ((t = ctab) != NULL) for (inp = in.dbp - (cnt = in.dbrcnt); cnt--; ++inp) *inp = t[*inp]; /* * Copy records (max cbsz size chunks) into the output buffer. The * translation has to already be done or we might not recognize the * spaces. */ - for (inp = in.db; in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) { + for (inp = in.db; (size_t)in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) { for (t = inp + cbsz - 1; t >= inp && *t == ' '; --t) ; if (t >= inp) { cnt = t - inp + 1; (void)memmove(out.dbp, inp, cnt); out.dbp += cnt; out.dbcnt += cnt; } *out.dbp++ = '\n'; if (++out.dbcnt >= out.dbsz) dd_out(0); } if (in.dbcnt) (void)memmove(in.db, in.dbp - in.dbcnt, in.dbcnt); in.dbp = in.db + in.dbcnt; } void unblock_close(void) { u_char *t; size_t cnt; if (in.dbcnt) { warnx("%s: short input record", in.name); for (t = in.db + in.dbcnt - 1; t >= in.db && *t == ' '; --t) ; if (t >= in.db) { cnt = t - in.db + 1; (void)memmove(out.dbp, in.db, cnt); out.dbp += cnt; out.dbcnt += cnt; } ++out.dbcnt; *out.dbp++ = '\n'; } } Index: projects/runtime-coverage/bin/dd/dd.c =================================================================== --- projects/runtime-coverage/bin/dd/dd.c (revision 322921) +++ projects/runtime-coverage/bin/dd/dd.c (revision 322922) @@ -1,592 +1,592 @@ /*- * Copyright (c) 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Keith Muller of the University of California, San Diego and Lance * Visser of Convex Computer Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #if 0 #ifndef lint static char const copyright[] = "@(#) Copyright (c) 1991, 1993, 1994\n\ The Regents of the University of California. All rights reserved.\n"; #endif /* not lint */ #ifndef lint static char sccsid[] = "@(#)dd.c 8.5 (Berkeley) 4/2/94"; #endif /* not lint */ #endif #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "dd.h" #include "extern.h" static void dd_close(void); static void dd_in(void); static void getfdtype(IO *); static void setup(void); IO in, out; /* input/output state */ STAT st; /* statistics */ void (*cfunc)(void); /* conversion function */ uintmax_t cpy_cnt; /* # of blocks to copy */ static off_t pending = 0; /* pending seek if sparse */ u_int ddflags = 0; /* conversion options */ size_t cbsz; /* conversion block size */ uintmax_t files_cnt = 1; /* # of files to copy */ const u_char *ctab; /* conversion table */ char fill_char; /* Character to fill with if defined */ size_t speed = 0; /* maximum speed, in bytes per second */ volatile sig_atomic_t need_summary; int main(int argc __unused, char *argv[]) { (void)setlocale(LC_CTYPE, ""); jcl(argv); setup(); caph_cache_catpages(); if (cap_enter() == -1 && errno != ENOSYS) err(1, "unable to enter capability mode"); (void)signal(SIGINFO, siginfo_handler); (void)signal(SIGINT, terminate); atexit(summary); while (files_cnt--) dd_in(); dd_close(); /* * Some devices such as cfi(4) may perform significant amounts * of work when a write descriptor is closed. Close the out * descriptor explicitly so that the summary handler (called * from an atexit() hook) includes this work. */ close(out.fd); exit(0); } static int parity(u_char c) { int i; i = c ^ (c >> 1) ^ (c >> 2) ^ (c >> 3) ^ (c >> 4) ^ (c >> 5) ^ (c >> 6) ^ (c >> 7); return (i & 1); } static void setup(void) { u_int cnt; cap_rights_t rights; unsigned long cmds[] = { FIODTYPE, MTIOCTOP }; if (in.name == NULL) { in.name = "stdin"; in.fd = STDIN_FILENO; } else { in.fd = open(in.name, O_RDONLY, 0); if (in.fd == -1) err(1, "%s", in.name); } getfdtype(&in); cap_rights_init(&rights, CAP_READ, CAP_SEEK); if (cap_rights_limit(in.fd, &rights) == -1 && errno != ENOSYS) err(1, "unable to limit capability rights"); if (files_cnt > 1 && !(in.flags & ISTAPE)) errx(1, "files is not supported for non-tape devices"); cap_rights_set(&rights, CAP_FTRUNCATE, CAP_IOCTL, CAP_WRITE); if (out.name == NULL) { /* No way to check for read access here. */ out.fd = STDOUT_FILENO; out.name = "stdout"; } else { #define OFLAGS \ (O_CREAT | (ddflags & (C_SEEK | C_NOTRUNC) ? 0 : O_TRUNC)) out.fd = open(out.name, O_RDWR | OFLAGS, DEFFILEMODE); /* * May not have read access, so try again with write only. * Without read we may have a problem if output also does * not support seeks. */ if (out.fd == -1) { out.fd = open(out.name, O_WRONLY | OFLAGS, DEFFILEMODE); out.flags |= NOREAD; cap_rights_clear(&rights, CAP_READ); } if (out.fd == -1) err(1, "%s", out.name); } getfdtype(&out); if (cap_rights_limit(out.fd, &rights) == -1 && errno != ENOSYS) err(1, "unable to limit capability rights"); if (cap_ioctls_limit(out.fd, cmds, nitems(cmds)) == -1 && errno != ENOSYS) err(1, "unable to limit capability rights"); if (in.fd != STDIN_FILENO && out.fd != STDIN_FILENO) { if (caph_limit_stdin() == -1) err(1, "unable to limit capability rights"); } if (in.fd != STDOUT_FILENO && out.fd != STDOUT_FILENO) { if (caph_limit_stdout() == -1) err(1, "unable to limit capability rights"); } if (in.fd != STDERR_FILENO && out.fd != STDERR_FILENO) { if (caph_limit_stderr() == -1) err(1, "unable to limit capability rights"); } /* * Allocate space for the input and output buffers. If not doing * record oriented I/O, only need a single buffer. */ if (!(ddflags & (C_BLOCK | C_UNBLOCK))) { - if ((in.db = malloc(out.dbsz + in.dbsz - 1)) == NULL) + if ((in.db = malloc((size_t)out.dbsz + in.dbsz - 1)) == NULL) err(1, "input buffer"); out.db = in.db; - } else if ((in.db = malloc(MAX(in.dbsz, cbsz) + cbsz)) == NULL || + } else if ((in.db = malloc(MAX((size_t)in.dbsz, cbsz) + cbsz)) == NULL || (out.db = malloc(out.dbsz + cbsz)) == NULL) err(1, "output buffer"); /* dbp is the first free position in each buffer. */ in.dbp = in.db; out.dbp = out.db; /* Position the input/output streams. */ if (in.offset) pos_in(); if (out.offset) pos_out(); /* * Truncate the output file. If it fails on a type of output file * that it should _not_ fail on, error out. */ if ((ddflags & (C_OF | C_SEEK | C_NOTRUNC)) == (C_OF | C_SEEK) && out.flags & ISTRUNC) if (ftruncate(out.fd, out.offset * out.dbsz) == -1) err(1, "truncating %s", out.name); if (ddflags & (C_LCASE | C_UCASE | C_ASCII | C_EBCDIC | C_PARITY)) { if (ctab != NULL) { for (cnt = 0; cnt <= 0377; ++cnt) casetab[cnt] = ctab[cnt]; } else { for (cnt = 0; cnt <= 0377; ++cnt) casetab[cnt] = cnt; } if ((ddflags & C_PARITY) && !(ddflags & C_ASCII)) { /* * If the input is not EBCDIC, and we do parity * processing, strip input parity. */ for (cnt = 200; cnt <= 0377; ++cnt) casetab[cnt] = casetab[cnt & 0x7f]; } if (ddflags & C_LCASE) { for (cnt = 0; cnt <= 0377; ++cnt) casetab[cnt] = tolower(casetab[cnt]); } else if (ddflags & C_UCASE) { for (cnt = 0; cnt <= 0377; ++cnt) casetab[cnt] = toupper(casetab[cnt]); } if ((ddflags & C_PARITY)) { /* * This should strictly speaking be a no-op, but I * wonder what funny LANG settings could get us. */ for (cnt = 0; cnt <= 0377; ++cnt) casetab[cnt] = casetab[cnt] & 0x7f; } if ((ddflags & C_PARSET)) { for (cnt = 0; cnt <= 0377; ++cnt) casetab[cnt] = casetab[cnt] | 0x80; } if ((ddflags & C_PAREVEN)) { for (cnt = 0; cnt <= 0377; ++cnt) if (parity(casetab[cnt])) casetab[cnt] = casetab[cnt] | 0x80; } if ((ddflags & C_PARODD)) { for (cnt = 0; cnt <= 0377; ++cnt) if (!parity(casetab[cnt])) casetab[cnt] = casetab[cnt] | 0x80; } ctab = casetab; } if (clock_gettime(CLOCK_MONOTONIC, &st.start)) err(1, "clock_gettime"); } static void getfdtype(IO *io) { struct stat sb; int type; if (fstat(io->fd, &sb) == -1) err(1, "%s", io->name); if (S_ISREG(sb.st_mode)) io->flags |= ISTRUNC; if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode)) { if (ioctl(io->fd, FIODTYPE, &type) == -1) { err(1, "%s", io->name); } else { if (type & D_TAPE) io->flags |= ISTAPE; else if (type & (D_DISK | D_MEM)) io->flags |= ISSEEK; if (S_ISCHR(sb.st_mode) && (type & D_TAPE) == 0) io->flags |= ISCHR; } return; } errno = 0; if (lseek(io->fd, (off_t)0, SEEK_CUR) == -1 && errno == ESPIPE) io->flags |= ISPIPE; else io->flags |= ISSEEK; } /* * Limit the speed by adding a delay before every block read. * The delay (t_usleep) is equal to the time computed from block * size and the specified speed limit (t_target) minus the time * spent on actual read and write operations (t_io). */ static void speed_limit(void) { static double t_prev, t_usleep; double t_now, t_io, t_target; t_now = secs_elapsed(); t_io = t_now - t_prev - t_usleep; t_target = (double)in.dbsz / (double)speed; t_usleep = t_target - t_io; if (t_usleep > 0) usleep(t_usleep * 1000000); else t_usleep = 0; t_prev = t_now; } static void dd_in(void) { ssize_t n; for (;;) { switch (cpy_cnt) { case -1: /* count=0 was specified */ return; case 0: break; default: if (st.in_full + st.in_part >= (uintmax_t)cpy_cnt) return; break; } if (speed > 0) speed_limit(); /* * Zero the buffer first if sync; if doing block operations, * use spaces. */ if (ddflags & C_SYNC) { if (ddflags & C_FILL) memset(in.dbp, fill_char, in.dbsz); else if (ddflags & (C_BLOCK | C_UNBLOCK)) memset(in.dbp, ' ', in.dbsz); else memset(in.dbp, 0, in.dbsz); } n = read(in.fd, in.dbp, in.dbsz); if (n == 0) { in.dbrcnt = 0; return; } /* Read error. */ if (n == -1) { /* * If noerror not specified, die. POSIX requires that * the warning message be followed by an I/O display. */ if (!(ddflags & C_NOERROR)) err(1, "%s", in.name); warn("%s", in.name); summary(); /* * If it's a seekable file descriptor, seek past the * error. If your OS doesn't do the right thing for * raw disks this section should be modified to re-read * in sector size chunks. */ if (in.flags & ISSEEK && lseek(in.fd, (off_t)in.dbsz, SEEK_CUR)) warn("%s", in.name); /* If sync not specified, omit block and continue. */ if (!(ddflags & C_SYNC)) continue; /* Read errors count as full blocks. */ in.dbcnt += in.dbrcnt = in.dbsz; ++st.in_full; /* Handle full input blocks. */ - } else if ((size_t)n == in.dbsz) { + } else if ((size_t)n == (size_t)in.dbsz) { in.dbcnt += in.dbrcnt = n; ++st.in_full; /* Handle partial input blocks. */ } else { /* If sync, use the entire block. */ if (ddflags & C_SYNC) in.dbcnt += in.dbrcnt = in.dbsz; else in.dbcnt += in.dbrcnt = n; ++st.in_part; } /* * POSIX states that if bs is set and no other conversions * than noerror, notrunc or sync are specified, the block * is output without buffering as it is read. */ if ((ddflags & ~(C_NOERROR | C_NOTRUNC | C_SYNC)) == C_BS) { out.dbcnt = in.dbcnt; dd_out(1); in.dbcnt = 0; continue; } if (ddflags & C_SWAB) { if ((n = in.dbrcnt) & 1) { ++st.swab; --n; } swab(in.dbp, in.dbp, (size_t)n); } in.dbp += in.dbrcnt; (*cfunc)(); if (need_summary) { summary(); } } } /* * Clean up any remaining I/O and flush output. If necessary, the output file * is truncated. */ static void dd_close(void) { if (cfunc == def) def_close(); else if (cfunc == block) block_close(); else if (cfunc == unblock) unblock_close(); if (ddflags & C_OSYNC && out.dbcnt && out.dbcnt < out.dbsz) { if (ddflags & C_FILL) memset(out.dbp, fill_char, out.dbsz - out.dbcnt); else if (ddflags & (C_BLOCK | C_UNBLOCK)) memset(out.dbp, ' ', out.dbsz - out.dbcnt); else memset(out.dbp, 0, out.dbsz - out.dbcnt); out.dbcnt = out.dbsz; } if (out.dbcnt || pending) dd_out(1); /* * If the file ends with a hole, ftruncate it to extend its size * up to the end of the hole (without having to write any data). */ if (out.seek_offset > 0 && (out.flags & ISTRUNC)) { if (ftruncate(out.fd, out.seek_offset) == -1) err(1, "truncating %s", out.name); } } void dd_out(int force) { u_char *outp; size_t cnt, i, n; ssize_t nw; static int warned; int sparse; /* * Write one or more blocks out. The common case is writing a full * output block in a single write; increment the full block stats. * Otherwise, we're into partial block writes. If a partial write, * and it's a character device, just warn. If a tape device, quit. * * The partial writes represent two cases. 1: Where the input block * was less than expected so the output block was less than expected. * 2: Where the input block was the right size but we were forced to * write the block in multiple chunks. The original versions of dd(1) * never wrote a block in more than a single write, so the latter case * never happened. * * One special case is if we're forced to do the write -- in that case * we play games with the buffer size, and it's usually a partial write. */ outp = out.db; /* * If force, first try to write all pending data, else try to write * just one block. Subsequently always write data one full block at * a time at most. */ for (n = force ? out.dbcnt : out.dbsz;; n = out.dbsz) { cnt = n; do { sparse = 0; if (ddflags & C_SPARSE) { sparse = 1; /* Is buffer sparse? */ for (i = 0; i < cnt; i++) if (outp[i] != 0) { sparse = 0; break; } } if (sparse && !force) { pending += cnt; nw = cnt; } else { if (pending != 0) { /* * Seek past hole. Note that we need to record the * reached offset, because we might have no more data * to write, in which case we'll need to call * ftruncate to extend the file size. */ out.seek_offset = lseek(out.fd, pending, SEEK_CUR); if (out.seek_offset == -1) err(2, "%s: seek error creating sparse file", out.name); pending = 0; } if (cnt) { nw = write(out.fd, outp, cnt); out.seek_offset = 0; } else { return; } } if (nw <= 0) { if (nw == 0) errx(1, "%s: end of device", out.name); if (errno != EINTR) err(1, "%s", out.name); nw = 0; } outp += nw; st.bytes += nw; - if ((size_t)nw == n && n == out.dbsz) + if ((size_t)nw == n && n == (size_t)out.dbsz) ++st.out_full; else ++st.out_part; if ((size_t) nw != cnt) { if (out.flags & ISTAPE) errx(1, "%s: short write on tape device", out.name); if (out.flags & ISCHR && !warned) { warned = 1; warnx("%s: short write on character device", out.name); } } cnt -= nw; } while (cnt != 0); if ((out.dbcnt -= n) < out.dbsz) break; } /* Reassemble the output block. */ if (out.dbcnt) (void)memmove(out.db, out.dbp - out.dbcnt, out.dbcnt); out.dbp = out.db + out.dbcnt; } Index: projects/runtime-coverage/bin/dd/dd.h =================================================================== --- projects/runtime-coverage/bin/dd/dd.h (revision 322921) +++ projects/runtime-coverage/bin/dd/dd.h (revision 322922) @@ -1,103 +1,102 @@ /*- * Copyright (c) 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Keith Muller of the University of California, San Diego and Lance * Visser of Convex Computer Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)dd.h 8.3 (Berkeley) 4/2/94 * $FreeBSD$ */ /* Input/output stream state. */ typedef struct { u_char *db; /* buffer address */ u_char *dbp; /* current buffer I/O address */ - /* XXX ssize_t? */ - size_t dbcnt; /* current buffer byte count */ - size_t dbrcnt; /* last read byte count */ - size_t dbsz; /* block size */ + ssize_t dbcnt; /* current buffer byte count */ + ssize_t dbrcnt; /* last read byte count */ + ssize_t dbsz; /* block size */ #define ISCHR 0x01 /* character device (warn on short) */ #define ISPIPE 0x02 /* pipe-like (see position.c) */ #define ISTAPE 0x04 /* tape */ #define ISSEEK 0x08 /* valid to seek on */ #define NOREAD 0x10 /* not readable */ #define ISTRUNC 0x20 /* valid to ftruncate() */ u_int flags; const char *name; /* name */ int fd; /* file descriptor */ off_t offset; /* # of blocks to skip */ off_t seek_offset; /* offset of last seek past output hole */ } IO; typedef struct { uintmax_t in_full; /* # of full input blocks */ uintmax_t in_part; /* # of partial input blocks */ uintmax_t out_full; /* # of full output blocks */ uintmax_t out_part; /* # of partial output blocks */ uintmax_t trunc; /* # of truncated records */ uintmax_t swab; /* # of odd-length swab blocks */ uintmax_t bytes; /* # of bytes written */ struct timespec start; /* start time of dd */ } STAT; /* Flags (in ddflags). */ #define C_ASCII 0x00000001 #define C_BLOCK 0x00000002 #define C_BS 0x00000004 #define C_CBS 0x00000008 #define C_COUNT 0x00000010 #define C_EBCDIC 0x00000020 #define C_FILES 0x00000040 #define C_IBS 0x00000080 #define C_IF 0x00000100 #define C_LCASE 0x00000200 #define C_NOERROR 0x00000400 #define C_NOTRUNC 0x00000800 #define C_OBS 0x00001000 #define C_OF 0x00002000 #define C_OSYNC 0x00004000 #define C_PAREVEN 0x00008000 #define C_PARNONE 0x00010000 #define C_PARODD 0x00020000 #define C_PARSET 0x00040000 #define C_SEEK 0x00080000 #define C_SKIP 0x00100000 #define C_SPARSE 0x00200000 #define C_SWAB 0x00400000 #define C_SYNC 0x00800000 #define C_UCASE 0x01000000 #define C_UNBLOCK 0x02000000 #define C_FILL 0x04000000 #define C_STATUS 0x08000000 #define C_NOXFER 0x10000000 #define C_NOINFO 0x20000000 #define C_PARITY (C_PAREVEN | C_PARODD | C_PARNONE | C_PARSET) Index: projects/runtime-coverage/bin/dd/position.c =================================================================== --- projects/runtime-coverage/bin/dd/position.c (revision 322921) +++ projects/runtime-coverage/bin/dd/position.c (revision 322922) @@ -1,215 +1,215 @@ /*- * Copyright (c) 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Keith Muller of the University of California, San Diego and Lance * Visser of Convex Computer Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef lint #if 0 static char sccsid[] = "@(#)position.c 8.3 (Berkeley) 4/2/94"; #endif #endif /* not lint */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include "dd.h" #include "extern.h" static off_t seek_offset(IO *io) { off_t n; size_t sz; n = io->offset; sz = io->dbsz; _Static_assert(sizeof(io->offset) == sizeof(int64_t), "64-bit off_t"); /* * If the lseek offset will be negative, verify that this is a special * device file. Some such files (e.g. /dev/kmem) permit "negative" * offsets. * * Bail out if the calculation of a file offset would overflow. */ if ((io->flags & ISCHR) == 0 && (n < 0 || n > OFF_MAX / (ssize_t)sz)) errx(1, "seek offsets cannot be larger than %jd", (intmax_t)OFF_MAX); else if ((io->flags & ISCHR) != 0 && (uint64_t)n > UINT64_MAX / sz) errx(1, "seek offsets cannot be larger than %ju", (uintmax_t)UINT64_MAX); return ((off_t)( (uint64_t)n * sz )); } /* * Position input/output data streams before starting the copy. Device type * dependent. Seekable devices use lseek, and the rest position by reading. * Seeking past the end of file can cause null blocks to be written to the * output. */ void pos_in(void) { off_t cnt; int warned; ssize_t nr; size_t bcnt; /* If known to be seekable, try to seek on it. */ if (in.flags & ISSEEK) { errno = 0; if (lseek(in.fd, seek_offset(&in), SEEK_CUR) == -1 && errno != 0) err(1, "%s", in.name); return; } /* Don't try to read a really weird amount (like negative). */ if (in.offset < 0) errx(1, "%s: illegal offset", "iseek/skip"); /* * Read the data. If a pipe, read until satisfy the number of bytes * being skipped. No differentiation for reading complete and partial * blocks for other devices. */ for (bcnt = in.dbsz, cnt = in.offset, warned = 0; cnt;) { if ((nr = read(in.fd, in.db, bcnt)) > 0) { if (in.flags & ISPIPE) { if (!(bcnt -= nr)) { bcnt = in.dbsz; --cnt; } } else --cnt; if (need_summary) summary(); continue; } if (nr == 0) { if (files_cnt > 1) { --files_cnt; continue; } errx(1, "skip reached end of input"); } /* * Input error -- either EOF with no more files, or I/O error. * If noerror not set die. POSIX requires that the warning * message be followed by an I/O display. */ if (ddflags & C_NOERROR) { if (!warned) { warn("%s", in.name); warned = 1; summary(); } continue; } err(1, "%s", in.name); } } void pos_out(void) { struct mtop t_op; off_t cnt; ssize_t n; /* * If not a tape, try seeking on the file. Seeking on a pipe is * going to fail, but don't protect the user -- they shouldn't * have specified the seek operand. */ if (out.flags & (ISSEEK | ISPIPE)) { errno = 0; if (lseek(out.fd, seek_offset(&out), SEEK_CUR) == -1 && errno != 0) err(1, "%s", out.name); return; } /* Don't try to read a really weird amount (like negative). */ if (out.offset < 0) errx(1, "%s: illegal offset", "oseek/seek"); /* If no read access, try using mtio. */ if (out.flags & NOREAD) { t_op.mt_op = MTFSR; t_op.mt_count = out.offset; if (ioctl(out.fd, MTIOCTOP, &t_op) == -1) err(1, "%s", out.name); return; } /* Read it. */ for (cnt = 0; cnt < out.offset; ++cnt) { if ((n = read(out.fd, out.db, out.dbsz)) > 0) continue; if (n == -1) err(1, "%s", out.name); /* * If reach EOF, fill with NUL characters; first, back up over * the EOF mark. Note, cnt has not yet been incremented, so * the EOF read does not count as a seek'd block. */ t_op.mt_op = MTBSR; t_op.mt_count = 1; if (ioctl(out.fd, MTIOCTOP, &t_op) == -1) err(1, "%s", out.name); while (cnt++ < out.offset) { n = write(out.fd, out.db, out.dbsz); if (n == -1) err(1, "%s", out.name); - if ((size_t)n != out.dbsz) + if (n != out.dbsz) errx(1, "%s: write failure", out.name); } break; } } Index: projects/runtime-coverage/contrib/compiler-rt/lib/builtins/int_lib.h =================================================================== --- projects/runtime-coverage/contrib/compiler-rt/lib/builtins/int_lib.h (revision 322921) +++ projects/runtime-coverage/contrib/compiler-rt/lib/builtins/int_lib.h (revision 322922) @@ -1,156 +1,157 @@ /* ===-- int_lib.h - configuration header for compiler-rt -----------------=== * * The LLVM Compiler Infrastructure * * This file is dual licensed under the MIT and the University of Illinois Open * Source Licenses. See LICENSE.TXT for details. * * ===----------------------------------------------------------------------=== * * This file is a configuration header for compiler-rt. * This file is not part of the interface of this library. * * ===----------------------------------------------------------------------=== */ #ifndef INT_LIB_H #define INT_LIB_H /* Assumption: Signed integral is 2's complement. */ /* Assumption: Right shift of signed negative is arithmetic shift. */ /* Assumption: Endianness is little or big (not mixed). */ #if defined(__ELF__) #define FNALIAS(alias_name, original_name) \ void alias_name() __attribute__((alias(#original_name))) #else #define FNALIAS(alias, name) _Pragma("GCC error(\"alias unsupported on this file format\")") #endif /* ABI macro definitions */ #if __ARM_EABI__ # if defined(COMPILER_RT_ARMHF_TARGET) || (!defined(__clang__) && \ defined(__GNUC__) && (__GNUC__ < 4 || __GNUC__ == 4 && __GNUC_MINOR__ < 5)) /* The pcs attribute was introduced in GCC 4.5.0 */ # define COMPILER_RT_ABI # else # define COMPILER_RT_ABI __attribute__((__pcs__("aapcs"))) # endif #else # define COMPILER_RT_ABI #endif #define AEABI_RTABI __attribute__((__pcs__("aapcs"))) #ifdef _MSC_VER #define ALWAYS_INLINE __forceinline #define NOINLINE __declspec(noinline) #define NORETURN __declspec(noreturn) #define UNUSED #else #define ALWAYS_INLINE __attribute__((always_inline)) #define NOINLINE __attribute__((noinline)) #define NORETURN __attribute__((noreturn)) #define UNUSED __attribute__((unused)) #endif #if defined(__NetBSD__) && (defined(_KERNEL) || defined(_STANDALONE)) /* * Kernel and boot environment can't use normal headers, * so use the equivalent system headers. */ # include # include # include #else /* Include the standard compiler builtin headers we use functionality from. */ # include # include # include # include #endif /* Include the commonly used internal type definitions. */ #include "int_types.h" /* Include internal utility function declarations. */ #include "int_util.h" /* * Workaround for LLVM bug 11663. Prevent endless recursion in * __c?zdi2(), where calls to __builtin_c?z() are expanded to * __c?zdi2() instead of __c?zsi2(). * * Instead of placing this workaround in c?zdi2.c, put it in this * global header to prevent other C files from making the detour * through __c?zdi2() as well. * * This problem has been observed on FreeBSD for sparc64 and * mips64 with GCC 4.2.1, and for riscv with GCC 5.2.0. * Presumably it's any version of GCC, and targeting an arch that * does not have dedicated bit counting instructions. */ #if defined(__FreeBSD__) && (defined(__sparc64__) || \ - defined(__mips_n64) || defined(__mips_o64) || defined(__riscv)) + defined(__mips_n32) || defined(__mips_n64) || defined(__mips_o64) || \ + defined(__riscv)) si_int __clzsi2(si_int); si_int __ctzsi2(si_int); #define __builtin_clz __clzsi2 #define __builtin_ctz __ctzsi2 -#endif /* FreeBSD && (sparc64 || mips_n64 || mips_o64) */ +#endif /* FreeBSD && (sparc64 || mips_n32 || mips_n64 || mips_o64 || riscv) */ COMPILER_RT_ABI si_int __paritysi2(si_int a); COMPILER_RT_ABI si_int __paritydi2(di_int a); COMPILER_RT_ABI di_int __divdi3(di_int a, di_int b); COMPILER_RT_ABI si_int __divsi3(si_int a, si_int b); COMPILER_RT_ABI su_int __udivsi3(su_int n, su_int d); COMPILER_RT_ABI su_int __udivmodsi4(su_int a, su_int b, su_int* rem); COMPILER_RT_ABI du_int __udivmoddi4(du_int a, du_int b, du_int* rem); #ifdef CRT_HAS_128BIT COMPILER_RT_ABI si_int __clzti2(ti_int a); COMPILER_RT_ABI tu_int __udivmodti4(tu_int a, tu_int b, tu_int* rem); #endif /* Definitions for builtins unavailable on MSVC */ #if defined(_MSC_VER) && !defined(__clang__) #include uint32_t __inline __builtin_ctz(uint32_t value) { unsigned long trailing_zero = 0; if (_BitScanForward(&trailing_zero, value)) return trailing_zero; return 32; } uint32_t __inline __builtin_clz(uint32_t value) { unsigned long leading_zero = 0; if (_BitScanReverse(&leading_zero, value)) return 31 - leading_zero; return 32; } #if defined(_M_ARM) || defined(_M_X64) uint32_t __inline __builtin_clzll(uint64_t value) { unsigned long leading_zero = 0; if (_BitScanReverse64(&leading_zero, value)) return 63 - leading_zero; return 64; } #else uint32_t __inline __builtin_clzll(uint64_t value) { if (value == 0) return 64; uint32_t msh = (uint32_t)(value >> 32); uint32_t lsh = (uint32_t)(value & 0xFFFFFFFF); if (msh != 0) return __builtin_clz(msh); return 32 + __builtin_clz(lsh); } #endif #define __builtin_clzl __builtin_clzll #endif /* defined(_MSC_VER) && !defined(__clang__) */ #endif /* INT_LIB_H */ Index: projects/runtime-coverage/contrib/compiler-rt =================================================================== --- projects/runtime-coverage/contrib/compiler-rt (revision 322921) +++ projects/runtime-coverage/contrib/compiler-rt (revision 322922) Property changes on: projects/runtime-coverage/contrib/compiler-rt ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/contrib/compiler-rt:r322871-322921 Index: projects/runtime-coverage/lib/libc/gen/getmntinfo.c =================================================================== --- projects/runtime-coverage/lib/libc/gen/getmntinfo.c (revision 322921) +++ projects/runtime-coverage/lib/libc/gen/getmntinfo.c (revision 322922) @@ -1,66 +1,70 @@ /* * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #if defined(LIBC_SCCS) && !defined(lint) static char sccsid[] = "@(#)getmntinfo.c 8.1 (Berkeley) 6/4/93"; #endif /* LIBC_SCCS and not lint */ #include __FBSDID("$FreeBSD$"); #include #include #include #include +#define MAX_TRIES 3 +#define SCALING_FACTOR 2 + /* * Return information about mounted filesystems. */ int getmntinfo(struct statfs **mntbufp, int mode) { static struct statfs *mntbuf; static int mntsize; static long bufsize; + unsigned tries = 0; if (mntsize <= 0 && (mntsize = getfsstat(0, 0, MNT_NOWAIT)) < 0) return (0); if (bufsize > 0 && (mntsize = getfsstat(mntbuf, bufsize, mode)) < 0) return (0); - while (bufsize <= mntsize * sizeof(struct statfs)) { - if (mntbuf) - free(mntbuf); - bufsize = (mntsize + 1) * sizeof(struct statfs); - if ((mntbuf = malloc(bufsize)) == NULL) + while (tries++ < MAX_TRIES && bufsize <= mntsize * sizeof(*mntbuf)) { + bufsize = (mntsize * SCALING_FACTOR) * sizeof(*mntbuf); + if ((mntbuf = reallocf(mntbuf, bufsize)) == NULL) return (0); if ((mntsize = getfsstat(mntbuf, bufsize, mode)) < 0) return (0); } *mntbufp = mntbuf; + if (mntsize > (bufsize / sizeof(*mntbuf))) + return (bufsize / sizeof(*mntbuf)); return (mntsize); } Index: projects/runtime-coverage/lib/libc/tests/gen/Makefile =================================================================== --- projects/runtime-coverage/lib/libc/tests/gen/Makefile (revision 322921) +++ projects/runtime-coverage/lib/libc/tests/gen/Makefile (revision 322922) @@ -1,92 +1,93 @@ # $FreeBSD$ .include ATF_TESTS_C+= arc4random_test ATF_TESTS_C+= fmtcheck2_test ATF_TESTS_C+= fmtmsg_test ATF_TESTS_C+= fnmatch2_test ATF_TESTS_C+= fpclassify2_test ATF_TESTS_C+= ftw_test +ATF_TESTS_C+= getmntinfo_test ATF_TESTS_C+= glob2_test ATF_TESTS_C+= popen_test ATF_TESTS_C+= posix_spawn_test ATF_TESTS_C+= wordexp_test ATF_TESTS_C+= dlopen_empty_test ATF_TESTS_C+= realpath2_test # TODO: t_closefrom, t_cpuset, t_fmtcheck, t_randomid, # TODO: t_siginfo (fixes require further inspection) # TODO: t_sethostname_test (consistently screws up the hostname) CFLAGS+= -DTEST_LONG_DOUBLE # Not sure why this isn't defined for all architectures, since most # have long double. .if ${MACHINE_CPUARCH} == "aarch64" || \ ${MACHINE_CPUARCH} == "amd64" || \ ${MACHINE_CPUARCH} == "i386" CFLAGS+= -D__HAVE_LONG_DOUBLE .endif NETBSD_ATF_TESTS_C= alarm_test NETBSD_ATF_TESTS_C+= assert_test NETBSD_ATF_TESTS_C+= basedirname_test NETBSD_ATF_TESTS_C+= dir_test NETBSD_ATF_TESTS_C+= floatunditf_test NETBSD_ATF_TESTS_C+= fnmatch_test NETBSD_ATF_TESTS_C+= fpclassify_test NETBSD_ATF_TESTS_C+= fpsetmask_test NETBSD_ATF_TESTS_C+= fpsetround_test NETBSD_ATF_TESTS_C+= ftok_test NETBSD_ATF_TESTS_C+= getcwd_test NETBSD_ATF_TESTS_C+= getgrent_test NETBSD_ATF_TESTS_C+= glob_test NETBSD_ATF_TESTS_C+= humanize_number_test NETBSD_ATF_TESTS_C+= isnan_test NETBSD_ATF_TESTS_C+= nice_test NETBSD_ATF_TESTS_C+= pause_test NETBSD_ATF_TESTS_C+= raise_test NETBSD_ATF_TESTS_C+= realpath_test NETBSD_ATF_TESTS_C+= setdomainname_test NETBSD_ATF_TESTS_C+= sethostname_test NETBSD_ATF_TESTS_C+= sleep_test NETBSD_ATF_TESTS_C+= syslog_test NETBSD_ATF_TESTS_C+= time_test NETBSD_ATF_TESTS_C+= ttyname_test NETBSD_ATF_TESTS_C+= vis_test .include "../Makefile.netbsd-tests" LIBADD.humanize_number_test+= util LIBADD.fpclassify_test+=m LIBADD.fpsetround_test+=m LIBADD.siginfo_test+= m LIBADD.nice_test+= pthread LIBADD.syslog_test+= pthread CFLAGS+= -I${.CURDIR} SRCS.fmtcheck2_test= fmtcheck_test.c SRCS.fnmatch2_test= fnmatch_test.c TEST_METADATA.setdomainname_test+= is_exclusive=true TESTS_SUBDIRS= execve TESTS_SUBDIRS+= posix_spawn # The old testcase name TEST_FNMATCH= test-fnmatch CLEANFILES+= ${GEN_SH_CASE_TESTCASES} sh-tests: .PHONY .for target in clean obj depend all @cd ${.CURDIR} && ${MAKE} PROG=${TEST_FNMATCH} \ -DNO_SUBDIR ${target} .endfor @cd ${.OBJDIR} && ./${TEST_FNMATCH} -s 1 > \ ${SRCTOP}/bin/sh/tests/builtins/case2.0 @cd ${.OBJDIR} && ./${TEST_FNMATCH} -s 2 > \ ${SRCTOP}/bin/sh/tests/builtins/case3.0 .include Index: projects/runtime-coverage/lib/libc/tests/gen/getmntinfo_test.c =================================================================== --- projects/runtime-coverage/lib/libc/tests/gen/getmntinfo_test.c (nonexistent) +++ projects/runtime-coverage/lib/libc/tests/gen/getmntinfo_test.c (revision 322922) @@ -0,0 +1,86 @@ +/*- + * Copyright (c) 2017 Conrad Meyer + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Limited test program for getmntinfo(3), a non-standard BSDism. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include + +#include + +#include + +static void +check_mntinfo(struct statfs *mntinfo, int n) +{ + int i; + + for (i = 0; i < n; i++) { + ATF_REQUIRE_MSG(mntinfo[i].f_version == STATFS_VERSION, "%ju", + (uintmax_t)mntinfo[i].f_version); + ATF_REQUIRE(mntinfo[i].f_namemax <= sizeof(mntinfo[0].f_mntonname)); + } +} + +ATF_TC_WITHOUT_HEAD(getmntinfo_test); +ATF_TC_BODY(getmntinfo_test, tc) +{ + int nmnts; + struct statfs *mntinfo; + + /* Test bogus mode */ + nmnts = getmntinfo(&mntinfo, 199); + ATF_REQUIRE_MSG(nmnts == 0 && errno == EINVAL, + "getmntinfo() succeeded; errno=%d", errno); + + /* Valid modes */ + nmnts = getmntinfo(&mntinfo, MNT_NOWAIT); + ATF_REQUIRE_MSG(nmnts != 0, "getmntinfo(MNT_NOWAIT) failed; errno=%d", + errno); + + check_mntinfo(mntinfo, nmnts); + memset(mntinfo, 0xdf, sizeof(*mntinfo) * nmnts); + + nmnts = getmntinfo(&mntinfo, MNT_WAIT); + ATF_REQUIRE_MSG(nmnts != 0, "getmntinfo(MNT_WAIT) failed; errno=%d", + errno); + + check_mntinfo(mntinfo, nmnts); +} + +ATF_TP_ADD_TCS(tp) +{ + + ATF_TP_ADD_TC(tp, getmntinfo_test); + + return (atf_no_error()); +} Property changes on: projects/runtime-coverage/lib/libc/tests/gen/getmntinfo_test.c ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: projects/runtime-coverage/lib/msun/tests/trig_test.c =================================================================== --- projects/runtime-coverage/lib/msun/tests/trig_test.c (revision 322921) +++ projects/runtime-coverage/lib/msun/tests/trig_test.c (revision 322922) @@ -1,285 +1,280 @@ /*- * Copyright (c) 2008 David Schultz * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * Tests for corner cases in trigonometric functions. Some accuracy tests * are included as well, but these are very basic sanity checks, not * intended to be comprehensive. * * The program for generating representable numbers near multiples of pi is * available at http://www.cs.berkeley.edu/~wkahan/testpi/ . */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include "test-utils.h" #pragma STDC FENV_ACCESS ON /* * Test that a function returns the correct value and sets the * exception flags correctly. The exceptmask specifies which * exceptions we should check. We need to be lenient for several * reasons, but mainly because on some architectures it's impossible * to raise FE_OVERFLOW without raising FE_INEXACT. * * These are macros instead of functions so that assert provides more * meaningful error messages. * * XXX The volatile here is to avoid gcc's bogus constant folding and work * around the lack of support for the FENV_ACCESS pragma. */ #define test(func, x, result, exceptmask, excepts) do { \ volatile long double _d = x; \ ATF_CHECK(feclearexcept(FE_ALL_EXCEPT) == 0); \ ATF_CHECK(fpequal((func)(_d), (result))); \ ATF_CHECK(((void)(func), fetestexcept(exceptmask) == (excepts))); \ } while (0) #define testall(prefix, x, result, exceptmask, excepts) do { \ test(prefix, x, (double)result, exceptmask, excepts); \ test(prefix##f, x, (float)result, exceptmask, excepts); \ test(prefix##l, x, result, exceptmask, excepts); \ } while (0) #define testdf(prefix, x, result, exceptmask, excepts) do { \ test(prefix, x, (double)result, exceptmask, excepts); \ test(prefix##f, x, (float)result, exceptmask, excepts); \ } while (0) ATF_TC(special); ATF_TC_HEAD(special, tc) { atf_tc_set_md_var(tc, "descr", "test special cases in sin(), cos(), and tan()"); } ATF_TC_BODY(special, tc) { /* Values at 0 should be exact. */ testall(tan, 0.0, 0.0, ALL_STD_EXCEPT, 0); testall(tan, -0.0, -0.0, ALL_STD_EXCEPT, 0); testall(cos, 0.0, 1.0, ALL_STD_EXCEPT, 0); testall(cos, -0.0, 1.0, ALL_STD_EXCEPT, 0); testall(sin, 0.0, 0.0, ALL_STD_EXCEPT, 0); testall(sin, -0.0, -0.0, ALL_STD_EXCEPT, 0); /* func(+-Inf) == NaN */ testall(tan, INFINITY, NAN, ALL_STD_EXCEPT, FE_INVALID); testall(sin, INFINITY, NAN, ALL_STD_EXCEPT, FE_INVALID); testall(cos, INFINITY, NAN, ALL_STD_EXCEPT, FE_INVALID); testall(tan, -INFINITY, NAN, ALL_STD_EXCEPT, FE_INVALID); testall(sin, -INFINITY, NAN, ALL_STD_EXCEPT, FE_INVALID); testall(cos, -INFINITY, NAN, ALL_STD_EXCEPT, FE_INVALID); /* func(NaN) == NaN */ testall(tan, NAN, NAN, ALL_STD_EXCEPT, 0); testall(sin, NAN, NAN, ALL_STD_EXCEPT, 0); testall(cos, NAN, NAN, ALL_STD_EXCEPT, 0); } #ifndef __i386__ ATF_TC(reduction); ATF_TC_HEAD(reduction, tc) { atf_tc_set_md_var(tc, "descr", "tests to ensure argument reduction for large arguments is accurate"); } ATF_TC_BODY(reduction, tc) { /* floats very close to odd multiples of pi */ static const float f_pi_odd[] = { 85563208.0f, 43998769152.0f, 9.2763667655669323e+25f, 1.5458357838905804e+29f, }; /* doubles very close to odd multiples of pi */ static const double d_pi_odd[] = { 3.1415926535897931, 91.106186954104004, 642615.9188844458, 3397346.5699258847, 6134899525417045.0, 3.0213551960457761e+43, 1.2646209897993783e+295, 6.2083625380677099e+307, }; /* long doubles very close to odd multiples of pi */ #if LDBL_MANT_DIG == 64 static const long double ld_pi_odd[] = { 1.1891886960373841596e+101L, 1.07999475322710967206e+2087L, 6.522151627890431836e+2147L, 8.9368974898260328229e+2484L, 9.2961044110572205863e+2555L, 4.90208421886578286e+3189L, 1.5275546401232615884e+3317L, 1.7227465626338900093e+3565L, 2.4160090594000745334e+3808L, 9.8477555741888350649e+4314L, 1.6061597222105160737e+4326L, }; #endif -#if defined(__clang__) && \ - ((__clang_major__ >= 5)) - atf_tc_expect_fail("test fails with clang 5.0+ - bug 220989"); -#endif - unsigned i; for (i = 0; i < nitems(f_pi_odd); i++) { ATF_CHECK(fabs(sinf(f_pi_odd[i])) < FLT_EPSILON); ATF_CHECK(cosf(f_pi_odd[i]) == -1.0); ATF_CHECK(fabs(tan(f_pi_odd[i])) < FLT_EPSILON); ATF_CHECK(fabs(sinf(-f_pi_odd[i])) < FLT_EPSILON); ATF_CHECK(cosf(-f_pi_odd[i]) == -1.0); ATF_CHECK(fabs(tanf(-f_pi_odd[i])) < FLT_EPSILON); ATF_CHECK(fabs(sinf(f_pi_odd[i] * 2)) < FLT_EPSILON); ATF_CHECK(cosf(f_pi_odd[i] * 2) == 1.0); ATF_CHECK(fabs(tanf(f_pi_odd[i] * 2)) < FLT_EPSILON); ATF_CHECK(fabs(sinf(-f_pi_odd[i] * 2)) < FLT_EPSILON); ATF_CHECK(cosf(-f_pi_odd[i] * 2) == 1.0); ATF_CHECK(fabs(tanf(-f_pi_odd[i] * 2)) < FLT_EPSILON); } for (i = 0; i < nitems(d_pi_odd); i++) { ATF_CHECK(fabs(sin(d_pi_odd[i])) < 2 * DBL_EPSILON); ATF_CHECK(cos(d_pi_odd[i]) == -1.0); ATF_CHECK(fabs(tan(d_pi_odd[i])) < 2 * DBL_EPSILON); ATF_CHECK(fabs(sin(-d_pi_odd[i])) < 2 * DBL_EPSILON); ATF_CHECK(cos(-d_pi_odd[i]) == -1.0); ATF_CHECK(fabs(tan(-d_pi_odd[i])) < 2 * DBL_EPSILON); ATF_CHECK(fabs(sin(d_pi_odd[i] * 2)) < 2 * DBL_EPSILON); ATF_CHECK(cos(d_pi_odd[i] * 2) == 1.0); ATF_CHECK(fabs(tan(d_pi_odd[i] * 2)) < 2 * DBL_EPSILON); ATF_CHECK(fabs(sin(-d_pi_odd[i] * 2)) < 2 * DBL_EPSILON); ATF_CHECK(cos(-d_pi_odd[i] * 2) == 1.0); ATF_CHECK(fabs(tan(-d_pi_odd[i] * 2)) < 2 * DBL_EPSILON); } #if LDBL_MANT_DIG == 64 /* XXX: || LDBL_MANT_DIG == 113 */ for (i = 0; i < nitems(ld_pi_odd); i++) { ATF_CHECK(fabsl(sinl(ld_pi_odd[i])) < LDBL_EPSILON); ATF_CHECK(cosl(ld_pi_odd[i]) == -1.0); ATF_CHECK(fabsl(tanl(ld_pi_odd[i])) < LDBL_EPSILON); ATF_CHECK(fabsl(sinl(-ld_pi_odd[i])) < LDBL_EPSILON); ATF_CHECK(cosl(-ld_pi_odd[i]) == -1.0); ATF_CHECK(fabsl(tanl(-ld_pi_odd[i])) < LDBL_EPSILON); ATF_CHECK(fabsl(sinl(ld_pi_odd[i] * 2)) < LDBL_EPSILON); ATF_CHECK(cosl(ld_pi_odd[i] * 2) == 1.0); ATF_CHECK(fabsl(tanl(ld_pi_odd[i] * 2)) < LDBL_EPSILON); ATF_CHECK(fabsl(sinl(-ld_pi_odd[i] * 2)) < LDBL_EPSILON); ATF_CHECK(cosl(-ld_pi_odd[i] * 2) == 1.0); ATF_CHECK(fabsl(tanl(-ld_pi_odd[i] * 2)) < LDBL_EPSILON); } #endif } ATF_TC(accuracy); ATF_TC_HEAD(accuracy, tc) { atf_tc_set_md_var(tc, "descr", "tests the accuracy of these functions over the primary range"); } ATF_TC_BODY(accuracy, tc) { /* For small args, sin(x) = tan(x) = x, and cos(x) = 1. */ testall(sin, 0xd.50ee515fe4aea16p-114L, 0xd.50ee515fe4aea16p-114L, ALL_STD_EXCEPT, FE_INEXACT); testall(tan, 0xd.50ee515fe4aea16p-114L, 0xd.50ee515fe4aea16p-114L, ALL_STD_EXCEPT, FE_INEXACT); testall(cos, 0xd.50ee515fe4aea16p-114L, 1.0, ALL_STD_EXCEPT, FE_INEXACT); /* * These tests should pass for f32, d64, and ld80 as long as * the error is <= 0.75 ulp (round to nearest) */ #if LDBL_MANT_DIG <= 64 #define testacc testall #else #define testacc testdf #endif testacc(sin, 0.17255452780841205174L, 0.17169949801444412683L, ALL_STD_EXCEPT, FE_INEXACT); testacc(sin, -0.75431944555904520893L, -0.68479288156557286353L, ALL_STD_EXCEPT, FE_INEXACT); testacc(cos, 0.70556358769838947292L, 0.76124620693117771850L, ALL_STD_EXCEPT, FE_INEXACT); testacc(cos, -0.34061437849088045332L, 0.94254960031831729956L, ALL_STD_EXCEPT, FE_INEXACT); testacc(tan, -0.15862817413325692897L, -0.15997221861309522115L, ALL_STD_EXCEPT, FE_INEXACT); testacc(tan, 0.38374784931303813530L, 0.40376500259976759951L, ALL_STD_EXCEPT, FE_INEXACT); /* * XXX missing: * - tests for ld128 * - tests for other rounding modes (probably won't pass for now) * - tests for large numbers that get reduced to hi+lo with lo!=0 */ } #endif ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, special); #ifndef __i386__ ATF_TP_ADD_TC(tp, accuracy); ATF_TP_ADD_TC(tp, reduction); #endif return (atf_no_error()); } Index: projects/runtime-coverage/sys/boot/efi/loader/arch/amd64/Makefile.inc =================================================================== --- projects/runtime-coverage/sys/boot/efi/loader/arch/amd64/Makefile.inc (revision 322921) +++ projects/runtime-coverage/sys/boot/efi/loader/arch/amd64/Makefile.inc (revision 322922) @@ -1,15 +1,16 @@ # $FreeBSD$ SRCS+= amd64_tramp.S \ start.S \ framebuffer.c \ elf64_freebsd.c \ trap.c \ exc.S .PATH: ${.CURDIR}/../../i386/libi386 SRCS+= nullconsole.c \ - comconsole.c + comconsole.c \ + spinconsole.c -CFLAGS+= -fPIC +CFLAGS+= -fPIC -DTERM_EMU LDFLAGS+= -Wl,-znocombreloc Index: projects/runtime-coverage/sys/boot/efi/loader/arch/i386/Makefile.inc =================================================================== --- projects/runtime-coverage/sys/boot/efi/loader/arch/i386/Makefile.inc (revision 322921) +++ projects/runtime-coverage/sys/boot/efi/loader/arch/i386/Makefile.inc (revision 322922) @@ -1,13 +1,14 @@ # $FreeBSD$ SRCS+= start.S \ efimd.c \ elf32_freebsd.c \ exec.c .PATH: ${.CURDIR}/../../i386/libi386 SRCS+= nullconsole.c \ - comconsole.c + comconsole.c \ + spinconsole.c -CFLAGS+= -fPIC +CFLAGS+= -fPIC -DTERM_EMU LDFLAGS+= -Wl,-znocombreloc Index: projects/runtime-coverage/sys/boot/efi/loader/conf.c =================================================================== --- projects/runtime-coverage/sys/boot/efi/loader/conf.c (revision 322921) +++ projects/runtime-coverage/sys/boot/efi/loader/conf.c (revision 322922) @@ -1,81 +1,83 @@ /*- * Copyright (c) 2006 Marcel Moolenaar * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #ifdef EFI_ZFS_BOOT #include #endif struct devsw *devsw[] = { &efipart_fddev, &efipart_cddev, &efipart_hddev, &efinet_dev, #ifdef EFI_ZFS_BOOT &zfs_dev, #endif NULL }; struct fs_ops *file_system[] = { #ifdef EFI_ZFS_BOOT &zfs_fsops, #endif &dosfs_fsops, &ufs_fsops, &cd9660_fsops, &tftp_fsops, &nfs_fsops, &gzipfs_fsops, &bzipfs_fsops, NULL }; struct netif_driver *netif_drivers[] = { &efinetif, NULL }; extern struct console efi_console; #if defined(__amd64__) || defined(__i386__) extern struct console comconsole; extern struct console nullconsole; +extern struct console spinconsole; #endif struct console *consoles[] = { &efi_console, #if defined(__amd64__) || defined(__i386__) &comconsole, &nullconsole, + &spinconsole, #endif NULL }; Index: projects/runtime-coverage/sys/boot/i386/libi386/spinconsole.c =================================================================== --- projects/runtime-coverage/sys/boot/i386/libi386/spinconsole.c (revision 322921) +++ projects/runtime-coverage/sys/boot/i386/libi386/spinconsole.c (revision 322922) @@ -1,108 +1,112 @@ /*- * spinconsole.c * * Author: Maksym Sobolyev * Copyright (c) 2009 Sippy Software, Inc. * All rights reserved. * * Subject to the following obligations and disclaimer of warranty, use and * redistribution of this software, in source or object code forms, with or * without modifications are expressly permitted by Whistle Communications; * provided, however, that: * 1. Any and all reproductions of the source or object code must include the * copyright notice above and the following disclaimer of warranties; and * 2. No rights are granted, in any manner or form, to use Whistle * Communications, Inc. trademarks, including the mark "WHISTLE * COMMUNICATIONS" on advertising, endorsements, or otherwise except as * such appears in the above copyright notice or in the software. * * THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND * TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO * REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE, * INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. * WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY * REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS * SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE. * IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES * RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING * WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, * PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY * OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include -extern void get_pos(int *x, int *y); -extern void curs_move(int *_x, int *_y, int x, int y); -extern void vidc_biosputchar(int c); - static void spinc_probe(struct console *cp); static int spinc_init(int arg); static void spinc_putchar(int c); static int spinc_getchar(void); static int spinc_ischar(void); +extern struct console *consoles[]; + struct console spinconsole = { "spinconsole", "spin port", 0, spinc_probe, spinc_init, spinc_putchar, spinc_getchar, spinc_ischar }; +static struct console *parent = NULL; + static void spinc_probe(struct console *cp) { - cp->c_flags |= (C_PRESENTIN | C_PRESENTOUT); + + if (parent == NULL) + parent = consoles[0]; + parent->c_probe(cp); } static int spinc_init(int arg) { - return(0); + + return(parent->c_init(arg)); } static void spinc_putchar(int c) { - static int curx, cury; static unsigned tw_chars = 0x5C2D2F7C; /* "\-/|" */ - static time_t lasttime; + static time_t lasttime = 0; time_t now; - now = time(NULL); + now = time(0); if (now < (lasttime + 1)) return; - lasttime = now; #ifdef TERM_EMU - get_pos(&curx, &cury); - if (curx > 0) - curs_move(&curx, &cury, curx - 1, cury); + if (lasttime > 0) + parent->c_out('\b'); #endif - vidc_biosputchar((char)tw_chars); + lasttime = now; + parent->c_out((char)tw_chars); tw_chars = (tw_chars >> 8) | ((tw_chars & (unsigned long)0xFF) << 24); } static int spinc_getchar(void) { + return(-1); } static int spinc_ischar(void) { + return(0); } Index: projects/runtime-coverage/sys/compat/cloudabi/cloudabi_fd.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi/cloudabi_fd.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi/cloudabi_fd.c (revision 322922) @@ -1,529 +1,513 @@ /*- * Copyright (c) 2015 Nuxi, https://nuxi.nl/ * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* Translation between CloudABI and Capsicum rights. */ #define RIGHTS_MAPPINGS \ MAPPING(CLOUDABI_RIGHT_FD_DATASYNC, CAP_FSYNC) \ MAPPING(CLOUDABI_RIGHT_FD_READ, CAP_READ) \ MAPPING(CLOUDABI_RIGHT_FD_SEEK, CAP_SEEK) \ MAPPING(CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS, CAP_FCNTL) \ MAPPING(CLOUDABI_RIGHT_FD_SYNC, CAP_FSYNC) \ MAPPING(CLOUDABI_RIGHT_FD_TELL, CAP_SEEK_TELL) \ MAPPING(CLOUDABI_RIGHT_FD_WRITE, CAP_WRITE) \ MAPPING(CLOUDABI_RIGHT_FILE_ADVISE) \ MAPPING(CLOUDABI_RIGHT_FILE_ALLOCATE, CAP_WRITE) \ MAPPING(CLOUDABI_RIGHT_FILE_CREATE_DIRECTORY, CAP_MKDIRAT) \ MAPPING(CLOUDABI_RIGHT_FILE_CREATE_FILE, CAP_CREATE) \ MAPPING(CLOUDABI_RIGHT_FILE_CREATE_FIFO, CAP_MKFIFOAT) \ MAPPING(CLOUDABI_RIGHT_FILE_LINK_SOURCE, CAP_LINKAT_SOURCE) \ MAPPING(CLOUDABI_RIGHT_FILE_LINK_TARGET, CAP_LINKAT_TARGET) \ MAPPING(CLOUDABI_RIGHT_FILE_OPEN, CAP_LOOKUP) \ MAPPING(CLOUDABI_RIGHT_FILE_READDIR, CAP_READ) \ MAPPING(CLOUDABI_RIGHT_FILE_READLINK, CAP_LOOKUP) \ MAPPING(CLOUDABI_RIGHT_FILE_RENAME_SOURCE, CAP_RENAMEAT_SOURCE) \ MAPPING(CLOUDABI_RIGHT_FILE_RENAME_TARGET, CAP_RENAMEAT_TARGET) \ MAPPING(CLOUDABI_RIGHT_FILE_STAT_FGET, CAP_FSTAT) \ MAPPING(CLOUDABI_RIGHT_FILE_STAT_FPUT_SIZE, CAP_FTRUNCATE) \ MAPPING(CLOUDABI_RIGHT_FILE_STAT_FPUT_TIMES, CAP_FUTIMES) \ MAPPING(CLOUDABI_RIGHT_FILE_STAT_GET, CAP_FSTATAT) \ MAPPING(CLOUDABI_RIGHT_FILE_STAT_PUT_TIMES, CAP_FUTIMESAT) \ MAPPING(CLOUDABI_RIGHT_FILE_SYMLINK, CAP_SYMLINKAT) \ MAPPING(CLOUDABI_RIGHT_FILE_UNLINK, CAP_UNLINKAT) \ MAPPING(CLOUDABI_RIGHT_MEM_MAP, CAP_MMAP) \ MAPPING(CLOUDABI_RIGHT_MEM_MAP_EXEC, CAP_MMAP_X) \ MAPPING(CLOUDABI_RIGHT_POLL_FD_READWRITE, CAP_EVENT) \ MAPPING(CLOUDABI_RIGHT_POLL_MODIFY, CAP_KQUEUE_CHANGE) \ MAPPING(CLOUDABI_RIGHT_POLL_PROC_TERMINATE, CAP_EVENT) \ MAPPING(CLOUDABI_RIGHT_POLL_WAIT, CAP_KQUEUE_EVENT) \ MAPPING(CLOUDABI_RIGHT_PROC_EXEC, CAP_FEXECVE) \ MAPPING(CLOUDABI_RIGHT_SOCK_ACCEPT, CAP_ACCEPT) \ - MAPPING(CLOUDABI_RIGHT_SOCK_BIND_DIRECTORY, CAP_BINDAT) \ - MAPPING(CLOUDABI_RIGHT_SOCK_BIND_SOCKET, CAP_BIND) \ - MAPPING(CLOUDABI_RIGHT_SOCK_CONNECT_DIRECTORY, CAP_CONNECTAT) \ - MAPPING(CLOUDABI_RIGHT_SOCK_CONNECT_SOCKET, CAP_CONNECT) \ - MAPPING(CLOUDABI_RIGHT_SOCK_LISTEN, CAP_LISTEN) \ MAPPING(CLOUDABI_RIGHT_SOCK_SHUTDOWN, CAP_SHUTDOWN) \ MAPPING(CLOUDABI_RIGHT_SOCK_STAT_GET, CAP_GETPEERNAME, \ CAP_GETSOCKNAME, CAP_GETSOCKOPT) int cloudabi_sys_fd_close(struct thread *td, struct cloudabi_sys_fd_close_args *uap) { return (kern_close(td, uap->fd)); } int cloudabi_sys_fd_create1(struct thread *td, struct cloudabi_sys_fd_create1_args *uap) { struct filecaps fcaps = {}; switch (uap->type) { case CLOUDABI_FILETYPE_POLL: cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_KQUEUE); return (kern_kqueue(td, 0, &fcaps)); case CLOUDABI_FILETYPE_SHARED_MEMORY: cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_FTRUNCATE, CAP_MMAP_RWX); return (kern_shm_open(td, SHM_ANON, O_RDWR, 0, &fcaps)); - case CLOUDABI_FILETYPE_SOCKET_DGRAM: - return (kern_socket(td, AF_UNIX, SOCK_DGRAM, 0)); - case CLOUDABI_FILETYPE_SOCKET_STREAM: - return (kern_socket(td, AF_UNIX, SOCK_STREAM, 0)); default: return (EINVAL); } } int cloudabi_sys_fd_create2(struct thread *td, struct cloudabi_sys_fd_create2_args *uap) { struct filecaps fcaps1 = {}, fcaps2 = {}; int fds[2]; int error; switch (uap->type) { case CLOUDABI_FILETYPE_FIFO: /* * CloudABI pipes are unidirectional. Restrict rights on * the pipe to simulate this. */ cap_rights_init(&fcaps1.fc_rights, CAP_EVENT, CAP_FCNTL, CAP_FSTAT, CAP_READ); fcaps1.fc_fcntls = CAP_FCNTL_SETFL; cap_rights_init(&fcaps2.fc_rights, CAP_EVENT, CAP_FCNTL, CAP_FSTAT, CAP_WRITE); fcaps2.fc_fcntls = CAP_FCNTL_SETFL; error = kern_pipe(td, fds, 0, &fcaps1, &fcaps2); break; case CLOUDABI_FILETYPE_SOCKET_DGRAM: error = kern_socketpair(td, AF_UNIX, SOCK_DGRAM, 0, fds); break; case CLOUDABI_FILETYPE_SOCKET_STREAM: error = kern_socketpair(td, AF_UNIX, SOCK_STREAM, 0, fds); break; default: return (EINVAL); } if (error == 0) { td->td_retval[0] = fds[0]; td->td_retval[1] = fds[1]; } return (0); } int cloudabi_sys_fd_datasync(struct thread *td, struct cloudabi_sys_fd_datasync_args *uap) { return (kern_fsync(td, uap->fd, false)); } int cloudabi_sys_fd_dup(struct thread *td, struct cloudabi_sys_fd_dup_args *uap) { return (kern_dup(td, FDDUP_NORMAL, 0, uap->from, 0)); } int cloudabi_sys_fd_replace(struct thread *td, struct cloudabi_sys_fd_replace_args *uap) { int error; /* * CloudABI's equivalent to dup2(). CloudABI processes should * not depend on hardcoded file descriptor layouts, but simply * use the file descriptor numbers that are allocated by the * kernel. Duplicating file descriptors to arbitrary numbers * should not be done. * * Invoke kern_dup() with FDDUP_MUSTREPLACE, so that we return * EBADF when duplicating to a nonexistent file descriptor. Also * clear the return value, as this system call yields no return * value. */ error = kern_dup(td, FDDUP_MUSTREPLACE, 0, uap->from, uap->to); td->td_retval[0] = 0; return (error); } int cloudabi_sys_fd_seek(struct thread *td, struct cloudabi_sys_fd_seek_args *uap) { int whence; switch (uap->whence) { case CLOUDABI_WHENCE_CUR: whence = SEEK_CUR; break; case CLOUDABI_WHENCE_END: whence = SEEK_END; break; case CLOUDABI_WHENCE_SET: whence = SEEK_SET; break; default: return (EINVAL); } return (kern_lseek(td, uap->fd, uap->offset, whence)); } /* Converts a file descriptor to a CloudABI file descriptor type. */ cloudabi_filetype_t cloudabi_convert_filetype(const struct file *fp) { struct socket *so; struct vnode *vp; switch (fp->f_type) { case DTYPE_FIFO: return (CLOUDABI_FILETYPE_FIFO); case DTYPE_KQUEUE: return (CLOUDABI_FILETYPE_POLL); case DTYPE_PIPE: return (CLOUDABI_FILETYPE_FIFO); case DTYPE_PROCDESC: return (CLOUDABI_FILETYPE_PROCESS); case DTYPE_SHM: return (CLOUDABI_FILETYPE_SHARED_MEMORY); case DTYPE_SOCKET: so = fp->f_data; switch (so->so_type) { case SOCK_DGRAM: return (CLOUDABI_FILETYPE_SOCKET_DGRAM); case SOCK_STREAM: return (CLOUDABI_FILETYPE_SOCKET_STREAM); default: return (CLOUDABI_FILETYPE_UNKNOWN); } case DTYPE_VNODE: vp = fp->f_vnode; switch (vp->v_type) { case VBLK: return (CLOUDABI_FILETYPE_BLOCK_DEVICE); case VCHR: return (CLOUDABI_FILETYPE_CHARACTER_DEVICE); case VDIR: return (CLOUDABI_FILETYPE_DIRECTORY); case VFIFO: return (CLOUDABI_FILETYPE_FIFO); case VLNK: return (CLOUDABI_FILETYPE_SYMBOLIC_LINK); case VREG: return (CLOUDABI_FILETYPE_REGULAR_FILE); case VSOCK: return (CLOUDABI_FILETYPE_SOCKET_STREAM); default: return (CLOUDABI_FILETYPE_UNKNOWN); } default: return (CLOUDABI_FILETYPE_UNKNOWN); } } /* Removes rights that conflict with the file descriptor type. */ void cloudabi_remove_conflicting_rights(cloudabi_filetype_t filetype, cloudabi_rights_t *base, cloudabi_rights_t *inheriting) { /* * CloudABI has a small number of additional rights bits to * disambiguate between multiple purposes. Remove the bits that * don't apply to the type of the file descriptor. * * As file descriptor access modes (O_ACCMODE) has been fully * replaced by rights bits, CloudABI distinguishes between * rights that apply to the file descriptor itself (base) versus * rights of new file descriptors derived from them * (inheriting). The code below approximates the pair by * decomposing depending on the file descriptor type. * * We need to be somewhat accurate about which actions can * actually be performed on the file descriptor, as functions * like fcntl(fd, F_GETFL) are emulated on top of this. */ switch (filetype) { case CLOUDABI_FILETYPE_DIRECTORY: *base &= CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS | CLOUDABI_RIGHT_FD_SYNC | CLOUDABI_RIGHT_FILE_ADVISE | CLOUDABI_RIGHT_FILE_CREATE_DIRECTORY | CLOUDABI_RIGHT_FILE_CREATE_FILE | CLOUDABI_RIGHT_FILE_CREATE_FIFO | CLOUDABI_RIGHT_FILE_LINK_SOURCE | CLOUDABI_RIGHT_FILE_LINK_TARGET | CLOUDABI_RIGHT_FILE_OPEN | CLOUDABI_RIGHT_FILE_READDIR | CLOUDABI_RIGHT_FILE_READLINK | CLOUDABI_RIGHT_FILE_RENAME_SOURCE | CLOUDABI_RIGHT_FILE_RENAME_TARGET | CLOUDABI_RIGHT_FILE_STAT_FGET | CLOUDABI_RIGHT_FILE_STAT_FPUT_TIMES | CLOUDABI_RIGHT_FILE_STAT_GET | CLOUDABI_RIGHT_FILE_STAT_PUT_TIMES | CLOUDABI_RIGHT_FILE_SYMLINK | CLOUDABI_RIGHT_FILE_UNLINK | - CLOUDABI_RIGHT_POLL_FD_READWRITE | - CLOUDABI_RIGHT_SOCK_BIND_DIRECTORY | - CLOUDABI_RIGHT_SOCK_CONNECT_DIRECTORY; + CLOUDABI_RIGHT_POLL_FD_READWRITE; *inheriting &= CLOUDABI_RIGHT_FD_DATASYNC | CLOUDABI_RIGHT_FD_READ | CLOUDABI_RIGHT_FD_SEEK | CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS | CLOUDABI_RIGHT_FD_SYNC | CLOUDABI_RIGHT_FD_TELL | CLOUDABI_RIGHT_FD_WRITE | CLOUDABI_RIGHT_FILE_ADVISE | CLOUDABI_RIGHT_FILE_ALLOCATE | CLOUDABI_RIGHT_FILE_CREATE_DIRECTORY | CLOUDABI_RIGHT_FILE_CREATE_FILE | CLOUDABI_RIGHT_FILE_CREATE_FIFO | CLOUDABI_RIGHT_FILE_LINK_SOURCE | CLOUDABI_RIGHT_FILE_LINK_TARGET | CLOUDABI_RIGHT_FILE_OPEN | CLOUDABI_RIGHT_FILE_READDIR | CLOUDABI_RIGHT_FILE_READLINK | CLOUDABI_RIGHT_FILE_RENAME_SOURCE | CLOUDABI_RIGHT_FILE_RENAME_TARGET | CLOUDABI_RIGHT_FILE_STAT_FGET | CLOUDABI_RIGHT_FILE_STAT_FPUT_SIZE | CLOUDABI_RIGHT_FILE_STAT_FPUT_TIMES | CLOUDABI_RIGHT_FILE_STAT_GET | CLOUDABI_RIGHT_FILE_STAT_PUT_TIMES | CLOUDABI_RIGHT_FILE_SYMLINK | CLOUDABI_RIGHT_FILE_UNLINK | CLOUDABI_RIGHT_MEM_MAP | CLOUDABI_RIGHT_MEM_MAP_EXEC | CLOUDABI_RIGHT_POLL_FD_READWRITE | - CLOUDABI_RIGHT_PROC_EXEC | - CLOUDABI_RIGHT_SOCK_BIND_DIRECTORY | - CLOUDABI_RIGHT_SOCK_CONNECT_DIRECTORY; + CLOUDABI_RIGHT_PROC_EXEC; break; case CLOUDABI_FILETYPE_FIFO: *base &= CLOUDABI_RIGHT_FD_READ | CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS | CLOUDABI_RIGHT_FD_WRITE | CLOUDABI_RIGHT_FILE_STAT_FGET | CLOUDABI_RIGHT_POLL_FD_READWRITE; *inheriting = 0; break; case CLOUDABI_FILETYPE_POLL: *base &= ~CLOUDABI_RIGHT_FILE_ADVISE; *inheriting = 0; break; case CLOUDABI_FILETYPE_PROCESS: *base &= ~(CLOUDABI_RIGHT_FILE_ADVISE | CLOUDABI_RIGHT_POLL_FD_READWRITE); *inheriting = 0; break; case CLOUDABI_FILETYPE_REGULAR_FILE: *base &= CLOUDABI_RIGHT_FD_DATASYNC | CLOUDABI_RIGHT_FD_READ | CLOUDABI_RIGHT_FD_SEEK | CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS | CLOUDABI_RIGHT_FD_SYNC | CLOUDABI_RIGHT_FD_TELL | CLOUDABI_RIGHT_FD_WRITE | CLOUDABI_RIGHT_FILE_ADVISE | CLOUDABI_RIGHT_FILE_ALLOCATE | CLOUDABI_RIGHT_FILE_STAT_FGET | CLOUDABI_RIGHT_FILE_STAT_FPUT_SIZE | CLOUDABI_RIGHT_FILE_STAT_FPUT_TIMES | CLOUDABI_RIGHT_MEM_MAP | CLOUDABI_RIGHT_MEM_MAP_EXEC | CLOUDABI_RIGHT_POLL_FD_READWRITE | CLOUDABI_RIGHT_PROC_EXEC; *inheriting = 0; break; case CLOUDABI_FILETYPE_SHARED_MEMORY: *base &= ~(CLOUDABI_RIGHT_FD_SEEK | CLOUDABI_RIGHT_FD_TELL | CLOUDABI_RIGHT_FILE_ADVISE | CLOUDABI_RIGHT_FILE_ALLOCATE | CLOUDABI_RIGHT_FILE_READDIR); *inheriting = 0; break; case CLOUDABI_FILETYPE_SOCKET_DGRAM: case CLOUDABI_FILETYPE_SOCKET_STREAM: *base &= CLOUDABI_RIGHT_FD_READ | CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS | CLOUDABI_RIGHT_FD_WRITE | CLOUDABI_RIGHT_FILE_STAT_FGET | CLOUDABI_RIGHT_POLL_FD_READWRITE | CLOUDABI_RIGHT_SOCK_ACCEPT | - CLOUDABI_RIGHT_SOCK_BIND_SOCKET | - CLOUDABI_RIGHT_SOCK_CONNECT_SOCKET | - CLOUDABI_RIGHT_SOCK_LISTEN | CLOUDABI_RIGHT_SOCK_SHUTDOWN | CLOUDABI_RIGHT_SOCK_STAT_GET; break; default: *inheriting = 0; break; } } /* Converts FreeBSD's Capsicum rights to CloudABI's set of rights. */ static void convert_capabilities(const cap_rights_t *capabilities, cloudabi_filetype_t filetype, cloudabi_rights_t *base, cloudabi_rights_t *inheriting) { cloudabi_rights_t rights; /* Convert FreeBSD bits to CloudABI bits. */ rights = 0; #define MAPPING(cloudabi, ...) do { \ if (cap_rights_is_set(capabilities, ##__VA_ARGS__)) \ rights |= (cloudabi); \ } while (0); RIGHTS_MAPPINGS #undef MAPPING *base = rights; *inheriting = rights; cloudabi_remove_conflicting_rights(filetype, base, inheriting); } int cloudabi_sys_fd_stat_get(struct thread *td, struct cloudabi_sys_fd_stat_get_args *uap) { cloudabi_fdstat_t fsb = {}; struct file *fp; cap_rights_t rights; struct filecaps fcaps; int error, oflags; /* Obtain file descriptor properties. */ error = fget_cap(td, uap->fd, cap_rights_init(&rights), &fp, &fcaps); if (error != 0) return (error); oflags = OFLAGS(fp->f_flag); fsb.fs_filetype = cloudabi_convert_filetype(fp); fdrop(fp, td); /* Convert file descriptor flags. */ if (oflags & O_APPEND) fsb.fs_flags |= CLOUDABI_FDFLAG_APPEND; if (oflags & O_NONBLOCK) fsb.fs_flags |= CLOUDABI_FDFLAG_NONBLOCK; if (oflags & O_SYNC) fsb.fs_flags |= CLOUDABI_FDFLAG_SYNC; /* Convert capabilities to CloudABI rights. */ convert_capabilities(&fcaps.fc_rights, fsb.fs_filetype, &fsb.fs_rights_base, &fsb.fs_rights_inheriting); filecaps_free(&fcaps); return (copyout(&fsb, (void *)uap->buf, sizeof(fsb))); } /* Converts CloudABI rights to a set of Capsicum capabilities. */ int cloudabi_convert_rights(cloudabi_rights_t in, cap_rights_t *out) { cap_rights_init(out); #define MAPPING(cloudabi, ...) do { \ if (in & (cloudabi)) { \ cap_rights_set(out, ##__VA_ARGS__); \ in &= ~(cloudabi); \ } \ } while (0); RIGHTS_MAPPINGS #undef MAPPING if (in != 0) return (ENOTCAPABLE); return (0); } int cloudabi_sys_fd_stat_put(struct thread *td, struct cloudabi_sys_fd_stat_put_args *uap) { cloudabi_fdstat_t fsb; cap_rights_t rights; int error, oflags; error = copyin(uap->buf, &fsb, sizeof(fsb)); if (error != 0) return (error); if (uap->flags == CLOUDABI_FDSTAT_FLAGS) { /* Convert flags. */ oflags = 0; if (fsb.fs_flags & CLOUDABI_FDFLAG_APPEND) oflags |= O_APPEND; if (fsb.fs_flags & CLOUDABI_FDFLAG_NONBLOCK) oflags |= O_NONBLOCK; if (fsb.fs_flags & (CLOUDABI_FDFLAG_SYNC | CLOUDABI_FDFLAG_DSYNC | CLOUDABI_FDFLAG_RSYNC)) oflags |= O_SYNC; return (kern_fcntl(td, uap->fd, F_SETFL, oflags)); } else if (uap->flags == CLOUDABI_FDSTAT_RIGHTS) { /* Convert rights. */ error = cloudabi_convert_rights( fsb.fs_rights_base | fsb.fs_rights_inheriting, &rights); if (error != 0) return (error); return (kern_cap_rights_limit(td, uap->fd, &rights)); } return (EINVAL); } int cloudabi_sys_fd_sync(struct thread *td, struct cloudabi_sys_fd_sync_args *uap) { return (kern_fsync(td, uap->fd, true)); } Index: projects/runtime-coverage/sys/compat/cloudabi/cloudabi_sock.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi/cloudabi_sock.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi/cloudabi_sock.c (revision 322922) @@ -1,283 +1,222 @@ /*- * Copyright (c) 2015-2017 Nuxi, https://nuxi.nl/ * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include -#include #include #include #include #include #include -/* Copies a pathname into a UNIX socket address structure. */ -static int -copyin_sockaddr_un(const char *path, size_t pathlen, struct sockaddr_un *sun) -{ - int error; - - /* Copy in pathname string if there's enough space. */ - if (pathlen >= sizeof(sun->sun_path)) - return (ENAMETOOLONG); - error = copyin(path, &sun->sun_path, pathlen); - if (error != 0) - return (error); - if (memchr(sun->sun_path, '\0', pathlen) != NULL) - return (EINVAL); - - /* Initialize the rest of the socket address. */ - sun->sun_path[pathlen] = '\0'; - sun->sun_family = AF_UNIX; - sun->sun_len = sizeof(*sun); - return (0); -} - int cloudabi_sys_sock_accept(struct thread *td, struct cloudabi_sys_sock_accept_args *uap) { return (kern_accept(td, uap->sock, NULL, NULL, NULL)); } int -cloudabi_sys_sock_bind(struct thread *td, - struct cloudabi_sys_sock_bind_args *uap) -{ - struct sockaddr_un sun; - int error; - - error = copyin_sockaddr_un(uap->path, uap->path_len, &sun); - if (error != 0) - return (error); - return (kern_bindat(td, uap->fd, uap->sock, (struct sockaddr *)&sun)); -} - -int -cloudabi_sys_sock_connect(struct thread *td, - struct cloudabi_sys_sock_connect_args *uap) -{ - struct sockaddr_un sun; - int error; - - error = copyin_sockaddr_un(uap->path, uap->path_len, &sun); - if (error != 0) - return (error); - return (kern_connectat(td, uap->fd, uap->sock, - (struct sockaddr *)&sun)); -} - -int -cloudabi_sys_sock_listen(struct thread *td, - struct cloudabi_sys_sock_listen_args *uap) -{ - - return (kern_listen(td, uap->sock, uap->backlog)); -} - -int cloudabi_sys_sock_shutdown(struct thread *td, struct cloudabi_sys_sock_shutdown_args *uap) { int how; switch (uap->how) { case CLOUDABI_SHUT_RD: how = SHUT_RD; break; case CLOUDABI_SHUT_WR: how = SHUT_WR; break; case CLOUDABI_SHUT_RD | CLOUDABI_SHUT_WR: how = SHUT_RDWR; break; default: return (EINVAL); } return (kern_shutdown(td, uap->sock, how)); } int cloudabi_sys_sock_stat_get(struct thread *td, struct cloudabi_sys_sock_stat_get_args *uap) { cloudabi_sockstat_t ss = {}; cap_rights_t rights; struct file *fp; struct socket *so; int error; error = getsock_cap(td, uap->sock, cap_rights_init(&rights, CAP_GETSOCKOPT, CAP_GETPEERNAME, CAP_GETSOCKNAME), &fp, NULL, NULL); if (error != 0) return (error); so = fp->f_data; /* Set ss_error. */ SOCK_LOCK(so); ss.ss_error = cloudabi_convert_errno(so->so_error); if ((uap->flags & CLOUDABI_SOCKSTAT_CLEAR_ERROR) != 0) so->so_error = 0; SOCK_UNLOCK(so); /* Set ss_state. */ if ((so->so_options & SO_ACCEPTCONN) != 0) ss.ss_state |= CLOUDABI_SOCKSTATE_ACCEPTCONN; fdrop(fp, td); return (copyout(&ss, uap->buf, sizeof(ss))); } int cloudabi_sock_recv(struct thread *td, cloudabi_fd_t fd, struct iovec *data, size_t datalen, cloudabi_fd_t *fds, size_t fdslen, cloudabi_riflags_t flags, size_t *rdatalen, size_t *rfdslen, cloudabi_roflags_t *rflags) { - struct sockaddr_storage ss; struct msghdr hdr = { - .msg_name = &ss, - .msg_namelen = sizeof(ss), .msg_iov = data, .msg_iovlen = datalen, }; struct mbuf *control; int error; /* Convert flags. */ if (flags & CLOUDABI_SOCK_RECV_PEEK) hdr.msg_flags |= MSG_PEEK; if (flags & CLOUDABI_SOCK_RECV_WAITALL) hdr.msg_flags |= MSG_WAITALL; control = NULL; error = kern_recvit(td, fd, &hdr, UIO_SYSSPACE, fdslen > 0 ? &control : NULL); if (error != 0) return (error); /* Convert return values. */ *rdatalen = td->td_retval[0]; td->td_retval[0] = 0; *rfdslen = 0; *rflags = 0; if (hdr.msg_flags & MSG_TRUNC) *rflags |= CLOUDABI_SOCK_RECV_DATA_TRUNCATED; /* Extract file descriptors from SCM_RIGHTS messages. */ if (control != NULL) { struct cmsghdr *chdr; hdr.msg_control = mtod(control, void *); hdr.msg_controllen = control->m_len; for (chdr = CMSG_FIRSTHDR(&hdr); chdr != NULL; chdr = CMSG_NXTHDR(&hdr, chdr)) { if (chdr->cmsg_level == SOL_SOCKET && chdr->cmsg_type == SCM_RIGHTS) { size_t nfds; nfds = (chdr->cmsg_len - CMSG_LEN(0)) / sizeof(int); if (nfds > fdslen) { /* Unable to store file descriptors. */ nfds = fdslen; *rflags |= CLOUDABI_SOCK_RECV_FDS_TRUNCATED; } error = copyout(CMSG_DATA(chdr), fds, nfds * sizeof(int)); if (error != 0) { m_free(control); return (error); } fds += nfds; fdslen -= nfds; *rfdslen += nfds; } } m_free(control); } return (0); } int cloudabi_sock_send(struct thread *td, cloudabi_fd_t fd, struct iovec *data, size_t datalen, const cloudabi_fd_t *fds, size_t fdslen, size_t *rdatalen) { struct msghdr hdr = { .msg_iov = data, .msg_iovlen = datalen, }; struct mbuf *control; int error; /* Convert file descriptor array to an SCM_RIGHTS message. */ if (fdslen > MCLBYTES || CMSG_SPACE(fdslen * sizeof(int)) > MCLBYTES) { return (EINVAL); } else if (fdslen > 0) { struct cmsghdr *chdr; control = m_get2(CMSG_SPACE(fdslen * sizeof(int)), M_WAITOK, MT_CONTROL, 0); control->m_len = CMSG_SPACE(fdslen * sizeof(int)); chdr = mtod(control, struct cmsghdr *); chdr->cmsg_len = CMSG_LEN(fdslen * sizeof(int)); chdr->cmsg_level = SOL_SOCKET; chdr->cmsg_type = SCM_RIGHTS; error = copyin(fds, CMSG_DATA(chdr), fdslen * sizeof(int)); if (error != 0) { m_free(control); return (error); } } else { control = NULL; } error = kern_sendit(td, fd, &hdr, MSG_NOSIGNAL, control, UIO_USERSPACE); if (error != 0) return (error); *rdatalen = td->td_retval[0]; td->td_retval[0] = 0; return (0); } Index: projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_proto.h =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_proto.h (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_proto.h (revision 322922) @@ -1,458 +1,436 @@ /* * System call prototypes. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ #ifndef _CLOUDABI32_SYSPROTO_H_ #define _CLOUDABI32_SYSPROTO_H_ #include #include #include #include #include #include #include #include struct proc; struct thread; #define PAD_(t) (sizeof(register_t) <= sizeof(t) ? \ 0 : sizeof(register_t) - sizeof(t)) #if BYTE_ORDER == LITTLE_ENDIAN #define PADL_(t) 0 #define PADR_(t) PAD_(t) #else #define PADL_(t) PAD_(t) #define PADR_(t) 0 #endif struct cloudabi_sys_clock_res_get_args { char clock_id_l_[PADL_(cloudabi_clockid_t)]; cloudabi_clockid_t clock_id; char clock_id_r_[PADR_(cloudabi_clockid_t)]; }; struct cloudabi_sys_clock_time_get_args { char clock_id_l_[PADL_(cloudabi_clockid_t)]; cloudabi_clockid_t clock_id; char clock_id_r_[PADR_(cloudabi_clockid_t)]; char precision_l_[PADL_(cloudabi_timestamp_t)]; cloudabi_timestamp_t precision; char precision_r_[PADR_(cloudabi_timestamp_t)]; }; struct cloudabi_sys_condvar_signal_args { char condvar_l_[PADL_(cloudabi_condvar_t *)]; cloudabi_condvar_t * condvar; char condvar_r_[PADR_(cloudabi_condvar_t *)]; char scope_l_[PADL_(cloudabi_scope_t)]; cloudabi_scope_t scope; char scope_r_[PADR_(cloudabi_scope_t)]; char nwaiters_l_[PADL_(cloudabi_nthreads_t)]; cloudabi_nthreads_t nwaiters; char nwaiters_r_[PADR_(cloudabi_nthreads_t)]; }; struct cloudabi_sys_fd_close_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi_sys_fd_create1_args { char type_l_[PADL_(cloudabi_filetype_t)]; cloudabi_filetype_t type; char type_r_[PADR_(cloudabi_filetype_t)]; }; struct cloudabi_sys_fd_create2_args { char type_l_[PADL_(cloudabi_filetype_t)]; cloudabi_filetype_t type; char type_r_[PADR_(cloudabi_filetype_t)]; }; struct cloudabi_sys_fd_datasync_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi_sys_fd_dup_args { char from_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t from; char from_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi32_sys_fd_pread_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi32_iovec_t *)]; const cloudabi32_iovec_t * iovs; char iovs_r_[PADR_(const cloudabi32_iovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi32_sys_fd_pwrite_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi32_ciovec_t *)]; const cloudabi32_ciovec_t * iovs; char iovs_r_[PADR_(const cloudabi32_ciovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi32_sys_fd_read_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi32_iovec_t *)]; const cloudabi32_iovec_t * iovs; char iovs_r_[PADR_(const cloudabi32_iovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_fd_replace_args { char from_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t from; char from_r_[PADR_(cloudabi_fd_t)]; char to_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t to; char to_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi_sys_fd_seek_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char offset_l_[PADL_(cloudabi_filedelta_t)]; cloudabi_filedelta_t offset; char offset_r_[PADR_(cloudabi_filedelta_t)]; char whence_l_[PADL_(cloudabi_whence_t)]; cloudabi_whence_t whence; char whence_r_[PADR_(cloudabi_whence_t)]; }; struct cloudabi_sys_fd_stat_get_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(cloudabi_fdstat_t *)]; cloudabi_fdstat_t * buf; char buf_r_[PADR_(cloudabi_fdstat_t *)]; }; struct cloudabi_sys_fd_stat_put_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(const cloudabi_fdstat_t *)]; const cloudabi_fdstat_t * buf; char buf_r_[PADR_(const cloudabi_fdstat_t *)]; char flags_l_[PADL_(cloudabi_fdsflags_t)]; cloudabi_fdsflags_t flags; char flags_r_[PADR_(cloudabi_fdsflags_t)]; }; struct cloudabi_sys_fd_sync_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi32_sys_fd_write_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi32_ciovec_t *)]; const cloudabi32_ciovec_t * iovs; char iovs_r_[PADR_(const cloudabi32_ciovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_advise_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; char len_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t len; char len_r_[PADR_(cloudabi_filesize_t)]; char advice_l_[PADL_(cloudabi_advice_t)]; cloudabi_advice_t advice; char advice_r_[PADR_(cloudabi_advice_t)]; }; struct cloudabi_sys_file_allocate_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; char len_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t len; char len_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi_sys_file_create_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char type_l_[PADL_(cloudabi_filetype_t)]; cloudabi_filetype_t type; char type_r_[PADR_(cloudabi_filetype_t)]; }; struct cloudabi_sys_file_link_args { char fd1_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t fd1; char fd1_r_[PADR_(cloudabi_lookup_t)]; char path1_l_[PADL_(const char *)]; const char * path1; char path1_r_[PADR_(const char *)]; char path1_len_l_[PADL_(size_t)]; size_t path1_len; char path1_len_r_[PADR_(size_t)]; char fd2_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd2; char fd2_r_[PADR_(cloudabi_fd_t)]; char path2_l_[PADL_(const char *)]; const char * path2; char path2_r_[PADR_(const char *)]; char path2_len_l_[PADL_(size_t)]; size_t path2_len; char path2_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_open_args { char dirfd_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t dirfd; char dirfd_r_[PADR_(cloudabi_lookup_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char oflags_l_[PADL_(cloudabi_oflags_t)]; cloudabi_oflags_t oflags; char oflags_r_[PADR_(cloudabi_oflags_t)]; char fds_l_[PADL_(const cloudabi_fdstat_t *)]; const cloudabi_fdstat_t * fds; char fds_r_[PADR_(const cloudabi_fdstat_t *)]; }; struct cloudabi_sys_file_readdir_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(void *)]; void * buf; char buf_r_[PADR_(void *)]; char buf_len_l_[PADL_(size_t)]; size_t buf_len; char buf_len_r_[PADR_(size_t)]; char cookie_l_[PADL_(cloudabi_dircookie_t)]; cloudabi_dircookie_t cookie; char cookie_r_[PADR_(cloudabi_dircookie_t)]; }; struct cloudabi_sys_file_readlink_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char buf_l_[PADL_(char *)]; char * buf; char buf_r_[PADR_(char *)]; char buf_len_l_[PADL_(size_t)]; size_t buf_len; char buf_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_rename_args { char fd1_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd1; char fd1_r_[PADR_(cloudabi_fd_t)]; char path1_l_[PADL_(const char *)]; const char * path1; char path1_r_[PADR_(const char *)]; char path1_len_l_[PADL_(size_t)]; size_t path1_len; char path1_len_r_[PADR_(size_t)]; char fd2_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd2; char fd2_r_[PADR_(cloudabi_fd_t)]; char path2_l_[PADL_(const char *)]; const char * path2; char path2_r_[PADR_(const char *)]; char path2_len_l_[PADL_(size_t)]; size_t path2_len; char path2_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_stat_fget_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(cloudabi_filestat_t *)]; cloudabi_filestat_t * buf; char buf_r_[PADR_(cloudabi_filestat_t *)]; }; struct cloudabi_sys_file_stat_fput_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(const cloudabi_filestat_t *)]; const cloudabi_filestat_t * buf; char buf_r_[PADR_(const cloudabi_filestat_t *)]; char flags_l_[PADL_(cloudabi_fsflags_t)]; cloudabi_fsflags_t flags; char flags_r_[PADR_(cloudabi_fsflags_t)]; }; struct cloudabi_sys_file_stat_get_args { char fd_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t fd; char fd_r_[PADR_(cloudabi_lookup_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char buf_l_[PADL_(cloudabi_filestat_t *)]; cloudabi_filestat_t * buf; char buf_r_[PADR_(cloudabi_filestat_t *)]; }; struct cloudabi_sys_file_stat_put_args { char fd_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t fd; char fd_r_[PADR_(cloudabi_lookup_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char buf_l_[PADL_(const cloudabi_filestat_t *)]; const cloudabi_filestat_t * buf; char buf_r_[PADR_(const cloudabi_filestat_t *)]; char flags_l_[PADL_(cloudabi_fsflags_t)]; cloudabi_fsflags_t flags; char flags_r_[PADR_(cloudabi_fsflags_t)]; }; struct cloudabi_sys_file_symlink_args { char path1_l_[PADL_(const char *)]; const char * path1; char path1_r_[PADR_(const char *)]; char path1_len_l_[PADL_(size_t)]; size_t path1_len; char path1_len_r_[PADR_(size_t)]; char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path2_l_[PADL_(const char *)]; const char * path2; char path2_r_[PADR_(const char *)]; char path2_len_l_[PADL_(size_t)]; size_t path2_len; char path2_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_unlink_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char flags_l_[PADL_(cloudabi_ulflags_t)]; cloudabi_ulflags_t flags; char flags_r_[PADR_(cloudabi_ulflags_t)]; }; struct cloudabi_sys_lock_unlock_args { char lock_l_[PADL_(cloudabi_lock_t *)]; cloudabi_lock_t * lock; char lock_r_[PADR_(cloudabi_lock_t *)]; char scope_l_[PADL_(cloudabi_scope_t)]; cloudabi_scope_t scope; char scope_r_[PADR_(cloudabi_scope_t)]; }; struct cloudabi_sys_mem_advise_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; char advice_l_[PADL_(cloudabi_advice_t)]; cloudabi_advice_t advice; char advice_r_[PADR_(cloudabi_advice_t)]; }; struct cloudabi_sys_mem_map_args { char addr_l_[PADL_(void *)]; void * addr; char addr_r_[PADR_(void *)]; char len_l_[PADL_(size_t)]; size_t len; char len_r_[PADR_(size_t)]; char prot_l_[PADL_(cloudabi_mprot_t)]; cloudabi_mprot_t prot; char prot_r_[PADR_(cloudabi_mprot_t)]; char flags_l_[PADL_(cloudabi_mflags_t)]; cloudabi_mflags_t flags; char flags_r_[PADR_(cloudabi_mflags_t)]; char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char off_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t off; char off_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi_sys_mem_protect_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; char prot_l_[PADL_(cloudabi_mprot_t)]; cloudabi_mprot_t prot; char prot_r_[PADR_(cloudabi_mprot_t)]; }; struct cloudabi_sys_mem_sync_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; char flags_l_[PADL_(cloudabi_msflags_t)]; cloudabi_msflags_t flags; char flags_r_[PADR_(cloudabi_msflags_t)]; }; struct cloudabi_sys_mem_unmap_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; }; struct cloudabi32_sys_poll_args { char in_l_[PADL_(const cloudabi32_subscription_t *)]; const cloudabi32_subscription_t * in; char in_r_[PADR_(const cloudabi32_subscription_t *)]; char out_l_[PADL_(cloudabi32_event_t *)]; cloudabi32_event_t * out; char out_r_[PADR_(cloudabi32_event_t *)]; char nsubscriptions_l_[PADL_(size_t)]; size_t nsubscriptions; char nsubscriptions_r_[PADR_(size_t)]; }; struct cloudabi32_sys_poll_fd_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char in_l_[PADL_(const cloudabi32_subscription_t *)]; const cloudabi32_subscription_t * in; char in_r_[PADR_(const cloudabi32_subscription_t *)]; char in_len_l_[PADL_(size_t)]; size_t in_len; char in_len_r_[PADR_(size_t)]; char out_l_[PADL_(cloudabi32_event_t *)]; cloudabi32_event_t * out; char out_r_[PADR_(cloudabi32_event_t *)]; char out_len_l_[PADL_(size_t)]; size_t out_len; char out_len_r_[PADR_(size_t)]; char timeout_l_[PADL_(const cloudabi32_subscription_t *)]; const cloudabi32_subscription_t * timeout; char timeout_r_[PADR_(const cloudabi32_subscription_t *)]; }; struct cloudabi_sys_proc_exec_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char data_l_[PADL_(const void *)]; const void * data; char data_r_[PADR_(const void *)]; char data_len_l_[PADL_(size_t)]; size_t data_len; char data_len_r_[PADR_(size_t)]; char fds_l_[PADL_(const cloudabi_fd_t *)]; const cloudabi_fd_t * fds; char fds_r_[PADR_(const cloudabi_fd_t *)]; char fds_len_l_[PADL_(size_t)]; size_t fds_len; char fds_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_proc_exit_args { char rval_l_[PADL_(cloudabi_exitcode_t)]; cloudabi_exitcode_t rval; char rval_r_[PADR_(cloudabi_exitcode_t)]; }; struct cloudabi_sys_proc_fork_args { register_t dummy; }; struct cloudabi_sys_proc_raise_args { char sig_l_[PADL_(cloudabi_signal_t)]; cloudabi_signal_t sig; char sig_r_[PADR_(cloudabi_signal_t)]; }; struct cloudabi_sys_random_get_args { char buf_l_[PADL_(void *)]; void * buf; char buf_r_[PADR_(void *)]; char buf_len_l_[PADL_(size_t)]; size_t buf_len; char buf_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_sock_accept_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char unused_l_[PADL_(void *)]; void * unused; char unused_r_[PADR_(void *)]; }; -struct cloudabi_sys_sock_bind_args { - char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; - char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; - char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; - char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; -}; -struct cloudabi_sys_sock_connect_args { - char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; - char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; - char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; - char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; -}; -struct cloudabi_sys_sock_listen_args { - char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; - char backlog_l_[PADL_(cloudabi_backlog_t)]; cloudabi_backlog_t backlog; char backlog_r_[PADR_(cloudabi_backlog_t)]; -}; struct cloudabi32_sys_sock_recv_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char in_l_[PADL_(const cloudabi32_recv_in_t *)]; const cloudabi32_recv_in_t * in; char in_r_[PADR_(const cloudabi32_recv_in_t *)]; char out_l_[PADL_(cloudabi32_recv_out_t *)]; cloudabi32_recv_out_t * out; char out_r_[PADR_(cloudabi32_recv_out_t *)]; }; struct cloudabi32_sys_sock_send_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char in_l_[PADL_(const cloudabi32_send_in_t *)]; const cloudabi32_send_in_t * in; char in_r_[PADR_(const cloudabi32_send_in_t *)]; char out_l_[PADL_(cloudabi32_send_out_t *)]; cloudabi32_send_out_t * out; char out_r_[PADR_(cloudabi32_send_out_t *)]; }; struct cloudabi_sys_sock_shutdown_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char how_l_[PADL_(cloudabi_sdflags_t)]; cloudabi_sdflags_t how; char how_r_[PADR_(cloudabi_sdflags_t)]; }; struct cloudabi_sys_sock_stat_get_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(cloudabi_sockstat_t *)]; cloudabi_sockstat_t * buf; char buf_r_[PADR_(cloudabi_sockstat_t *)]; char flags_l_[PADL_(cloudabi_ssflags_t)]; cloudabi_ssflags_t flags; char flags_r_[PADR_(cloudabi_ssflags_t)]; }; struct cloudabi32_sys_thread_create_args { char attr_l_[PADL_(cloudabi32_threadattr_t *)]; cloudabi32_threadattr_t * attr; char attr_r_[PADR_(cloudabi32_threadattr_t *)]; }; struct cloudabi_sys_thread_exit_args { char lock_l_[PADL_(cloudabi_lock_t *)]; cloudabi_lock_t * lock; char lock_r_[PADR_(cloudabi_lock_t *)]; char scope_l_[PADL_(cloudabi_scope_t)]; cloudabi_scope_t scope; char scope_r_[PADR_(cloudabi_scope_t)]; }; struct cloudabi_sys_thread_yield_args { register_t dummy; }; int cloudabi_sys_clock_res_get(struct thread *, struct cloudabi_sys_clock_res_get_args *); int cloudabi_sys_clock_time_get(struct thread *, struct cloudabi_sys_clock_time_get_args *); int cloudabi_sys_condvar_signal(struct thread *, struct cloudabi_sys_condvar_signal_args *); int cloudabi_sys_fd_close(struct thread *, struct cloudabi_sys_fd_close_args *); int cloudabi_sys_fd_create1(struct thread *, struct cloudabi_sys_fd_create1_args *); int cloudabi_sys_fd_create2(struct thread *, struct cloudabi_sys_fd_create2_args *); int cloudabi_sys_fd_datasync(struct thread *, struct cloudabi_sys_fd_datasync_args *); int cloudabi_sys_fd_dup(struct thread *, struct cloudabi_sys_fd_dup_args *); int cloudabi32_sys_fd_pread(struct thread *, struct cloudabi32_sys_fd_pread_args *); int cloudabi32_sys_fd_pwrite(struct thread *, struct cloudabi32_sys_fd_pwrite_args *); int cloudabi32_sys_fd_read(struct thread *, struct cloudabi32_sys_fd_read_args *); int cloudabi_sys_fd_replace(struct thread *, struct cloudabi_sys_fd_replace_args *); int cloudabi_sys_fd_seek(struct thread *, struct cloudabi_sys_fd_seek_args *); int cloudabi_sys_fd_stat_get(struct thread *, struct cloudabi_sys_fd_stat_get_args *); int cloudabi_sys_fd_stat_put(struct thread *, struct cloudabi_sys_fd_stat_put_args *); int cloudabi_sys_fd_sync(struct thread *, struct cloudabi_sys_fd_sync_args *); int cloudabi32_sys_fd_write(struct thread *, struct cloudabi32_sys_fd_write_args *); int cloudabi_sys_file_advise(struct thread *, struct cloudabi_sys_file_advise_args *); int cloudabi_sys_file_allocate(struct thread *, struct cloudabi_sys_file_allocate_args *); int cloudabi_sys_file_create(struct thread *, struct cloudabi_sys_file_create_args *); int cloudabi_sys_file_link(struct thread *, struct cloudabi_sys_file_link_args *); int cloudabi_sys_file_open(struct thread *, struct cloudabi_sys_file_open_args *); int cloudabi_sys_file_readdir(struct thread *, struct cloudabi_sys_file_readdir_args *); int cloudabi_sys_file_readlink(struct thread *, struct cloudabi_sys_file_readlink_args *); int cloudabi_sys_file_rename(struct thread *, struct cloudabi_sys_file_rename_args *); int cloudabi_sys_file_stat_fget(struct thread *, struct cloudabi_sys_file_stat_fget_args *); int cloudabi_sys_file_stat_fput(struct thread *, struct cloudabi_sys_file_stat_fput_args *); int cloudabi_sys_file_stat_get(struct thread *, struct cloudabi_sys_file_stat_get_args *); int cloudabi_sys_file_stat_put(struct thread *, struct cloudabi_sys_file_stat_put_args *); int cloudabi_sys_file_symlink(struct thread *, struct cloudabi_sys_file_symlink_args *); int cloudabi_sys_file_unlink(struct thread *, struct cloudabi_sys_file_unlink_args *); int cloudabi_sys_lock_unlock(struct thread *, struct cloudabi_sys_lock_unlock_args *); int cloudabi_sys_mem_advise(struct thread *, struct cloudabi_sys_mem_advise_args *); int cloudabi_sys_mem_map(struct thread *, struct cloudabi_sys_mem_map_args *); int cloudabi_sys_mem_protect(struct thread *, struct cloudabi_sys_mem_protect_args *); int cloudabi_sys_mem_sync(struct thread *, struct cloudabi_sys_mem_sync_args *); int cloudabi_sys_mem_unmap(struct thread *, struct cloudabi_sys_mem_unmap_args *); int cloudabi32_sys_poll(struct thread *, struct cloudabi32_sys_poll_args *); int cloudabi32_sys_poll_fd(struct thread *, struct cloudabi32_sys_poll_fd_args *); int cloudabi_sys_proc_exec(struct thread *, struct cloudabi_sys_proc_exec_args *); int cloudabi_sys_proc_exit(struct thread *, struct cloudabi_sys_proc_exit_args *); int cloudabi_sys_proc_fork(struct thread *, struct cloudabi_sys_proc_fork_args *); int cloudabi_sys_proc_raise(struct thread *, struct cloudabi_sys_proc_raise_args *); int cloudabi_sys_random_get(struct thread *, struct cloudabi_sys_random_get_args *); int cloudabi_sys_sock_accept(struct thread *, struct cloudabi_sys_sock_accept_args *); -int cloudabi_sys_sock_bind(struct thread *, struct cloudabi_sys_sock_bind_args *); -int cloudabi_sys_sock_connect(struct thread *, struct cloudabi_sys_sock_connect_args *); -int cloudabi_sys_sock_listen(struct thread *, struct cloudabi_sys_sock_listen_args *); int cloudabi32_sys_sock_recv(struct thread *, struct cloudabi32_sys_sock_recv_args *); int cloudabi32_sys_sock_send(struct thread *, struct cloudabi32_sys_sock_send_args *); int cloudabi_sys_sock_shutdown(struct thread *, struct cloudabi_sys_sock_shutdown_args *); int cloudabi_sys_sock_stat_get(struct thread *, struct cloudabi_sys_sock_stat_get_args *); int cloudabi32_sys_thread_create(struct thread *, struct cloudabi32_sys_thread_create_args *); int cloudabi_sys_thread_exit(struct thread *, struct cloudabi_sys_thread_exit_args *); int cloudabi_sys_thread_yield(struct thread *, struct cloudabi_sys_thread_yield_args *); #ifdef COMPAT_43 #endif /* COMPAT_43 */ #ifdef COMPAT_FREEBSD4 #endif /* COMPAT_FREEBSD4 */ #ifdef COMPAT_FREEBSD6 #endif /* COMPAT_FREEBSD6 */ #ifdef COMPAT_FREEBSD7 #endif /* COMPAT_FREEBSD7 */ #ifdef COMPAT_FREEBSD10 #endif /* COMPAT_FREEBSD10 */ #ifdef COMPAT_FREEBSD11 #endif /* COMPAT_FREEBSD11 */ #define CLOUDABI32_SYS_AUE_cloudabi_sys_clock_res_get AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_clock_time_get AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_condvar_signal AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_close AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_create1 AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_create2 AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_datasync AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_dup AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_fd_pread AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_fd_pwrite AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_fd_read AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_replace AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_seek AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_stat_get AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_stat_put AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_fd_sync AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_fd_write AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_advise AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_allocate AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_create AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_link AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_open AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_readdir AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_readlink AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_rename AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_stat_fget AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_stat_fput AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_stat_get AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_stat_put AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_symlink AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_file_unlink AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_lock_unlock AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_mem_advise AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_mem_map AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_mem_protect AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_mem_sync AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_mem_unmap AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_poll AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_poll_fd AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_proc_exec AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_proc_exit AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_proc_fork AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_proc_raise AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_random_get AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_sock_accept AUE_NULL -#define CLOUDABI32_SYS_AUE_cloudabi_sys_sock_bind AUE_NULL -#define CLOUDABI32_SYS_AUE_cloudabi_sys_sock_connect AUE_NULL -#define CLOUDABI32_SYS_AUE_cloudabi_sys_sock_listen AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_sock_recv AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_sock_send AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_sock_shutdown AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_sock_stat_get AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi32_sys_thread_create AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_thread_exit AUE_NULL #define CLOUDABI32_SYS_AUE_cloudabi_sys_thread_yield AUE_NULL #undef PAD_ #undef PADL_ #undef PADR_ #endif /* !_CLOUDABI32_SYSPROTO_H_ */ Index: projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_syscall.h =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_syscall.h (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_syscall.h (revision 322922) @@ -1,63 +1,60 @@ /* * System call numbers. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ #define CLOUDABI32_SYS_cloudabi_sys_clock_res_get 0 #define CLOUDABI32_SYS_cloudabi_sys_clock_time_get 1 #define CLOUDABI32_SYS_cloudabi_sys_condvar_signal 2 #define CLOUDABI32_SYS_cloudabi_sys_fd_close 3 #define CLOUDABI32_SYS_cloudabi_sys_fd_create1 4 #define CLOUDABI32_SYS_cloudabi_sys_fd_create2 5 #define CLOUDABI32_SYS_cloudabi_sys_fd_datasync 6 #define CLOUDABI32_SYS_cloudabi_sys_fd_dup 7 #define CLOUDABI32_SYS_cloudabi32_sys_fd_pread 8 #define CLOUDABI32_SYS_cloudabi32_sys_fd_pwrite 9 #define CLOUDABI32_SYS_cloudabi32_sys_fd_read 10 #define CLOUDABI32_SYS_cloudabi_sys_fd_replace 11 #define CLOUDABI32_SYS_cloudabi_sys_fd_seek 12 #define CLOUDABI32_SYS_cloudabi_sys_fd_stat_get 13 #define CLOUDABI32_SYS_cloudabi_sys_fd_stat_put 14 #define CLOUDABI32_SYS_cloudabi_sys_fd_sync 15 #define CLOUDABI32_SYS_cloudabi32_sys_fd_write 16 #define CLOUDABI32_SYS_cloudabi_sys_file_advise 17 #define CLOUDABI32_SYS_cloudabi_sys_file_allocate 18 #define CLOUDABI32_SYS_cloudabi_sys_file_create 19 #define CLOUDABI32_SYS_cloudabi_sys_file_link 20 #define CLOUDABI32_SYS_cloudabi_sys_file_open 21 #define CLOUDABI32_SYS_cloudabi_sys_file_readdir 22 #define CLOUDABI32_SYS_cloudabi_sys_file_readlink 23 #define CLOUDABI32_SYS_cloudabi_sys_file_rename 24 #define CLOUDABI32_SYS_cloudabi_sys_file_stat_fget 25 #define CLOUDABI32_SYS_cloudabi_sys_file_stat_fput 26 #define CLOUDABI32_SYS_cloudabi_sys_file_stat_get 27 #define CLOUDABI32_SYS_cloudabi_sys_file_stat_put 28 #define CLOUDABI32_SYS_cloudabi_sys_file_symlink 29 #define CLOUDABI32_SYS_cloudabi_sys_file_unlink 30 #define CLOUDABI32_SYS_cloudabi_sys_lock_unlock 31 #define CLOUDABI32_SYS_cloudabi_sys_mem_advise 32 #define CLOUDABI32_SYS_cloudabi_sys_mem_map 33 #define CLOUDABI32_SYS_cloudabi_sys_mem_protect 34 #define CLOUDABI32_SYS_cloudabi_sys_mem_sync 35 #define CLOUDABI32_SYS_cloudabi_sys_mem_unmap 36 #define CLOUDABI32_SYS_cloudabi32_sys_poll 37 #define CLOUDABI32_SYS_cloudabi32_sys_poll_fd 38 #define CLOUDABI32_SYS_cloudabi_sys_proc_exec 39 #define CLOUDABI32_SYS_cloudabi_sys_proc_exit 40 #define CLOUDABI32_SYS_cloudabi_sys_proc_fork 41 #define CLOUDABI32_SYS_cloudabi_sys_proc_raise 42 #define CLOUDABI32_SYS_cloudabi_sys_random_get 43 #define CLOUDABI32_SYS_cloudabi_sys_sock_accept 44 -#define CLOUDABI32_SYS_cloudabi_sys_sock_bind 45 -#define CLOUDABI32_SYS_cloudabi_sys_sock_connect 46 -#define CLOUDABI32_SYS_cloudabi_sys_sock_listen 47 -#define CLOUDABI32_SYS_cloudabi32_sys_sock_recv 48 -#define CLOUDABI32_SYS_cloudabi32_sys_sock_send 49 -#define CLOUDABI32_SYS_cloudabi_sys_sock_shutdown 50 -#define CLOUDABI32_SYS_cloudabi_sys_sock_stat_get 51 -#define CLOUDABI32_SYS_cloudabi32_sys_thread_create 52 -#define CLOUDABI32_SYS_cloudabi_sys_thread_exit 53 -#define CLOUDABI32_SYS_cloudabi_sys_thread_yield 54 -#define CLOUDABI32_SYS_MAXSYSCALL 55 +#define CLOUDABI32_SYS_cloudabi32_sys_sock_recv 45 +#define CLOUDABI32_SYS_cloudabi32_sys_sock_send 46 +#define CLOUDABI32_SYS_cloudabi_sys_sock_shutdown 47 +#define CLOUDABI32_SYS_cloudabi_sys_sock_stat_get 48 +#define CLOUDABI32_SYS_cloudabi32_sys_thread_create 49 +#define CLOUDABI32_SYS_cloudabi_sys_thread_exit 50 +#define CLOUDABI32_SYS_cloudabi_sys_thread_yield 51 +#define CLOUDABI32_SYS_MAXSYSCALL 52 Index: projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_syscalls.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_syscalls.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_syscalls.c (revision 322922) @@ -1,64 +1,61 @@ /* * System call names. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ const char *cloudabi32_syscallnames[] = { "cloudabi_sys_clock_res_get", /* 0 = cloudabi_sys_clock_res_get */ "cloudabi_sys_clock_time_get", /* 1 = cloudabi_sys_clock_time_get */ "cloudabi_sys_condvar_signal", /* 2 = cloudabi_sys_condvar_signal */ "cloudabi_sys_fd_close", /* 3 = cloudabi_sys_fd_close */ "cloudabi_sys_fd_create1", /* 4 = cloudabi_sys_fd_create1 */ "cloudabi_sys_fd_create2", /* 5 = cloudabi_sys_fd_create2 */ "cloudabi_sys_fd_datasync", /* 6 = cloudabi_sys_fd_datasync */ "cloudabi_sys_fd_dup", /* 7 = cloudabi_sys_fd_dup */ "cloudabi32_sys_fd_pread", /* 8 = cloudabi32_sys_fd_pread */ "cloudabi32_sys_fd_pwrite", /* 9 = cloudabi32_sys_fd_pwrite */ "cloudabi32_sys_fd_read", /* 10 = cloudabi32_sys_fd_read */ "cloudabi_sys_fd_replace", /* 11 = cloudabi_sys_fd_replace */ "cloudabi_sys_fd_seek", /* 12 = cloudabi_sys_fd_seek */ "cloudabi_sys_fd_stat_get", /* 13 = cloudabi_sys_fd_stat_get */ "cloudabi_sys_fd_stat_put", /* 14 = cloudabi_sys_fd_stat_put */ "cloudabi_sys_fd_sync", /* 15 = cloudabi_sys_fd_sync */ "cloudabi32_sys_fd_write", /* 16 = cloudabi32_sys_fd_write */ "cloudabi_sys_file_advise", /* 17 = cloudabi_sys_file_advise */ "cloudabi_sys_file_allocate", /* 18 = cloudabi_sys_file_allocate */ "cloudabi_sys_file_create", /* 19 = cloudabi_sys_file_create */ "cloudabi_sys_file_link", /* 20 = cloudabi_sys_file_link */ "cloudabi_sys_file_open", /* 21 = cloudabi_sys_file_open */ "cloudabi_sys_file_readdir", /* 22 = cloudabi_sys_file_readdir */ "cloudabi_sys_file_readlink", /* 23 = cloudabi_sys_file_readlink */ "cloudabi_sys_file_rename", /* 24 = cloudabi_sys_file_rename */ "cloudabi_sys_file_stat_fget", /* 25 = cloudabi_sys_file_stat_fget */ "cloudabi_sys_file_stat_fput", /* 26 = cloudabi_sys_file_stat_fput */ "cloudabi_sys_file_stat_get", /* 27 = cloudabi_sys_file_stat_get */ "cloudabi_sys_file_stat_put", /* 28 = cloudabi_sys_file_stat_put */ "cloudabi_sys_file_symlink", /* 29 = cloudabi_sys_file_symlink */ "cloudabi_sys_file_unlink", /* 30 = cloudabi_sys_file_unlink */ "cloudabi_sys_lock_unlock", /* 31 = cloudabi_sys_lock_unlock */ "cloudabi_sys_mem_advise", /* 32 = cloudabi_sys_mem_advise */ "cloudabi_sys_mem_map", /* 33 = cloudabi_sys_mem_map */ "cloudabi_sys_mem_protect", /* 34 = cloudabi_sys_mem_protect */ "cloudabi_sys_mem_sync", /* 35 = cloudabi_sys_mem_sync */ "cloudabi_sys_mem_unmap", /* 36 = cloudabi_sys_mem_unmap */ "cloudabi32_sys_poll", /* 37 = cloudabi32_sys_poll */ "cloudabi32_sys_poll_fd", /* 38 = cloudabi32_sys_poll_fd */ "cloudabi_sys_proc_exec", /* 39 = cloudabi_sys_proc_exec */ "cloudabi_sys_proc_exit", /* 40 = cloudabi_sys_proc_exit */ "cloudabi_sys_proc_fork", /* 41 = cloudabi_sys_proc_fork */ "cloudabi_sys_proc_raise", /* 42 = cloudabi_sys_proc_raise */ "cloudabi_sys_random_get", /* 43 = cloudabi_sys_random_get */ "cloudabi_sys_sock_accept", /* 44 = cloudabi_sys_sock_accept */ - "cloudabi_sys_sock_bind", /* 45 = cloudabi_sys_sock_bind */ - "cloudabi_sys_sock_connect", /* 46 = cloudabi_sys_sock_connect */ - "cloudabi_sys_sock_listen", /* 47 = cloudabi_sys_sock_listen */ - "cloudabi32_sys_sock_recv", /* 48 = cloudabi32_sys_sock_recv */ - "cloudabi32_sys_sock_send", /* 49 = cloudabi32_sys_sock_send */ - "cloudabi_sys_sock_shutdown", /* 50 = cloudabi_sys_sock_shutdown */ - "cloudabi_sys_sock_stat_get", /* 51 = cloudabi_sys_sock_stat_get */ - "cloudabi32_sys_thread_create", /* 52 = cloudabi32_sys_thread_create */ - "cloudabi_sys_thread_exit", /* 53 = cloudabi_sys_thread_exit */ - "cloudabi_sys_thread_yield", /* 54 = cloudabi_sys_thread_yield */ + "cloudabi32_sys_sock_recv", /* 45 = cloudabi32_sys_sock_recv */ + "cloudabi32_sys_sock_send", /* 46 = cloudabi32_sys_sock_send */ + "cloudabi_sys_sock_shutdown", /* 47 = cloudabi_sys_sock_shutdown */ + "cloudabi_sys_sock_stat_get", /* 48 = cloudabi_sys_sock_stat_get */ + "cloudabi32_sys_thread_create", /* 49 = cloudabi32_sys_thread_create */ + "cloudabi_sys_thread_exit", /* 50 = cloudabi_sys_thread_exit */ + "cloudabi_sys_thread_yield", /* 51 = cloudabi_sys_thread_yield */ }; Index: projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_sysent.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_sysent.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_sysent.c (revision 322922) @@ -1,72 +1,69 @@ /* * System call switch table. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ #include #include #include #include #define AS(name) (sizeof(struct name) / sizeof(register_t)) /* The casts are bogus but will do for now. */ struct sysent cloudabi32_sysent[] = { { AS(cloudabi_sys_clock_res_get_args), (sy_call_t *)cloudabi_sys_clock_res_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 0 = cloudabi_sys_clock_res_get */ { AS(cloudabi_sys_clock_time_get_args), (sy_call_t *)cloudabi_sys_clock_time_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 1 = cloudabi_sys_clock_time_get */ { AS(cloudabi_sys_condvar_signal_args), (sy_call_t *)cloudabi_sys_condvar_signal, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 2 = cloudabi_sys_condvar_signal */ { AS(cloudabi_sys_fd_close_args), (sy_call_t *)cloudabi_sys_fd_close, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 3 = cloudabi_sys_fd_close */ { AS(cloudabi_sys_fd_create1_args), (sy_call_t *)cloudabi_sys_fd_create1, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 4 = cloudabi_sys_fd_create1 */ { AS(cloudabi_sys_fd_create2_args), (sy_call_t *)cloudabi_sys_fd_create2, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 5 = cloudabi_sys_fd_create2 */ { AS(cloudabi_sys_fd_datasync_args), (sy_call_t *)cloudabi_sys_fd_datasync, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 6 = cloudabi_sys_fd_datasync */ { AS(cloudabi_sys_fd_dup_args), (sy_call_t *)cloudabi_sys_fd_dup, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 7 = cloudabi_sys_fd_dup */ { AS(cloudabi32_sys_fd_pread_args), (sy_call_t *)cloudabi32_sys_fd_pread, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 8 = cloudabi32_sys_fd_pread */ { AS(cloudabi32_sys_fd_pwrite_args), (sy_call_t *)cloudabi32_sys_fd_pwrite, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 9 = cloudabi32_sys_fd_pwrite */ { AS(cloudabi32_sys_fd_read_args), (sy_call_t *)cloudabi32_sys_fd_read, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 10 = cloudabi32_sys_fd_read */ { AS(cloudabi_sys_fd_replace_args), (sy_call_t *)cloudabi_sys_fd_replace, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 11 = cloudabi_sys_fd_replace */ { AS(cloudabi_sys_fd_seek_args), (sy_call_t *)cloudabi_sys_fd_seek, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 12 = cloudabi_sys_fd_seek */ { AS(cloudabi_sys_fd_stat_get_args), (sy_call_t *)cloudabi_sys_fd_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 13 = cloudabi_sys_fd_stat_get */ { AS(cloudabi_sys_fd_stat_put_args), (sy_call_t *)cloudabi_sys_fd_stat_put, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 14 = cloudabi_sys_fd_stat_put */ { AS(cloudabi_sys_fd_sync_args), (sy_call_t *)cloudabi_sys_fd_sync, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 15 = cloudabi_sys_fd_sync */ { AS(cloudabi32_sys_fd_write_args), (sy_call_t *)cloudabi32_sys_fd_write, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 16 = cloudabi32_sys_fd_write */ { AS(cloudabi_sys_file_advise_args), (sy_call_t *)cloudabi_sys_file_advise, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 17 = cloudabi_sys_file_advise */ { AS(cloudabi_sys_file_allocate_args), (sy_call_t *)cloudabi_sys_file_allocate, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 18 = cloudabi_sys_file_allocate */ { AS(cloudabi_sys_file_create_args), (sy_call_t *)cloudabi_sys_file_create, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 19 = cloudabi_sys_file_create */ { AS(cloudabi_sys_file_link_args), (sy_call_t *)cloudabi_sys_file_link, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 20 = cloudabi_sys_file_link */ { AS(cloudabi_sys_file_open_args), (sy_call_t *)cloudabi_sys_file_open, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 21 = cloudabi_sys_file_open */ { AS(cloudabi_sys_file_readdir_args), (sy_call_t *)cloudabi_sys_file_readdir, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 22 = cloudabi_sys_file_readdir */ { AS(cloudabi_sys_file_readlink_args), (sy_call_t *)cloudabi_sys_file_readlink, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 23 = cloudabi_sys_file_readlink */ { AS(cloudabi_sys_file_rename_args), (sy_call_t *)cloudabi_sys_file_rename, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 24 = cloudabi_sys_file_rename */ { AS(cloudabi_sys_file_stat_fget_args), (sy_call_t *)cloudabi_sys_file_stat_fget, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 25 = cloudabi_sys_file_stat_fget */ { AS(cloudabi_sys_file_stat_fput_args), (sy_call_t *)cloudabi_sys_file_stat_fput, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 26 = cloudabi_sys_file_stat_fput */ { AS(cloudabi_sys_file_stat_get_args), (sy_call_t *)cloudabi_sys_file_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 27 = cloudabi_sys_file_stat_get */ { AS(cloudabi_sys_file_stat_put_args), (sy_call_t *)cloudabi_sys_file_stat_put, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 28 = cloudabi_sys_file_stat_put */ { AS(cloudabi_sys_file_symlink_args), (sy_call_t *)cloudabi_sys_file_symlink, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 29 = cloudabi_sys_file_symlink */ { AS(cloudabi_sys_file_unlink_args), (sy_call_t *)cloudabi_sys_file_unlink, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 30 = cloudabi_sys_file_unlink */ { AS(cloudabi_sys_lock_unlock_args), (sy_call_t *)cloudabi_sys_lock_unlock, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 31 = cloudabi_sys_lock_unlock */ { AS(cloudabi_sys_mem_advise_args), (sy_call_t *)cloudabi_sys_mem_advise, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 32 = cloudabi_sys_mem_advise */ { AS(cloudabi_sys_mem_map_args), (sy_call_t *)cloudabi_sys_mem_map, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 33 = cloudabi_sys_mem_map */ { AS(cloudabi_sys_mem_protect_args), (sy_call_t *)cloudabi_sys_mem_protect, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 34 = cloudabi_sys_mem_protect */ { AS(cloudabi_sys_mem_sync_args), (sy_call_t *)cloudabi_sys_mem_sync, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 35 = cloudabi_sys_mem_sync */ { AS(cloudabi_sys_mem_unmap_args), (sy_call_t *)cloudabi_sys_mem_unmap, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 36 = cloudabi_sys_mem_unmap */ { AS(cloudabi32_sys_poll_args), (sy_call_t *)cloudabi32_sys_poll, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 37 = cloudabi32_sys_poll */ { AS(cloudabi32_sys_poll_fd_args), (sy_call_t *)cloudabi32_sys_poll_fd, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 38 = cloudabi32_sys_poll_fd */ { AS(cloudabi_sys_proc_exec_args), (sy_call_t *)cloudabi_sys_proc_exec, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 39 = cloudabi_sys_proc_exec */ { AS(cloudabi_sys_proc_exit_args), (sy_call_t *)cloudabi_sys_proc_exit, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 40 = cloudabi_sys_proc_exit */ { 0, (sy_call_t *)cloudabi_sys_proc_fork, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 41 = cloudabi_sys_proc_fork */ { AS(cloudabi_sys_proc_raise_args), (sy_call_t *)cloudabi_sys_proc_raise, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 42 = cloudabi_sys_proc_raise */ { AS(cloudabi_sys_random_get_args), (sy_call_t *)cloudabi_sys_random_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 43 = cloudabi_sys_random_get */ { AS(cloudabi_sys_sock_accept_args), (sy_call_t *)cloudabi_sys_sock_accept, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 44 = cloudabi_sys_sock_accept */ - { AS(cloudabi_sys_sock_bind_args), (sy_call_t *)cloudabi_sys_sock_bind, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 45 = cloudabi_sys_sock_bind */ - { AS(cloudabi_sys_sock_connect_args), (sy_call_t *)cloudabi_sys_sock_connect, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 46 = cloudabi_sys_sock_connect */ - { AS(cloudabi_sys_sock_listen_args), (sy_call_t *)cloudabi_sys_sock_listen, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 47 = cloudabi_sys_sock_listen */ - { AS(cloudabi32_sys_sock_recv_args), (sy_call_t *)cloudabi32_sys_sock_recv, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 48 = cloudabi32_sys_sock_recv */ - { AS(cloudabi32_sys_sock_send_args), (sy_call_t *)cloudabi32_sys_sock_send, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 49 = cloudabi32_sys_sock_send */ - { AS(cloudabi_sys_sock_shutdown_args), (sy_call_t *)cloudabi_sys_sock_shutdown, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 50 = cloudabi_sys_sock_shutdown */ - { AS(cloudabi_sys_sock_stat_get_args), (sy_call_t *)cloudabi_sys_sock_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 51 = cloudabi_sys_sock_stat_get */ - { AS(cloudabi32_sys_thread_create_args), (sy_call_t *)cloudabi32_sys_thread_create, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 52 = cloudabi32_sys_thread_create */ - { AS(cloudabi_sys_thread_exit_args), (sy_call_t *)cloudabi_sys_thread_exit, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 53 = cloudabi_sys_thread_exit */ - { 0, (sy_call_t *)cloudabi_sys_thread_yield, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 54 = cloudabi_sys_thread_yield */ + { AS(cloudabi32_sys_sock_recv_args), (sy_call_t *)cloudabi32_sys_sock_recv, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 45 = cloudabi32_sys_sock_recv */ + { AS(cloudabi32_sys_sock_send_args), (sy_call_t *)cloudabi32_sys_sock_send, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 46 = cloudabi32_sys_sock_send */ + { AS(cloudabi_sys_sock_shutdown_args), (sy_call_t *)cloudabi_sys_sock_shutdown, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 47 = cloudabi_sys_sock_shutdown */ + { AS(cloudabi_sys_sock_stat_get_args), (sy_call_t *)cloudabi_sys_sock_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 48 = cloudabi_sys_sock_stat_get */ + { AS(cloudabi32_sys_thread_create_args), (sy_call_t *)cloudabi32_sys_thread_create, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 49 = cloudabi32_sys_thread_create */ + { AS(cloudabi_sys_thread_exit_args), (sy_call_t *)cloudabi_sys_thread_exit, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 50 = cloudabi_sys_thread_exit */ + { 0, (sy_call_t *)cloudabi_sys_thread_yield, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 51 = cloudabi_sys_thread_yield */ }; Index: projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_systrace_args.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_systrace_args.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi32/cloudabi32_systrace_args.c (revision 322922) @@ -1,1650 +1,1556 @@ /* * System call argument to DTrace register array converstion. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ * This file is part of the DTrace syscall provider. */ static void systrace_args(int sysnum, void *params, uint64_t *uarg, int *n_args) { int64_t *iarg = (int64_t *) uarg; switch (sysnum) { /* cloudabi_sys_clock_res_get */ case 0: { struct cloudabi_sys_clock_res_get_args *p = params; iarg[0] = p->clock_id; /* cloudabi_clockid_t */ *n_args = 1; break; } /* cloudabi_sys_clock_time_get */ case 1: { struct cloudabi_sys_clock_time_get_args *p = params; iarg[0] = p->clock_id; /* cloudabi_clockid_t */ iarg[1] = p->precision; /* cloudabi_timestamp_t */ *n_args = 2; break; } /* cloudabi_sys_condvar_signal */ case 2: { struct cloudabi_sys_condvar_signal_args *p = params; uarg[0] = (intptr_t) p->condvar; /* cloudabi_condvar_t * */ iarg[1] = p->scope; /* cloudabi_scope_t */ iarg[2] = p->nwaiters; /* cloudabi_nthreads_t */ *n_args = 3; break; } /* cloudabi_sys_fd_close */ case 3: { struct cloudabi_sys_fd_close_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi_sys_fd_create1 */ case 4: { struct cloudabi_sys_fd_create1_args *p = params; iarg[0] = p->type; /* cloudabi_filetype_t */ *n_args = 1; break; } /* cloudabi_sys_fd_create2 */ case 5: { struct cloudabi_sys_fd_create2_args *p = params; iarg[0] = p->type; /* cloudabi_filetype_t */ *n_args = 1; break; } /* cloudabi_sys_fd_datasync */ case 6: { struct cloudabi_sys_fd_datasync_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi_sys_fd_dup */ case 7: { struct cloudabi_sys_fd_dup_args *p = params; iarg[0] = p->from; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi32_sys_fd_pread */ case 8: { struct cloudabi32_sys_fd_pread_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi32_iovec_t * */ uarg[2] = p->iovs_len; /* size_t */ iarg[3] = p->offset; /* cloudabi_filesize_t */ *n_args = 4; break; } /* cloudabi32_sys_fd_pwrite */ case 9: { struct cloudabi32_sys_fd_pwrite_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi32_ciovec_t * */ uarg[2] = p->iovs_len; /* size_t */ iarg[3] = p->offset; /* cloudabi_filesize_t */ *n_args = 4; break; } /* cloudabi32_sys_fd_read */ case 10: { struct cloudabi32_sys_fd_read_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi32_iovec_t * */ uarg[2] = p->iovs_len; /* size_t */ *n_args = 3; break; } /* cloudabi_sys_fd_replace */ case 11: { struct cloudabi_sys_fd_replace_args *p = params; iarg[0] = p->from; /* cloudabi_fd_t */ iarg[1] = p->to; /* cloudabi_fd_t */ *n_args = 2; break; } /* cloudabi_sys_fd_seek */ case 12: { struct cloudabi_sys_fd_seek_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ iarg[1] = p->offset; /* cloudabi_filedelta_t */ iarg[2] = p->whence; /* cloudabi_whence_t */ *n_args = 3; break; } /* cloudabi_sys_fd_stat_get */ case 13: { struct cloudabi_sys_fd_stat_get_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* cloudabi_fdstat_t * */ *n_args = 2; break; } /* cloudabi_sys_fd_stat_put */ case 14: { struct cloudabi_sys_fd_stat_put_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* const cloudabi_fdstat_t * */ iarg[2] = p->flags; /* cloudabi_fdsflags_t */ *n_args = 3; break; } /* cloudabi_sys_fd_sync */ case 15: { struct cloudabi_sys_fd_sync_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi32_sys_fd_write */ case 16: { struct cloudabi32_sys_fd_write_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi32_ciovec_t * */ uarg[2] = p->iovs_len; /* size_t */ *n_args = 3; break; } /* cloudabi_sys_file_advise */ case 17: { struct cloudabi_sys_file_advise_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ iarg[1] = p->offset; /* cloudabi_filesize_t */ iarg[2] = p->len; /* cloudabi_filesize_t */ iarg[3] = p->advice; /* cloudabi_advice_t */ *n_args = 4; break; } /* cloudabi_sys_file_allocate */ case 18: { struct cloudabi_sys_file_allocate_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ iarg[1] = p->offset; /* cloudabi_filesize_t */ iarg[2] = p->len; /* cloudabi_filesize_t */ *n_args = 3; break; } /* cloudabi_sys_file_create */ case 19: { struct cloudabi_sys_file_create_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ iarg[3] = p->type; /* cloudabi_filetype_t */ *n_args = 4; break; } /* cloudabi_sys_file_link */ case 20: { struct cloudabi_sys_file_link_args *p = params; iarg[0] = p->fd1; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path1; /* const char * */ uarg[2] = p->path1_len; /* size_t */ iarg[3] = p->fd2; /* cloudabi_fd_t */ uarg[4] = (intptr_t) p->path2; /* const char * */ uarg[5] = p->path2_len; /* size_t */ *n_args = 6; break; } /* cloudabi_sys_file_open */ case 21: { struct cloudabi_sys_file_open_args *p = params; iarg[0] = p->dirfd; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ iarg[3] = p->oflags; /* cloudabi_oflags_t */ uarg[4] = (intptr_t) p->fds; /* const cloudabi_fdstat_t * */ *n_args = 5; break; } /* cloudabi_sys_file_readdir */ case 22: { struct cloudabi_sys_file_readdir_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* void * */ uarg[2] = p->buf_len; /* size_t */ iarg[3] = p->cookie; /* cloudabi_dircookie_t */ *n_args = 4; break; } /* cloudabi_sys_file_readlink */ case 23: { struct cloudabi_sys_file_readlink_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ uarg[3] = (intptr_t) p->buf; /* char * */ uarg[4] = p->buf_len; /* size_t */ *n_args = 5; break; } /* cloudabi_sys_file_rename */ case 24: { struct cloudabi_sys_file_rename_args *p = params; iarg[0] = p->fd1; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path1; /* const char * */ uarg[2] = p->path1_len; /* size_t */ iarg[3] = p->fd2; /* cloudabi_fd_t */ uarg[4] = (intptr_t) p->path2; /* const char * */ uarg[5] = p->path2_len; /* size_t */ *n_args = 6; break; } /* cloudabi_sys_file_stat_fget */ case 25: { struct cloudabi_sys_file_stat_fget_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* cloudabi_filestat_t * */ *n_args = 2; break; } /* cloudabi_sys_file_stat_fput */ case 26: { struct cloudabi_sys_file_stat_fput_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* const cloudabi_filestat_t * */ iarg[2] = p->flags; /* cloudabi_fsflags_t */ *n_args = 3; break; } /* cloudabi_sys_file_stat_get */ case 27: { struct cloudabi_sys_file_stat_get_args *p = params; iarg[0] = p->fd; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ uarg[3] = (intptr_t) p->buf; /* cloudabi_filestat_t * */ *n_args = 4; break; } /* cloudabi_sys_file_stat_put */ case 28: { struct cloudabi_sys_file_stat_put_args *p = params; iarg[0] = p->fd; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ uarg[3] = (intptr_t) p->buf; /* const cloudabi_filestat_t * */ iarg[4] = p->flags; /* cloudabi_fsflags_t */ *n_args = 5; break; } /* cloudabi_sys_file_symlink */ case 29: { struct cloudabi_sys_file_symlink_args *p = params; uarg[0] = (intptr_t) p->path1; /* const char * */ uarg[1] = p->path1_len; /* size_t */ iarg[2] = p->fd; /* cloudabi_fd_t */ uarg[3] = (intptr_t) p->path2; /* const char * */ uarg[4] = p->path2_len; /* size_t */ *n_args = 5; break; } /* cloudabi_sys_file_unlink */ case 30: { struct cloudabi_sys_file_unlink_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ iarg[3] = p->flags; /* cloudabi_ulflags_t */ *n_args = 4; break; } /* cloudabi_sys_lock_unlock */ case 31: { struct cloudabi_sys_lock_unlock_args *p = params; uarg[0] = (intptr_t) p->lock; /* cloudabi_lock_t * */ iarg[1] = p->scope; /* cloudabi_scope_t */ *n_args = 2; break; } /* cloudabi_sys_mem_advise */ case 32: { struct cloudabi_sys_mem_advise_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ iarg[2] = p->advice; /* cloudabi_advice_t */ *n_args = 3; break; } /* cloudabi_sys_mem_map */ case 33: { struct cloudabi_sys_mem_map_args *p = params; uarg[0] = (intptr_t) p->addr; /* void * */ uarg[1] = p->len; /* size_t */ iarg[2] = p->prot; /* cloudabi_mprot_t */ iarg[3] = p->flags; /* cloudabi_mflags_t */ iarg[4] = p->fd; /* cloudabi_fd_t */ iarg[5] = p->off; /* cloudabi_filesize_t */ *n_args = 6; break; } /* cloudabi_sys_mem_protect */ case 34: { struct cloudabi_sys_mem_protect_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ iarg[2] = p->prot; /* cloudabi_mprot_t */ *n_args = 3; break; } /* cloudabi_sys_mem_sync */ case 35: { struct cloudabi_sys_mem_sync_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ iarg[2] = p->flags; /* cloudabi_msflags_t */ *n_args = 3; break; } /* cloudabi_sys_mem_unmap */ case 36: { struct cloudabi_sys_mem_unmap_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ *n_args = 2; break; } /* cloudabi32_sys_poll */ case 37: { struct cloudabi32_sys_poll_args *p = params; uarg[0] = (intptr_t) p->in; /* const cloudabi32_subscription_t * */ uarg[1] = (intptr_t) p->out; /* cloudabi32_event_t * */ uarg[2] = p->nsubscriptions; /* size_t */ *n_args = 3; break; } /* cloudabi32_sys_poll_fd */ case 38: { struct cloudabi32_sys_poll_fd_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->in; /* const cloudabi32_subscription_t * */ uarg[2] = p->in_len; /* size_t */ uarg[3] = (intptr_t) p->out; /* cloudabi32_event_t * */ uarg[4] = p->out_len; /* size_t */ uarg[5] = (intptr_t) p->timeout; /* const cloudabi32_subscription_t * */ *n_args = 6; break; } /* cloudabi_sys_proc_exec */ case 39: { struct cloudabi_sys_proc_exec_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->data; /* const void * */ uarg[2] = p->data_len; /* size_t */ uarg[3] = (intptr_t) p->fds; /* const cloudabi_fd_t * */ uarg[4] = p->fds_len; /* size_t */ *n_args = 5; break; } /* cloudabi_sys_proc_exit */ case 40: { struct cloudabi_sys_proc_exit_args *p = params; iarg[0] = p->rval; /* cloudabi_exitcode_t */ *n_args = 1; break; } /* cloudabi_sys_proc_fork */ case 41: { *n_args = 0; break; } /* cloudabi_sys_proc_raise */ case 42: { struct cloudabi_sys_proc_raise_args *p = params; iarg[0] = p->sig; /* cloudabi_signal_t */ *n_args = 1; break; } /* cloudabi_sys_random_get */ case 43: { struct cloudabi_sys_random_get_args *p = params; uarg[0] = (intptr_t) p->buf; /* void * */ uarg[1] = p->buf_len; /* size_t */ *n_args = 2; break; } /* cloudabi_sys_sock_accept */ case 44: { struct cloudabi_sys_sock_accept_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->unused; /* void * */ *n_args = 2; break; } - /* cloudabi_sys_sock_bind */ - case 45: { - struct cloudabi_sys_sock_bind_args *p = params; - iarg[0] = p->sock; /* cloudabi_fd_t */ - iarg[1] = p->fd; /* cloudabi_fd_t */ - uarg[2] = (intptr_t) p->path; /* const char * */ - uarg[3] = p->path_len; /* size_t */ - *n_args = 4; - break; - } - /* cloudabi_sys_sock_connect */ - case 46: { - struct cloudabi_sys_sock_connect_args *p = params; - iarg[0] = p->sock; /* cloudabi_fd_t */ - iarg[1] = p->fd; /* cloudabi_fd_t */ - uarg[2] = (intptr_t) p->path; /* const char * */ - uarg[3] = p->path_len; /* size_t */ - *n_args = 4; - break; - } - /* cloudabi_sys_sock_listen */ - case 47: { - struct cloudabi_sys_sock_listen_args *p = params; - iarg[0] = p->sock; /* cloudabi_fd_t */ - iarg[1] = p->backlog; /* cloudabi_backlog_t */ - *n_args = 2; - break; - } /* cloudabi32_sys_sock_recv */ - case 48: { + case 45: { struct cloudabi32_sys_sock_recv_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->in; /* const cloudabi32_recv_in_t * */ uarg[2] = (intptr_t) p->out; /* cloudabi32_recv_out_t * */ *n_args = 3; break; } /* cloudabi32_sys_sock_send */ - case 49: { + case 46: { struct cloudabi32_sys_sock_send_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->in; /* const cloudabi32_send_in_t * */ uarg[2] = (intptr_t) p->out; /* cloudabi32_send_out_t * */ *n_args = 3; break; } /* cloudabi_sys_sock_shutdown */ - case 50: { + case 47: { struct cloudabi_sys_sock_shutdown_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ iarg[1] = p->how; /* cloudabi_sdflags_t */ *n_args = 2; break; } /* cloudabi_sys_sock_stat_get */ - case 51: { + case 48: { struct cloudabi_sys_sock_stat_get_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* cloudabi_sockstat_t * */ iarg[2] = p->flags; /* cloudabi_ssflags_t */ *n_args = 3; break; } /* cloudabi32_sys_thread_create */ - case 52: { + case 49: { struct cloudabi32_sys_thread_create_args *p = params; uarg[0] = (intptr_t) p->attr; /* cloudabi32_threadattr_t * */ *n_args = 1; break; } /* cloudabi_sys_thread_exit */ - case 53: { + case 50: { struct cloudabi_sys_thread_exit_args *p = params; uarg[0] = (intptr_t) p->lock; /* cloudabi_lock_t * */ iarg[1] = p->scope; /* cloudabi_scope_t */ *n_args = 2; break; } /* cloudabi_sys_thread_yield */ - case 54: { + case 51: { *n_args = 0; break; } default: *n_args = 0; break; }; } static void systrace_entry_setargdesc(int sysnum, int ndx, char *desc, size_t descsz) { const char *p = NULL; switch (sysnum) { /* cloudabi_sys_clock_res_get */ case 0: switch(ndx) { case 0: p = "cloudabi_clockid_t"; break; default: break; }; break; /* cloudabi_sys_clock_time_get */ case 1: switch(ndx) { case 0: p = "cloudabi_clockid_t"; break; case 1: p = "cloudabi_timestamp_t"; break; default: break; }; break; /* cloudabi_sys_condvar_signal */ case 2: switch(ndx) { case 0: p = "userland cloudabi_condvar_t *"; break; case 1: p = "cloudabi_scope_t"; break; case 2: p = "cloudabi_nthreads_t"; break; default: break; }; break; /* cloudabi_sys_fd_close */ case 3: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi_sys_fd_create1 */ case 4: switch(ndx) { case 0: p = "cloudabi_filetype_t"; break; default: break; }; break; /* cloudabi_sys_fd_create2 */ case 5: switch(ndx) { case 0: p = "cloudabi_filetype_t"; break; default: break; }; break; /* cloudabi_sys_fd_datasync */ case 6: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi_sys_fd_dup */ case 7: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi32_sys_fd_pread */ case 8: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi32_iovec_t *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi32_sys_fd_pwrite */ case 9: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi32_ciovec_t *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi32_sys_fd_read */ case 10: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi32_iovec_t *"; break; case 2: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_fd_replace */ case 11: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi_sys_fd_seek */ case 12: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_filedelta_t"; break; case 2: p = "cloudabi_whence_t"; break; default: break; }; break; /* cloudabi_sys_fd_stat_get */ case 13: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland cloudabi_fdstat_t *"; break; default: break; }; break; /* cloudabi_sys_fd_stat_put */ case 14: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi_fdstat_t *"; break; case 2: p = "cloudabi_fdsflags_t"; break; default: break; }; break; /* cloudabi_sys_fd_sync */ case 15: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi32_sys_fd_write */ case 16: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi32_ciovec_t *"; break; case 2: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_advise */ case 17: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_filesize_t"; break; case 2: p = "cloudabi_filesize_t"; break; case 3: p = "cloudabi_advice_t"; break; default: break; }; break; /* cloudabi_sys_file_allocate */ case 18: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_filesize_t"; break; case 2: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi_sys_file_create */ case 19: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_filetype_t"; break; default: break; }; break; /* cloudabi_sys_file_link */ case 20: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_fd_t"; break; case 4: p = "userland const char *"; break; case 5: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_open */ case 21: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_oflags_t"; break; case 4: p = "userland const cloudabi_fdstat_t *"; break; default: break; }; break; /* cloudabi_sys_file_readdir */ case 22: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland void *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_dircookie_t"; break; default: break; }; break; /* cloudabi_sys_file_readlink */ case 23: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "userland char *"; break; case 4: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_rename */ case 24: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_fd_t"; break; case 4: p = "userland const char *"; break; case 5: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_stat_fget */ case 25: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland cloudabi_filestat_t *"; break; default: break; }; break; /* cloudabi_sys_file_stat_fput */ case 26: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi_filestat_t *"; break; case 2: p = "cloudabi_fsflags_t"; break; default: break; }; break; /* cloudabi_sys_file_stat_get */ case 27: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "userland cloudabi_filestat_t *"; break; default: break; }; break; /* cloudabi_sys_file_stat_put */ case 28: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "userland const cloudabi_filestat_t *"; break; case 4: p = "cloudabi_fsflags_t"; break; default: break; }; break; /* cloudabi_sys_file_symlink */ case 29: switch(ndx) { case 0: p = "userland const char *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_fd_t"; break; case 3: p = "userland const char *"; break; case 4: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_unlink */ case 30: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_ulflags_t"; break; default: break; }; break; /* cloudabi_sys_lock_unlock */ case 31: switch(ndx) { case 0: p = "userland cloudabi_lock_t *"; break; case 1: p = "cloudabi_scope_t"; break; default: break; }; break; /* cloudabi_sys_mem_advise */ case 32: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_advice_t"; break; default: break; }; break; /* cloudabi_sys_mem_map */ case 33: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_mprot_t"; break; case 3: p = "cloudabi_mflags_t"; break; case 4: p = "cloudabi_fd_t"; break; case 5: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi_sys_mem_protect */ case 34: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_mprot_t"; break; default: break; }; break; /* cloudabi_sys_mem_sync */ case 35: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_msflags_t"; break; default: break; }; break; /* cloudabi_sys_mem_unmap */ case 36: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; default: break; }; break; /* cloudabi32_sys_poll */ case 37: switch(ndx) { case 0: p = "userland const cloudabi32_subscription_t *"; break; case 1: p = "userland cloudabi32_event_t *"; break; case 2: p = "size_t"; break; default: break; }; break; /* cloudabi32_sys_poll_fd */ case 38: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi32_subscription_t *"; break; case 2: p = "size_t"; break; case 3: p = "userland cloudabi32_event_t *"; break; case 4: p = "size_t"; break; case 5: p = "userland const cloudabi32_subscription_t *"; break; default: break; }; break; /* cloudabi_sys_proc_exec */ case 39: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const void *"; break; case 2: p = "size_t"; break; case 3: p = "userland const cloudabi_fd_t *"; break; case 4: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_proc_exit */ case 40: switch(ndx) { case 0: p = "cloudabi_exitcode_t"; break; default: break; }; break; /* cloudabi_sys_proc_fork */ case 41: break; /* cloudabi_sys_proc_raise */ case 42: switch(ndx) { case 0: p = "cloudabi_signal_t"; break; default: break; }; break; /* cloudabi_sys_random_get */ case 43: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_sock_accept */ case 44: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland void *"; break; default: break; }; break; - /* cloudabi_sys_sock_bind */ + /* cloudabi32_sys_sock_recv */ case 45: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: - p = "cloudabi_fd_t"; - break; - case 2: - p = "userland const char *"; - break; - case 3: - p = "size_t"; - break; - default: - break; - }; - break; - /* cloudabi_sys_sock_connect */ - case 46: - switch(ndx) { - case 0: - p = "cloudabi_fd_t"; - break; - case 1: - p = "cloudabi_fd_t"; - break; - case 2: - p = "userland const char *"; - break; - case 3: - p = "size_t"; - break; - default: - break; - }; - break; - /* cloudabi_sys_sock_listen */ - case 47: - switch(ndx) { - case 0: - p = "cloudabi_fd_t"; - break; - case 1: - p = "cloudabi_backlog_t"; - break; - default: - break; - }; - break; - /* cloudabi32_sys_sock_recv */ - case 48: - switch(ndx) { - case 0: - p = "cloudabi_fd_t"; - break; - case 1: p = "userland const cloudabi32_recv_in_t *"; break; case 2: p = "userland cloudabi32_recv_out_t *"; break; default: break; }; break; /* cloudabi32_sys_sock_send */ - case 49: + case 46: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi32_send_in_t *"; break; case 2: p = "userland cloudabi32_send_out_t *"; break; default: break; }; break; /* cloudabi_sys_sock_shutdown */ - case 50: + case 47: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_sdflags_t"; break; default: break; }; break; /* cloudabi_sys_sock_stat_get */ - case 51: + case 48: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland cloudabi_sockstat_t *"; break; case 2: p = "cloudabi_ssflags_t"; break; default: break; }; break; /* cloudabi32_sys_thread_create */ - case 52: + case 49: switch(ndx) { case 0: p = "userland cloudabi32_threadattr_t *"; break; default: break; }; break; /* cloudabi_sys_thread_exit */ - case 53: + case 50: switch(ndx) { case 0: p = "userland cloudabi_lock_t *"; break; case 1: p = "cloudabi_scope_t"; break; default: break; }; break; /* cloudabi_sys_thread_yield */ - case 54: + case 51: break; default: break; }; if (p != NULL) strlcpy(desc, p, descsz); } static void systrace_return_setargdesc(int sysnum, int ndx, char *desc, size_t descsz) { const char *p = NULL; switch (sysnum) { /* cloudabi_sys_clock_res_get */ case 0: if (ndx == 0 || ndx == 1) p = "cloudabi_timestamp_t"; break; /* cloudabi_sys_clock_time_get */ case 1: if (ndx == 0 || ndx == 1) p = "cloudabi_timestamp_t"; break; /* cloudabi_sys_condvar_signal */ case 2: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_close */ case 3: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_create1 */ case 4: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; /* cloudabi_sys_fd_create2 */ case 5: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_datasync */ case 6: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_dup */ case 7: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; /* cloudabi32_sys_fd_pread */ case 8: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi32_sys_fd_pwrite */ case 9: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi32_sys_fd_read */ case 10: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_fd_replace */ case 11: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_seek */ case 12: if (ndx == 0 || ndx == 1) p = "cloudabi_filesize_t"; break; /* cloudabi_sys_fd_stat_get */ case 13: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_stat_put */ case 14: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_sync */ case 15: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi32_sys_fd_write */ case 16: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_file_advise */ case 17: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_allocate */ case 18: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_create */ case 19: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_link */ case 20: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_open */ case 21: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; /* cloudabi_sys_file_readdir */ case 22: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_file_readlink */ case 23: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_file_rename */ case 24: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_fget */ case 25: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_fput */ case 26: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_get */ case 27: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_put */ case 28: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_symlink */ case 29: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_unlink */ case 30: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_lock_unlock */ case 31: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_advise */ case 32: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_map */ case 33: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_protect */ case 34: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_sync */ case 35: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_unmap */ case 36: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi32_sys_poll */ case 37: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi32_sys_poll_fd */ case 38: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_proc_exec */ case 39: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_proc_exit */ case 40: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_proc_fork */ case 41: /* cloudabi_sys_proc_raise */ case 42: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_random_get */ case 43: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_sock_accept */ case 44: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; - /* cloudabi_sys_sock_bind */ + /* cloudabi32_sys_sock_recv */ case 45: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi_sys_sock_connect */ + /* cloudabi32_sys_sock_send */ case 46: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi_sys_sock_listen */ + /* cloudabi_sys_sock_shutdown */ case 47: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi32_sys_sock_recv */ + /* cloudabi_sys_sock_stat_get */ case 48: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi32_sys_sock_send */ + /* cloudabi32_sys_thread_create */ case 49: if (ndx == 0 || ndx == 1) - p = "void"; - break; - /* cloudabi_sys_sock_shutdown */ - case 50: - if (ndx == 0 || ndx == 1) - p = "void"; - break; - /* cloudabi_sys_sock_stat_get */ - case 51: - if (ndx == 0 || ndx == 1) - p = "void"; - break; - /* cloudabi32_sys_thread_create */ - case 52: - if (ndx == 0 || ndx == 1) p = "cloudabi_tid_t"; break; /* cloudabi_sys_thread_exit */ - case 53: + case 50: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_thread_yield */ - case 54: + case 51: default: break; }; if (p != NULL) strlcpy(desc, p, descsz); } Index: projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_proto.h =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_proto.h (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_proto.h (revision 322922) @@ -1,458 +1,436 @@ /* * System call prototypes. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ #ifndef _CLOUDABI64_SYSPROTO_H_ #define _CLOUDABI64_SYSPROTO_H_ #include #include #include #include #include #include #include #include struct proc; struct thread; #define PAD_(t) (sizeof(register_t) <= sizeof(t) ? \ 0 : sizeof(register_t) - sizeof(t)) #if BYTE_ORDER == LITTLE_ENDIAN #define PADL_(t) 0 #define PADR_(t) PAD_(t) #else #define PADL_(t) PAD_(t) #define PADR_(t) 0 #endif struct cloudabi_sys_clock_res_get_args { char clock_id_l_[PADL_(cloudabi_clockid_t)]; cloudabi_clockid_t clock_id; char clock_id_r_[PADR_(cloudabi_clockid_t)]; }; struct cloudabi_sys_clock_time_get_args { char clock_id_l_[PADL_(cloudabi_clockid_t)]; cloudabi_clockid_t clock_id; char clock_id_r_[PADR_(cloudabi_clockid_t)]; char precision_l_[PADL_(cloudabi_timestamp_t)]; cloudabi_timestamp_t precision; char precision_r_[PADR_(cloudabi_timestamp_t)]; }; struct cloudabi_sys_condvar_signal_args { char condvar_l_[PADL_(cloudabi_condvar_t *)]; cloudabi_condvar_t * condvar; char condvar_r_[PADR_(cloudabi_condvar_t *)]; char scope_l_[PADL_(cloudabi_scope_t)]; cloudabi_scope_t scope; char scope_r_[PADR_(cloudabi_scope_t)]; char nwaiters_l_[PADL_(cloudabi_nthreads_t)]; cloudabi_nthreads_t nwaiters; char nwaiters_r_[PADR_(cloudabi_nthreads_t)]; }; struct cloudabi_sys_fd_close_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi_sys_fd_create1_args { char type_l_[PADL_(cloudabi_filetype_t)]; cloudabi_filetype_t type; char type_r_[PADR_(cloudabi_filetype_t)]; }; struct cloudabi_sys_fd_create2_args { char type_l_[PADL_(cloudabi_filetype_t)]; cloudabi_filetype_t type; char type_r_[PADR_(cloudabi_filetype_t)]; }; struct cloudabi_sys_fd_datasync_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi_sys_fd_dup_args { char from_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t from; char from_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi64_sys_fd_pread_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi64_iovec_t *)]; const cloudabi64_iovec_t * iovs; char iovs_r_[PADR_(const cloudabi64_iovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi64_sys_fd_pwrite_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi64_ciovec_t *)]; const cloudabi64_ciovec_t * iovs; char iovs_r_[PADR_(const cloudabi64_ciovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi64_sys_fd_read_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi64_iovec_t *)]; const cloudabi64_iovec_t * iovs; char iovs_r_[PADR_(const cloudabi64_iovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_fd_replace_args { char from_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t from; char from_r_[PADR_(cloudabi_fd_t)]; char to_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t to; char to_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi_sys_fd_seek_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char offset_l_[PADL_(cloudabi_filedelta_t)]; cloudabi_filedelta_t offset; char offset_r_[PADR_(cloudabi_filedelta_t)]; char whence_l_[PADL_(cloudabi_whence_t)]; cloudabi_whence_t whence; char whence_r_[PADR_(cloudabi_whence_t)]; }; struct cloudabi_sys_fd_stat_get_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(cloudabi_fdstat_t *)]; cloudabi_fdstat_t * buf; char buf_r_[PADR_(cloudabi_fdstat_t *)]; }; struct cloudabi_sys_fd_stat_put_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(const cloudabi_fdstat_t *)]; const cloudabi_fdstat_t * buf; char buf_r_[PADR_(const cloudabi_fdstat_t *)]; char flags_l_[PADL_(cloudabi_fdsflags_t)]; cloudabi_fdsflags_t flags; char flags_r_[PADR_(cloudabi_fdsflags_t)]; }; struct cloudabi_sys_fd_sync_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; }; struct cloudabi64_sys_fd_write_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char iovs_l_[PADL_(const cloudabi64_ciovec_t *)]; const cloudabi64_ciovec_t * iovs; char iovs_r_[PADR_(const cloudabi64_ciovec_t *)]; char iovs_len_l_[PADL_(size_t)]; size_t iovs_len; char iovs_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_advise_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; char len_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t len; char len_r_[PADR_(cloudabi_filesize_t)]; char advice_l_[PADL_(cloudabi_advice_t)]; cloudabi_advice_t advice; char advice_r_[PADR_(cloudabi_advice_t)]; }; struct cloudabi_sys_file_allocate_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char offset_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t offset; char offset_r_[PADR_(cloudabi_filesize_t)]; char len_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t len; char len_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi_sys_file_create_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char type_l_[PADL_(cloudabi_filetype_t)]; cloudabi_filetype_t type; char type_r_[PADR_(cloudabi_filetype_t)]; }; struct cloudabi_sys_file_link_args { char fd1_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t fd1; char fd1_r_[PADR_(cloudabi_lookup_t)]; char path1_l_[PADL_(const char *)]; const char * path1; char path1_r_[PADR_(const char *)]; char path1_len_l_[PADL_(size_t)]; size_t path1_len; char path1_len_r_[PADR_(size_t)]; char fd2_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd2; char fd2_r_[PADR_(cloudabi_fd_t)]; char path2_l_[PADL_(const char *)]; const char * path2; char path2_r_[PADR_(const char *)]; char path2_len_l_[PADL_(size_t)]; size_t path2_len; char path2_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_open_args { char dirfd_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t dirfd; char dirfd_r_[PADR_(cloudabi_lookup_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char oflags_l_[PADL_(cloudabi_oflags_t)]; cloudabi_oflags_t oflags; char oflags_r_[PADR_(cloudabi_oflags_t)]; char fds_l_[PADL_(const cloudabi_fdstat_t *)]; const cloudabi_fdstat_t * fds; char fds_r_[PADR_(const cloudabi_fdstat_t *)]; }; struct cloudabi_sys_file_readdir_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(void *)]; void * buf; char buf_r_[PADR_(void *)]; char buf_len_l_[PADL_(size_t)]; size_t buf_len; char buf_len_r_[PADR_(size_t)]; char cookie_l_[PADL_(cloudabi_dircookie_t)]; cloudabi_dircookie_t cookie; char cookie_r_[PADR_(cloudabi_dircookie_t)]; }; struct cloudabi_sys_file_readlink_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char buf_l_[PADL_(char *)]; char * buf; char buf_r_[PADR_(char *)]; char buf_len_l_[PADL_(size_t)]; size_t buf_len; char buf_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_rename_args { char fd1_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd1; char fd1_r_[PADR_(cloudabi_fd_t)]; char path1_l_[PADL_(const char *)]; const char * path1; char path1_r_[PADR_(const char *)]; char path1_len_l_[PADL_(size_t)]; size_t path1_len; char path1_len_r_[PADR_(size_t)]; char fd2_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd2; char fd2_r_[PADR_(cloudabi_fd_t)]; char path2_l_[PADL_(const char *)]; const char * path2; char path2_r_[PADR_(const char *)]; char path2_len_l_[PADL_(size_t)]; size_t path2_len; char path2_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_stat_fget_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(cloudabi_filestat_t *)]; cloudabi_filestat_t * buf; char buf_r_[PADR_(cloudabi_filestat_t *)]; }; struct cloudabi_sys_file_stat_fput_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(const cloudabi_filestat_t *)]; const cloudabi_filestat_t * buf; char buf_r_[PADR_(const cloudabi_filestat_t *)]; char flags_l_[PADL_(cloudabi_fsflags_t)]; cloudabi_fsflags_t flags; char flags_r_[PADR_(cloudabi_fsflags_t)]; }; struct cloudabi_sys_file_stat_get_args { char fd_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t fd; char fd_r_[PADR_(cloudabi_lookup_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char buf_l_[PADL_(cloudabi_filestat_t *)]; cloudabi_filestat_t * buf; char buf_r_[PADR_(cloudabi_filestat_t *)]; }; struct cloudabi_sys_file_stat_put_args { char fd_l_[PADL_(cloudabi_lookup_t)]; cloudabi_lookup_t fd; char fd_r_[PADR_(cloudabi_lookup_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char buf_l_[PADL_(const cloudabi_filestat_t *)]; const cloudabi_filestat_t * buf; char buf_r_[PADR_(const cloudabi_filestat_t *)]; char flags_l_[PADL_(cloudabi_fsflags_t)]; cloudabi_fsflags_t flags; char flags_r_[PADR_(cloudabi_fsflags_t)]; }; struct cloudabi_sys_file_symlink_args { char path1_l_[PADL_(const char *)]; const char * path1; char path1_r_[PADR_(const char *)]; char path1_len_l_[PADL_(size_t)]; size_t path1_len; char path1_len_r_[PADR_(size_t)]; char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path2_l_[PADL_(const char *)]; const char * path2; char path2_r_[PADR_(const char *)]; char path2_len_l_[PADL_(size_t)]; size_t path2_len; char path2_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_file_unlink_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; char flags_l_[PADL_(cloudabi_ulflags_t)]; cloudabi_ulflags_t flags; char flags_r_[PADR_(cloudabi_ulflags_t)]; }; struct cloudabi_sys_lock_unlock_args { char lock_l_[PADL_(cloudabi_lock_t *)]; cloudabi_lock_t * lock; char lock_r_[PADR_(cloudabi_lock_t *)]; char scope_l_[PADL_(cloudabi_scope_t)]; cloudabi_scope_t scope; char scope_r_[PADR_(cloudabi_scope_t)]; }; struct cloudabi_sys_mem_advise_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; char advice_l_[PADL_(cloudabi_advice_t)]; cloudabi_advice_t advice; char advice_r_[PADR_(cloudabi_advice_t)]; }; struct cloudabi_sys_mem_map_args { char addr_l_[PADL_(void *)]; void * addr; char addr_r_[PADR_(void *)]; char len_l_[PADL_(size_t)]; size_t len; char len_r_[PADR_(size_t)]; char prot_l_[PADL_(cloudabi_mprot_t)]; cloudabi_mprot_t prot; char prot_r_[PADR_(cloudabi_mprot_t)]; char flags_l_[PADL_(cloudabi_mflags_t)]; cloudabi_mflags_t flags; char flags_r_[PADR_(cloudabi_mflags_t)]; char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char off_l_[PADL_(cloudabi_filesize_t)]; cloudabi_filesize_t off; char off_r_[PADR_(cloudabi_filesize_t)]; }; struct cloudabi_sys_mem_protect_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; char prot_l_[PADL_(cloudabi_mprot_t)]; cloudabi_mprot_t prot; char prot_r_[PADR_(cloudabi_mprot_t)]; }; struct cloudabi_sys_mem_sync_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; char flags_l_[PADL_(cloudabi_msflags_t)]; cloudabi_msflags_t flags; char flags_r_[PADR_(cloudabi_msflags_t)]; }; struct cloudabi_sys_mem_unmap_args { char mapping_l_[PADL_(void *)]; void * mapping; char mapping_r_[PADR_(void *)]; char mapping_len_l_[PADL_(size_t)]; size_t mapping_len; char mapping_len_r_[PADR_(size_t)]; }; struct cloudabi64_sys_poll_args { char in_l_[PADL_(const cloudabi64_subscription_t *)]; const cloudabi64_subscription_t * in; char in_r_[PADR_(const cloudabi64_subscription_t *)]; char out_l_[PADL_(cloudabi64_event_t *)]; cloudabi64_event_t * out; char out_r_[PADR_(cloudabi64_event_t *)]; char nsubscriptions_l_[PADL_(size_t)]; size_t nsubscriptions; char nsubscriptions_r_[PADR_(size_t)]; }; struct cloudabi64_sys_poll_fd_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char in_l_[PADL_(const cloudabi64_subscription_t *)]; const cloudabi64_subscription_t * in; char in_r_[PADR_(const cloudabi64_subscription_t *)]; char in_len_l_[PADL_(size_t)]; size_t in_len; char in_len_r_[PADR_(size_t)]; char out_l_[PADL_(cloudabi64_event_t *)]; cloudabi64_event_t * out; char out_r_[PADR_(cloudabi64_event_t *)]; char out_len_l_[PADL_(size_t)]; size_t out_len; char out_len_r_[PADR_(size_t)]; char timeout_l_[PADL_(const cloudabi64_subscription_t *)]; const cloudabi64_subscription_t * timeout; char timeout_r_[PADR_(const cloudabi64_subscription_t *)]; }; struct cloudabi_sys_proc_exec_args { char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; char data_l_[PADL_(const void *)]; const void * data; char data_r_[PADR_(const void *)]; char data_len_l_[PADL_(size_t)]; size_t data_len; char data_len_r_[PADR_(size_t)]; char fds_l_[PADL_(const cloudabi_fd_t *)]; const cloudabi_fd_t * fds; char fds_r_[PADR_(const cloudabi_fd_t *)]; char fds_len_l_[PADL_(size_t)]; size_t fds_len; char fds_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_proc_exit_args { char rval_l_[PADL_(cloudabi_exitcode_t)]; cloudabi_exitcode_t rval; char rval_r_[PADR_(cloudabi_exitcode_t)]; }; struct cloudabi_sys_proc_fork_args { register_t dummy; }; struct cloudabi_sys_proc_raise_args { char sig_l_[PADL_(cloudabi_signal_t)]; cloudabi_signal_t sig; char sig_r_[PADR_(cloudabi_signal_t)]; }; struct cloudabi_sys_random_get_args { char buf_l_[PADL_(void *)]; void * buf; char buf_r_[PADR_(void *)]; char buf_len_l_[PADL_(size_t)]; size_t buf_len; char buf_len_r_[PADR_(size_t)]; }; struct cloudabi_sys_sock_accept_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char unused_l_[PADL_(void *)]; void * unused; char unused_r_[PADR_(void *)]; }; -struct cloudabi_sys_sock_bind_args { - char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; - char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; - char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; - char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; -}; -struct cloudabi_sys_sock_connect_args { - char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; - char fd_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t fd; char fd_r_[PADR_(cloudabi_fd_t)]; - char path_l_[PADL_(const char *)]; const char * path; char path_r_[PADR_(const char *)]; - char path_len_l_[PADL_(size_t)]; size_t path_len; char path_len_r_[PADR_(size_t)]; -}; -struct cloudabi_sys_sock_listen_args { - char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; - char backlog_l_[PADL_(cloudabi_backlog_t)]; cloudabi_backlog_t backlog; char backlog_r_[PADR_(cloudabi_backlog_t)]; -}; struct cloudabi64_sys_sock_recv_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char in_l_[PADL_(const cloudabi64_recv_in_t *)]; const cloudabi64_recv_in_t * in; char in_r_[PADR_(const cloudabi64_recv_in_t *)]; char out_l_[PADL_(cloudabi64_recv_out_t *)]; cloudabi64_recv_out_t * out; char out_r_[PADR_(cloudabi64_recv_out_t *)]; }; struct cloudabi64_sys_sock_send_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char in_l_[PADL_(const cloudabi64_send_in_t *)]; const cloudabi64_send_in_t * in; char in_r_[PADR_(const cloudabi64_send_in_t *)]; char out_l_[PADL_(cloudabi64_send_out_t *)]; cloudabi64_send_out_t * out; char out_r_[PADR_(cloudabi64_send_out_t *)]; }; struct cloudabi_sys_sock_shutdown_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char how_l_[PADL_(cloudabi_sdflags_t)]; cloudabi_sdflags_t how; char how_r_[PADR_(cloudabi_sdflags_t)]; }; struct cloudabi_sys_sock_stat_get_args { char sock_l_[PADL_(cloudabi_fd_t)]; cloudabi_fd_t sock; char sock_r_[PADR_(cloudabi_fd_t)]; char buf_l_[PADL_(cloudabi_sockstat_t *)]; cloudabi_sockstat_t * buf; char buf_r_[PADR_(cloudabi_sockstat_t *)]; char flags_l_[PADL_(cloudabi_ssflags_t)]; cloudabi_ssflags_t flags; char flags_r_[PADR_(cloudabi_ssflags_t)]; }; struct cloudabi64_sys_thread_create_args { char attr_l_[PADL_(cloudabi64_threadattr_t *)]; cloudabi64_threadattr_t * attr; char attr_r_[PADR_(cloudabi64_threadattr_t *)]; }; struct cloudabi_sys_thread_exit_args { char lock_l_[PADL_(cloudabi_lock_t *)]; cloudabi_lock_t * lock; char lock_r_[PADR_(cloudabi_lock_t *)]; char scope_l_[PADL_(cloudabi_scope_t)]; cloudabi_scope_t scope; char scope_r_[PADR_(cloudabi_scope_t)]; }; struct cloudabi_sys_thread_yield_args { register_t dummy; }; int cloudabi_sys_clock_res_get(struct thread *, struct cloudabi_sys_clock_res_get_args *); int cloudabi_sys_clock_time_get(struct thread *, struct cloudabi_sys_clock_time_get_args *); int cloudabi_sys_condvar_signal(struct thread *, struct cloudabi_sys_condvar_signal_args *); int cloudabi_sys_fd_close(struct thread *, struct cloudabi_sys_fd_close_args *); int cloudabi_sys_fd_create1(struct thread *, struct cloudabi_sys_fd_create1_args *); int cloudabi_sys_fd_create2(struct thread *, struct cloudabi_sys_fd_create2_args *); int cloudabi_sys_fd_datasync(struct thread *, struct cloudabi_sys_fd_datasync_args *); int cloudabi_sys_fd_dup(struct thread *, struct cloudabi_sys_fd_dup_args *); int cloudabi64_sys_fd_pread(struct thread *, struct cloudabi64_sys_fd_pread_args *); int cloudabi64_sys_fd_pwrite(struct thread *, struct cloudabi64_sys_fd_pwrite_args *); int cloudabi64_sys_fd_read(struct thread *, struct cloudabi64_sys_fd_read_args *); int cloudabi_sys_fd_replace(struct thread *, struct cloudabi_sys_fd_replace_args *); int cloudabi_sys_fd_seek(struct thread *, struct cloudabi_sys_fd_seek_args *); int cloudabi_sys_fd_stat_get(struct thread *, struct cloudabi_sys_fd_stat_get_args *); int cloudabi_sys_fd_stat_put(struct thread *, struct cloudabi_sys_fd_stat_put_args *); int cloudabi_sys_fd_sync(struct thread *, struct cloudabi_sys_fd_sync_args *); int cloudabi64_sys_fd_write(struct thread *, struct cloudabi64_sys_fd_write_args *); int cloudabi_sys_file_advise(struct thread *, struct cloudabi_sys_file_advise_args *); int cloudabi_sys_file_allocate(struct thread *, struct cloudabi_sys_file_allocate_args *); int cloudabi_sys_file_create(struct thread *, struct cloudabi_sys_file_create_args *); int cloudabi_sys_file_link(struct thread *, struct cloudabi_sys_file_link_args *); int cloudabi_sys_file_open(struct thread *, struct cloudabi_sys_file_open_args *); int cloudabi_sys_file_readdir(struct thread *, struct cloudabi_sys_file_readdir_args *); int cloudabi_sys_file_readlink(struct thread *, struct cloudabi_sys_file_readlink_args *); int cloudabi_sys_file_rename(struct thread *, struct cloudabi_sys_file_rename_args *); int cloudabi_sys_file_stat_fget(struct thread *, struct cloudabi_sys_file_stat_fget_args *); int cloudabi_sys_file_stat_fput(struct thread *, struct cloudabi_sys_file_stat_fput_args *); int cloudabi_sys_file_stat_get(struct thread *, struct cloudabi_sys_file_stat_get_args *); int cloudabi_sys_file_stat_put(struct thread *, struct cloudabi_sys_file_stat_put_args *); int cloudabi_sys_file_symlink(struct thread *, struct cloudabi_sys_file_symlink_args *); int cloudabi_sys_file_unlink(struct thread *, struct cloudabi_sys_file_unlink_args *); int cloudabi_sys_lock_unlock(struct thread *, struct cloudabi_sys_lock_unlock_args *); int cloudabi_sys_mem_advise(struct thread *, struct cloudabi_sys_mem_advise_args *); int cloudabi_sys_mem_map(struct thread *, struct cloudabi_sys_mem_map_args *); int cloudabi_sys_mem_protect(struct thread *, struct cloudabi_sys_mem_protect_args *); int cloudabi_sys_mem_sync(struct thread *, struct cloudabi_sys_mem_sync_args *); int cloudabi_sys_mem_unmap(struct thread *, struct cloudabi_sys_mem_unmap_args *); int cloudabi64_sys_poll(struct thread *, struct cloudabi64_sys_poll_args *); int cloudabi64_sys_poll_fd(struct thread *, struct cloudabi64_sys_poll_fd_args *); int cloudabi_sys_proc_exec(struct thread *, struct cloudabi_sys_proc_exec_args *); int cloudabi_sys_proc_exit(struct thread *, struct cloudabi_sys_proc_exit_args *); int cloudabi_sys_proc_fork(struct thread *, struct cloudabi_sys_proc_fork_args *); int cloudabi_sys_proc_raise(struct thread *, struct cloudabi_sys_proc_raise_args *); int cloudabi_sys_random_get(struct thread *, struct cloudabi_sys_random_get_args *); int cloudabi_sys_sock_accept(struct thread *, struct cloudabi_sys_sock_accept_args *); -int cloudabi_sys_sock_bind(struct thread *, struct cloudabi_sys_sock_bind_args *); -int cloudabi_sys_sock_connect(struct thread *, struct cloudabi_sys_sock_connect_args *); -int cloudabi_sys_sock_listen(struct thread *, struct cloudabi_sys_sock_listen_args *); int cloudabi64_sys_sock_recv(struct thread *, struct cloudabi64_sys_sock_recv_args *); int cloudabi64_sys_sock_send(struct thread *, struct cloudabi64_sys_sock_send_args *); int cloudabi_sys_sock_shutdown(struct thread *, struct cloudabi_sys_sock_shutdown_args *); int cloudabi_sys_sock_stat_get(struct thread *, struct cloudabi_sys_sock_stat_get_args *); int cloudabi64_sys_thread_create(struct thread *, struct cloudabi64_sys_thread_create_args *); int cloudabi_sys_thread_exit(struct thread *, struct cloudabi_sys_thread_exit_args *); int cloudabi_sys_thread_yield(struct thread *, struct cloudabi_sys_thread_yield_args *); #ifdef COMPAT_43 #endif /* COMPAT_43 */ #ifdef COMPAT_FREEBSD4 #endif /* COMPAT_FREEBSD4 */ #ifdef COMPAT_FREEBSD6 #endif /* COMPAT_FREEBSD6 */ #ifdef COMPAT_FREEBSD7 #endif /* COMPAT_FREEBSD7 */ #ifdef COMPAT_FREEBSD10 #endif /* COMPAT_FREEBSD10 */ #ifdef COMPAT_FREEBSD11 #endif /* COMPAT_FREEBSD11 */ #define CLOUDABI64_SYS_AUE_cloudabi_sys_clock_res_get AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_clock_time_get AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_condvar_signal AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_close AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_create1 AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_create2 AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_datasync AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_dup AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_fd_pread AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_fd_pwrite AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_fd_read AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_replace AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_seek AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_stat_get AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_stat_put AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_fd_sync AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_fd_write AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_advise AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_allocate AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_create AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_link AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_open AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_readdir AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_readlink AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_rename AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_stat_fget AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_stat_fput AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_stat_get AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_stat_put AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_symlink AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_file_unlink AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_lock_unlock AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_mem_advise AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_mem_map AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_mem_protect AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_mem_sync AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_mem_unmap AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_poll AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_poll_fd AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_proc_exec AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_proc_exit AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_proc_fork AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_proc_raise AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_random_get AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_sock_accept AUE_NULL -#define CLOUDABI64_SYS_AUE_cloudabi_sys_sock_bind AUE_NULL -#define CLOUDABI64_SYS_AUE_cloudabi_sys_sock_connect AUE_NULL -#define CLOUDABI64_SYS_AUE_cloudabi_sys_sock_listen AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_sock_recv AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_sock_send AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_sock_shutdown AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_sock_stat_get AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi64_sys_thread_create AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_thread_exit AUE_NULL #define CLOUDABI64_SYS_AUE_cloudabi_sys_thread_yield AUE_NULL #undef PAD_ #undef PADL_ #undef PADR_ #endif /* !_CLOUDABI64_SYSPROTO_H_ */ Index: projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_syscall.h =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_syscall.h (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_syscall.h (revision 322922) @@ -1,63 +1,60 @@ /* * System call numbers. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ #define CLOUDABI64_SYS_cloudabi_sys_clock_res_get 0 #define CLOUDABI64_SYS_cloudabi_sys_clock_time_get 1 #define CLOUDABI64_SYS_cloudabi_sys_condvar_signal 2 #define CLOUDABI64_SYS_cloudabi_sys_fd_close 3 #define CLOUDABI64_SYS_cloudabi_sys_fd_create1 4 #define CLOUDABI64_SYS_cloudabi_sys_fd_create2 5 #define CLOUDABI64_SYS_cloudabi_sys_fd_datasync 6 #define CLOUDABI64_SYS_cloudabi_sys_fd_dup 7 #define CLOUDABI64_SYS_cloudabi64_sys_fd_pread 8 #define CLOUDABI64_SYS_cloudabi64_sys_fd_pwrite 9 #define CLOUDABI64_SYS_cloudabi64_sys_fd_read 10 #define CLOUDABI64_SYS_cloudabi_sys_fd_replace 11 #define CLOUDABI64_SYS_cloudabi_sys_fd_seek 12 #define CLOUDABI64_SYS_cloudabi_sys_fd_stat_get 13 #define CLOUDABI64_SYS_cloudabi_sys_fd_stat_put 14 #define CLOUDABI64_SYS_cloudabi_sys_fd_sync 15 #define CLOUDABI64_SYS_cloudabi64_sys_fd_write 16 #define CLOUDABI64_SYS_cloudabi_sys_file_advise 17 #define CLOUDABI64_SYS_cloudabi_sys_file_allocate 18 #define CLOUDABI64_SYS_cloudabi_sys_file_create 19 #define CLOUDABI64_SYS_cloudabi_sys_file_link 20 #define CLOUDABI64_SYS_cloudabi_sys_file_open 21 #define CLOUDABI64_SYS_cloudabi_sys_file_readdir 22 #define CLOUDABI64_SYS_cloudabi_sys_file_readlink 23 #define CLOUDABI64_SYS_cloudabi_sys_file_rename 24 #define CLOUDABI64_SYS_cloudabi_sys_file_stat_fget 25 #define CLOUDABI64_SYS_cloudabi_sys_file_stat_fput 26 #define CLOUDABI64_SYS_cloudabi_sys_file_stat_get 27 #define CLOUDABI64_SYS_cloudabi_sys_file_stat_put 28 #define CLOUDABI64_SYS_cloudabi_sys_file_symlink 29 #define CLOUDABI64_SYS_cloudabi_sys_file_unlink 30 #define CLOUDABI64_SYS_cloudabi_sys_lock_unlock 31 #define CLOUDABI64_SYS_cloudabi_sys_mem_advise 32 #define CLOUDABI64_SYS_cloudabi_sys_mem_map 33 #define CLOUDABI64_SYS_cloudabi_sys_mem_protect 34 #define CLOUDABI64_SYS_cloudabi_sys_mem_sync 35 #define CLOUDABI64_SYS_cloudabi_sys_mem_unmap 36 #define CLOUDABI64_SYS_cloudabi64_sys_poll 37 #define CLOUDABI64_SYS_cloudabi64_sys_poll_fd 38 #define CLOUDABI64_SYS_cloudabi_sys_proc_exec 39 #define CLOUDABI64_SYS_cloudabi_sys_proc_exit 40 #define CLOUDABI64_SYS_cloudabi_sys_proc_fork 41 #define CLOUDABI64_SYS_cloudabi_sys_proc_raise 42 #define CLOUDABI64_SYS_cloudabi_sys_random_get 43 #define CLOUDABI64_SYS_cloudabi_sys_sock_accept 44 -#define CLOUDABI64_SYS_cloudabi_sys_sock_bind 45 -#define CLOUDABI64_SYS_cloudabi_sys_sock_connect 46 -#define CLOUDABI64_SYS_cloudabi_sys_sock_listen 47 -#define CLOUDABI64_SYS_cloudabi64_sys_sock_recv 48 -#define CLOUDABI64_SYS_cloudabi64_sys_sock_send 49 -#define CLOUDABI64_SYS_cloudabi_sys_sock_shutdown 50 -#define CLOUDABI64_SYS_cloudabi_sys_sock_stat_get 51 -#define CLOUDABI64_SYS_cloudabi64_sys_thread_create 52 -#define CLOUDABI64_SYS_cloudabi_sys_thread_exit 53 -#define CLOUDABI64_SYS_cloudabi_sys_thread_yield 54 -#define CLOUDABI64_SYS_MAXSYSCALL 55 +#define CLOUDABI64_SYS_cloudabi64_sys_sock_recv 45 +#define CLOUDABI64_SYS_cloudabi64_sys_sock_send 46 +#define CLOUDABI64_SYS_cloudabi_sys_sock_shutdown 47 +#define CLOUDABI64_SYS_cloudabi_sys_sock_stat_get 48 +#define CLOUDABI64_SYS_cloudabi64_sys_thread_create 49 +#define CLOUDABI64_SYS_cloudabi_sys_thread_exit 50 +#define CLOUDABI64_SYS_cloudabi_sys_thread_yield 51 +#define CLOUDABI64_SYS_MAXSYSCALL 52 Index: projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_syscalls.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_syscalls.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_syscalls.c (revision 322922) @@ -1,64 +1,61 @@ /* * System call names. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ const char *cloudabi64_syscallnames[] = { "cloudabi_sys_clock_res_get", /* 0 = cloudabi_sys_clock_res_get */ "cloudabi_sys_clock_time_get", /* 1 = cloudabi_sys_clock_time_get */ "cloudabi_sys_condvar_signal", /* 2 = cloudabi_sys_condvar_signal */ "cloudabi_sys_fd_close", /* 3 = cloudabi_sys_fd_close */ "cloudabi_sys_fd_create1", /* 4 = cloudabi_sys_fd_create1 */ "cloudabi_sys_fd_create2", /* 5 = cloudabi_sys_fd_create2 */ "cloudabi_sys_fd_datasync", /* 6 = cloudabi_sys_fd_datasync */ "cloudabi_sys_fd_dup", /* 7 = cloudabi_sys_fd_dup */ "cloudabi64_sys_fd_pread", /* 8 = cloudabi64_sys_fd_pread */ "cloudabi64_sys_fd_pwrite", /* 9 = cloudabi64_sys_fd_pwrite */ "cloudabi64_sys_fd_read", /* 10 = cloudabi64_sys_fd_read */ "cloudabi_sys_fd_replace", /* 11 = cloudabi_sys_fd_replace */ "cloudabi_sys_fd_seek", /* 12 = cloudabi_sys_fd_seek */ "cloudabi_sys_fd_stat_get", /* 13 = cloudabi_sys_fd_stat_get */ "cloudabi_sys_fd_stat_put", /* 14 = cloudabi_sys_fd_stat_put */ "cloudabi_sys_fd_sync", /* 15 = cloudabi_sys_fd_sync */ "cloudabi64_sys_fd_write", /* 16 = cloudabi64_sys_fd_write */ "cloudabi_sys_file_advise", /* 17 = cloudabi_sys_file_advise */ "cloudabi_sys_file_allocate", /* 18 = cloudabi_sys_file_allocate */ "cloudabi_sys_file_create", /* 19 = cloudabi_sys_file_create */ "cloudabi_sys_file_link", /* 20 = cloudabi_sys_file_link */ "cloudabi_sys_file_open", /* 21 = cloudabi_sys_file_open */ "cloudabi_sys_file_readdir", /* 22 = cloudabi_sys_file_readdir */ "cloudabi_sys_file_readlink", /* 23 = cloudabi_sys_file_readlink */ "cloudabi_sys_file_rename", /* 24 = cloudabi_sys_file_rename */ "cloudabi_sys_file_stat_fget", /* 25 = cloudabi_sys_file_stat_fget */ "cloudabi_sys_file_stat_fput", /* 26 = cloudabi_sys_file_stat_fput */ "cloudabi_sys_file_stat_get", /* 27 = cloudabi_sys_file_stat_get */ "cloudabi_sys_file_stat_put", /* 28 = cloudabi_sys_file_stat_put */ "cloudabi_sys_file_symlink", /* 29 = cloudabi_sys_file_symlink */ "cloudabi_sys_file_unlink", /* 30 = cloudabi_sys_file_unlink */ "cloudabi_sys_lock_unlock", /* 31 = cloudabi_sys_lock_unlock */ "cloudabi_sys_mem_advise", /* 32 = cloudabi_sys_mem_advise */ "cloudabi_sys_mem_map", /* 33 = cloudabi_sys_mem_map */ "cloudabi_sys_mem_protect", /* 34 = cloudabi_sys_mem_protect */ "cloudabi_sys_mem_sync", /* 35 = cloudabi_sys_mem_sync */ "cloudabi_sys_mem_unmap", /* 36 = cloudabi_sys_mem_unmap */ "cloudabi64_sys_poll", /* 37 = cloudabi64_sys_poll */ "cloudabi64_sys_poll_fd", /* 38 = cloudabi64_sys_poll_fd */ "cloudabi_sys_proc_exec", /* 39 = cloudabi_sys_proc_exec */ "cloudabi_sys_proc_exit", /* 40 = cloudabi_sys_proc_exit */ "cloudabi_sys_proc_fork", /* 41 = cloudabi_sys_proc_fork */ "cloudabi_sys_proc_raise", /* 42 = cloudabi_sys_proc_raise */ "cloudabi_sys_random_get", /* 43 = cloudabi_sys_random_get */ "cloudabi_sys_sock_accept", /* 44 = cloudabi_sys_sock_accept */ - "cloudabi_sys_sock_bind", /* 45 = cloudabi_sys_sock_bind */ - "cloudabi_sys_sock_connect", /* 46 = cloudabi_sys_sock_connect */ - "cloudabi_sys_sock_listen", /* 47 = cloudabi_sys_sock_listen */ - "cloudabi64_sys_sock_recv", /* 48 = cloudabi64_sys_sock_recv */ - "cloudabi64_sys_sock_send", /* 49 = cloudabi64_sys_sock_send */ - "cloudabi_sys_sock_shutdown", /* 50 = cloudabi_sys_sock_shutdown */ - "cloudabi_sys_sock_stat_get", /* 51 = cloudabi_sys_sock_stat_get */ - "cloudabi64_sys_thread_create", /* 52 = cloudabi64_sys_thread_create */ - "cloudabi_sys_thread_exit", /* 53 = cloudabi_sys_thread_exit */ - "cloudabi_sys_thread_yield", /* 54 = cloudabi_sys_thread_yield */ + "cloudabi64_sys_sock_recv", /* 45 = cloudabi64_sys_sock_recv */ + "cloudabi64_sys_sock_send", /* 46 = cloudabi64_sys_sock_send */ + "cloudabi_sys_sock_shutdown", /* 47 = cloudabi_sys_sock_shutdown */ + "cloudabi_sys_sock_stat_get", /* 48 = cloudabi_sys_sock_stat_get */ + "cloudabi64_sys_thread_create", /* 49 = cloudabi64_sys_thread_create */ + "cloudabi_sys_thread_exit", /* 50 = cloudabi_sys_thread_exit */ + "cloudabi_sys_thread_yield", /* 51 = cloudabi_sys_thread_yield */ }; Index: projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_sysent.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_sysent.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_sysent.c (revision 322922) @@ -1,72 +1,69 @@ /* * System call switch table. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ */ #include #include #include #include #define AS(name) (sizeof(struct name) / sizeof(register_t)) /* The casts are bogus but will do for now. */ struct sysent cloudabi64_sysent[] = { { AS(cloudabi_sys_clock_res_get_args), (sy_call_t *)cloudabi_sys_clock_res_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 0 = cloudabi_sys_clock_res_get */ { AS(cloudabi_sys_clock_time_get_args), (sy_call_t *)cloudabi_sys_clock_time_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 1 = cloudabi_sys_clock_time_get */ { AS(cloudabi_sys_condvar_signal_args), (sy_call_t *)cloudabi_sys_condvar_signal, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 2 = cloudabi_sys_condvar_signal */ { AS(cloudabi_sys_fd_close_args), (sy_call_t *)cloudabi_sys_fd_close, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 3 = cloudabi_sys_fd_close */ { AS(cloudabi_sys_fd_create1_args), (sy_call_t *)cloudabi_sys_fd_create1, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 4 = cloudabi_sys_fd_create1 */ { AS(cloudabi_sys_fd_create2_args), (sy_call_t *)cloudabi_sys_fd_create2, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 5 = cloudabi_sys_fd_create2 */ { AS(cloudabi_sys_fd_datasync_args), (sy_call_t *)cloudabi_sys_fd_datasync, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 6 = cloudabi_sys_fd_datasync */ { AS(cloudabi_sys_fd_dup_args), (sy_call_t *)cloudabi_sys_fd_dup, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 7 = cloudabi_sys_fd_dup */ { AS(cloudabi64_sys_fd_pread_args), (sy_call_t *)cloudabi64_sys_fd_pread, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 8 = cloudabi64_sys_fd_pread */ { AS(cloudabi64_sys_fd_pwrite_args), (sy_call_t *)cloudabi64_sys_fd_pwrite, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 9 = cloudabi64_sys_fd_pwrite */ { AS(cloudabi64_sys_fd_read_args), (sy_call_t *)cloudabi64_sys_fd_read, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 10 = cloudabi64_sys_fd_read */ { AS(cloudabi_sys_fd_replace_args), (sy_call_t *)cloudabi_sys_fd_replace, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 11 = cloudabi_sys_fd_replace */ { AS(cloudabi_sys_fd_seek_args), (sy_call_t *)cloudabi_sys_fd_seek, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 12 = cloudabi_sys_fd_seek */ { AS(cloudabi_sys_fd_stat_get_args), (sy_call_t *)cloudabi_sys_fd_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 13 = cloudabi_sys_fd_stat_get */ { AS(cloudabi_sys_fd_stat_put_args), (sy_call_t *)cloudabi_sys_fd_stat_put, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 14 = cloudabi_sys_fd_stat_put */ { AS(cloudabi_sys_fd_sync_args), (sy_call_t *)cloudabi_sys_fd_sync, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 15 = cloudabi_sys_fd_sync */ { AS(cloudabi64_sys_fd_write_args), (sy_call_t *)cloudabi64_sys_fd_write, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 16 = cloudabi64_sys_fd_write */ { AS(cloudabi_sys_file_advise_args), (sy_call_t *)cloudabi_sys_file_advise, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 17 = cloudabi_sys_file_advise */ { AS(cloudabi_sys_file_allocate_args), (sy_call_t *)cloudabi_sys_file_allocate, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 18 = cloudabi_sys_file_allocate */ { AS(cloudabi_sys_file_create_args), (sy_call_t *)cloudabi_sys_file_create, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 19 = cloudabi_sys_file_create */ { AS(cloudabi_sys_file_link_args), (sy_call_t *)cloudabi_sys_file_link, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 20 = cloudabi_sys_file_link */ { AS(cloudabi_sys_file_open_args), (sy_call_t *)cloudabi_sys_file_open, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 21 = cloudabi_sys_file_open */ { AS(cloudabi_sys_file_readdir_args), (sy_call_t *)cloudabi_sys_file_readdir, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 22 = cloudabi_sys_file_readdir */ { AS(cloudabi_sys_file_readlink_args), (sy_call_t *)cloudabi_sys_file_readlink, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 23 = cloudabi_sys_file_readlink */ { AS(cloudabi_sys_file_rename_args), (sy_call_t *)cloudabi_sys_file_rename, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 24 = cloudabi_sys_file_rename */ { AS(cloudabi_sys_file_stat_fget_args), (sy_call_t *)cloudabi_sys_file_stat_fget, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 25 = cloudabi_sys_file_stat_fget */ { AS(cloudabi_sys_file_stat_fput_args), (sy_call_t *)cloudabi_sys_file_stat_fput, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 26 = cloudabi_sys_file_stat_fput */ { AS(cloudabi_sys_file_stat_get_args), (sy_call_t *)cloudabi_sys_file_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 27 = cloudabi_sys_file_stat_get */ { AS(cloudabi_sys_file_stat_put_args), (sy_call_t *)cloudabi_sys_file_stat_put, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 28 = cloudabi_sys_file_stat_put */ { AS(cloudabi_sys_file_symlink_args), (sy_call_t *)cloudabi_sys_file_symlink, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 29 = cloudabi_sys_file_symlink */ { AS(cloudabi_sys_file_unlink_args), (sy_call_t *)cloudabi_sys_file_unlink, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 30 = cloudabi_sys_file_unlink */ { AS(cloudabi_sys_lock_unlock_args), (sy_call_t *)cloudabi_sys_lock_unlock, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 31 = cloudabi_sys_lock_unlock */ { AS(cloudabi_sys_mem_advise_args), (sy_call_t *)cloudabi_sys_mem_advise, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 32 = cloudabi_sys_mem_advise */ { AS(cloudabi_sys_mem_map_args), (sy_call_t *)cloudabi_sys_mem_map, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 33 = cloudabi_sys_mem_map */ { AS(cloudabi_sys_mem_protect_args), (sy_call_t *)cloudabi_sys_mem_protect, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 34 = cloudabi_sys_mem_protect */ { AS(cloudabi_sys_mem_sync_args), (sy_call_t *)cloudabi_sys_mem_sync, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 35 = cloudabi_sys_mem_sync */ { AS(cloudabi_sys_mem_unmap_args), (sy_call_t *)cloudabi_sys_mem_unmap, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 36 = cloudabi_sys_mem_unmap */ { AS(cloudabi64_sys_poll_args), (sy_call_t *)cloudabi64_sys_poll, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 37 = cloudabi64_sys_poll */ { AS(cloudabi64_sys_poll_fd_args), (sy_call_t *)cloudabi64_sys_poll_fd, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 38 = cloudabi64_sys_poll_fd */ { AS(cloudabi_sys_proc_exec_args), (sy_call_t *)cloudabi_sys_proc_exec, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 39 = cloudabi_sys_proc_exec */ { AS(cloudabi_sys_proc_exit_args), (sy_call_t *)cloudabi_sys_proc_exit, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 40 = cloudabi_sys_proc_exit */ { 0, (sy_call_t *)cloudabi_sys_proc_fork, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 41 = cloudabi_sys_proc_fork */ { AS(cloudabi_sys_proc_raise_args), (sy_call_t *)cloudabi_sys_proc_raise, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 42 = cloudabi_sys_proc_raise */ { AS(cloudabi_sys_random_get_args), (sy_call_t *)cloudabi_sys_random_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 43 = cloudabi_sys_random_get */ { AS(cloudabi_sys_sock_accept_args), (sy_call_t *)cloudabi_sys_sock_accept, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 44 = cloudabi_sys_sock_accept */ - { AS(cloudabi_sys_sock_bind_args), (sy_call_t *)cloudabi_sys_sock_bind, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 45 = cloudabi_sys_sock_bind */ - { AS(cloudabi_sys_sock_connect_args), (sy_call_t *)cloudabi_sys_sock_connect, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 46 = cloudabi_sys_sock_connect */ - { AS(cloudabi_sys_sock_listen_args), (sy_call_t *)cloudabi_sys_sock_listen, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 47 = cloudabi_sys_sock_listen */ - { AS(cloudabi64_sys_sock_recv_args), (sy_call_t *)cloudabi64_sys_sock_recv, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 48 = cloudabi64_sys_sock_recv */ - { AS(cloudabi64_sys_sock_send_args), (sy_call_t *)cloudabi64_sys_sock_send, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 49 = cloudabi64_sys_sock_send */ - { AS(cloudabi_sys_sock_shutdown_args), (sy_call_t *)cloudabi_sys_sock_shutdown, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 50 = cloudabi_sys_sock_shutdown */ - { AS(cloudabi_sys_sock_stat_get_args), (sy_call_t *)cloudabi_sys_sock_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 51 = cloudabi_sys_sock_stat_get */ - { AS(cloudabi64_sys_thread_create_args), (sy_call_t *)cloudabi64_sys_thread_create, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 52 = cloudabi64_sys_thread_create */ - { AS(cloudabi_sys_thread_exit_args), (sy_call_t *)cloudabi_sys_thread_exit, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 53 = cloudabi_sys_thread_exit */ - { 0, (sy_call_t *)cloudabi_sys_thread_yield, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 54 = cloudabi_sys_thread_yield */ + { AS(cloudabi64_sys_sock_recv_args), (sy_call_t *)cloudabi64_sys_sock_recv, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 45 = cloudabi64_sys_sock_recv */ + { AS(cloudabi64_sys_sock_send_args), (sy_call_t *)cloudabi64_sys_sock_send, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 46 = cloudabi64_sys_sock_send */ + { AS(cloudabi_sys_sock_shutdown_args), (sy_call_t *)cloudabi_sys_sock_shutdown, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 47 = cloudabi_sys_sock_shutdown */ + { AS(cloudabi_sys_sock_stat_get_args), (sy_call_t *)cloudabi_sys_sock_stat_get, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 48 = cloudabi_sys_sock_stat_get */ + { AS(cloudabi64_sys_thread_create_args), (sy_call_t *)cloudabi64_sys_thread_create, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 49 = cloudabi64_sys_thread_create */ + { AS(cloudabi_sys_thread_exit_args), (sy_call_t *)cloudabi_sys_thread_exit, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 50 = cloudabi_sys_thread_exit */ + { 0, (sy_call_t *)cloudabi_sys_thread_yield, AUE_NULL, NULL, 0, 0, SYF_CAPENABLED, SY_THR_STATIC }, /* 51 = cloudabi_sys_thread_yield */ }; Index: projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_systrace_args.c =================================================================== --- projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_systrace_args.c (revision 322921) +++ projects/runtime-coverage/sys/compat/cloudabi64/cloudabi64_systrace_args.c (revision 322922) @@ -1,1650 +1,1556 @@ /* * System call argument to DTrace register array converstion. * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ * This file is part of the DTrace syscall provider. */ static void systrace_args(int sysnum, void *params, uint64_t *uarg, int *n_args) { int64_t *iarg = (int64_t *) uarg; switch (sysnum) { /* cloudabi_sys_clock_res_get */ case 0: { struct cloudabi_sys_clock_res_get_args *p = params; iarg[0] = p->clock_id; /* cloudabi_clockid_t */ *n_args = 1; break; } /* cloudabi_sys_clock_time_get */ case 1: { struct cloudabi_sys_clock_time_get_args *p = params; iarg[0] = p->clock_id; /* cloudabi_clockid_t */ iarg[1] = p->precision; /* cloudabi_timestamp_t */ *n_args = 2; break; } /* cloudabi_sys_condvar_signal */ case 2: { struct cloudabi_sys_condvar_signal_args *p = params; uarg[0] = (intptr_t) p->condvar; /* cloudabi_condvar_t * */ iarg[1] = p->scope; /* cloudabi_scope_t */ iarg[2] = p->nwaiters; /* cloudabi_nthreads_t */ *n_args = 3; break; } /* cloudabi_sys_fd_close */ case 3: { struct cloudabi_sys_fd_close_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi_sys_fd_create1 */ case 4: { struct cloudabi_sys_fd_create1_args *p = params; iarg[0] = p->type; /* cloudabi_filetype_t */ *n_args = 1; break; } /* cloudabi_sys_fd_create2 */ case 5: { struct cloudabi_sys_fd_create2_args *p = params; iarg[0] = p->type; /* cloudabi_filetype_t */ *n_args = 1; break; } /* cloudabi_sys_fd_datasync */ case 6: { struct cloudabi_sys_fd_datasync_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi_sys_fd_dup */ case 7: { struct cloudabi_sys_fd_dup_args *p = params; iarg[0] = p->from; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi64_sys_fd_pread */ case 8: { struct cloudabi64_sys_fd_pread_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi64_iovec_t * */ uarg[2] = p->iovs_len; /* size_t */ iarg[3] = p->offset; /* cloudabi_filesize_t */ *n_args = 4; break; } /* cloudabi64_sys_fd_pwrite */ case 9: { struct cloudabi64_sys_fd_pwrite_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi64_ciovec_t * */ uarg[2] = p->iovs_len; /* size_t */ iarg[3] = p->offset; /* cloudabi_filesize_t */ *n_args = 4; break; } /* cloudabi64_sys_fd_read */ case 10: { struct cloudabi64_sys_fd_read_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi64_iovec_t * */ uarg[2] = p->iovs_len; /* size_t */ *n_args = 3; break; } /* cloudabi_sys_fd_replace */ case 11: { struct cloudabi_sys_fd_replace_args *p = params; iarg[0] = p->from; /* cloudabi_fd_t */ iarg[1] = p->to; /* cloudabi_fd_t */ *n_args = 2; break; } /* cloudabi_sys_fd_seek */ case 12: { struct cloudabi_sys_fd_seek_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ iarg[1] = p->offset; /* cloudabi_filedelta_t */ iarg[2] = p->whence; /* cloudabi_whence_t */ *n_args = 3; break; } /* cloudabi_sys_fd_stat_get */ case 13: { struct cloudabi_sys_fd_stat_get_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* cloudabi_fdstat_t * */ *n_args = 2; break; } /* cloudabi_sys_fd_stat_put */ case 14: { struct cloudabi_sys_fd_stat_put_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* const cloudabi_fdstat_t * */ iarg[2] = p->flags; /* cloudabi_fdsflags_t */ *n_args = 3; break; } /* cloudabi_sys_fd_sync */ case 15: { struct cloudabi_sys_fd_sync_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ *n_args = 1; break; } /* cloudabi64_sys_fd_write */ case 16: { struct cloudabi64_sys_fd_write_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->iovs; /* const cloudabi64_ciovec_t * */ uarg[2] = p->iovs_len; /* size_t */ *n_args = 3; break; } /* cloudabi_sys_file_advise */ case 17: { struct cloudabi_sys_file_advise_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ iarg[1] = p->offset; /* cloudabi_filesize_t */ iarg[2] = p->len; /* cloudabi_filesize_t */ iarg[3] = p->advice; /* cloudabi_advice_t */ *n_args = 4; break; } /* cloudabi_sys_file_allocate */ case 18: { struct cloudabi_sys_file_allocate_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ iarg[1] = p->offset; /* cloudabi_filesize_t */ iarg[2] = p->len; /* cloudabi_filesize_t */ *n_args = 3; break; } /* cloudabi_sys_file_create */ case 19: { struct cloudabi_sys_file_create_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ iarg[3] = p->type; /* cloudabi_filetype_t */ *n_args = 4; break; } /* cloudabi_sys_file_link */ case 20: { struct cloudabi_sys_file_link_args *p = params; iarg[0] = p->fd1; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path1; /* const char * */ uarg[2] = p->path1_len; /* size_t */ iarg[3] = p->fd2; /* cloudabi_fd_t */ uarg[4] = (intptr_t) p->path2; /* const char * */ uarg[5] = p->path2_len; /* size_t */ *n_args = 6; break; } /* cloudabi_sys_file_open */ case 21: { struct cloudabi_sys_file_open_args *p = params; iarg[0] = p->dirfd; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ iarg[3] = p->oflags; /* cloudabi_oflags_t */ uarg[4] = (intptr_t) p->fds; /* const cloudabi_fdstat_t * */ *n_args = 5; break; } /* cloudabi_sys_file_readdir */ case 22: { struct cloudabi_sys_file_readdir_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* void * */ uarg[2] = p->buf_len; /* size_t */ iarg[3] = p->cookie; /* cloudabi_dircookie_t */ *n_args = 4; break; } /* cloudabi_sys_file_readlink */ case 23: { struct cloudabi_sys_file_readlink_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ uarg[3] = (intptr_t) p->buf; /* char * */ uarg[4] = p->buf_len; /* size_t */ *n_args = 5; break; } /* cloudabi_sys_file_rename */ case 24: { struct cloudabi_sys_file_rename_args *p = params; iarg[0] = p->fd1; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path1; /* const char * */ uarg[2] = p->path1_len; /* size_t */ iarg[3] = p->fd2; /* cloudabi_fd_t */ uarg[4] = (intptr_t) p->path2; /* const char * */ uarg[5] = p->path2_len; /* size_t */ *n_args = 6; break; } /* cloudabi_sys_file_stat_fget */ case 25: { struct cloudabi_sys_file_stat_fget_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* cloudabi_filestat_t * */ *n_args = 2; break; } /* cloudabi_sys_file_stat_fput */ case 26: { struct cloudabi_sys_file_stat_fput_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* const cloudabi_filestat_t * */ iarg[2] = p->flags; /* cloudabi_fsflags_t */ *n_args = 3; break; } /* cloudabi_sys_file_stat_get */ case 27: { struct cloudabi_sys_file_stat_get_args *p = params; iarg[0] = p->fd; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ uarg[3] = (intptr_t) p->buf; /* cloudabi_filestat_t * */ *n_args = 4; break; } /* cloudabi_sys_file_stat_put */ case 28: { struct cloudabi_sys_file_stat_put_args *p = params; iarg[0] = p->fd; /* cloudabi_lookup_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ uarg[3] = (intptr_t) p->buf; /* const cloudabi_filestat_t * */ iarg[4] = p->flags; /* cloudabi_fsflags_t */ *n_args = 5; break; } /* cloudabi_sys_file_symlink */ case 29: { struct cloudabi_sys_file_symlink_args *p = params; uarg[0] = (intptr_t) p->path1; /* const char * */ uarg[1] = p->path1_len; /* size_t */ iarg[2] = p->fd; /* cloudabi_fd_t */ uarg[3] = (intptr_t) p->path2; /* const char * */ uarg[4] = p->path2_len; /* size_t */ *n_args = 5; break; } /* cloudabi_sys_file_unlink */ case 30: { struct cloudabi_sys_file_unlink_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->path; /* const char * */ uarg[2] = p->path_len; /* size_t */ iarg[3] = p->flags; /* cloudabi_ulflags_t */ *n_args = 4; break; } /* cloudabi_sys_lock_unlock */ case 31: { struct cloudabi_sys_lock_unlock_args *p = params; uarg[0] = (intptr_t) p->lock; /* cloudabi_lock_t * */ iarg[1] = p->scope; /* cloudabi_scope_t */ *n_args = 2; break; } /* cloudabi_sys_mem_advise */ case 32: { struct cloudabi_sys_mem_advise_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ iarg[2] = p->advice; /* cloudabi_advice_t */ *n_args = 3; break; } /* cloudabi_sys_mem_map */ case 33: { struct cloudabi_sys_mem_map_args *p = params; uarg[0] = (intptr_t) p->addr; /* void * */ uarg[1] = p->len; /* size_t */ iarg[2] = p->prot; /* cloudabi_mprot_t */ iarg[3] = p->flags; /* cloudabi_mflags_t */ iarg[4] = p->fd; /* cloudabi_fd_t */ iarg[5] = p->off; /* cloudabi_filesize_t */ *n_args = 6; break; } /* cloudabi_sys_mem_protect */ case 34: { struct cloudabi_sys_mem_protect_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ iarg[2] = p->prot; /* cloudabi_mprot_t */ *n_args = 3; break; } /* cloudabi_sys_mem_sync */ case 35: { struct cloudabi_sys_mem_sync_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ iarg[2] = p->flags; /* cloudabi_msflags_t */ *n_args = 3; break; } /* cloudabi_sys_mem_unmap */ case 36: { struct cloudabi_sys_mem_unmap_args *p = params; uarg[0] = (intptr_t) p->mapping; /* void * */ uarg[1] = p->mapping_len; /* size_t */ *n_args = 2; break; } /* cloudabi64_sys_poll */ case 37: { struct cloudabi64_sys_poll_args *p = params; uarg[0] = (intptr_t) p->in; /* const cloudabi64_subscription_t * */ uarg[1] = (intptr_t) p->out; /* cloudabi64_event_t * */ uarg[2] = p->nsubscriptions; /* size_t */ *n_args = 3; break; } /* cloudabi64_sys_poll_fd */ case 38: { struct cloudabi64_sys_poll_fd_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->in; /* const cloudabi64_subscription_t * */ uarg[2] = p->in_len; /* size_t */ uarg[3] = (intptr_t) p->out; /* cloudabi64_event_t * */ uarg[4] = p->out_len; /* size_t */ uarg[5] = (intptr_t) p->timeout; /* const cloudabi64_subscription_t * */ *n_args = 6; break; } /* cloudabi_sys_proc_exec */ case 39: { struct cloudabi_sys_proc_exec_args *p = params; iarg[0] = p->fd; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->data; /* const void * */ uarg[2] = p->data_len; /* size_t */ uarg[3] = (intptr_t) p->fds; /* const cloudabi_fd_t * */ uarg[4] = p->fds_len; /* size_t */ *n_args = 5; break; } /* cloudabi_sys_proc_exit */ case 40: { struct cloudabi_sys_proc_exit_args *p = params; iarg[0] = p->rval; /* cloudabi_exitcode_t */ *n_args = 1; break; } /* cloudabi_sys_proc_fork */ case 41: { *n_args = 0; break; } /* cloudabi_sys_proc_raise */ case 42: { struct cloudabi_sys_proc_raise_args *p = params; iarg[0] = p->sig; /* cloudabi_signal_t */ *n_args = 1; break; } /* cloudabi_sys_random_get */ case 43: { struct cloudabi_sys_random_get_args *p = params; uarg[0] = (intptr_t) p->buf; /* void * */ uarg[1] = p->buf_len; /* size_t */ *n_args = 2; break; } /* cloudabi_sys_sock_accept */ case 44: { struct cloudabi_sys_sock_accept_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->unused; /* void * */ *n_args = 2; break; } - /* cloudabi_sys_sock_bind */ - case 45: { - struct cloudabi_sys_sock_bind_args *p = params; - iarg[0] = p->sock; /* cloudabi_fd_t */ - iarg[1] = p->fd; /* cloudabi_fd_t */ - uarg[2] = (intptr_t) p->path; /* const char * */ - uarg[3] = p->path_len; /* size_t */ - *n_args = 4; - break; - } - /* cloudabi_sys_sock_connect */ - case 46: { - struct cloudabi_sys_sock_connect_args *p = params; - iarg[0] = p->sock; /* cloudabi_fd_t */ - iarg[1] = p->fd; /* cloudabi_fd_t */ - uarg[2] = (intptr_t) p->path; /* const char * */ - uarg[3] = p->path_len; /* size_t */ - *n_args = 4; - break; - } - /* cloudabi_sys_sock_listen */ - case 47: { - struct cloudabi_sys_sock_listen_args *p = params; - iarg[0] = p->sock; /* cloudabi_fd_t */ - iarg[1] = p->backlog; /* cloudabi_backlog_t */ - *n_args = 2; - break; - } /* cloudabi64_sys_sock_recv */ - case 48: { + case 45: { struct cloudabi64_sys_sock_recv_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->in; /* const cloudabi64_recv_in_t * */ uarg[2] = (intptr_t) p->out; /* cloudabi64_recv_out_t * */ *n_args = 3; break; } /* cloudabi64_sys_sock_send */ - case 49: { + case 46: { struct cloudabi64_sys_sock_send_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->in; /* const cloudabi64_send_in_t * */ uarg[2] = (intptr_t) p->out; /* cloudabi64_send_out_t * */ *n_args = 3; break; } /* cloudabi_sys_sock_shutdown */ - case 50: { + case 47: { struct cloudabi_sys_sock_shutdown_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ iarg[1] = p->how; /* cloudabi_sdflags_t */ *n_args = 2; break; } /* cloudabi_sys_sock_stat_get */ - case 51: { + case 48: { struct cloudabi_sys_sock_stat_get_args *p = params; iarg[0] = p->sock; /* cloudabi_fd_t */ uarg[1] = (intptr_t) p->buf; /* cloudabi_sockstat_t * */ iarg[2] = p->flags; /* cloudabi_ssflags_t */ *n_args = 3; break; } /* cloudabi64_sys_thread_create */ - case 52: { + case 49: { struct cloudabi64_sys_thread_create_args *p = params; uarg[0] = (intptr_t) p->attr; /* cloudabi64_threadattr_t * */ *n_args = 1; break; } /* cloudabi_sys_thread_exit */ - case 53: { + case 50: { struct cloudabi_sys_thread_exit_args *p = params; uarg[0] = (intptr_t) p->lock; /* cloudabi_lock_t * */ iarg[1] = p->scope; /* cloudabi_scope_t */ *n_args = 2; break; } /* cloudabi_sys_thread_yield */ - case 54: { + case 51: { *n_args = 0; break; } default: *n_args = 0; break; }; } static void systrace_entry_setargdesc(int sysnum, int ndx, char *desc, size_t descsz) { const char *p = NULL; switch (sysnum) { /* cloudabi_sys_clock_res_get */ case 0: switch(ndx) { case 0: p = "cloudabi_clockid_t"; break; default: break; }; break; /* cloudabi_sys_clock_time_get */ case 1: switch(ndx) { case 0: p = "cloudabi_clockid_t"; break; case 1: p = "cloudabi_timestamp_t"; break; default: break; }; break; /* cloudabi_sys_condvar_signal */ case 2: switch(ndx) { case 0: p = "userland cloudabi_condvar_t *"; break; case 1: p = "cloudabi_scope_t"; break; case 2: p = "cloudabi_nthreads_t"; break; default: break; }; break; /* cloudabi_sys_fd_close */ case 3: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi_sys_fd_create1 */ case 4: switch(ndx) { case 0: p = "cloudabi_filetype_t"; break; default: break; }; break; /* cloudabi_sys_fd_create2 */ case 5: switch(ndx) { case 0: p = "cloudabi_filetype_t"; break; default: break; }; break; /* cloudabi_sys_fd_datasync */ case 6: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi_sys_fd_dup */ case 7: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi64_sys_fd_pread */ case 8: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi64_iovec_t *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi64_sys_fd_pwrite */ case 9: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi64_ciovec_t *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi64_sys_fd_read */ case 10: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi64_iovec_t *"; break; case 2: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_fd_replace */ case 11: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi_sys_fd_seek */ case 12: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_filedelta_t"; break; case 2: p = "cloudabi_whence_t"; break; default: break; }; break; /* cloudabi_sys_fd_stat_get */ case 13: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland cloudabi_fdstat_t *"; break; default: break; }; break; /* cloudabi_sys_fd_stat_put */ case 14: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi_fdstat_t *"; break; case 2: p = "cloudabi_fdsflags_t"; break; default: break; }; break; /* cloudabi_sys_fd_sync */ case 15: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; default: break; }; break; /* cloudabi64_sys_fd_write */ case 16: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi64_ciovec_t *"; break; case 2: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_advise */ case 17: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_filesize_t"; break; case 2: p = "cloudabi_filesize_t"; break; case 3: p = "cloudabi_advice_t"; break; default: break; }; break; /* cloudabi_sys_file_allocate */ case 18: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_filesize_t"; break; case 2: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi_sys_file_create */ case 19: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_filetype_t"; break; default: break; }; break; /* cloudabi_sys_file_link */ case 20: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_fd_t"; break; case 4: p = "userland const char *"; break; case 5: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_open */ case 21: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_oflags_t"; break; case 4: p = "userland const cloudabi_fdstat_t *"; break; default: break; }; break; /* cloudabi_sys_file_readdir */ case 22: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland void *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_dircookie_t"; break; default: break; }; break; /* cloudabi_sys_file_readlink */ case 23: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "userland char *"; break; case 4: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_rename */ case 24: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_fd_t"; break; case 4: p = "userland const char *"; break; case 5: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_stat_fget */ case 25: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland cloudabi_filestat_t *"; break; default: break; }; break; /* cloudabi_sys_file_stat_fput */ case 26: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi_filestat_t *"; break; case 2: p = "cloudabi_fsflags_t"; break; default: break; }; break; /* cloudabi_sys_file_stat_get */ case 27: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "userland cloudabi_filestat_t *"; break; default: break; }; break; /* cloudabi_sys_file_stat_put */ case 28: switch(ndx) { case 0: p = "cloudabi_lookup_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "userland const cloudabi_filestat_t *"; break; case 4: p = "cloudabi_fsflags_t"; break; default: break; }; break; /* cloudabi_sys_file_symlink */ case 29: switch(ndx) { case 0: p = "userland const char *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_fd_t"; break; case 3: p = "userland const char *"; break; case 4: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_file_unlink */ case 30: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const char *"; break; case 2: p = "size_t"; break; case 3: p = "cloudabi_ulflags_t"; break; default: break; }; break; /* cloudabi_sys_lock_unlock */ case 31: switch(ndx) { case 0: p = "userland cloudabi_lock_t *"; break; case 1: p = "cloudabi_scope_t"; break; default: break; }; break; /* cloudabi_sys_mem_advise */ case 32: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_advice_t"; break; default: break; }; break; /* cloudabi_sys_mem_map */ case 33: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_mprot_t"; break; case 3: p = "cloudabi_mflags_t"; break; case 4: p = "cloudabi_fd_t"; break; case 5: p = "cloudabi_filesize_t"; break; default: break; }; break; /* cloudabi_sys_mem_protect */ case 34: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_mprot_t"; break; default: break; }; break; /* cloudabi_sys_mem_sync */ case 35: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; case 2: p = "cloudabi_msflags_t"; break; default: break; }; break; /* cloudabi_sys_mem_unmap */ case 36: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; default: break; }; break; /* cloudabi64_sys_poll */ case 37: switch(ndx) { case 0: p = "userland const cloudabi64_subscription_t *"; break; case 1: p = "userland cloudabi64_event_t *"; break; case 2: p = "size_t"; break; default: break; }; break; /* cloudabi64_sys_poll_fd */ case 38: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi64_subscription_t *"; break; case 2: p = "size_t"; break; case 3: p = "userland cloudabi64_event_t *"; break; case 4: p = "size_t"; break; case 5: p = "userland const cloudabi64_subscription_t *"; break; default: break; }; break; /* cloudabi_sys_proc_exec */ case 39: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const void *"; break; case 2: p = "size_t"; break; case 3: p = "userland const cloudabi_fd_t *"; break; case 4: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_proc_exit */ case 40: switch(ndx) { case 0: p = "cloudabi_exitcode_t"; break; default: break; }; break; /* cloudabi_sys_proc_fork */ case 41: break; /* cloudabi_sys_proc_raise */ case 42: switch(ndx) { case 0: p = "cloudabi_signal_t"; break; default: break; }; break; /* cloudabi_sys_random_get */ case 43: switch(ndx) { case 0: p = "userland void *"; break; case 1: p = "size_t"; break; default: break; }; break; /* cloudabi_sys_sock_accept */ case 44: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland void *"; break; default: break; }; break; - /* cloudabi_sys_sock_bind */ + /* cloudabi64_sys_sock_recv */ case 45: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: - p = "cloudabi_fd_t"; - break; - case 2: - p = "userland const char *"; - break; - case 3: - p = "size_t"; - break; - default: - break; - }; - break; - /* cloudabi_sys_sock_connect */ - case 46: - switch(ndx) { - case 0: - p = "cloudabi_fd_t"; - break; - case 1: - p = "cloudabi_fd_t"; - break; - case 2: - p = "userland const char *"; - break; - case 3: - p = "size_t"; - break; - default: - break; - }; - break; - /* cloudabi_sys_sock_listen */ - case 47: - switch(ndx) { - case 0: - p = "cloudabi_fd_t"; - break; - case 1: - p = "cloudabi_backlog_t"; - break; - default: - break; - }; - break; - /* cloudabi64_sys_sock_recv */ - case 48: - switch(ndx) { - case 0: - p = "cloudabi_fd_t"; - break; - case 1: p = "userland const cloudabi64_recv_in_t *"; break; case 2: p = "userland cloudabi64_recv_out_t *"; break; default: break; }; break; /* cloudabi64_sys_sock_send */ - case 49: + case 46: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland const cloudabi64_send_in_t *"; break; case 2: p = "userland cloudabi64_send_out_t *"; break; default: break; }; break; /* cloudabi_sys_sock_shutdown */ - case 50: + case 47: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "cloudabi_sdflags_t"; break; default: break; }; break; /* cloudabi_sys_sock_stat_get */ - case 51: + case 48: switch(ndx) { case 0: p = "cloudabi_fd_t"; break; case 1: p = "userland cloudabi_sockstat_t *"; break; case 2: p = "cloudabi_ssflags_t"; break; default: break; }; break; /* cloudabi64_sys_thread_create */ - case 52: + case 49: switch(ndx) { case 0: p = "userland cloudabi64_threadattr_t *"; break; default: break; }; break; /* cloudabi_sys_thread_exit */ - case 53: + case 50: switch(ndx) { case 0: p = "userland cloudabi_lock_t *"; break; case 1: p = "cloudabi_scope_t"; break; default: break; }; break; /* cloudabi_sys_thread_yield */ - case 54: + case 51: break; default: break; }; if (p != NULL) strlcpy(desc, p, descsz); } static void systrace_return_setargdesc(int sysnum, int ndx, char *desc, size_t descsz) { const char *p = NULL; switch (sysnum) { /* cloudabi_sys_clock_res_get */ case 0: if (ndx == 0 || ndx == 1) p = "cloudabi_timestamp_t"; break; /* cloudabi_sys_clock_time_get */ case 1: if (ndx == 0 || ndx == 1) p = "cloudabi_timestamp_t"; break; /* cloudabi_sys_condvar_signal */ case 2: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_close */ case 3: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_create1 */ case 4: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; /* cloudabi_sys_fd_create2 */ case 5: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_datasync */ case 6: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_dup */ case 7: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; /* cloudabi64_sys_fd_pread */ case 8: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi64_sys_fd_pwrite */ case 9: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi64_sys_fd_read */ case 10: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_fd_replace */ case 11: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_seek */ case 12: if (ndx == 0 || ndx == 1) p = "cloudabi_filesize_t"; break; /* cloudabi_sys_fd_stat_get */ case 13: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_stat_put */ case 14: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_fd_sync */ case 15: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi64_sys_fd_write */ case 16: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_file_advise */ case 17: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_allocate */ case 18: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_create */ case 19: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_link */ case 20: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_open */ case 21: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; /* cloudabi_sys_file_readdir */ case 22: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_file_readlink */ case 23: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_file_rename */ case 24: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_fget */ case 25: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_fput */ case 26: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_get */ case 27: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_stat_put */ case 28: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_symlink */ case 29: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_file_unlink */ case 30: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_lock_unlock */ case 31: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_advise */ case 32: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_map */ case 33: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_protect */ case 34: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_sync */ case 35: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_mem_unmap */ case 36: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi64_sys_poll */ case 37: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi64_sys_poll_fd */ case 38: if (ndx == 0 || ndx == 1) p = "size_t"; break; /* cloudabi_sys_proc_exec */ case 39: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_proc_exit */ case 40: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_proc_fork */ case 41: /* cloudabi_sys_proc_raise */ case 42: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_random_get */ case 43: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_sock_accept */ case 44: if (ndx == 0 || ndx == 1) p = "cloudabi_fd_t"; break; - /* cloudabi_sys_sock_bind */ + /* cloudabi64_sys_sock_recv */ case 45: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi_sys_sock_connect */ + /* cloudabi64_sys_sock_send */ case 46: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi_sys_sock_listen */ + /* cloudabi_sys_sock_shutdown */ case 47: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi64_sys_sock_recv */ + /* cloudabi_sys_sock_stat_get */ case 48: if (ndx == 0 || ndx == 1) p = "void"; break; - /* cloudabi64_sys_sock_send */ + /* cloudabi64_sys_thread_create */ case 49: if (ndx == 0 || ndx == 1) - p = "void"; - break; - /* cloudabi_sys_sock_shutdown */ - case 50: - if (ndx == 0 || ndx == 1) - p = "void"; - break; - /* cloudabi_sys_sock_stat_get */ - case 51: - if (ndx == 0 || ndx == 1) - p = "void"; - break; - /* cloudabi64_sys_thread_create */ - case 52: - if (ndx == 0 || ndx == 1) p = "cloudabi_tid_t"; break; /* cloudabi_sys_thread_exit */ - case 53: + case 50: if (ndx == 0 || ndx == 1) p = "void"; break; /* cloudabi_sys_thread_yield */ - case 54: + case 51: default: break; }; if (p != NULL) strlcpy(desc, p, descsz); } Index: projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_types_common.h =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_types_common.h (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_types_common.h (revision 322922) @@ -1,437 +1,432 @@ // Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions // are met: // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE // ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS // OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) // HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT // LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY // OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF // SUCH DAMAGE. // // This file is automatically generated. Do not edit. // // Source: https://github.com/NuxiNL/cloudabi #ifndef CLOUDABI_TYPES_COMMON_H #define CLOUDABI_TYPES_COMMON_H #if defined(__FreeBSD__) && defined(_KERNEL) #include #elif defined(__linux__) && defined(__KERNEL__) #include #else #include #include #endif typedef uint8_t cloudabi_advice_t; #define CLOUDABI_ADVICE_DONTNEED 1 #define CLOUDABI_ADVICE_NOREUSE 2 #define CLOUDABI_ADVICE_NORMAL 3 #define CLOUDABI_ADVICE_RANDOM 4 #define CLOUDABI_ADVICE_SEQUENTIAL 5 #define CLOUDABI_ADVICE_WILLNEED 6 typedef uint32_t cloudabi_auxtype_t; #define CLOUDABI_AT_ARGDATA 256 #define CLOUDABI_AT_ARGDATALEN 257 #define CLOUDABI_AT_BASE 7 #define CLOUDABI_AT_CANARY 258 #define CLOUDABI_AT_CANARYLEN 259 #define CLOUDABI_AT_NCPUS 260 #define CLOUDABI_AT_NULL 0 #define CLOUDABI_AT_PAGESZ 6 #define CLOUDABI_AT_PHDR 3 #define CLOUDABI_AT_PHNUM 4 #define CLOUDABI_AT_SYSINFO_EHDR 262 #define CLOUDABI_AT_TID 261 typedef uint32_t cloudabi_backlog_t; typedef uint32_t cloudabi_clockid_t; #define CLOUDABI_CLOCK_MONOTONIC 1 #define CLOUDABI_CLOCK_PROCESS_CPUTIME_ID 2 #define CLOUDABI_CLOCK_REALTIME 3 #define CLOUDABI_CLOCK_THREAD_CPUTIME_ID 4 typedef uint32_t cloudabi_condvar_t; #define CLOUDABI_CONDVAR_HAS_NO_WAITERS 0 typedef uint64_t cloudabi_device_t; typedef uint64_t cloudabi_dircookie_t; #define CLOUDABI_DIRCOOKIE_START 0 typedef uint16_t cloudabi_errno_t; #define CLOUDABI_E2BIG 1 #define CLOUDABI_EACCES 2 #define CLOUDABI_EADDRINUSE 3 #define CLOUDABI_EADDRNOTAVAIL 4 #define CLOUDABI_EAFNOSUPPORT 5 #define CLOUDABI_EAGAIN 6 #define CLOUDABI_EALREADY 7 #define CLOUDABI_EBADF 8 #define CLOUDABI_EBADMSG 9 #define CLOUDABI_EBUSY 10 #define CLOUDABI_ECANCELED 11 #define CLOUDABI_ECHILD 12 #define CLOUDABI_ECONNABORTED 13 #define CLOUDABI_ECONNREFUSED 14 #define CLOUDABI_ECONNRESET 15 #define CLOUDABI_EDEADLK 16 #define CLOUDABI_EDESTADDRREQ 17 #define CLOUDABI_EDOM 18 #define CLOUDABI_EDQUOT 19 #define CLOUDABI_EEXIST 20 #define CLOUDABI_EFAULT 21 #define CLOUDABI_EFBIG 22 #define CLOUDABI_EHOSTUNREACH 23 #define CLOUDABI_EIDRM 24 #define CLOUDABI_EILSEQ 25 #define CLOUDABI_EINPROGRESS 26 #define CLOUDABI_EINTR 27 #define CLOUDABI_EINVAL 28 #define CLOUDABI_EIO 29 #define CLOUDABI_EISCONN 30 #define CLOUDABI_EISDIR 31 #define CLOUDABI_ELOOP 32 #define CLOUDABI_EMFILE 33 #define CLOUDABI_EMLINK 34 #define CLOUDABI_EMSGSIZE 35 #define CLOUDABI_EMULTIHOP 36 #define CLOUDABI_ENAMETOOLONG 37 #define CLOUDABI_ENETDOWN 38 #define CLOUDABI_ENETRESET 39 #define CLOUDABI_ENETUNREACH 40 #define CLOUDABI_ENFILE 41 #define CLOUDABI_ENOBUFS 42 #define CLOUDABI_ENODEV 43 #define CLOUDABI_ENOENT 44 #define CLOUDABI_ENOEXEC 45 #define CLOUDABI_ENOLCK 46 #define CLOUDABI_ENOLINK 47 #define CLOUDABI_ENOMEM 48 #define CLOUDABI_ENOMSG 49 #define CLOUDABI_ENOPROTOOPT 50 #define CLOUDABI_ENOSPC 51 #define CLOUDABI_ENOSYS 52 #define CLOUDABI_ENOTCONN 53 #define CLOUDABI_ENOTDIR 54 #define CLOUDABI_ENOTEMPTY 55 #define CLOUDABI_ENOTRECOVERABLE 56 #define CLOUDABI_ENOTSOCK 57 #define CLOUDABI_ENOTSUP 58 #define CLOUDABI_ENOTTY 59 #define CLOUDABI_ENXIO 60 #define CLOUDABI_EOVERFLOW 61 #define CLOUDABI_EOWNERDEAD 62 #define CLOUDABI_EPERM 63 #define CLOUDABI_EPIPE 64 #define CLOUDABI_EPROTO 65 #define CLOUDABI_EPROTONOSUPPORT 66 #define CLOUDABI_EPROTOTYPE 67 #define CLOUDABI_ERANGE 68 #define CLOUDABI_EROFS 69 #define CLOUDABI_ESPIPE 70 #define CLOUDABI_ESRCH 71 #define CLOUDABI_ESTALE 72 #define CLOUDABI_ETIMEDOUT 73 #define CLOUDABI_ETXTBSY 74 #define CLOUDABI_EXDEV 75 #define CLOUDABI_ENOTCAPABLE 76 typedef uint16_t cloudabi_eventrwflags_t; #define CLOUDABI_EVENT_FD_READWRITE_HANGUP 0x0001 typedef uint8_t cloudabi_eventtype_t; #define CLOUDABI_EVENTTYPE_CLOCK 1 #define CLOUDABI_EVENTTYPE_CONDVAR 2 #define CLOUDABI_EVENTTYPE_FD_READ 3 #define CLOUDABI_EVENTTYPE_FD_WRITE 4 #define CLOUDABI_EVENTTYPE_LOCK_RDLOCK 5 #define CLOUDABI_EVENTTYPE_LOCK_WRLOCK 6 #define CLOUDABI_EVENTTYPE_PROC_TERMINATE 7 typedef uint32_t cloudabi_exitcode_t; typedef uint32_t cloudabi_fd_t; #define CLOUDABI_PROCESS_CHILD 0xffffffff #define CLOUDABI_MAP_ANON_FD 0xffffffff typedef uint16_t cloudabi_fdflags_t; #define CLOUDABI_FDFLAG_APPEND 0x0001 #define CLOUDABI_FDFLAG_DSYNC 0x0002 #define CLOUDABI_FDFLAG_NONBLOCK 0x0004 #define CLOUDABI_FDFLAG_RSYNC 0x0008 #define CLOUDABI_FDFLAG_SYNC 0x0010 typedef uint16_t cloudabi_fdsflags_t; #define CLOUDABI_FDSTAT_FLAGS 0x0001 #define CLOUDABI_FDSTAT_RIGHTS 0x0002 typedef int64_t cloudabi_filedelta_t; typedef uint64_t cloudabi_filesize_t; typedef uint8_t cloudabi_filetype_t; #define CLOUDABI_FILETYPE_UNKNOWN 0 #define CLOUDABI_FILETYPE_BLOCK_DEVICE 16 #define CLOUDABI_FILETYPE_CHARACTER_DEVICE 17 #define CLOUDABI_FILETYPE_DIRECTORY 32 #define CLOUDABI_FILETYPE_FIFO 48 #define CLOUDABI_FILETYPE_POLL 64 #define CLOUDABI_FILETYPE_PROCESS 80 #define CLOUDABI_FILETYPE_REGULAR_FILE 96 #define CLOUDABI_FILETYPE_SHARED_MEMORY 112 #define CLOUDABI_FILETYPE_SOCKET_DGRAM 128 #define CLOUDABI_FILETYPE_SOCKET_STREAM 130 #define CLOUDABI_FILETYPE_SYMBOLIC_LINK 144 typedef uint16_t cloudabi_fsflags_t; #define CLOUDABI_FILESTAT_ATIM 0x0001 #define CLOUDABI_FILESTAT_ATIM_NOW 0x0002 #define CLOUDABI_FILESTAT_MTIM 0x0004 #define CLOUDABI_FILESTAT_MTIM_NOW 0x0008 #define CLOUDABI_FILESTAT_SIZE 0x0010 typedef uint64_t cloudabi_inode_t; typedef uint32_t cloudabi_linkcount_t; typedef uint32_t cloudabi_lock_t; #define CLOUDABI_LOCK_UNLOCKED 0x00000000 #define CLOUDABI_LOCK_WRLOCKED 0x40000000 #define CLOUDABI_LOCK_KERNEL_MANAGED 0x80000000 #define CLOUDABI_LOCK_BOGUS 0x80000000 typedef uint32_t cloudabi_lookupflags_t; #define CLOUDABI_LOOKUP_SYMLINK_FOLLOW 0x00000001 typedef uint8_t cloudabi_mflags_t; #define CLOUDABI_MAP_ANON 0x01 #define CLOUDABI_MAP_FIXED 0x02 #define CLOUDABI_MAP_PRIVATE 0x04 #define CLOUDABI_MAP_SHARED 0x08 typedef uint8_t cloudabi_mprot_t; #define CLOUDABI_PROT_EXEC 0x01 #define CLOUDABI_PROT_WRITE 0x02 #define CLOUDABI_PROT_READ 0x04 typedef uint8_t cloudabi_msflags_t; #define CLOUDABI_MS_ASYNC 0x01 #define CLOUDABI_MS_INVALIDATE 0x02 #define CLOUDABI_MS_SYNC 0x04 typedef uint32_t cloudabi_nthreads_t; typedef uint16_t cloudabi_oflags_t; #define CLOUDABI_O_CREAT 0x0001 #define CLOUDABI_O_DIRECTORY 0x0002 #define CLOUDABI_O_EXCL 0x0004 #define CLOUDABI_O_TRUNC 0x0008 typedef uint16_t cloudabi_riflags_t; #define CLOUDABI_SOCK_RECV_PEEK 0x0004 #define CLOUDABI_SOCK_RECV_WAITALL 0x0010 typedef uint64_t cloudabi_rights_t; -#define CLOUDABI_RIGHT_FD_DATASYNC 0x0000000000000001 -#define CLOUDABI_RIGHT_FD_READ 0x0000000000000002 -#define CLOUDABI_RIGHT_FD_SEEK 0x0000000000000004 -#define CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS 0x0000000000000008 -#define CLOUDABI_RIGHT_FD_SYNC 0x0000000000000010 -#define CLOUDABI_RIGHT_FD_TELL 0x0000000000000020 -#define CLOUDABI_RIGHT_FD_WRITE 0x0000000000000040 -#define CLOUDABI_RIGHT_FILE_ADVISE 0x0000000000000080 -#define CLOUDABI_RIGHT_FILE_ALLOCATE 0x0000000000000100 -#define CLOUDABI_RIGHT_FILE_CREATE_DIRECTORY 0x0000000000000200 -#define CLOUDABI_RIGHT_FILE_CREATE_FILE 0x0000000000000400 -#define CLOUDABI_RIGHT_FILE_CREATE_FIFO 0x0000000000000800 -#define CLOUDABI_RIGHT_FILE_LINK_SOURCE 0x0000000000001000 -#define CLOUDABI_RIGHT_FILE_LINK_TARGET 0x0000000000002000 -#define CLOUDABI_RIGHT_FILE_OPEN 0x0000000000004000 -#define CLOUDABI_RIGHT_FILE_READDIR 0x0000000000008000 -#define CLOUDABI_RIGHT_FILE_READLINK 0x0000000000010000 -#define CLOUDABI_RIGHT_FILE_RENAME_SOURCE 0x0000000000020000 -#define CLOUDABI_RIGHT_FILE_RENAME_TARGET 0x0000000000040000 -#define CLOUDABI_RIGHT_FILE_STAT_FGET 0x0000000000080000 -#define CLOUDABI_RIGHT_FILE_STAT_FPUT_SIZE 0x0000000000100000 -#define CLOUDABI_RIGHT_FILE_STAT_FPUT_TIMES 0x0000000000200000 -#define CLOUDABI_RIGHT_FILE_STAT_GET 0x0000000000400000 -#define CLOUDABI_RIGHT_FILE_STAT_PUT_TIMES 0x0000000000800000 -#define CLOUDABI_RIGHT_FILE_SYMLINK 0x0000000001000000 -#define CLOUDABI_RIGHT_FILE_UNLINK 0x0000000002000000 -#define CLOUDABI_RIGHT_MEM_MAP 0x0000000004000000 -#define CLOUDABI_RIGHT_MEM_MAP_EXEC 0x0000000008000000 -#define CLOUDABI_RIGHT_POLL_FD_READWRITE 0x0000000010000000 -#define CLOUDABI_RIGHT_POLL_MODIFY 0x0000000020000000 -#define CLOUDABI_RIGHT_POLL_PROC_TERMINATE 0x0000000040000000 -#define CLOUDABI_RIGHT_POLL_WAIT 0x0000000080000000 -#define CLOUDABI_RIGHT_PROC_EXEC 0x0000000100000000 -#define CLOUDABI_RIGHT_SOCK_ACCEPT 0x0000000200000000 -#define CLOUDABI_RIGHT_SOCK_BIND_DIRECTORY 0x0000000400000000 -#define CLOUDABI_RIGHT_SOCK_BIND_SOCKET 0x0000000800000000 -#define CLOUDABI_RIGHT_SOCK_CONNECT_DIRECTORY 0x0000001000000000 -#define CLOUDABI_RIGHT_SOCK_CONNECT_SOCKET 0x0000002000000000 -#define CLOUDABI_RIGHT_SOCK_LISTEN 0x0000004000000000 -#define CLOUDABI_RIGHT_SOCK_SHUTDOWN 0x0000008000000000 -#define CLOUDABI_RIGHT_SOCK_STAT_GET 0x0000010000000000 +#define CLOUDABI_RIGHT_FD_DATASYNC 0x0000000000000001 +#define CLOUDABI_RIGHT_FD_READ 0x0000000000000002 +#define CLOUDABI_RIGHT_FD_SEEK 0x0000000000000004 +#define CLOUDABI_RIGHT_FD_STAT_PUT_FLAGS 0x0000000000000008 +#define CLOUDABI_RIGHT_FD_SYNC 0x0000000000000010 +#define CLOUDABI_RIGHT_FD_TELL 0x0000000000000020 +#define CLOUDABI_RIGHT_FD_WRITE 0x0000000000000040 +#define CLOUDABI_RIGHT_FILE_ADVISE 0x0000000000000080 +#define CLOUDABI_RIGHT_FILE_ALLOCATE 0x0000000000000100 +#define CLOUDABI_RIGHT_FILE_CREATE_DIRECTORY 0x0000000000000200 +#define CLOUDABI_RIGHT_FILE_CREATE_FILE 0x0000000000000400 +#define CLOUDABI_RIGHT_FILE_CREATE_FIFO 0x0000000000000800 +#define CLOUDABI_RIGHT_FILE_LINK_SOURCE 0x0000000000001000 +#define CLOUDABI_RIGHT_FILE_LINK_TARGET 0x0000000000002000 +#define CLOUDABI_RIGHT_FILE_OPEN 0x0000000000004000 +#define CLOUDABI_RIGHT_FILE_READDIR 0x0000000000008000 +#define CLOUDABI_RIGHT_FILE_READLINK 0x0000000000010000 +#define CLOUDABI_RIGHT_FILE_RENAME_SOURCE 0x0000000000020000 +#define CLOUDABI_RIGHT_FILE_RENAME_TARGET 0x0000000000040000 +#define CLOUDABI_RIGHT_FILE_STAT_FGET 0x0000000000080000 +#define CLOUDABI_RIGHT_FILE_STAT_FPUT_SIZE 0x0000000000100000 +#define CLOUDABI_RIGHT_FILE_STAT_FPUT_TIMES 0x0000000000200000 +#define CLOUDABI_RIGHT_FILE_STAT_GET 0x0000000000400000 +#define CLOUDABI_RIGHT_FILE_STAT_PUT_TIMES 0x0000000000800000 +#define CLOUDABI_RIGHT_FILE_SYMLINK 0x0000000001000000 +#define CLOUDABI_RIGHT_FILE_UNLINK 0x0000000002000000 +#define CLOUDABI_RIGHT_MEM_MAP 0x0000000004000000 +#define CLOUDABI_RIGHT_MEM_MAP_EXEC 0x0000000008000000 +#define CLOUDABI_RIGHT_POLL_FD_READWRITE 0x0000000010000000 +#define CLOUDABI_RIGHT_POLL_MODIFY 0x0000000020000000 +#define CLOUDABI_RIGHT_POLL_PROC_TERMINATE 0x0000000040000000 +#define CLOUDABI_RIGHT_POLL_WAIT 0x0000000080000000 +#define CLOUDABI_RIGHT_PROC_EXEC 0x0000000100000000 +#define CLOUDABI_RIGHT_SOCK_ACCEPT 0x0000000200000000 +#define CLOUDABI_RIGHT_SOCK_SHUTDOWN 0x0000008000000000 +#define CLOUDABI_RIGHT_SOCK_STAT_GET 0x0000010000000000 typedef uint16_t cloudabi_roflags_t; #define CLOUDABI_SOCK_RECV_FDS_TRUNCATED 0x0001 #define CLOUDABI_SOCK_RECV_DATA_TRUNCATED 0x0008 typedef uint8_t cloudabi_scope_t; #define CLOUDABI_SCOPE_PRIVATE 4 #define CLOUDABI_SCOPE_SHARED 8 typedef uint8_t cloudabi_sdflags_t; #define CLOUDABI_SHUT_RD 0x01 #define CLOUDABI_SHUT_WR 0x02 typedef uint16_t cloudabi_siflags_t; typedef uint8_t cloudabi_signal_t; #define CLOUDABI_SIGABRT 1 #define CLOUDABI_SIGALRM 2 #define CLOUDABI_SIGBUS 3 #define CLOUDABI_SIGCHLD 4 #define CLOUDABI_SIGCONT 5 #define CLOUDABI_SIGFPE 6 #define CLOUDABI_SIGHUP 7 #define CLOUDABI_SIGILL 8 #define CLOUDABI_SIGINT 9 #define CLOUDABI_SIGKILL 10 #define CLOUDABI_SIGPIPE 11 #define CLOUDABI_SIGQUIT 12 #define CLOUDABI_SIGSEGV 13 #define CLOUDABI_SIGSTOP 14 #define CLOUDABI_SIGSYS 15 #define CLOUDABI_SIGTERM 16 #define CLOUDABI_SIGTRAP 17 #define CLOUDABI_SIGTSTP 18 #define CLOUDABI_SIGTTIN 19 #define CLOUDABI_SIGTTOU 20 #define CLOUDABI_SIGURG 21 #define CLOUDABI_SIGUSR1 22 #define CLOUDABI_SIGUSR2 23 #define CLOUDABI_SIGVTALRM 24 #define CLOUDABI_SIGXCPU 25 #define CLOUDABI_SIGXFSZ 26 typedef uint8_t cloudabi_ssflags_t; #define CLOUDABI_SOCKSTAT_CLEAR_ERROR 0x01 typedef uint32_t cloudabi_sstate_t; #define CLOUDABI_SOCKSTATE_ACCEPTCONN 0x00000001 typedef uint16_t cloudabi_subclockflags_t; #define CLOUDABI_SUBSCRIPTION_CLOCK_ABSTIME 0x0001 typedef uint16_t cloudabi_subflags_t; #define CLOUDABI_SUBSCRIPTION_ADD 0x0001 #define CLOUDABI_SUBSCRIPTION_CLEAR 0x0002 #define CLOUDABI_SUBSCRIPTION_DELETE 0x0004 #define CLOUDABI_SUBSCRIPTION_DISABLE 0x0008 #define CLOUDABI_SUBSCRIPTION_ENABLE 0x0010 #define CLOUDABI_SUBSCRIPTION_ONESHOT 0x0020 typedef uint16_t cloudabi_subrwflags_t; #define CLOUDABI_SUBSCRIPTION_FD_READWRITE_POLL 0x0001 typedef uint32_t cloudabi_tid_t; typedef uint64_t cloudabi_timestamp_t; typedef uint8_t cloudabi_ulflags_t; #define CLOUDABI_UNLINK_REMOVEDIR 0x01 typedef uint64_t cloudabi_userdata_t; typedef uint8_t cloudabi_whence_t; #define CLOUDABI_WHENCE_CUR 1 #define CLOUDABI_WHENCE_END 2 #define CLOUDABI_WHENCE_SET 3 typedef struct { _Alignas(8) cloudabi_dircookie_t d_next; _Alignas(8) cloudabi_inode_t d_ino; _Alignas(4) uint32_t d_namlen; _Alignas(1) cloudabi_filetype_t d_type; } cloudabi_dirent_t; _Static_assert(offsetof(cloudabi_dirent_t, d_next) == 0, "Incorrect layout"); _Static_assert(offsetof(cloudabi_dirent_t, d_ino) == 8, "Incorrect layout"); _Static_assert(offsetof(cloudabi_dirent_t, d_namlen) == 16, "Incorrect layout"); _Static_assert(offsetof(cloudabi_dirent_t, d_type) == 20, "Incorrect layout"); _Static_assert(sizeof(cloudabi_dirent_t) == 24, "Incorrect layout"); _Static_assert(_Alignof(cloudabi_dirent_t) == 8, "Incorrect layout"); typedef struct { _Alignas(1) cloudabi_filetype_t fs_filetype; _Alignas(2) cloudabi_fdflags_t fs_flags; _Alignas(8) cloudabi_rights_t fs_rights_base; _Alignas(8) cloudabi_rights_t fs_rights_inheriting; } cloudabi_fdstat_t; _Static_assert(offsetof(cloudabi_fdstat_t, fs_filetype) == 0, "Incorrect layout"); _Static_assert(offsetof(cloudabi_fdstat_t, fs_flags) == 2, "Incorrect layout"); _Static_assert(offsetof(cloudabi_fdstat_t, fs_rights_base) == 8, "Incorrect layout"); _Static_assert(offsetof(cloudabi_fdstat_t, fs_rights_inheriting) == 16, "Incorrect layout"); _Static_assert(sizeof(cloudabi_fdstat_t) == 24, "Incorrect layout"); _Static_assert(_Alignof(cloudabi_fdstat_t) == 8, "Incorrect layout"); typedef struct { _Alignas(8) cloudabi_device_t st_dev; _Alignas(8) cloudabi_inode_t st_ino; _Alignas(1) cloudabi_filetype_t st_filetype; _Alignas(4) cloudabi_linkcount_t st_nlink; _Alignas(8) cloudabi_filesize_t st_size; _Alignas(8) cloudabi_timestamp_t st_atim; _Alignas(8) cloudabi_timestamp_t st_mtim; _Alignas(8) cloudabi_timestamp_t st_ctim; } cloudabi_filestat_t; _Static_assert(offsetof(cloudabi_filestat_t, st_dev) == 0, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_ino) == 8, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_filetype) == 16, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_nlink) == 20, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_size) == 24, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_atim) == 32, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_mtim) == 40, "Incorrect layout"); _Static_assert(offsetof(cloudabi_filestat_t, st_ctim) == 48, "Incorrect layout"); _Static_assert(sizeof(cloudabi_filestat_t) == 56, "Incorrect layout"); _Static_assert(_Alignof(cloudabi_filestat_t) == 8, "Incorrect layout"); typedef struct { _Alignas(4) cloudabi_fd_t fd; _Alignas(4) cloudabi_lookupflags_t flags; } cloudabi_lookup_t; _Static_assert(offsetof(cloudabi_lookup_t, fd) == 0, "Incorrect layout"); _Static_assert(offsetof(cloudabi_lookup_t, flags) == 4, "Incorrect layout"); _Static_assert(sizeof(cloudabi_lookup_t) == 8, "Incorrect layout"); _Static_assert(_Alignof(cloudabi_lookup_t) == 4, "Incorrect layout"); typedef struct { _Alignas(1) char ss_unused[40]; _Alignas(2) cloudabi_errno_t ss_error; _Alignas(4) cloudabi_sstate_t ss_state; } cloudabi_sockstat_t; _Static_assert(offsetof(cloudabi_sockstat_t, ss_unused) == 0, "Incorrect layout"); _Static_assert(offsetof(cloudabi_sockstat_t, ss_error) == 40, "Incorrect layout"); _Static_assert(offsetof(cloudabi_sockstat_t, ss_state) == 44, "Incorrect layout"); _Static_assert(sizeof(cloudabi_sockstat_t) == 48, "Incorrect layout"); _Static_assert(_Alignof(cloudabi_sockstat_t) == 4, "Incorrect layout"); #endif Index: projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_aarch64.S =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_aarch64.S (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_aarch64.S (revision 322922) @@ -1,479 +1,461 @@ // Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions // are met: // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE // ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS // OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) // HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT // LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY // OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF // SUCH DAMAGE. // // This file is automatically generated. Do not edit. // // Source: https://github.com/NuxiNL/cloudabi #define ENTRY(name) \ .text; \ .p2align 2; \ .global name; \ .type name, @function; \ name: #define END(name) .size name, . - name ENTRY(cloudabi_sys_clock_res_get) str x1, [sp, #-8] mov w8, #0 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_clock_res_get) ENTRY(cloudabi_sys_clock_time_get) str x2, [sp, #-8] mov w8, #1 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_clock_time_get) ENTRY(cloudabi_sys_condvar_signal) mov w8, #2 svc #0 ret END(cloudabi_sys_condvar_signal) ENTRY(cloudabi_sys_fd_close) mov w8, #3 svc #0 ret END(cloudabi_sys_fd_close) ENTRY(cloudabi_sys_fd_create1) str x1, [sp, #-8] mov w8, #4 svc #0 ldr x2, [sp, #-8] b.cs 1f str w0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_create1) ENTRY(cloudabi_sys_fd_create2) stp x1, x2, [sp, #-16] mov w8, #5 svc #0 ldp x2, x3, [sp, #-16] b.cs 1f str w0, [x2] str w1, [x3] mov w0, wzr 1: ret END(cloudabi_sys_fd_create2) ENTRY(cloudabi_sys_fd_datasync) mov w8, #6 svc #0 ret END(cloudabi_sys_fd_datasync) ENTRY(cloudabi_sys_fd_dup) str x1, [sp, #-8] mov w8, #7 svc #0 ldr x2, [sp, #-8] b.cs 1f str w0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_dup) ENTRY(cloudabi_sys_fd_pread) str x4, [sp, #-8] mov w8, #8 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_pread) ENTRY(cloudabi_sys_fd_pwrite) str x4, [sp, #-8] mov w8, #9 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_pwrite) ENTRY(cloudabi_sys_fd_read) str x3, [sp, #-8] mov w8, #10 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_read) ENTRY(cloudabi_sys_fd_replace) mov w8, #11 svc #0 ret END(cloudabi_sys_fd_replace) ENTRY(cloudabi_sys_fd_seek) str x3, [sp, #-8] mov w8, #12 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_seek) ENTRY(cloudabi_sys_fd_stat_get) mov w8, #13 svc #0 ret END(cloudabi_sys_fd_stat_get) ENTRY(cloudabi_sys_fd_stat_put) mov w8, #14 svc #0 ret END(cloudabi_sys_fd_stat_put) ENTRY(cloudabi_sys_fd_sync) mov w8, #15 svc #0 ret END(cloudabi_sys_fd_sync) ENTRY(cloudabi_sys_fd_write) str x3, [sp, #-8] mov w8, #16 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_fd_write) ENTRY(cloudabi_sys_file_advise) mov w8, #17 svc #0 ret END(cloudabi_sys_file_advise) ENTRY(cloudabi_sys_file_allocate) mov w8, #18 svc #0 ret END(cloudabi_sys_file_allocate) ENTRY(cloudabi_sys_file_create) mov w8, #19 svc #0 ret END(cloudabi_sys_file_create) ENTRY(cloudabi_sys_file_link) mov w8, #20 svc #0 ret END(cloudabi_sys_file_link) ENTRY(cloudabi_sys_file_open) str x5, [sp, #-8] mov w8, #21 svc #0 ldr x2, [sp, #-8] b.cs 1f str w0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_file_open) ENTRY(cloudabi_sys_file_readdir) str x4, [sp, #-8] mov w8, #22 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_file_readdir) ENTRY(cloudabi_sys_file_readlink) str x5, [sp, #-8] mov w8, #23 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_file_readlink) ENTRY(cloudabi_sys_file_rename) mov w8, #24 svc #0 ret END(cloudabi_sys_file_rename) ENTRY(cloudabi_sys_file_stat_fget) mov w8, #25 svc #0 ret END(cloudabi_sys_file_stat_fget) ENTRY(cloudabi_sys_file_stat_fput) mov w8, #26 svc #0 ret END(cloudabi_sys_file_stat_fput) ENTRY(cloudabi_sys_file_stat_get) mov w8, #27 svc #0 ret END(cloudabi_sys_file_stat_get) ENTRY(cloudabi_sys_file_stat_put) mov w8, #28 svc #0 ret END(cloudabi_sys_file_stat_put) ENTRY(cloudabi_sys_file_symlink) mov w8, #29 svc #0 ret END(cloudabi_sys_file_symlink) ENTRY(cloudabi_sys_file_unlink) mov w8, #30 svc #0 ret END(cloudabi_sys_file_unlink) ENTRY(cloudabi_sys_lock_unlock) mov w8, #31 svc #0 ret END(cloudabi_sys_lock_unlock) ENTRY(cloudabi_sys_mem_advise) mov w8, #32 svc #0 ret END(cloudabi_sys_mem_advise) ENTRY(cloudabi_sys_mem_map) str x6, [sp, #-8] mov w8, #33 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_mem_map) ENTRY(cloudabi_sys_mem_protect) mov w8, #34 svc #0 ret END(cloudabi_sys_mem_protect) ENTRY(cloudabi_sys_mem_sync) mov w8, #35 svc #0 ret END(cloudabi_sys_mem_sync) ENTRY(cloudabi_sys_mem_unmap) mov w8, #36 svc #0 ret END(cloudabi_sys_mem_unmap) ENTRY(cloudabi_sys_poll) str x3, [sp, #-8] mov w8, #37 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_poll) ENTRY(cloudabi_sys_poll_fd) str x6, [sp, #-8] mov w8, #38 svc #0 ldr x2, [sp, #-8] b.cs 1f str x0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_poll_fd) ENTRY(cloudabi_sys_proc_exec) mov w8, #39 svc #0 ret END(cloudabi_sys_proc_exec) ENTRY(cloudabi_sys_proc_exit) mov w8, #40 svc #0 END(cloudabi_sys_proc_exit) ENTRY(cloudabi_sys_proc_fork) stp x0, x1, [sp, #-16] mov w8, #41 svc #0 ldp x2, x3, [sp, #-16] b.cs 1f str w0, [x2] str w1, [x3] mov w0, wzr 1: ret END(cloudabi_sys_proc_fork) ENTRY(cloudabi_sys_proc_raise) mov w8, #42 svc #0 ret END(cloudabi_sys_proc_raise) ENTRY(cloudabi_sys_random_get) mov w8, #43 svc #0 ret END(cloudabi_sys_random_get) ENTRY(cloudabi_sys_sock_accept) str x2, [sp, #-8] mov w8, #44 svc #0 ldr x2, [sp, #-8] b.cs 1f str w0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_sock_accept) -ENTRY(cloudabi_sys_sock_bind) +ENTRY(cloudabi_sys_sock_recv) mov w8, #45 svc #0 ret -END(cloudabi_sys_sock_bind) - -ENTRY(cloudabi_sys_sock_connect) - mov w8, #46 - svc #0 - ret -END(cloudabi_sys_sock_connect) - -ENTRY(cloudabi_sys_sock_listen) - mov w8, #47 - svc #0 - ret -END(cloudabi_sys_sock_listen) - -ENTRY(cloudabi_sys_sock_recv) - mov w8, #48 - svc #0 - ret END(cloudabi_sys_sock_recv) ENTRY(cloudabi_sys_sock_send) - mov w8, #49 + mov w8, #46 svc #0 ret END(cloudabi_sys_sock_send) ENTRY(cloudabi_sys_sock_shutdown) - mov w8, #50 + mov w8, #47 svc #0 ret END(cloudabi_sys_sock_shutdown) ENTRY(cloudabi_sys_sock_stat_get) - mov w8, #51 + mov w8, #48 svc #0 ret END(cloudabi_sys_sock_stat_get) ENTRY(cloudabi_sys_thread_create) str x1, [sp, #-8] - mov w8, #52 + mov w8, #49 svc #0 ldr x2, [sp, #-8] b.cs 1f str w0, [x2] mov w0, wzr 1: ret END(cloudabi_sys_thread_create) ENTRY(cloudabi_sys_thread_exit) - mov w8, #53 + mov w8, #50 svc #0 END(cloudabi_sys_thread_exit) ENTRY(cloudabi_sys_thread_yield) - mov w8, #54 + mov w8, #51 svc #0 ret END(cloudabi_sys_thread_yield) Index: projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_armv6.S =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_armv6.S (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_armv6.S (revision 322922) @@ -1,439 +1,421 @@ // Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions // are met: // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE // ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS // OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) // HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT // LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY // OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF // SUCH DAMAGE. // // This file is automatically generated. Do not edit. // // Source: https://github.com/NuxiNL/cloudabi #define ENTRY(name) \ .text; \ .p2align 2; \ .global name; \ .type name, %function; \ name: #define END(name) .size name, . - name ENTRY(cloudabi_sys_clock_res_get) str r1, [sp, #-4] mov ip, #0 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2, 0] strcc r1, [r2, 4] movcc r0, $0 bx lr END(cloudabi_sys_clock_res_get) ENTRY(cloudabi_sys_clock_time_get) mov ip, #1 swi 0 ldrcc r2, [sp, #0] strcc r0, [r2, 0] strcc r1, [r2, 4] movcc r0, $0 bx lr END(cloudabi_sys_clock_time_get) ENTRY(cloudabi_sys_condvar_signal) mov ip, #2 swi 0 bx lr END(cloudabi_sys_condvar_signal) ENTRY(cloudabi_sys_fd_close) mov ip, #3 swi 0 bx lr END(cloudabi_sys_fd_close) ENTRY(cloudabi_sys_fd_create1) str r1, [sp, #-4] mov ip, #4 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_fd_create1) ENTRY(cloudabi_sys_fd_create2) str r1, [sp, #-4] str r2, [sp, #-8] mov ip, #5 swi 0 ldrcc r2, [sp, #-4] ldrcc r3, [sp, #-8] strcc r0, [r2] strcc r1, [r3] movcc r0, $0 bx lr END(cloudabi_sys_fd_create2) ENTRY(cloudabi_sys_fd_datasync) mov ip, #6 swi 0 bx lr END(cloudabi_sys_fd_datasync) ENTRY(cloudabi_sys_fd_dup) str r1, [sp, #-4] mov ip, #7 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_fd_dup) ENTRY(cloudabi_sys_fd_pread) mov ip, #8 swi 0 ldrcc r2, [sp, #8] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_fd_pread) ENTRY(cloudabi_sys_fd_pwrite) mov ip, #9 swi 0 ldrcc r2, [sp, #8] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_fd_pwrite) ENTRY(cloudabi_sys_fd_read) str r3, [sp, #-4] mov ip, #10 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_fd_read) ENTRY(cloudabi_sys_fd_replace) mov ip, #11 swi 0 bx lr END(cloudabi_sys_fd_replace) ENTRY(cloudabi_sys_fd_seek) mov ip, #12 swi 0 ldrcc r2, [sp, #4] strcc r0, [r2, 0] strcc r1, [r2, 4] movcc r0, $0 bx lr END(cloudabi_sys_fd_seek) ENTRY(cloudabi_sys_fd_stat_get) mov ip, #13 swi 0 bx lr END(cloudabi_sys_fd_stat_get) ENTRY(cloudabi_sys_fd_stat_put) mov ip, #14 swi 0 bx lr END(cloudabi_sys_fd_stat_put) ENTRY(cloudabi_sys_fd_sync) mov ip, #15 swi 0 bx lr END(cloudabi_sys_fd_sync) ENTRY(cloudabi_sys_fd_write) str r3, [sp, #-4] mov ip, #16 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_fd_write) ENTRY(cloudabi_sys_file_advise) mov ip, #17 swi 0 bx lr END(cloudabi_sys_file_advise) ENTRY(cloudabi_sys_file_allocate) mov ip, #18 swi 0 bx lr END(cloudabi_sys_file_allocate) ENTRY(cloudabi_sys_file_create) mov ip, #19 swi 0 bx lr END(cloudabi_sys_file_create) ENTRY(cloudabi_sys_file_link) mov ip, #20 swi 0 bx lr END(cloudabi_sys_file_link) ENTRY(cloudabi_sys_file_open) mov ip, #21 swi 0 ldrcc r2, [sp, #8] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_file_open) ENTRY(cloudabi_sys_file_readdir) mov ip, #22 swi 0 ldrcc r2, [sp, #8] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_file_readdir) ENTRY(cloudabi_sys_file_readlink) mov ip, #23 swi 0 ldrcc r2, [sp, #4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_file_readlink) ENTRY(cloudabi_sys_file_rename) mov ip, #24 swi 0 bx lr END(cloudabi_sys_file_rename) ENTRY(cloudabi_sys_file_stat_fget) mov ip, #25 swi 0 bx lr END(cloudabi_sys_file_stat_fget) ENTRY(cloudabi_sys_file_stat_fput) mov ip, #26 swi 0 bx lr END(cloudabi_sys_file_stat_fput) ENTRY(cloudabi_sys_file_stat_get) mov ip, #27 swi 0 bx lr END(cloudabi_sys_file_stat_get) ENTRY(cloudabi_sys_file_stat_put) mov ip, #28 swi 0 bx lr END(cloudabi_sys_file_stat_put) ENTRY(cloudabi_sys_file_symlink) mov ip, #29 swi 0 bx lr END(cloudabi_sys_file_symlink) ENTRY(cloudabi_sys_file_unlink) mov ip, #30 swi 0 bx lr END(cloudabi_sys_file_unlink) ENTRY(cloudabi_sys_lock_unlock) mov ip, #31 swi 0 bx lr END(cloudabi_sys_lock_unlock) ENTRY(cloudabi_sys_mem_advise) mov ip, #32 swi 0 bx lr END(cloudabi_sys_mem_advise) ENTRY(cloudabi_sys_mem_map) mov ip, #33 swi 0 ldrcc r2, [sp, #16] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_mem_map) ENTRY(cloudabi_sys_mem_protect) mov ip, #34 swi 0 bx lr END(cloudabi_sys_mem_protect) ENTRY(cloudabi_sys_mem_sync) mov ip, #35 swi 0 bx lr END(cloudabi_sys_mem_sync) ENTRY(cloudabi_sys_mem_unmap) mov ip, #36 swi 0 bx lr END(cloudabi_sys_mem_unmap) ENTRY(cloudabi_sys_poll) str r3, [sp, #-4] mov ip, #37 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_poll) ENTRY(cloudabi_sys_poll_fd) mov ip, #38 swi 0 ldrcc r2, [sp, #8] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_poll_fd) ENTRY(cloudabi_sys_proc_exec) mov ip, #39 swi 0 bx lr END(cloudabi_sys_proc_exec) ENTRY(cloudabi_sys_proc_exit) mov ip, #40 swi 0 END(cloudabi_sys_proc_exit) ENTRY(cloudabi_sys_proc_fork) str r0, [sp, #-4] str r1, [sp, #-8] mov ip, #41 swi 0 ldrcc r2, [sp, #-4] ldrcc r3, [sp, #-8] strcc r0, [r2] strcc r1, [r3] movcc r0, $0 bx lr END(cloudabi_sys_proc_fork) ENTRY(cloudabi_sys_proc_raise) mov ip, #42 swi 0 bx lr END(cloudabi_sys_proc_raise) ENTRY(cloudabi_sys_random_get) mov ip, #43 swi 0 bx lr END(cloudabi_sys_random_get) ENTRY(cloudabi_sys_sock_accept) str r2, [sp, #-4] mov ip, #44 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_sock_accept) -ENTRY(cloudabi_sys_sock_bind) +ENTRY(cloudabi_sys_sock_recv) mov ip, #45 swi 0 bx lr -END(cloudabi_sys_sock_bind) - -ENTRY(cloudabi_sys_sock_connect) - mov ip, #46 - swi 0 - bx lr -END(cloudabi_sys_sock_connect) - -ENTRY(cloudabi_sys_sock_listen) - mov ip, #47 - swi 0 - bx lr -END(cloudabi_sys_sock_listen) - -ENTRY(cloudabi_sys_sock_recv) - mov ip, #48 - swi 0 - bx lr END(cloudabi_sys_sock_recv) ENTRY(cloudabi_sys_sock_send) - mov ip, #49 + mov ip, #46 swi 0 bx lr END(cloudabi_sys_sock_send) ENTRY(cloudabi_sys_sock_shutdown) - mov ip, #50 + mov ip, #47 swi 0 bx lr END(cloudabi_sys_sock_shutdown) ENTRY(cloudabi_sys_sock_stat_get) - mov ip, #51 + mov ip, #48 swi 0 bx lr END(cloudabi_sys_sock_stat_get) ENTRY(cloudabi_sys_thread_create) str r1, [sp, #-4] - mov ip, #52 + mov ip, #49 swi 0 ldrcc r2, [sp, #-4] strcc r0, [r2] movcc r0, $0 bx lr END(cloudabi_sys_thread_create) ENTRY(cloudabi_sys_thread_exit) - mov ip, #53 + mov ip, #50 swi 0 END(cloudabi_sys_thread_exit) ENTRY(cloudabi_sys_thread_yield) - mov ip, #54 + mov ip, #51 swi 0 bx lr END(cloudabi_sys_thread_yield) Index: projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_i686.S =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_i686.S (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_i686.S (revision 322922) @@ -1,465 +1,447 @@ // Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions // are met: // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE // ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS // OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) // HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT // LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY // OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF // SUCH DAMAGE. // // This file is automatically generated. Do not edit. // // Source: https://github.com/NuxiNL/cloudabi #define ENTRY(name) \ .text; \ .p2align 2, 0x90; \ .global name; \ .type name, @function; \ name: #define END(name) .size name, . - name ENTRY(cloudabi_sys_clock_res_get) mov $0, %eax int $0x80 jc 1f mov 8(%esp), %ecx mov %eax, 0(%ecx) mov %edx, 4(%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_clock_res_get) ENTRY(cloudabi_sys_clock_time_get) mov $1, %eax int $0x80 jc 1f mov 16(%esp), %ecx mov %eax, 0(%ecx) mov %edx, 4(%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_clock_time_get) ENTRY(cloudabi_sys_condvar_signal) mov $2, %eax int $0x80 ret END(cloudabi_sys_condvar_signal) ENTRY(cloudabi_sys_fd_close) mov $3, %eax int $0x80 ret END(cloudabi_sys_fd_close) ENTRY(cloudabi_sys_fd_create1) mov $4, %eax int $0x80 jc 1f mov 8(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_create1) ENTRY(cloudabi_sys_fd_create2) mov $5, %eax int $0x80 jc 1f mov 8(%esp), %ecx mov %eax, (%ecx) mov 12(%esp), %ecx mov %edx, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_create2) ENTRY(cloudabi_sys_fd_datasync) mov $6, %eax int $0x80 ret END(cloudabi_sys_fd_datasync) ENTRY(cloudabi_sys_fd_dup) mov $7, %eax int $0x80 jc 1f mov 8(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_dup) ENTRY(cloudabi_sys_fd_pread) mov $8, %eax int $0x80 jc 1f mov 24(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_pread) ENTRY(cloudabi_sys_fd_pwrite) mov $9, %eax int $0x80 jc 1f mov 24(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_pwrite) ENTRY(cloudabi_sys_fd_read) mov $10, %eax int $0x80 jc 1f mov 16(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_read) ENTRY(cloudabi_sys_fd_replace) mov $11, %eax int $0x80 ret END(cloudabi_sys_fd_replace) ENTRY(cloudabi_sys_fd_seek) mov $12, %eax int $0x80 jc 1f mov 20(%esp), %ecx mov %eax, 0(%ecx) mov %edx, 4(%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_seek) ENTRY(cloudabi_sys_fd_stat_get) mov $13, %eax int $0x80 ret END(cloudabi_sys_fd_stat_get) ENTRY(cloudabi_sys_fd_stat_put) mov $14, %eax int $0x80 ret END(cloudabi_sys_fd_stat_put) ENTRY(cloudabi_sys_fd_sync) mov $15, %eax int $0x80 ret END(cloudabi_sys_fd_sync) ENTRY(cloudabi_sys_fd_write) mov $16, %eax int $0x80 jc 1f mov 16(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_write) ENTRY(cloudabi_sys_file_advise) mov $17, %eax int $0x80 ret END(cloudabi_sys_file_advise) ENTRY(cloudabi_sys_file_allocate) mov $18, %eax int $0x80 ret END(cloudabi_sys_file_allocate) ENTRY(cloudabi_sys_file_create) mov $19, %eax int $0x80 ret END(cloudabi_sys_file_create) ENTRY(cloudabi_sys_file_link) mov $20, %eax int $0x80 ret END(cloudabi_sys_file_link) ENTRY(cloudabi_sys_file_open) mov $21, %eax int $0x80 jc 1f mov 28(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_file_open) ENTRY(cloudabi_sys_file_readdir) mov $22, %eax int $0x80 jc 1f mov 24(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_file_readdir) ENTRY(cloudabi_sys_file_readlink) mov $23, %eax int $0x80 jc 1f mov 24(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_file_readlink) ENTRY(cloudabi_sys_file_rename) mov $24, %eax int $0x80 ret END(cloudabi_sys_file_rename) ENTRY(cloudabi_sys_file_stat_fget) mov $25, %eax int $0x80 ret END(cloudabi_sys_file_stat_fget) ENTRY(cloudabi_sys_file_stat_fput) mov $26, %eax int $0x80 ret END(cloudabi_sys_file_stat_fput) ENTRY(cloudabi_sys_file_stat_get) mov $27, %eax int $0x80 ret END(cloudabi_sys_file_stat_get) ENTRY(cloudabi_sys_file_stat_put) mov $28, %eax int $0x80 ret END(cloudabi_sys_file_stat_put) ENTRY(cloudabi_sys_file_symlink) mov $29, %eax int $0x80 ret END(cloudabi_sys_file_symlink) ENTRY(cloudabi_sys_file_unlink) mov $30, %eax int $0x80 ret END(cloudabi_sys_file_unlink) ENTRY(cloudabi_sys_lock_unlock) mov $31, %eax int $0x80 ret END(cloudabi_sys_lock_unlock) ENTRY(cloudabi_sys_mem_advise) mov $32, %eax int $0x80 ret END(cloudabi_sys_mem_advise) ENTRY(cloudabi_sys_mem_map) mov $33, %eax int $0x80 jc 1f mov 32(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_mem_map) ENTRY(cloudabi_sys_mem_protect) mov $34, %eax int $0x80 ret END(cloudabi_sys_mem_protect) ENTRY(cloudabi_sys_mem_sync) mov $35, %eax int $0x80 ret END(cloudabi_sys_mem_sync) ENTRY(cloudabi_sys_mem_unmap) mov $36, %eax int $0x80 ret END(cloudabi_sys_mem_unmap) ENTRY(cloudabi_sys_poll) mov $37, %eax int $0x80 jc 1f mov 16(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_poll) ENTRY(cloudabi_sys_poll_fd) mov $38, %eax int $0x80 jc 1f mov 28(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_poll_fd) ENTRY(cloudabi_sys_proc_exec) mov $39, %eax int $0x80 ret END(cloudabi_sys_proc_exec) ENTRY(cloudabi_sys_proc_exit) mov $40, %eax int $0x80 END(cloudabi_sys_proc_exit) ENTRY(cloudabi_sys_proc_fork) mov $41, %eax int $0x80 jc 1f mov 4(%esp), %ecx mov %eax, (%ecx) mov 8(%esp), %ecx mov %edx, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_proc_fork) ENTRY(cloudabi_sys_proc_raise) mov $42, %eax int $0x80 ret END(cloudabi_sys_proc_raise) ENTRY(cloudabi_sys_random_get) mov $43, %eax int $0x80 ret END(cloudabi_sys_random_get) ENTRY(cloudabi_sys_sock_accept) mov $44, %eax int $0x80 jc 1f mov 12(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_sock_accept) -ENTRY(cloudabi_sys_sock_bind) +ENTRY(cloudabi_sys_sock_recv) mov $45, %eax int $0x80 ret -END(cloudabi_sys_sock_bind) - -ENTRY(cloudabi_sys_sock_connect) - mov $46, %eax - int $0x80 - ret -END(cloudabi_sys_sock_connect) - -ENTRY(cloudabi_sys_sock_listen) - mov $47, %eax - int $0x80 - ret -END(cloudabi_sys_sock_listen) - -ENTRY(cloudabi_sys_sock_recv) - mov $48, %eax - int $0x80 - ret END(cloudabi_sys_sock_recv) ENTRY(cloudabi_sys_sock_send) - mov $49, %eax + mov $46, %eax int $0x80 ret END(cloudabi_sys_sock_send) ENTRY(cloudabi_sys_sock_shutdown) - mov $50, %eax + mov $47, %eax int $0x80 ret END(cloudabi_sys_sock_shutdown) ENTRY(cloudabi_sys_sock_stat_get) - mov $51, %eax + mov $48, %eax int $0x80 ret END(cloudabi_sys_sock_stat_get) ENTRY(cloudabi_sys_thread_create) - mov $52, %eax + mov $49, %eax int $0x80 jc 1f mov 8(%esp), %ecx mov %eax, (%ecx) xor %eax, %eax 1: ret END(cloudabi_sys_thread_create) ENTRY(cloudabi_sys_thread_exit) - mov $53, %eax + mov $50, %eax int $0x80 END(cloudabi_sys_thread_exit) ENTRY(cloudabi_sys_thread_yield) - mov $54, %eax + mov $51, %eax int $0x80 ret END(cloudabi_sys_thread_yield) Index: projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_i686_on_64bit.S =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_i686_on_64bit.S (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_i686_on_64bit.S (revision 322922) @@ -1,1189 +1,1132 @@ // Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions // are met: // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE // ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS // OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) // HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT // LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY // OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF // SUCH DAMAGE. // // This file is automatically generated. Do not edit. // // Source: https://github.com/NuxiNL/cloudabi #define ENTRY(name) \ .text; \ .p2align 2, 0x90; \ .global name; \ .type name, @function; \ name: #define END(name) .size name, . - name ENTRY(cloudabi_sys_clock_res_get) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $0, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 12(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) mov -12(%ebp), %edx mov %edx, 4(%ecx) 1: pop %ebp ret END(cloudabi_sys_clock_res_get) ENTRY(cloudabi_sys_clock_time_get) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) mov 16(%ebp), %ecx mov %ecx, -4(%ebp) mov $1, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 20(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) mov -12(%ebp), %edx mov %edx, 4(%ecx) 1: pop %ebp ret END(cloudabi_sys_clock_time_get) ENTRY(cloudabi_sys_condvar_signal) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) mov $2, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_condvar_signal) ENTRY(cloudabi_sys_fd_close) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $3, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_fd_close) ENTRY(cloudabi_sys_fd_create1) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $4, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 12(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_create1) ENTRY(cloudabi_sys_fd_create2) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $5, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 12(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) mov 16(%ebp), %ecx mov -8(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_create2) ENTRY(cloudabi_sys_fd_datasync) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $6, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_fd_datasync) ENTRY(cloudabi_sys_fd_dup) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $7, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 12(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_dup) ENTRY(cloudabi_sys_fd_pread) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 16(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov 24(%ebp), %ecx mov %ecx, -4(%ebp) mov $8, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 test %eax, %eax jnz 1f mov 28(%ebp), %ecx mov -32(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_pread) ENTRY(cloudabi_sys_fd_pwrite) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 16(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov 24(%ebp), %ecx mov %ecx, -4(%ebp) mov $9, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 test %eax, %eax jnz 1f mov 28(%ebp), %ecx mov -32(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_pwrite) ENTRY(cloudabi_sys_fd_read) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $10, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 test %eax, %eax jnz 1f mov 20(%ebp), %ecx mov -24(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_read) ENTRY(cloudabi_sys_fd_replace) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) mov $11, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_fd_replace) ENTRY(cloudabi_sys_fd_seek) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) mov 16(%ebp), %ecx mov %ecx, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov $12, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 test %eax, %eax jnz 1f mov 24(%ebp), %ecx mov -24(%ebp), %edx mov %edx, 0(%ecx) mov -20(%ebp), %edx mov %edx, 4(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_seek) ENTRY(cloudabi_sys_fd_stat_get) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $13, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_fd_stat_get) ENTRY(cloudabi_sys_fd_stat_put) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) mov $14, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_fd_stat_put) ENTRY(cloudabi_sys_fd_sync) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $15, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_fd_sync) ENTRY(cloudabi_sys_fd_write) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $16, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 test %eax, %eax jnz 1f mov 20(%ebp), %ecx mov -24(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_fd_write) ENTRY(cloudabi_sys_file_advise) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -24(%ebp) mov 16(%ebp), %ecx mov %ecx, -20(%ebp) mov 20(%ebp), %ecx mov %ecx, -16(%ebp) mov 24(%ebp), %ecx mov %ecx, -12(%ebp) mov 28(%ebp), %ecx mov %ecx, -8(%ebp) mov $17, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_advise) ENTRY(cloudabi_sys_file_allocate) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) mov 16(%ebp), %ecx mov %ecx, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov 24(%ebp), %ecx mov %ecx, -4(%ebp) mov $18, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_allocate) ENTRY(cloudabi_sys_file_create) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 16(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov $19, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_create) ENTRY(cloudabi_sys_file_link) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -48(%ebp) mov 12(%ebp), %ecx mov %ecx, -44(%ebp) mov 16(%ebp), %ecx mov %ecx, -40(%ebp) movl $0, -36(%ebp) mov 20(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 24(%ebp), %ecx mov %ecx, -24(%ebp) mov 28(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 32(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $20, %eax mov %ebp, %ecx sub $48, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_link) ENTRY(cloudabi_sys_file_open) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -40(%ebp) mov 12(%ebp), %ecx mov %ecx, -36(%ebp) mov 16(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 20(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 24(%ebp), %ecx mov %ecx, -16(%ebp) mov 28(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $21, %eax mov %ebp, %ecx sub $40, %ecx int $0x80 test %eax, %eax jnz 1f mov 32(%ebp), %ecx mov -40(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_file_open) ENTRY(cloudabi_sys_file_readdir) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 16(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov 24(%ebp), %ecx mov %ecx, -4(%ebp) mov $22, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 test %eax, %eax jnz 1f mov 28(%ebp), %ecx mov -32(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_file_readdir) ENTRY(cloudabi_sys_file_readlink) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -40(%ebp) mov 12(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 16(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 20(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 24(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $23, %eax mov %ebp, %ecx sub $40, %ecx int $0x80 test %eax, %eax jnz 1f mov 28(%ebp), %ecx mov -40(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_file_readlink) ENTRY(cloudabi_sys_file_rename) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -48(%ebp) mov 12(%ebp), %ecx mov %ecx, -40(%ebp) movl $0, -36(%ebp) mov 16(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 20(%ebp), %ecx mov %ecx, -24(%ebp) mov 24(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 28(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $24, %eax mov %ebp, %ecx sub $48, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_rename) ENTRY(cloudabi_sys_file_stat_fget) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $25, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_stat_fget) ENTRY(cloudabi_sys_file_stat_fput) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) mov $26, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_stat_fput) ENTRY(cloudabi_sys_file_stat_get) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -28(%ebp) mov 16(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 20(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 24(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $27, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_stat_get) ENTRY(cloudabi_sys_file_stat_put) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -40(%ebp) mov 12(%ebp), %ecx mov %ecx, -36(%ebp) mov 16(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 20(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 24(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 28(%ebp), %ecx mov %ecx, -8(%ebp) mov $28, %eax mov %ebp, %ecx sub $40, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_stat_put) ENTRY(cloudabi_sys_file_symlink) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -40(%ebp) movl $0, -36(%ebp) mov 12(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 16(%ebp), %ecx mov %ecx, -24(%ebp) mov 20(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 24(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $29, %eax mov %ebp, %ecx sub $40, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_symlink) ENTRY(cloudabi_sys_file_unlink) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -32(%ebp) mov 12(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 16(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 20(%ebp), %ecx mov %ecx, -8(%ebp) mov $30, %eax mov %ebp, %ecx sub $32, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_file_unlink) ENTRY(cloudabi_sys_lock_unlock) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) mov $31, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_lock_unlock) ENTRY(cloudabi_sys_mem_advise) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) mov $32, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_mem_advise) ENTRY(cloudabi_sys_mem_map) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -48(%ebp) movl $0, -44(%ebp) mov 12(%ebp), %ecx mov %ecx, -40(%ebp) movl $0, -36(%ebp) mov 16(%ebp), %ecx mov %ecx, -32(%ebp) mov 20(%ebp), %ecx mov %ecx, -24(%ebp) mov 24(%ebp), %ecx mov %ecx, -16(%ebp) mov 28(%ebp), %ecx mov %ecx, -8(%ebp) mov 32(%ebp), %ecx mov %ecx, -4(%ebp) mov $33, %eax mov %ebp, %ecx sub $48, %ecx int $0x80 test %eax, %eax jnz 1f mov 36(%ebp), %ecx mov -48(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_mem_map) ENTRY(cloudabi_sys_mem_protect) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) mov $34, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_mem_protect) ENTRY(cloudabi_sys_mem_sync) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) mov $35, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_mem_sync) ENTRY(cloudabi_sys_mem_unmap) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $36, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_mem_unmap) ENTRY(cloudabi_sys_poll) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $37, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 test %eax, %eax jnz 1f mov 20(%ebp), %ecx mov -24(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_poll) ENTRY(cloudabi_sys_poll_fd) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -48(%ebp) mov 12(%ebp), %ecx mov %ecx, -40(%ebp) movl $0, -36(%ebp) mov 16(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 20(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 24(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 28(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $38, %eax mov %ebp, %ecx sub $48, %ecx int $0x80 test %eax, %eax jnz 1f mov 32(%ebp), %ecx mov -48(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_poll_fd) ENTRY(cloudabi_sys_proc_exec) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -40(%ebp) mov 12(%ebp), %ecx mov %ecx, -32(%ebp) movl $0, -28(%ebp) mov 16(%ebp), %ecx mov %ecx, -24(%ebp) movl $0, -20(%ebp) mov 20(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 24(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $39, %eax mov %ebp, %ecx sub $40, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_proc_exec) ENTRY(cloudabi_sys_proc_exit) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $40, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 END(cloudabi_sys_proc_exit) ENTRY(cloudabi_sys_proc_fork) push %ebp mov %esp, %ebp mov $41, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 8(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) mov 12(%ebp), %ecx mov -8(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_proc_fork) ENTRY(cloudabi_sys_proc_raise) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov $42, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_proc_raise) ENTRY(cloudabi_sys_random_get) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $43, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_random_get) ENTRY(cloudabi_sys_sock_accept) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) mov $44, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 16(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_sock_accept) -ENTRY(cloudabi_sys_sock_bind) - push %ebp - mov %esp, %ebp - mov 8(%ebp), %ecx - mov %ecx, -32(%ebp) - mov 12(%ebp), %ecx - mov %ecx, -24(%ebp) - mov 16(%ebp), %ecx - mov %ecx, -16(%ebp) - movl $0, -12(%ebp) - mov 20(%ebp), %ecx - mov %ecx, -8(%ebp) - movl $0, -4(%ebp) - mov $45, %eax - mov %ebp, %ecx - sub $32, %ecx - int $0x80 - pop %ebp - ret -END(cloudabi_sys_sock_bind) - -ENTRY(cloudabi_sys_sock_connect) - push %ebp - mov %esp, %ebp - mov 8(%ebp), %ecx - mov %ecx, -32(%ebp) - mov 12(%ebp), %ecx - mov %ecx, -24(%ebp) - mov 16(%ebp), %ecx - mov %ecx, -16(%ebp) - movl $0, -12(%ebp) - mov 20(%ebp), %ecx - mov %ecx, -8(%ebp) - movl $0, -4(%ebp) - mov $46, %eax - mov %ebp, %ecx - sub $32, %ecx - int $0x80 - pop %ebp - ret -END(cloudabi_sys_sock_connect) - -ENTRY(cloudabi_sys_sock_listen) - push %ebp - mov %esp, %ebp - mov 8(%ebp), %ecx - mov %ecx, -16(%ebp) - mov 12(%ebp), %ecx - mov %ecx, -8(%ebp) - mov $47, %eax - mov %ebp, %ecx - sub $16, %ecx - int $0x80 - pop %ebp - ret -END(cloudabi_sys_sock_listen) - ENTRY(cloudabi_sys_sock_recv) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) - mov $48, %eax + mov $45, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_sock_recv) ENTRY(cloudabi_sys_sock_send) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) movl $0, -4(%ebp) - mov $49, %eax + mov $46, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_sock_send) ENTRY(cloudabi_sys_sock_shutdown) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) - mov $50, %eax + mov $47, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_sock_shutdown) ENTRY(cloudabi_sys_sock_stat_get) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -24(%ebp) mov 12(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 16(%ebp), %ecx mov %ecx, -8(%ebp) - mov $51, %eax + mov $48, %eax mov %ebp, %ecx sub $24, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_sock_stat_get) ENTRY(cloudabi_sys_thread_create) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) - mov $52, %eax + mov $49, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 test %eax, %eax jnz 1f mov 12(%ebp), %ecx mov -16(%ebp), %edx mov %edx, 0(%ecx) 1: pop %ebp ret END(cloudabi_sys_thread_create) ENTRY(cloudabi_sys_thread_exit) push %ebp mov %esp, %ebp mov 8(%ebp), %ecx mov %ecx, -16(%ebp) movl $0, -12(%ebp) mov 12(%ebp), %ecx mov %ecx, -8(%ebp) - mov $53, %eax + mov $50, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 END(cloudabi_sys_thread_exit) ENTRY(cloudabi_sys_thread_yield) push %ebp mov %esp, %ebp - mov $54, %eax + mov $51, %eax mov %ebp, %ecx sub $16, %ecx int $0x80 pop %ebp ret END(cloudabi_sys_thread_yield) Index: projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_x86_64.S =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_x86_64.S (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/cloudabi_vdso_x86_64.S (revision 322922) @@ -1,499 +1,479 @@ // Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions // are met: // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE // ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS // OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) // HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT // LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY // OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF // SUCH DAMAGE. // // This file is automatically generated. Do not edit. // // Source: https://github.com/NuxiNL/cloudabi #define ENTRY(name) \ .text; \ .p2align 4, 0x90; \ .global name; \ .type name, @function; \ name: #define END(name) .size name, . - name ENTRY(cloudabi_sys_clock_res_get) push %rsi mov $0, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_clock_res_get) ENTRY(cloudabi_sys_clock_time_get) push %rdx mov $1, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_clock_time_get) ENTRY(cloudabi_sys_condvar_signal) mov $2, %eax syscall ret END(cloudabi_sys_condvar_signal) ENTRY(cloudabi_sys_fd_close) mov $3, %eax syscall ret END(cloudabi_sys_fd_close) ENTRY(cloudabi_sys_fd_create1) push %rsi mov $4, %eax syscall pop %rcx jc 1f mov %eax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_create1) ENTRY(cloudabi_sys_fd_create2) push %rsi push %rdx mov $5, %eax syscall pop %rsi pop %rcx jc 1f mov %eax, (%rcx) mov %edx, (%rsi) xor %eax, %eax 1: ret END(cloudabi_sys_fd_create2) ENTRY(cloudabi_sys_fd_datasync) mov $6, %eax syscall ret END(cloudabi_sys_fd_datasync) ENTRY(cloudabi_sys_fd_dup) push %rsi mov $7, %eax syscall pop %rcx jc 1f mov %eax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_dup) ENTRY(cloudabi_sys_fd_pread) mov %rcx, %r10 push %r8 mov $8, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_pread) ENTRY(cloudabi_sys_fd_pwrite) mov %rcx, %r10 push %r8 mov $9, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_pwrite) ENTRY(cloudabi_sys_fd_read) push %rcx mov $10, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_read) ENTRY(cloudabi_sys_fd_replace) mov $11, %eax syscall ret END(cloudabi_sys_fd_replace) ENTRY(cloudabi_sys_fd_seek) push %rcx mov $12, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_seek) ENTRY(cloudabi_sys_fd_stat_get) mov $13, %eax syscall ret END(cloudabi_sys_fd_stat_get) ENTRY(cloudabi_sys_fd_stat_put) mov $14, %eax syscall ret END(cloudabi_sys_fd_stat_put) ENTRY(cloudabi_sys_fd_sync) mov $15, %eax syscall ret END(cloudabi_sys_fd_sync) ENTRY(cloudabi_sys_fd_write) push %rcx mov $16, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_fd_write) ENTRY(cloudabi_sys_file_advise) mov %rcx, %r10 mov $17, %eax syscall ret END(cloudabi_sys_file_advise) ENTRY(cloudabi_sys_file_allocate) mov $18, %eax syscall ret END(cloudabi_sys_file_allocate) ENTRY(cloudabi_sys_file_create) mov %rcx, %r10 mov $19, %eax syscall ret END(cloudabi_sys_file_create) ENTRY(cloudabi_sys_file_link) mov %rcx, %r10 mov $20, %eax syscall ret END(cloudabi_sys_file_link) ENTRY(cloudabi_sys_file_open) mov %rcx, %r10 push %r9 mov $21, %eax syscall pop %rcx jc 1f mov %eax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_file_open) ENTRY(cloudabi_sys_file_readdir) mov %rcx, %r10 push %r8 mov $22, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_file_readdir) ENTRY(cloudabi_sys_file_readlink) mov %rcx, %r10 push %r9 mov $23, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_file_readlink) ENTRY(cloudabi_sys_file_rename) mov %rcx, %r10 mov $24, %eax syscall ret END(cloudabi_sys_file_rename) ENTRY(cloudabi_sys_file_stat_fget) mov $25, %eax syscall ret END(cloudabi_sys_file_stat_fget) ENTRY(cloudabi_sys_file_stat_fput) mov $26, %eax syscall ret END(cloudabi_sys_file_stat_fput) ENTRY(cloudabi_sys_file_stat_get) mov %rcx, %r10 mov $27, %eax syscall ret END(cloudabi_sys_file_stat_get) ENTRY(cloudabi_sys_file_stat_put) mov %rcx, %r10 mov $28, %eax syscall ret END(cloudabi_sys_file_stat_put) ENTRY(cloudabi_sys_file_symlink) mov %rcx, %r10 mov $29, %eax syscall ret END(cloudabi_sys_file_symlink) ENTRY(cloudabi_sys_file_unlink) mov %rcx, %r10 mov $30, %eax syscall ret END(cloudabi_sys_file_unlink) ENTRY(cloudabi_sys_lock_unlock) mov $31, %eax syscall ret END(cloudabi_sys_lock_unlock) ENTRY(cloudabi_sys_mem_advise) mov $32, %eax syscall ret END(cloudabi_sys_mem_advise) ENTRY(cloudabi_sys_mem_map) mov %rcx, %r10 mov $33, %eax syscall jc 1f mov 8(%rsp), %rcx mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_mem_map) ENTRY(cloudabi_sys_mem_protect) mov $34, %eax syscall ret END(cloudabi_sys_mem_protect) ENTRY(cloudabi_sys_mem_sync) mov $35, %eax syscall ret END(cloudabi_sys_mem_sync) ENTRY(cloudabi_sys_mem_unmap) mov $36, %eax syscall ret END(cloudabi_sys_mem_unmap) ENTRY(cloudabi_sys_poll) push %rcx mov $37, %eax syscall pop %rcx jc 1f mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_poll) ENTRY(cloudabi_sys_poll_fd) mov %rcx, %r10 mov $38, %eax syscall jc 1f mov 8(%rsp), %rcx mov %rax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_poll_fd) ENTRY(cloudabi_sys_proc_exec) mov %rcx, %r10 mov $39, %eax syscall ret END(cloudabi_sys_proc_exec) ENTRY(cloudabi_sys_proc_exit) mov $40, %eax syscall END(cloudabi_sys_proc_exit) ENTRY(cloudabi_sys_proc_fork) push %rdi push %rsi mov $41, %eax syscall pop %rsi pop %rcx jc 1f mov %eax, (%rcx) mov %edx, (%rsi) xor %eax, %eax 1: ret END(cloudabi_sys_proc_fork) ENTRY(cloudabi_sys_proc_raise) mov $42, %eax syscall ret END(cloudabi_sys_proc_raise) ENTRY(cloudabi_sys_random_get) mov $43, %eax syscall ret END(cloudabi_sys_random_get) ENTRY(cloudabi_sys_sock_accept) push %rdx mov $44, %eax syscall pop %rcx jc 1f mov %eax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_sock_accept) -ENTRY(cloudabi_sys_sock_bind) - mov %rcx, %r10 +ENTRY(cloudabi_sys_sock_recv) mov $45, %eax syscall ret -END(cloudabi_sys_sock_bind) - -ENTRY(cloudabi_sys_sock_connect) - mov %rcx, %r10 - mov $46, %eax - syscall - ret -END(cloudabi_sys_sock_connect) - -ENTRY(cloudabi_sys_sock_listen) - mov $47, %eax - syscall - ret -END(cloudabi_sys_sock_listen) - -ENTRY(cloudabi_sys_sock_recv) - mov $48, %eax - syscall - ret END(cloudabi_sys_sock_recv) ENTRY(cloudabi_sys_sock_send) - mov $49, %eax + mov $46, %eax syscall ret END(cloudabi_sys_sock_send) ENTRY(cloudabi_sys_sock_shutdown) - mov $50, %eax + mov $47, %eax syscall ret END(cloudabi_sys_sock_shutdown) ENTRY(cloudabi_sys_sock_stat_get) - mov $51, %eax + mov $48, %eax syscall ret END(cloudabi_sys_sock_stat_get) ENTRY(cloudabi_sys_thread_create) push %rsi - mov $52, %eax + mov $49, %eax syscall pop %rcx jc 1f mov %eax, (%rcx) xor %eax, %eax 1: ret END(cloudabi_sys_thread_create) ENTRY(cloudabi_sys_thread_exit) - mov $53, %eax + mov $50, %eax syscall END(cloudabi_sys_thread_exit) ENTRY(cloudabi_sys_thread_yield) - mov $54, %eax + mov $51, %eax syscall ret END(cloudabi_sys_thread_yield) Index: projects/runtime-coverage/sys/contrib/cloudabi/syscalls32.master =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/syscalls32.master (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/syscalls32.master (revision 322922) @@ -1,307 +1,291 @@ $FreeBSD$ ; Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. ; ; Redistribution and use in source and binary forms, with or without ; modification, are permitted provided that the following conditions ; are met: ; 1. Redistributions of source code must retain the above copyright ; notice, this list of conditions and the following disclaimer. ; 2. Redistributions in binary form must reproduce the above copyright ; notice, this list of conditions and the following disclaimer in the ; documentation and/or other materials provided with the distribution. ; ; THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ; ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE ; IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE ; FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS ; OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ; HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT ; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY ; OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ; SUCH DAMAGE. ; ; This file is automatically generated. Do not edit. ; ; Source: https://github.com/NuxiNL/cloudabi #include #include #include #include 0 AUE_NULL STD { cloudabi_timestamp_t \ cloudabi_sys_clock_res_get( \ cloudabi_clockid_t clock_id); } 1 AUE_NULL STD { cloudabi_timestamp_t \ cloudabi_sys_clock_time_get( \ cloudabi_clockid_t clock_id, \ cloudabi_timestamp_t precision); } 2 AUE_NULL STD { void cloudabi_sys_condvar_signal( \ cloudabi_condvar_t *condvar, \ cloudabi_scope_t scope, \ cloudabi_nthreads_t nwaiters); } 3 AUE_NULL STD { void cloudabi_sys_fd_close( \ cloudabi_fd_t fd); } 4 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_fd_create1( \ cloudabi_filetype_t type); } 5 AUE_NULL STD { void cloudabi_sys_fd_create2( \ cloudabi_filetype_t type); } 6 AUE_NULL STD { void cloudabi_sys_fd_datasync( \ cloudabi_fd_t fd); } 7 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_fd_dup( \ cloudabi_fd_t from); } 8 AUE_NULL STD { size_t cloudabi32_sys_fd_pread( \ cloudabi_fd_t fd, \ const cloudabi32_iovec_t *iovs, \ size_t iovs_len, \ cloudabi_filesize_t offset); } 9 AUE_NULL STD { size_t cloudabi32_sys_fd_pwrite( \ cloudabi_fd_t fd, \ const cloudabi32_ciovec_t *iovs, \ size_t iovs_len, \ cloudabi_filesize_t offset); } 10 AUE_NULL STD { size_t cloudabi32_sys_fd_read( \ cloudabi_fd_t fd, \ const cloudabi32_iovec_t *iovs, \ size_t iovs_len); } 11 AUE_NULL STD { void cloudabi_sys_fd_replace( \ cloudabi_fd_t from, \ cloudabi_fd_t to); } 12 AUE_NULL STD { cloudabi_filesize_t \ cloudabi_sys_fd_seek( \ cloudabi_fd_t fd, \ cloudabi_filedelta_t offset, \ cloudabi_whence_t whence); } 13 AUE_NULL STD { void cloudabi_sys_fd_stat_get( \ cloudabi_fd_t fd, \ cloudabi_fdstat_t *buf); } 14 AUE_NULL STD { void cloudabi_sys_fd_stat_put( \ cloudabi_fd_t fd, \ const cloudabi_fdstat_t *buf, \ cloudabi_fdsflags_t flags); } 15 AUE_NULL STD { void cloudabi_sys_fd_sync( \ cloudabi_fd_t fd); } 16 AUE_NULL STD { size_t cloudabi32_sys_fd_write( \ cloudabi_fd_t fd, \ const cloudabi32_ciovec_t *iovs, \ size_t iovs_len); } 17 AUE_NULL STD { void cloudabi_sys_file_advise( \ cloudabi_fd_t fd, \ cloudabi_filesize_t offset, \ cloudabi_filesize_t len, \ cloudabi_advice_t advice); } 18 AUE_NULL STD { void cloudabi_sys_file_allocate( \ cloudabi_fd_t fd, \ cloudabi_filesize_t offset, \ cloudabi_filesize_t len); } 19 AUE_NULL STD { void cloudabi_sys_file_create( \ cloudabi_fd_t fd, \ const char *path, \ size_t path_len, \ cloudabi_filetype_t type); } 20 AUE_NULL STD { void cloudabi_sys_file_link( \ cloudabi_lookup_t fd1, \ const char *path1, \ size_t path1_len, \ cloudabi_fd_t fd2, \ const char *path2, \ size_t path2_len); } 21 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_file_open( \ cloudabi_lookup_t dirfd, \ const char *path, \ size_t path_len, \ cloudabi_oflags_t oflags, \ const cloudabi_fdstat_t *fds); } 22 AUE_NULL STD { size_t cloudabi_sys_file_readdir( \ cloudabi_fd_t fd, \ void *buf, \ size_t buf_len, \ cloudabi_dircookie_t cookie); } 23 AUE_NULL STD { size_t cloudabi_sys_file_readlink( \ cloudabi_fd_t fd, \ const char *path, \ size_t path_len, \ char *buf, \ size_t buf_len); } 24 AUE_NULL STD { void cloudabi_sys_file_rename( \ cloudabi_fd_t fd1, \ const char *path1, \ size_t path1_len, \ cloudabi_fd_t fd2, \ const char *path2, \ size_t path2_len); } 25 AUE_NULL STD { void cloudabi_sys_file_stat_fget( \ cloudabi_fd_t fd, \ cloudabi_filestat_t *buf); } 26 AUE_NULL STD { void cloudabi_sys_file_stat_fput( \ cloudabi_fd_t fd, \ const cloudabi_filestat_t *buf, \ cloudabi_fsflags_t flags); } 27 AUE_NULL STD { void cloudabi_sys_file_stat_get( \ cloudabi_lookup_t fd, \ const char *path, \ size_t path_len, \ cloudabi_filestat_t *buf); } 28 AUE_NULL STD { void cloudabi_sys_file_stat_put( \ cloudabi_lookup_t fd, \ const char *path, \ size_t path_len, \ const cloudabi_filestat_t *buf, \ cloudabi_fsflags_t flags); } 29 AUE_NULL STD { void cloudabi_sys_file_symlink( \ const char *path1, \ size_t path1_len, \ cloudabi_fd_t fd, \ const char *path2, \ size_t path2_len); } 30 AUE_NULL STD { void cloudabi_sys_file_unlink( \ cloudabi_fd_t fd, \ const char *path, \ size_t path_len, \ cloudabi_ulflags_t flags); } 31 AUE_NULL STD { void cloudabi_sys_lock_unlock( \ cloudabi_lock_t *lock, \ cloudabi_scope_t scope); } 32 AUE_NULL STD { void cloudabi_sys_mem_advise( \ void *mapping, \ size_t mapping_len, \ cloudabi_advice_t advice); } 33 AUE_NULL STD { void cloudabi_sys_mem_map( \ void *addr, \ size_t len, \ cloudabi_mprot_t prot, \ cloudabi_mflags_t flags, \ cloudabi_fd_t fd, \ cloudabi_filesize_t off); } 34 AUE_NULL STD { void cloudabi_sys_mem_protect( \ void *mapping, \ size_t mapping_len, \ cloudabi_mprot_t prot); } 35 AUE_NULL STD { void cloudabi_sys_mem_sync( \ void *mapping, \ size_t mapping_len, \ cloudabi_msflags_t flags); } 36 AUE_NULL STD { void cloudabi_sys_mem_unmap( \ void *mapping, \ size_t mapping_len); } 37 AUE_NULL STD { size_t cloudabi32_sys_poll( \ const cloudabi32_subscription_t *in, \ cloudabi32_event_t *out, \ size_t nsubscriptions); } 38 AUE_NULL STD { size_t cloudabi32_sys_poll_fd( \ cloudabi_fd_t fd, \ const cloudabi32_subscription_t *in, \ size_t in_len, \ cloudabi32_event_t *out, \ size_t out_len, \ const cloudabi32_subscription_t *timeout); } 39 AUE_NULL STD { void cloudabi_sys_proc_exec( \ cloudabi_fd_t fd, \ const void *data, \ size_t data_len, \ const cloudabi_fd_t *fds, \ size_t fds_len); } 40 AUE_NULL STD { void cloudabi_sys_proc_exit( \ cloudabi_exitcode_t rval); } 41 AUE_NULL STD { void cloudabi_sys_proc_fork(); } 42 AUE_NULL STD { void cloudabi_sys_proc_raise( \ cloudabi_signal_t sig); } 43 AUE_NULL STD { void cloudabi_sys_random_get( \ void *buf, \ size_t buf_len); } 44 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_sock_accept( \ cloudabi_fd_t sock, \ void *unused); } -45 AUE_NULL STD { void cloudabi_sys_sock_bind( \ +45 AUE_NULL STD { void cloudabi32_sys_sock_recv( \ cloudabi_fd_t sock, \ - cloudabi_fd_t fd, \ - const char *path, \ - size_t path_len); } - -46 AUE_NULL STD { void cloudabi_sys_sock_connect( \ - cloudabi_fd_t sock, \ - cloudabi_fd_t fd, \ - const char *path, \ - size_t path_len); } - -47 AUE_NULL STD { void cloudabi_sys_sock_listen( \ - cloudabi_fd_t sock, \ - cloudabi_backlog_t backlog); } - -48 AUE_NULL STD { void cloudabi32_sys_sock_recv( \ - cloudabi_fd_t sock, \ const cloudabi32_recv_in_t *in, \ cloudabi32_recv_out_t *out); } -49 AUE_NULL STD { void cloudabi32_sys_sock_send( \ +46 AUE_NULL STD { void cloudabi32_sys_sock_send( \ cloudabi_fd_t sock, \ const cloudabi32_send_in_t *in, \ cloudabi32_send_out_t *out); } -50 AUE_NULL STD { void cloudabi_sys_sock_shutdown( \ +47 AUE_NULL STD { void cloudabi_sys_sock_shutdown( \ cloudabi_fd_t sock, \ cloudabi_sdflags_t how); } -51 AUE_NULL STD { void cloudabi_sys_sock_stat_get( \ +48 AUE_NULL STD { void cloudabi_sys_sock_stat_get( \ cloudabi_fd_t sock, \ cloudabi_sockstat_t *buf, \ cloudabi_ssflags_t flags); } -52 AUE_NULL STD { cloudabi_tid_t cloudabi32_sys_thread_create( \ +49 AUE_NULL STD { cloudabi_tid_t cloudabi32_sys_thread_create( \ cloudabi32_threadattr_t *attr); } -53 AUE_NULL STD { void cloudabi_sys_thread_exit( \ +50 AUE_NULL STD { void cloudabi_sys_thread_exit( \ cloudabi_lock_t *lock, \ cloudabi_scope_t scope); } -54 AUE_NULL STD { void cloudabi_sys_thread_yield(); } +51 AUE_NULL STD { void cloudabi_sys_thread_yield(); } Index: projects/runtime-coverage/sys/contrib/cloudabi/syscalls64.master =================================================================== --- projects/runtime-coverage/sys/contrib/cloudabi/syscalls64.master (revision 322921) +++ projects/runtime-coverage/sys/contrib/cloudabi/syscalls64.master (revision 322922) @@ -1,307 +1,291 @@ $FreeBSD$ ; Copyright (c) 2016 Nuxi (https://nuxi.nl/) and contributors. ; ; Redistribution and use in source and binary forms, with or without ; modification, are permitted provided that the following conditions ; are met: ; 1. Redistributions of source code must retain the above copyright ; notice, this list of conditions and the following disclaimer. ; 2. Redistributions in binary form must reproduce the above copyright ; notice, this list of conditions and the following disclaimer in the ; documentation and/or other materials provided with the distribution. ; ; THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ; ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE ; IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE ; FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS ; OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ; HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT ; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY ; OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ; SUCH DAMAGE. ; ; This file is automatically generated. Do not edit. ; ; Source: https://github.com/NuxiNL/cloudabi #include #include #include #include 0 AUE_NULL STD { cloudabi_timestamp_t \ cloudabi_sys_clock_res_get( \ cloudabi_clockid_t clock_id); } 1 AUE_NULL STD { cloudabi_timestamp_t \ cloudabi_sys_clock_time_get( \ cloudabi_clockid_t clock_id, \ cloudabi_timestamp_t precision); } 2 AUE_NULL STD { void cloudabi_sys_condvar_signal( \ cloudabi_condvar_t *condvar, \ cloudabi_scope_t scope, \ cloudabi_nthreads_t nwaiters); } 3 AUE_NULL STD { void cloudabi_sys_fd_close( \ cloudabi_fd_t fd); } 4 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_fd_create1( \ cloudabi_filetype_t type); } 5 AUE_NULL STD { void cloudabi_sys_fd_create2( \ cloudabi_filetype_t type); } 6 AUE_NULL STD { void cloudabi_sys_fd_datasync( \ cloudabi_fd_t fd); } 7 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_fd_dup( \ cloudabi_fd_t from); } 8 AUE_NULL STD { size_t cloudabi64_sys_fd_pread( \ cloudabi_fd_t fd, \ const cloudabi64_iovec_t *iovs, \ size_t iovs_len, \ cloudabi_filesize_t offset); } 9 AUE_NULL STD { size_t cloudabi64_sys_fd_pwrite( \ cloudabi_fd_t fd, \ const cloudabi64_ciovec_t *iovs, \ size_t iovs_len, \ cloudabi_filesize_t offset); } 10 AUE_NULL STD { size_t cloudabi64_sys_fd_read( \ cloudabi_fd_t fd, \ const cloudabi64_iovec_t *iovs, \ size_t iovs_len); } 11 AUE_NULL STD { void cloudabi_sys_fd_replace( \ cloudabi_fd_t from, \ cloudabi_fd_t to); } 12 AUE_NULL STD { cloudabi_filesize_t \ cloudabi_sys_fd_seek( \ cloudabi_fd_t fd, \ cloudabi_filedelta_t offset, \ cloudabi_whence_t whence); } 13 AUE_NULL STD { void cloudabi_sys_fd_stat_get( \ cloudabi_fd_t fd, \ cloudabi_fdstat_t *buf); } 14 AUE_NULL STD { void cloudabi_sys_fd_stat_put( \ cloudabi_fd_t fd, \ const cloudabi_fdstat_t *buf, \ cloudabi_fdsflags_t flags); } 15 AUE_NULL STD { void cloudabi_sys_fd_sync( \ cloudabi_fd_t fd); } 16 AUE_NULL STD { size_t cloudabi64_sys_fd_write( \ cloudabi_fd_t fd, \ const cloudabi64_ciovec_t *iovs, \ size_t iovs_len); } 17 AUE_NULL STD { void cloudabi_sys_file_advise( \ cloudabi_fd_t fd, \ cloudabi_filesize_t offset, \ cloudabi_filesize_t len, \ cloudabi_advice_t advice); } 18 AUE_NULL STD { void cloudabi_sys_file_allocate( \ cloudabi_fd_t fd, \ cloudabi_filesize_t offset, \ cloudabi_filesize_t len); } 19 AUE_NULL STD { void cloudabi_sys_file_create( \ cloudabi_fd_t fd, \ const char *path, \ size_t path_len, \ cloudabi_filetype_t type); } 20 AUE_NULL STD { void cloudabi_sys_file_link( \ cloudabi_lookup_t fd1, \ const char *path1, \ size_t path1_len, \ cloudabi_fd_t fd2, \ const char *path2, \ size_t path2_len); } 21 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_file_open( \ cloudabi_lookup_t dirfd, \ const char *path, \ size_t path_len, \ cloudabi_oflags_t oflags, \ const cloudabi_fdstat_t *fds); } 22 AUE_NULL STD { size_t cloudabi_sys_file_readdir( \ cloudabi_fd_t fd, \ void *buf, \ size_t buf_len, \ cloudabi_dircookie_t cookie); } 23 AUE_NULL STD { size_t cloudabi_sys_file_readlink( \ cloudabi_fd_t fd, \ const char *path, \ size_t path_len, \ char *buf, \ size_t buf_len); } 24 AUE_NULL STD { void cloudabi_sys_file_rename( \ cloudabi_fd_t fd1, \ const char *path1, \ size_t path1_len, \ cloudabi_fd_t fd2, \ const char *path2, \ size_t path2_len); } 25 AUE_NULL STD { void cloudabi_sys_file_stat_fget( \ cloudabi_fd_t fd, \ cloudabi_filestat_t *buf); } 26 AUE_NULL STD { void cloudabi_sys_file_stat_fput( \ cloudabi_fd_t fd, \ const cloudabi_filestat_t *buf, \ cloudabi_fsflags_t flags); } 27 AUE_NULL STD { void cloudabi_sys_file_stat_get( \ cloudabi_lookup_t fd, \ const char *path, \ size_t path_len, \ cloudabi_filestat_t *buf); } 28 AUE_NULL STD { void cloudabi_sys_file_stat_put( \ cloudabi_lookup_t fd, \ const char *path, \ size_t path_len, \ const cloudabi_filestat_t *buf, \ cloudabi_fsflags_t flags); } 29 AUE_NULL STD { void cloudabi_sys_file_symlink( \ const char *path1, \ size_t path1_len, \ cloudabi_fd_t fd, \ const char *path2, \ size_t path2_len); } 30 AUE_NULL STD { void cloudabi_sys_file_unlink( \ cloudabi_fd_t fd, \ const char *path, \ size_t path_len, \ cloudabi_ulflags_t flags); } 31 AUE_NULL STD { void cloudabi_sys_lock_unlock( \ cloudabi_lock_t *lock, \ cloudabi_scope_t scope); } 32 AUE_NULL STD { void cloudabi_sys_mem_advise( \ void *mapping, \ size_t mapping_len, \ cloudabi_advice_t advice); } 33 AUE_NULL STD { void cloudabi_sys_mem_map( \ void *addr, \ size_t len, \ cloudabi_mprot_t prot, \ cloudabi_mflags_t flags, \ cloudabi_fd_t fd, \ cloudabi_filesize_t off); } 34 AUE_NULL STD { void cloudabi_sys_mem_protect( \ void *mapping, \ size_t mapping_len, \ cloudabi_mprot_t prot); } 35 AUE_NULL STD { void cloudabi_sys_mem_sync( \ void *mapping, \ size_t mapping_len, \ cloudabi_msflags_t flags); } 36 AUE_NULL STD { void cloudabi_sys_mem_unmap( \ void *mapping, \ size_t mapping_len); } 37 AUE_NULL STD { size_t cloudabi64_sys_poll( \ const cloudabi64_subscription_t *in, \ cloudabi64_event_t *out, \ size_t nsubscriptions); } 38 AUE_NULL STD { size_t cloudabi64_sys_poll_fd( \ cloudabi_fd_t fd, \ const cloudabi64_subscription_t *in, \ size_t in_len, \ cloudabi64_event_t *out, \ size_t out_len, \ const cloudabi64_subscription_t *timeout); } 39 AUE_NULL STD { void cloudabi_sys_proc_exec( \ cloudabi_fd_t fd, \ const void *data, \ size_t data_len, \ const cloudabi_fd_t *fds, \ size_t fds_len); } 40 AUE_NULL STD { void cloudabi_sys_proc_exit( \ cloudabi_exitcode_t rval); } 41 AUE_NULL STD { void cloudabi_sys_proc_fork(); } 42 AUE_NULL STD { void cloudabi_sys_proc_raise( \ cloudabi_signal_t sig); } 43 AUE_NULL STD { void cloudabi_sys_random_get( \ void *buf, \ size_t buf_len); } 44 AUE_NULL STD { cloudabi_fd_t cloudabi_sys_sock_accept( \ cloudabi_fd_t sock, \ void *unused); } -45 AUE_NULL STD { void cloudabi_sys_sock_bind( \ +45 AUE_NULL STD { void cloudabi64_sys_sock_recv( \ cloudabi_fd_t sock, \ - cloudabi_fd_t fd, \ - const char *path, \ - size_t path_len); } - -46 AUE_NULL STD { void cloudabi_sys_sock_connect( \ - cloudabi_fd_t sock, \ - cloudabi_fd_t fd, \ - const char *path, \ - size_t path_len); } - -47 AUE_NULL STD { void cloudabi_sys_sock_listen( \ - cloudabi_fd_t sock, \ - cloudabi_backlog_t backlog); } - -48 AUE_NULL STD { void cloudabi64_sys_sock_recv( \ - cloudabi_fd_t sock, \ const cloudabi64_recv_in_t *in, \ cloudabi64_recv_out_t *out); } -49 AUE_NULL STD { void cloudabi64_sys_sock_send( \ +46 AUE_NULL STD { void cloudabi64_sys_sock_send( \ cloudabi_fd_t sock, \ const cloudabi64_send_in_t *in, \ cloudabi64_send_out_t *out); } -50 AUE_NULL STD { void cloudabi_sys_sock_shutdown( \ +47 AUE_NULL STD { void cloudabi_sys_sock_shutdown( \ cloudabi_fd_t sock, \ cloudabi_sdflags_t how); } -51 AUE_NULL STD { void cloudabi_sys_sock_stat_get( \ +48 AUE_NULL STD { void cloudabi_sys_sock_stat_get( \ cloudabi_fd_t sock, \ cloudabi_sockstat_t *buf, \ cloudabi_ssflags_t flags); } -52 AUE_NULL STD { cloudabi_tid_t cloudabi64_sys_thread_create( \ +49 AUE_NULL STD { cloudabi_tid_t cloudabi64_sys_thread_create( \ cloudabi64_threadattr_t *attr); } -53 AUE_NULL STD { void cloudabi_sys_thread_exit( \ +50 AUE_NULL STD { void cloudabi_sys_thread_exit( \ cloudabi_lock_t *lock, \ cloudabi_scope_t scope); } -54 AUE_NULL STD { void cloudabi_sys_thread_yield(); } +51 AUE_NULL STD { void cloudabi_sys_thread_yield(); } Index: projects/runtime-coverage/sys/dev/cxgbe/common/t4_hw.c =================================================================== --- projects/runtime-coverage/sys/dev/cxgbe/common/t4_hw.c (revision 322921) +++ projects/runtime-coverage/sys/dev/cxgbe/common/t4_hw.c (revision 322922) @@ -1,9747 +1,9747 @@ /*- * Copyright (c) 2012, 2016 Chelsio Communications, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include #include #include "common.h" #include "t4_regs.h" #include "t4_regs_values.h" #include "firmware/t4fw_interface.h" #undef msleep #define msleep(x) do { \ if (cold) \ DELAY((x) * 1000); \ else \ pause("t4hw", (x) * hz / 1000); \ } while (0) /** * t4_wait_op_done_val - wait until an operation is completed * @adapter: the adapter performing the operation * @reg: the register to check for completion * @mask: a single-bit field within @reg that indicates completion * @polarity: the value of the field when the operation is completed * @attempts: number of check iterations * @delay: delay in usecs between iterations * @valp: where to store the value of the register at completion time * * Wait until an operation is completed by checking a bit in a register * up to @attempts times. If @valp is not NULL the value of the register * at the time it indicated completion is stored there. Returns 0 if the * operation completes and -EAGAIN otherwise. */ static int t4_wait_op_done_val(struct adapter *adapter, int reg, u32 mask, int polarity, int attempts, int delay, u32 *valp) { while (1) { u32 val = t4_read_reg(adapter, reg); if (!!(val & mask) == polarity) { if (valp) *valp = val; return 0; } if (--attempts == 0) return -EAGAIN; if (delay) udelay(delay); } } static inline int t4_wait_op_done(struct adapter *adapter, int reg, u32 mask, int polarity, int attempts, int delay) { return t4_wait_op_done_val(adapter, reg, mask, polarity, attempts, delay, NULL); } /** * t4_set_reg_field - set a register field to a value * @adapter: the adapter to program * @addr: the register address * @mask: specifies the portion of the register to modify * @val: the new value for the register field * * Sets a register field specified by the supplied mask to the * given value. */ void t4_set_reg_field(struct adapter *adapter, unsigned int addr, u32 mask, u32 val) { u32 v = t4_read_reg(adapter, addr) & ~mask; t4_write_reg(adapter, addr, v | val); (void) t4_read_reg(adapter, addr); /* flush */ } /** * t4_read_indirect - read indirectly addressed registers * @adap: the adapter * @addr_reg: register holding the indirect address * @data_reg: register holding the value of the indirect register * @vals: where the read register values are stored * @nregs: how many indirect registers to read * @start_idx: index of first indirect register to read * * Reads registers that are accessed indirectly through an address/data * register pair. */ void t4_read_indirect(struct adapter *adap, unsigned int addr_reg, unsigned int data_reg, u32 *vals, unsigned int nregs, unsigned int start_idx) { while (nregs--) { t4_write_reg(adap, addr_reg, start_idx); *vals++ = t4_read_reg(adap, data_reg); start_idx++; } } /** * t4_write_indirect - write indirectly addressed registers * @adap: the adapter * @addr_reg: register holding the indirect addresses * @data_reg: register holding the value for the indirect registers * @vals: values to write * @nregs: how many indirect registers to write * @start_idx: address of first indirect register to write * * Writes a sequential block of registers that are accessed indirectly * through an address/data register pair. */ void t4_write_indirect(struct adapter *adap, unsigned int addr_reg, unsigned int data_reg, const u32 *vals, unsigned int nregs, unsigned int start_idx) { while (nregs--) { t4_write_reg(adap, addr_reg, start_idx++); t4_write_reg(adap, data_reg, *vals++); } } /* * Read a 32-bit PCI Configuration Space register via the PCI-E backdoor * mechanism. This guarantees that we get the real value even if we're * operating within a Virtual Machine and the Hypervisor is trapping our * Configuration Space accesses. * * N.B. This routine should only be used as a last resort: the firmware uses * the backdoor registers on a regular basis and we can end up * conflicting with it's uses! */ u32 t4_hw_pci_read_cfg4(adapter_t *adap, int reg) { u32 req = V_FUNCTION(adap->pf) | V_REGISTER(reg); u32 val; if (chip_id(adap) <= CHELSIO_T5) req |= F_ENABLE; else req |= F_T6_ENABLE; if (is_t4(adap)) req |= F_LOCALCFG; t4_write_reg(adap, A_PCIE_CFG_SPACE_REQ, req); val = t4_read_reg(adap, A_PCIE_CFG_SPACE_DATA); /* * Reset F_ENABLE to 0 so reads of PCIE_CFG_SPACE_DATA won't cause a * Configuration Space read. (None of the other fields matter when * F_ENABLE is 0 so a simple register write is easier than a * read-modify-write via t4_set_reg_field().) */ t4_write_reg(adap, A_PCIE_CFG_SPACE_REQ, 0); return val; } /* * t4_report_fw_error - report firmware error * @adap: the adapter * * The adapter firmware can indicate error conditions to the host. * If the firmware has indicated an error, print out the reason for * the firmware error. */ static void t4_report_fw_error(struct adapter *adap) { static const char *const reason[] = { "Crash", /* PCIE_FW_EVAL_CRASH */ "During Device Preparation", /* PCIE_FW_EVAL_PREP */ "During Device Configuration", /* PCIE_FW_EVAL_CONF */ "During Device Initialization", /* PCIE_FW_EVAL_INIT */ "Unexpected Event", /* PCIE_FW_EVAL_UNEXPECTEDEVENT */ "Insufficient Airflow", /* PCIE_FW_EVAL_OVERHEAT */ "Device Shutdown", /* PCIE_FW_EVAL_DEVICESHUTDOWN */ "Reserved", /* reserved */ }; u32 pcie_fw; pcie_fw = t4_read_reg(adap, A_PCIE_FW); if (pcie_fw & F_PCIE_FW_ERR) CH_ERR(adap, "Firmware reports adapter error: %s\n", reason[G_PCIE_FW_EVAL(pcie_fw)]); } /* * Get the reply to a mailbox command and store it in @rpl in big-endian order. */ static void get_mbox_rpl(struct adapter *adap, __be64 *rpl, int nflit, u32 mbox_addr) { for ( ; nflit; nflit--, mbox_addr += 8) *rpl++ = cpu_to_be64(t4_read_reg64(adap, mbox_addr)); } /* * Handle a FW assertion reported in a mailbox. */ static void fw_asrt(struct adapter *adap, struct fw_debug_cmd *asrt) { CH_ALERT(adap, "FW assertion at %.16s:%u, val0 %#x, val1 %#x\n", asrt->u.assert.filename_0_7, be32_to_cpu(asrt->u.assert.line), be32_to_cpu(asrt->u.assert.x), be32_to_cpu(asrt->u.assert.y)); } #define X_CIM_PF_NOACCESS 0xeeeeeeee /** * t4_wr_mbox_meat_timeout - send a command to FW through the given mailbox * @adap: the adapter * @mbox: index of the mailbox to use * @cmd: the command to write * @size: command length in bytes * @rpl: where to optionally store the reply * @sleep_ok: if true we may sleep while awaiting command completion * @timeout: time to wait for command to finish before timing out * (negative implies @sleep_ok=false) * * Sends the given command to FW through the selected mailbox and waits * for the FW to execute the command. If @rpl is not %NULL it is used to * store the FW's reply to the command. The command and its optional * reply are of the same length. Some FW commands like RESET and * INITIALIZE can take a considerable amount of time to execute. * @sleep_ok determines whether we may sleep while awaiting the response. * If sleeping is allowed we use progressive backoff otherwise we spin. * Note that passing in a negative @timeout is an alternate mechanism * for specifying @sleep_ok=false. This is useful when a higher level * interface allows for specification of @timeout but not @sleep_ok ... * * The return value is 0 on success or a negative errno on failure. A * failure can happen either because we are not able to execute the * command or FW executes it but signals an error. In the latter case * the return value is the error code indicated by FW (negated). */ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd, int size, void *rpl, bool sleep_ok, int timeout) { /* * We delay in small increments at first in an effort to maintain * responsiveness for simple, fast executing commands but then back * off to larger delays to a maximum retry delay. */ static const int delay[] = { 1, 1, 3, 5, 10, 10, 20, 50, 100 }; u32 v; u64 res; int i, ms, delay_idx, ret; const __be64 *p = cmd; u32 data_reg = PF_REG(mbox, A_CIM_PF_MAILBOX_DATA); u32 ctl_reg = PF_REG(mbox, A_CIM_PF_MAILBOX_CTRL); u32 ctl; __be64 cmd_rpl[MBOX_LEN/8]; u32 pcie_fw; if ((size & 15) || size > MBOX_LEN) return -EINVAL; if (adap->flags & IS_VF) { if (is_t6(adap)) data_reg = FW_T6VF_MBDATA_BASE_ADDR; else data_reg = FW_T4VF_MBDATA_BASE_ADDR; ctl_reg = VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_CTRL); } /* * If we have a negative timeout, that implies that we can't sleep. */ if (timeout < 0) { sleep_ok = false; timeout = -timeout; } /* * Attempt to gain access to the mailbox. */ for (i = 0; i < 4; i++) { ctl = t4_read_reg(adap, ctl_reg); v = G_MBOWNER(ctl); if (v != X_MBOWNER_NONE) break; } /* * If we were unable to gain access, dequeue ourselves from the * mailbox atomic access list and report the error to our caller. */ if (v != X_MBOWNER_PL) { t4_report_fw_error(adap); ret = (v == X_MBOWNER_FW) ? -EBUSY : -ETIMEDOUT; return ret; } /* * If we gain ownership of the mailbox and there's a "valid" message * in it, this is likely an asynchronous error message from the * firmware. So we'll report that and then proceed on with attempting * to issue our own command ... which may well fail if the error * presaged the firmware crashing ... */ if (ctl & F_MBMSGVALID) { - CH_ERR(adap, "found VALID command in mbox %u: " - "%llx %llx %llx %llx %llx %llx %llx %llx\n", mbox, - (unsigned long long)t4_read_reg64(adap, data_reg), + CH_ERR(adap, "found VALID command in mbox %u: %016llx %016llx " + "%016llx %016llx %016llx %016llx %016llx %016llx\n", + mbox, (unsigned long long)t4_read_reg64(adap, data_reg), (unsigned long long)t4_read_reg64(adap, data_reg + 8), (unsigned long long)t4_read_reg64(adap, data_reg + 16), (unsigned long long)t4_read_reg64(adap, data_reg + 24), (unsigned long long)t4_read_reg64(adap, data_reg + 32), (unsigned long long)t4_read_reg64(adap, data_reg + 40), (unsigned long long)t4_read_reg64(adap, data_reg + 48), (unsigned long long)t4_read_reg64(adap, data_reg + 56)); } /* * Copy in the new mailbox command and send it on its way ... */ for (i = 0; i < size; i += 8, p++) t4_write_reg64(adap, data_reg + i, be64_to_cpu(*p)); if (adap->flags & IS_VF) { /* * For the VFs, the Mailbox Data "registers" are * actually backed by T4's "MA" interface rather than * PL Registers (as is the case for the PFs). Because * these are in different coherency domains, the write * to the VF's PL-register-backed Mailbox Control can * race in front of the writes to the MA-backed VF * Mailbox Data "registers". So we need to do a * read-back on at least one byte of the VF Mailbox * Data registers before doing the write to the VF * Mailbox Control register. */ t4_read_reg(adap, data_reg); } CH_DUMP_MBOX(adap, mbox, data_reg); t4_write_reg(adap, ctl_reg, F_MBMSGVALID | V_MBOWNER(X_MBOWNER_FW)); t4_read_reg(adap, ctl_reg); /* flush write */ delay_idx = 0; ms = delay[0]; /* * Loop waiting for the reply; bail out if we time out or the firmware * reports an error. */ pcie_fw = 0; for (i = 0; i < timeout; i += ms) { if (!(adap->flags & IS_VF)) { pcie_fw = t4_read_reg(adap, A_PCIE_FW); if (pcie_fw & F_PCIE_FW_ERR) break; } if (sleep_ok) { ms = delay[delay_idx]; /* last element may repeat */ if (delay_idx < ARRAY_SIZE(delay) - 1) delay_idx++; msleep(ms); } else { mdelay(ms); } v = t4_read_reg(adap, ctl_reg); if (v == X_CIM_PF_NOACCESS) continue; if (G_MBOWNER(v) == X_MBOWNER_PL) { if (!(v & F_MBMSGVALID)) { t4_write_reg(adap, ctl_reg, V_MBOWNER(X_MBOWNER_NONE)); continue; } /* * Retrieve the command reply and release the mailbox. */ get_mbox_rpl(adap, cmd_rpl, MBOX_LEN/8, data_reg); t4_write_reg(adap, ctl_reg, V_MBOWNER(X_MBOWNER_NONE)); CH_DUMP_MBOX(adap, mbox, data_reg); res = be64_to_cpu(cmd_rpl[0]); if (G_FW_CMD_OP(res >> 32) == FW_DEBUG_CMD) { fw_asrt(adap, (struct fw_debug_cmd *)cmd_rpl); res = V_FW_CMD_RETVAL(EIO); } else if (rpl) memcpy(rpl, cmd_rpl, size); return -G_FW_CMD_RETVAL((int)res); } } /* * We timed out waiting for a reply to our mailbox command. Report * the error and also check to see if the firmware reported any * errors ... */ ret = (pcie_fw & F_PCIE_FW_ERR) ? -ENXIO : -ETIMEDOUT; CH_ERR(adap, "command %#x in mailbox %d timed out\n", *(const u8 *)cmd, mbox); /* If DUMP_MBOX is set the mbox has already been dumped */ if ((adap->debug_flags & DF_DUMP_MBOX) == 0) { p = cmd; CH_ERR(adap, "mbox: %016llx %016llx %016llx %016llx " "%016llx %016llx %016llx %016llx\n", (unsigned long long)be64_to_cpu(p[0]), (unsigned long long)be64_to_cpu(p[1]), (unsigned long long)be64_to_cpu(p[2]), (unsigned long long)be64_to_cpu(p[3]), (unsigned long long)be64_to_cpu(p[4]), (unsigned long long)be64_to_cpu(p[5]), (unsigned long long)be64_to_cpu(p[6]), (unsigned long long)be64_to_cpu(p[7])); } t4_report_fw_error(adap); t4_fatal_err(adap); return ret; } int t4_wr_mbox_meat(struct adapter *adap, int mbox, const void *cmd, int size, void *rpl, bool sleep_ok) { return t4_wr_mbox_meat_timeout(adap, mbox, cmd, size, rpl, sleep_ok, FW_CMD_MAX_TIMEOUT); } static int t4_edc_err_read(struct adapter *adap, int idx) { u32 edc_ecc_err_addr_reg; u32 edc_bist_status_rdata_reg; if (is_t4(adap)) { CH_WARN(adap, "%s: T4 NOT supported.\n", __func__); return 0; } if (idx != MEM_EDC0 && idx != MEM_EDC1) { CH_WARN(adap, "%s: idx %d NOT supported.\n", __func__, idx); return 0; } edc_ecc_err_addr_reg = EDC_T5_REG(A_EDC_H_ECC_ERR_ADDR, idx); edc_bist_status_rdata_reg = EDC_T5_REG(A_EDC_H_BIST_STATUS_RDATA, idx); CH_WARN(adap, "edc%d err addr 0x%x: 0x%x.\n", idx, edc_ecc_err_addr_reg, t4_read_reg(adap, edc_ecc_err_addr_reg)); CH_WARN(adap, "bist: 0x%x, status %llx %llx %llx %llx %llx %llx %llx %llx %llx.\n", edc_bist_status_rdata_reg, (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 8), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 16), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 24), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 32), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 40), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 48), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 56), (unsigned long long)t4_read_reg64(adap, edc_bist_status_rdata_reg + 64)); return 0; } /** * t4_mc_read - read from MC through backdoor accesses * @adap: the adapter * @idx: which MC to access * @addr: address of first byte requested * @data: 64 bytes of data containing the requested address * @ecc: where to store the corresponding 64-bit ECC word * * Read 64 bytes of data from MC starting at a 64-byte-aligned address * that covers the requested address @addr. If @parity is not %NULL it * is assigned the 64-bit ECC word for the read data. */ int t4_mc_read(struct adapter *adap, int idx, u32 addr, __be32 *data, u64 *ecc) { int i; u32 mc_bist_cmd_reg, mc_bist_cmd_addr_reg, mc_bist_cmd_len_reg; u32 mc_bist_status_rdata_reg, mc_bist_data_pattern_reg; if (is_t4(adap)) { mc_bist_cmd_reg = A_MC_BIST_CMD; mc_bist_cmd_addr_reg = A_MC_BIST_CMD_ADDR; mc_bist_cmd_len_reg = A_MC_BIST_CMD_LEN; mc_bist_status_rdata_reg = A_MC_BIST_STATUS_RDATA; mc_bist_data_pattern_reg = A_MC_BIST_DATA_PATTERN; } else { mc_bist_cmd_reg = MC_REG(A_MC_P_BIST_CMD, idx); mc_bist_cmd_addr_reg = MC_REG(A_MC_P_BIST_CMD_ADDR, idx); mc_bist_cmd_len_reg = MC_REG(A_MC_P_BIST_CMD_LEN, idx); mc_bist_status_rdata_reg = MC_REG(A_MC_P_BIST_STATUS_RDATA, idx); mc_bist_data_pattern_reg = MC_REG(A_MC_P_BIST_DATA_PATTERN, idx); } if (t4_read_reg(adap, mc_bist_cmd_reg) & F_START_BIST) return -EBUSY; t4_write_reg(adap, mc_bist_cmd_addr_reg, addr & ~0x3fU); t4_write_reg(adap, mc_bist_cmd_len_reg, 64); t4_write_reg(adap, mc_bist_data_pattern_reg, 0xc); t4_write_reg(adap, mc_bist_cmd_reg, V_BIST_OPCODE(1) | F_START_BIST | V_BIST_CMD_GAP(1)); i = t4_wait_op_done(adap, mc_bist_cmd_reg, F_START_BIST, 0, 10, 1); if (i) return i; #define MC_DATA(i) MC_BIST_STATUS_REG(mc_bist_status_rdata_reg, i) for (i = 15; i >= 0; i--) *data++ = ntohl(t4_read_reg(adap, MC_DATA(i))); if (ecc) *ecc = t4_read_reg64(adap, MC_DATA(16)); #undef MC_DATA return 0; } /** * t4_edc_read - read from EDC through backdoor accesses * @adap: the adapter * @idx: which EDC to access * @addr: address of first byte requested * @data: 64 bytes of data containing the requested address * @ecc: where to store the corresponding 64-bit ECC word * * Read 64 bytes of data from EDC starting at a 64-byte-aligned address * that covers the requested address @addr. If @parity is not %NULL it * is assigned the 64-bit ECC word for the read data. */ int t4_edc_read(struct adapter *adap, int idx, u32 addr, __be32 *data, u64 *ecc) { int i; u32 edc_bist_cmd_reg, edc_bist_cmd_addr_reg, edc_bist_cmd_len_reg; u32 edc_bist_cmd_data_pattern, edc_bist_status_rdata_reg; if (is_t4(adap)) { edc_bist_cmd_reg = EDC_REG(A_EDC_BIST_CMD, idx); edc_bist_cmd_addr_reg = EDC_REG(A_EDC_BIST_CMD_ADDR, idx); edc_bist_cmd_len_reg = EDC_REG(A_EDC_BIST_CMD_LEN, idx); edc_bist_cmd_data_pattern = EDC_REG(A_EDC_BIST_DATA_PATTERN, idx); edc_bist_status_rdata_reg = EDC_REG(A_EDC_BIST_STATUS_RDATA, idx); } else { /* * These macro are missing in t4_regs.h file. * Added temporarily for testing. */ #define EDC_STRIDE_T5 (EDC_T51_BASE_ADDR - EDC_T50_BASE_ADDR) #define EDC_REG_T5(reg, idx) (reg + EDC_STRIDE_T5 * idx) edc_bist_cmd_reg = EDC_REG_T5(A_EDC_H_BIST_CMD, idx); edc_bist_cmd_addr_reg = EDC_REG_T5(A_EDC_H_BIST_CMD_ADDR, idx); edc_bist_cmd_len_reg = EDC_REG_T5(A_EDC_H_BIST_CMD_LEN, idx); edc_bist_cmd_data_pattern = EDC_REG_T5(A_EDC_H_BIST_DATA_PATTERN, idx); edc_bist_status_rdata_reg = EDC_REG_T5(A_EDC_H_BIST_STATUS_RDATA, idx); #undef EDC_REG_T5 #undef EDC_STRIDE_T5 } if (t4_read_reg(adap, edc_bist_cmd_reg) & F_START_BIST) return -EBUSY; t4_write_reg(adap, edc_bist_cmd_addr_reg, addr & ~0x3fU); t4_write_reg(adap, edc_bist_cmd_len_reg, 64); t4_write_reg(adap, edc_bist_cmd_data_pattern, 0xc); t4_write_reg(adap, edc_bist_cmd_reg, V_BIST_OPCODE(1) | V_BIST_CMD_GAP(1) | F_START_BIST); i = t4_wait_op_done(adap, edc_bist_cmd_reg, F_START_BIST, 0, 10, 1); if (i) return i; #define EDC_DATA(i) EDC_BIST_STATUS_REG(edc_bist_status_rdata_reg, i) for (i = 15; i >= 0; i--) *data++ = ntohl(t4_read_reg(adap, EDC_DATA(i))); if (ecc) *ecc = t4_read_reg64(adap, EDC_DATA(16)); #undef EDC_DATA return 0; } /** * t4_mem_read - read EDC 0, EDC 1 or MC into buffer * @adap: the adapter * @mtype: memory type: MEM_EDC0, MEM_EDC1 or MEM_MC * @addr: address within indicated memory type * @len: amount of memory to read * @buf: host memory buffer * * Reads an [almost] arbitrary memory region in the firmware: the * firmware memory address, length and host buffer must be aligned on * 32-bit boudaries. The memory is returned as a raw byte sequence from * the firmware's memory. If this memory contains data structures which * contain multi-byte integers, it's the callers responsibility to * perform appropriate byte order conversions. */ int t4_mem_read(struct adapter *adap, int mtype, u32 addr, u32 len, __be32 *buf) { u32 pos, start, end, offset; int ret; /* * Argument sanity checks ... */ if ((addr & 0x3) || (len & 0x3)) return -EINVAL; /* * The underlaying EDC/MC read routines read 64 bytes at a time so we * need to round down the start and round up the end. We'll start * copying out of the first line at (addr - start) a word at a time. */ start = rounddown2(addr, 64); end = roundup2(addr + len, 64); offset = (addr - start)/sizeof(__be32); for (pos = start; pos < end; pos += 64, offset = 0) { __be32 data[16]; /* * Read the chip's memory block and bail if there's an error. */ if ((mtype == MEM_MC) || (mtype == MEM_MC1)) ret = t4_mc_read(adap, mtype - MEM_MC, pos, data, NULL); else ret = t4_edc_read(adap, mtype, pos, data, NULL); if (ret) return ret; /* * Copy the data into the caller's memory buffer. */ while (offset < 16 && len > 0) { *buf++ = data[offset++]; len -= sizeof(__be32); } } return 0; } /* * Return the specified PCI-E Configuration Space register from our Physical * Function. We try first via a Firmware LDST Command (if fw_attach != 0) * since we prefer to let the firmware own all of these registers, but if that * fails we go for it directly ourselves. */ u32 t4_read_pcie_cfg4(struct adapter *adap, int reg, int drv_fw_attach) { /* * If fw_attach != 0, construct and send the Firmware LDST Command to * retrieve the specified PCI-E Configuration Space register. */ if (drv_fw_attach != 0) { struct fw_ldst_cmd ldst_cmd; int ret; memset(&ldst_cmd, 0, sizeof(ldst_cmd)); ldst_cmd.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_FUNC_PCIE)); ldst_cmd.cycles_to_len16 = cpu_to_be32(FW_LEN16(ldst_cmd)); ldst_cmd.u.pcie.select_naccess = V_FW_LDST_CMD_NACCESS(1); ldst_cmd.u.pcie.ctrl_to_fn = (F_FW_LDST_CMD_LC | V_FW_LDST_CMD_FN(adap->pf)); ldst_cmd.u.pcie.r = reg; /* * If the LDST Command succeeds, return the result, otherwise * fall through to reading it directly ourselves ... */ ret = t4_wr_mbox(adap, adap->mbox, &ldst_cmd, sizeof(ldst_cmd), &ldst_cmd); if (ret == 0) return be32_to_cpu(ldst_cmd.u.pcie.data[0]); CH_WARN(adap, "Firmware failed to return " "Configuration Space register %d, err = %d\n", reg, -ret); } /* * Read the desired Configuration Space register via the PCI-E * Backdoor mechanism. */ return t4_hw_pci_read_cfg4(adap, reg); } /** * t4_get_regs_len - return the size of the chips register set * @adapter: the adapter * * Returns the size of the chip's BAR0 register space. */ unsigned int t4_get_regs_len(struct adapter *adapter) { unsigned int chip_version = chip_id(adapter); switch (chip_version) { case CHELSIO_T4: if (adapter->flags & IS_VF) return FW_T4VF_REGMAP_SIZE; return T4_REGMAP_SIZE; case CHELSIO_T5: case CHELSIO_T6: if (adapter->flags & IS_VF) return FW_T4VF_REGMAP_SIZE; return T5_REGMAP_SIZE; } CH_ERR(adapter, "Unsupported chip version %d\n", chip_version); return 0; } /** * t4_get_regs - read chip registers into provided buffer * @adap: the adapter * @buf: register buffer * @buf_size: size (in bytes) of register buffer * * If the provided register buffer isn't large enough for the chip's * full register range, the register dump will be truncated to the * register buffer's size. */ void t4_get_regs(struct adapter *adap, u8 *buf, size_t buf_size) { static const unsigned int t4_reg_ranges[] = { 0x1008, 0x1108, 0x1180, 0x1184, 0x1190, 0x1194, 0x11a0, 0x11a4, 0x11b0, 0x11b4, 0x11fc, 0x123c, 0x1300, 0x173c, 0x1800, 0x18fc, 0x3000, 0x30d8, 0x30e0, 0x30e4, 0x30ec, 0x5910, 0x5920, 0x5924, 0x5960, 0x5960, 0x5968, 0x5968, 0x5970, 0x5970, 0x5978, 0x5978, 0x5980, 0x5980, 0x5988, 0x5988, 0x5990, 0x5990, 0x5998, 0x5998, 0x59a0, 0x59d4, 0x5a00, 0x5ae0, 0x5ae8, 0x5ae8, 0x5af0, 0x5af0, 0x5af8, 0x5af8, 0x6000, 0x6098, 0x6100, 0x6150, 0x6200, 0x6208, 0x6240, 0x6248, 0x6280, 0x62b0, 0x62c0, 0x6338, 0x6370, 0x638c, 0x6400, 0x643c, 0x6500, 0x6524, 0x6a00, 0x6a04, 0x6a14, 0x6a38, 0x6a60, 0x6a70, 0x6a78, 0x6a78, 0x6b00, 0x6b0c, 0x6b1c, 0x6b84, 0x6bf0, 0x6bf8, 0x6c00, 0x6c0c, 0x6c1c, 0x6c84, 0x6cf0, 0x6cf8, 0x6d00, 0x6d0c, 0x6d1c, 0x6d84, 0x6df0, 0x6df8, 0x6e00, 0x6e0c, 0x6e1c, 0x6e84, 0x6ef0, 0x6ef8, 0x6f00, 0x6f0c, 0x6f1c, 0x6f84, 0x6ff0, 0x6ff8, 0x7000, 0x700c, 0x701c, 0x7084, 0x70f0, 0x70f8, 0x7100, 0x710c, 0x711c, 0x7184, 0x71f0, 0x71f8, 0x7200, 0x720c, 0x721c, 0x7284, 0x72f0, 0x72f8, 0x7300, 0x730c, 0x731c, 0x7384, 0x73f0, 0x73f8, 0x7400, 0x7450, 0x7500, 0x7530, 0x7600, 0x760c, 0x7614, 0x761c, 0x7680, 0x76cc, 0x7700, 0x7798, 0x77c0, 0x77fc, 0x7900, 0x79fc, 0x7b00, 0x7b58, 0x7b60, 0x7b84, 0x7b8c, 0x7c38, 0x7d00, 0x7d38, 0x7d40, 0x7d80, 0x7d8c, 0x7ddc, 0x7de4, 0x7e04, 0x7e10, 0x7e1c, 0x7e24, 0x7e38, 0x7e40, 0x7e44, 0x7e4c, 0x7e78, 0x7e80, 0x7ea4, 0x7eac, 0x7edc, 0x7ee8, 0x7efc, 0x8dc0, 0x8e04, 0x8e10, 0x8e1c, 0x8e30, 0x8e78, 0x8ea0, 0x8eb8, 0x8ec0, 0x8f6c, 0x8fc0, 0x9008, 0x9010, 0x9058, 0x9060, 0x9060, 0x9068, 0x9074, 0x90fc, 0x90fc, 0x9400, 0x9408, 0x9410, 0x9458, 0x9600, 0x9600, 0x9608, 0x9638, 0x9640, 0x96bc, 0x9800, 0x9808, 0x9820, 0x983c, 0x9850, 0x9864, 0x9c00, 0x9c6c, 0x9c80, 0x9cec, 0x9d00, 0x9d6c, 0x9d80, 0x9dec, 0x9e00, 0x9e6c, 0x9e80, 0x9eec, 0x9f00, 0x9f6c, 0x9f80, 0x9fec, 0xd004, 0xd004, 0xd010, 0xd03c, 0xdfc0, 0xdfe0, 0xe000, 0xea7c, 0xf000, 0x11110, 0x11118, 0x11190, 0x19040, 0x1906c, 0x19078, 0x19080, 0x1908c, 0x190e4, 0x190f0, 0x190f8, 0x19100, 0x19110, 0x19120, 0x19124, 0x19150, 0x19194, 0x1919c, 0x191b0, 0x191d0, 0x191e8, 0x19238, 0x1924c, 0x193f8, 0x1943c, 0x1944c, 0x19474, 0x19490, 0x194e0, 0x194f0, 0x194f8, 0x19800, 0x19c08, 0x19c10, 0x19c90, 0x19ca0, 0x19ce4, 0x19cf0, 0x19d40, 0x19d50, 0x19d94, 0x19da0, 0x19de8, 0x19df0, 0x19e40, 0x19e50, 0x19e90, 0x19ea0, 0x19f4c, 0x1a000, 0x1a004, 0x1a010, 0x1a06c, 0x1a0b0, 0x1a0e4, 0x1a0ec, 0x1a0f4, 0x1a100, 0x1a108, 0x1a114, 0x1a120, 0x1a128, 0x1a130, 0x1a138, 0x1a138, 0x1a190, 0x1a1c4, 0x1a1fc, 0x1a1fc, 0x1e040, 0x1e04c, 0x1e284, 0x1e28c, 0x1e2c0, 0x1e2c0, 0x1e2e0, 0x1e2e0, 0x1e300, 0x1e384, 0x1e3c0, 0x1e3c8, 0x1e440, 0x1e44c, 0x1e684, 0x1e68c, 0x1e6c0, 0x1e6c0, 0x1e6e0, 0x1e6e0, 0x1e700, 0x1e784, 0x1e7c0, 0x1e7c8, 0x1e840, 0x1e84c, 0x1ea84, 0x1ea8c, 0x1eac0, 0x1eac0, 0x1eae0, 0x1eae0, 0x1eb00, 0x1eb84, 0x1ebc0, 0x1ebc8, 0x1ec40, 0x1ec4c, 0x1ee84, 0x1ee8c, 0x1eec0, 0x1eec0, 0x1eee0, 0x1eee0, 0x1ef00, 0x1ef84, 0x1efc0, 0x1efc8, 0x1f040, 0x1f04c, 0x1f284, 0x1f28c, 0x1f2c0, 0x1f2c0, 0x1f2e0, 0x1f2e0, 0x1f300, 0x1f384, 0x1f3c0, 0x1f3c8, 0x1f440, 0x1f44c, 0x1f684, 0x1f68c, 0x1f6c0, 0x1f6c0, 0x1f6e0, 0x1f6e0, 0x1f700, 0x1f784, 0x1f7c0, 0x1f7c8, 0x1f840, 0x1f84c, 0x1fa84, 0x1fa8c, 0x1fac0, 0x1fac0, 0x1fae0, 0x1fae0, 0x1fb00, 0x1fb84, 0x1fbc0, 0x1fbc8, 0x1fc40, 0x1fc4c, 0x1fe84, 0x1fe8c, 0x1fec0, 0x1fec0, 0x1fee0, 0x1fee0, 0x1ff00, 0x1ff84, 0x1ffc0, 0x1ffc8, 0x20000, 0x2002c, 0x20100, 0x2013c, 0x20190, 0x201a0, 0x201a8, 0x201b8, 0x201c4, 0x201c8, 0x20200, 0x20318, 0x20400, 0x204b4, 0x204c0, 0x20528, 0x20540, 0x20614, 0x21000, 0x21040, 0x2104c, 0x21060, 0x210c0, 0x210ec, 0x21200, 0x21268, 0x21270, 0x21284, 0x212fc, 0x21388, 0x21400, 0x21404, 0x21500, 0x21500, 0x21510, 0x21518, 0x2152c, 0x21530, 0x2153c, 0x2153c, 0x21550, 0x21554, 0x21600, 0x21600, 0x21608, 0x2161c, 0x21624, 0x21628, 0x21630, 0x21634, 0x2163c, 0x2163c, 0x21700, 0x2171c, 0x21780, 0x2178c, 0x21800, 0x21818, 0x21820, 0x21828, 0x21830, 0x21848, 0x21850, 0x21854, 0x21860, 0x21868, 0x21870, 0x21870, 0x21878, 0x21898, 0x218a0, 0x218a8, 0x218b0, 0x218c8, 0x218d0, 0x218d4, 0x218e0, 0x218e8, 0x218f0, 0x218f0, 0x218f8, 0x21a18, 0x21a20, 0x21a28, 0x21a30, 0x21a48, 0x21a50, 0x21a54, 0x21a60, 0x21a68, 0x21a70, 0x21a70, 0x21a78, 0x21a98, 0x21aa0, 0x21aa8, 0x21ab0, 0x21ac8, 0x21ad0, 0x21ad4, 0x21ae0, 0x21ae8, 0x21af0, 0x21af0, 0x21af8, 0x21c18, 0x21c20, 0x21c20, 0x21c28, 0x21c30, 0x21c38, 0x21c38, 0x21c80, 0x21c98, 0x21ca0, 0x21ca8, 0x21cb0, 0x21cc8, 0x21cd0, 0x21cd4, 0x21ce0, 0x21ce8, 0x21cf0, 0x21cf0, 0x21cf8, 0x21d7c, 0x21e00, 0x21e04, 0x22000, 0x2202c, 0x22100, 0x2213c, 0x22190, 0x221a0, 0x221a8, 0x221b8, 0x221c4, 0x221c8, 0x22200, 0x22318, 0x22400, 0x224b4, 0x224c0, 0x22528, 0x22540, 0x22614, 0x23000, 0x23040, 0x2304c, 0x23060, 0x230c0, 0x230ec, 0x23200, 0x23268, 0x23270, 0x23284, 0x232fc, 0x23388, 0x23400, 0x23404, 0x23500, 0x23500, 0x23510, 0x23518, 0x2352c, 0x23530, 0x2353c, 0x2353c, 0x23550, 0x23554, 0x23600, 0x23600, 0x23608, 0x2361c, 0x23624, 0x23628, 0x23630, 0x23634, 0x2363c, 0x2363c, 0x23700, 0x2371c, 0x23780, 0x2378c, 0x23800, 0x23818, 0x23820, 0x23828, 0x23830, 0x23848, 0x23850, 0x23854, 0x23860, 0x23868, 0x23870, 0x23870, 0x23878, 0x23898, 0x238a0, 0x238a8, 0x238b0, 0x238c8, 0x238d0, 0x238d4, 0x238e0, 0x238e8, 0x238f0, 0x238f0, 0x238f8, 0x23a18, 0x23a20, 0x23a28, 0x23a30, 0x23a48, 0x23a50, 0x23a54, 0x23a60, 0x23a68, 0x23a70, 0x23a70, 0x23a78, 0x23a98, 0x23aa0, 0x23aa8, 0x23ab0, 0x23ac8, 0x23ad0, 0x23ad4, 0x23ae0, 0x23ae8, 0x23af0, 0x23af0, 0x23af8, 0x23c18, 0x23c20, 0x23c20, 0x23c28, 0x23c30, 0x23c38, 0x23c38, 0x23c80, 0x23c98, 0x23ca0, 0x23ca8, 0x23cb0, 0x23cc8, 0x23cd0, 0x23cd4, 0x23ce0, 0x23ce8, 0x23cf0, 0x23cf0, 0x23cf8, 0x23d7c, 0x23e00, 0x23e04, 0x24000, 0x2402c, 0x24100, 0x2413c, 0x24190, 0x241a0, 0x241a8, 0x241b8, 0x241c4, 0x241c8, 0x24200, 0x24318, 0x24400, 0x244b4, 0x244c0, 0x24528, 0x24540, 0x24614, 0x25000, 0x25040, 0x2504c, 0x25060, 0x250c0, 0x250ec, 0x25200, 0x25268, 0x25270, 0x25284, 0x252fc, 0x25388, 0x25400, 0x25404, 0x25500, 0x25500, 0x25510, 0x25518, 0x2552c, 0x25530, 0x2553c, 0x2553c, 0x25550, 0x25554, 0x25600, 0x25600, 0x25608, 0x2561c, 0x25624, 0x25628, 0x25630, 0x25634, 0x2563c, 0x2563c, 0x25700, 0x2571c, 0x25780, 0x2578c, 0x25800, 0x25818, 0x25820, 0x25828, 0x25830, 0x25848, 0x25850, 0x25854, 0x25860, 0x25868, 0x25870, 0x25870, 0x25878, 0x25898, 0x258a0, 0x258a8, 0x258b0, 0x258c8, 0x258d0, 0x258d4, 0x258e0, 0x258e8, 0x258f0, 0x258f0, 0x258f8, 0x25a18, 0x25a20, 0x25a28, 0x25a30, 0x25a48, 0x25a50, 0x25a54, 0x25a60, 0x25a68, 0x25a70, 0x25a70, 0x25a78, 0x25a98, 0x25aa0, 0x25aa8, 0x25ab0, 0x25ac8, 0x25ad0, 0x25ad4, 0x25ae0, 0x25ae8, 0x25af0, 0x25af0, 0x25af8, 0x25c18, 0x25c20, 0x25c20, 0x25c28, 0x25c30, 0x25c38, 0x25c38, 0x25c80, 0x25c98, 0x25ca0, 0x25ca8, 0x25cb0, 0x25cc8, 0x25cd0, 0x25cd4, 0x25ce0, 0x25ce8, 0x25cf0, 0x25cf0, 0x25cf8, 0x25d7c, 0x25e00, 0x25e04, 0x26000, 0x2602c, 0x26100, 0x2613c, 0x26190, 0x261a0, 0x261a8, 0x261b8, 0x261c4, 0x261c8, 0x26200, 0x26318, 0x26400, 0x264b4, 0x264c0, 0x26528, 0x26540, 0x26614, 0x27000, 0x27040, 0x2704c, 0x27060, 0x270c0, 0x270ec, 0x27200, 0x27268, 0x27270, 0x27284, 0x272fc, 0x27388, 0x27400, 0x27404, 0x27500, 0x27500, 0x27510, 0x27518, 0x2752c, 0x27530, 0x2753c, 0x2753c, 0x27550, 0x27554, 0x27600, 0x27600, 0x27608, 0x2761c, 0x27624, 0x27628, 0x27630, 0x27634, 0x2763c, 0x2763c, 0x27700, 0x2771c, 0x27780, 0x2778c, 0x27800, 0x27818, 0x27820, 0x27828, 0x27830, 0x27848, 0x27850, 0x27854, 0x27860, 0x27868, 0x27870, 0x27870, 0x27878, 0x27898, 0x278a0, 0x278a8, 0x278b0, 0x278c8, 0x278d0, 0x278d4, 0x278e0, 0x278e8, 0x278f0, 0x278f0, 0x278f8, 0x27a18, 0x27a20, 0x27a28, 0x27a30, 0x27a48, 0x27a50, 0x27a54, 0x27a60, 0x27a68, 0x27a70, 0x27a70, 0x27a78, 0x27a98, 0x27aa0, 0x27aa8, 0x27ab0, 0x27ac8, 0x27ad0, 0x27ad4, 0x27ae0, 0x27ae8, 0x27af0, 0x27af0, 0x27af8, 0x27c18, 0x27c20, 0x27c20, 0x27c28, 0x27c30, 0x27c38, 0x27c38, 0x27c80, 0x27c98, 0x27ca0, 0x27ca8, 0x27cb0, 0x27cc8, 0x27cd0, 0x27cd4, 0x27ce0, 0x27ce8, 0x27cf0, 0x27cf0, 0x27cf8, 0x27d7c, 0x27e00, 0x27e04, }; static const unsigned int t4vf_reg_ranges[] = { VF_SGE_REG(A_SGE_VF_KDOORBELL), VF_SGE_REG(A_SGE_VF_GTS), VF_MPS_REG(A_MPS_VF_CTL), VF_MPS_REG(A_MPS_VF_STAT_RX_VF_ERR_FRAMES_H), VF_PL_REG(A_PL_VF_WHOAMI), VF_PL_REG(A_PL_VF_WHOAMI), VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_CTRL), VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_STATUS), FW_T4VF_MBDATA_BASE_ADDR, FW_T4VF_MBDATA_BASE_ADDR + ((NUM_CIM_PF_MAILBOX_DATA_INSTANCES - 1) * 4), }; static const unsigned int t5_reg_ranges[] = { 0x1008, 0x10c0, 0x10cc, 0x10f8, 0x1100, 0x1100, 0x110c, 0x1148, 0x1180, 0x1184, 0x1190, 0x1194, 0x11a0, 0x11a4, 0x11b0, 0x11b4, 0x11fc, 0x123c, 0x1280, 0x173c, 0x1800, 0x18fc, 0x3000, 0x3028, 0x3060, 0x30b0, 0x30b8, 0x30d8, 0x30e0, 0x30fc, 0x3140, 0x357c, 0x35a8, 0x35cc, 0x35ec, 0x35ec, 0x3600, 0x5624, 0x56cc, 0x56ec, 0x56f4, 0x5720, 0x5728, 0x575c, 0x580c, 0x5814, 0x5890, 0x589c, 0x58a4, 0x58ac, 0x58b8, 0x58bc, 0x5940, 0x59c8, 0x59d0, 0x59dc, 0x59fc, 0x5a18, 0x5a60, 0x5a70, 0x5a80, 0x5a9c, 0x5b94, 0x5bfc, 0x6000, 0x6020, 0x6028, 0x6040, 0x6058, 0x609c, 0x60a8, 0x614c, 0x7700, 0x7798, 0x77c0, 0x78fc, 0x7b00, 0x7b58, 0x7b60, 0x7b84, 0x7b8c, 0x7c54, 0x7d00, 0x7d38, 0x7d40, 0x7d80, 0x7d8c, 0x7ddc, 0x7de4, 0x7e04, 0x7e10, 0x7e1c, 0x7e24, 0x7e38, 0x7e40, 0x7e44, 0x7e4c, 0x7e78, 0x7e80, 0x7edc, 0x7ee8, 0x7efc, 0x8dc0, 0x8de0, 0x8df8, 0x8e04, 0x8e10, 0x8e84, 0x8ea0, 0x8f84, 0x8fc0, 0x9058, 0x9060, 0x9060, 0x9068, 0x90f8, 0x9400, 0x9408, 0x9410, 0x9470, 0x9600, 0x9600, 0x9608, 0x9638, 0x9640, 0x96f4, 0x9800, 0x9808, 0x9820, 0x983c, 0x9850, 0x9864, 0x9c00, 0x9c6c, 0x9c80, 0x9cec, 0x9d00, 0x9d6c, 0x9d80, 0x9dec, 0x9e00, 0x9e6c, 0x9e80, 0x9eec, 0x9f00, 0x9f6c, 0x9f80, 0xa020, 0xd004, 0xd004, 0xd010, 0xd03c, 0xdfc0, 0xdfe0, 0xe000, 0x1106c, 0x11074, 0x11088, 0x1109c, 0x1117c, 0x11190, 0x11204, 0x19040, 0x1906c, 0x19078, 0x19080, 0x1908c, 0x190e8, 0x190f0, 0x190f8, 0x19100, 0x19110, 0x19120, 0x19124, 0x19150, 0x19194, 0x1919c, 0x191b0, 0x191d0, 0x191e8, 0x19238, 0x19290, 0x193f8, 0x19428, 0x19430, 0x19444, 0x1944c, 0x1946c, 0x19474, 0x19474, 0x19490, 0x194cc, 0x194f0, 0x194f8, 0x19c00, 0x19c08, 0x19c10, 0x19c60, 0x19c94, 0x19ce4, 0x19cf0, 0x19d40, 0x19d50, 0x19d94, 0x19da0, 0x19de8, 0x19df0, 0x19e10, 0x19e50, 0x19e90, 0x19ea0, 0x19f24, 0x19f34, 0x19f34, 0x19f40, 0x19f50, 0x19f90, 0x19fb4, 0x19fc4, 0x19fe4, 0x1a000, 0x1a004, 0x1a010, 0x1a06c, 0x1a0b0, 0x1a0e4, 0x1a0ec, 0x1a0f8, 0x1a100, 0x1a108, 0x1a114, 0x1a120, 0x1a128, 0x1a130, 0x1a138, 0x1a138, 0x1a190, 0x1a1c4, 0x1a1fc, 0x1a1fc, 0x1e008, 0x1e00c, 0x1e040, 0x1e044, 0x1e04c, 0x1e04c, 0x1e284, 0x1e290, 0x1e2c0, 0x1e2c0, 0x1e2e0, 0x1e2e0, 0x1e300, 0x1e384, 0x1e3c0, 0x1e3c8, 0x1e408, 0x1e40c, 0x1e440, 0x1e444, 0x1e44c, 0x1e44c, 0x1e684, 0x1e690, 0x1e6c0, 0x1e6c0, 0x1e6e0, 0x1e6e0, 0x1e700, 0x1e784, 0x1e7c0, 0x1e7c8, 0x1e808, 0x1e80c, 0x1e840, 0x1e844, 0x1e84c, 0x1e84c, 0x1ea84, 0x1ea90, 0x1eac0, 0x1eac0, 0x1eae0, 0x1eae0, 0x1eb00, 0x1eb84, 0x1ebc0, 0x1ebc8, 0x1ec08, 0x1ec0c, 0x1ec40, 0x1ec44, 0x1ec4c, 0x1ec4c, 0x1ee84, 0x1ee90, 0x1eec0, 0x1eec0, 0x1eee0, 0x1eee0, 0x1ef00, 0x1ef84, 0x1efc0, 0x1efc8, 0x1f008, 0x1f00c, 0x1f040, 0x1f044, 0x1f04c, 0x1f04c, 0x1f284, 0x1f290, 0x1f2c0, 0x1f2c0, 0x1f2e0, 0x1f2e0, 0x1f300, 0x1f384, 0x1f3c0, 0x1f3c8, 0x1f408, 0x1f40c, 0x1f440, 0x1f444, 0x1f44c, 0x1f44c, 0x1f684, 0x1f690, 0x1f6c0, 0x1f6c0, 0x1f6e0, 0x1f6e0, 0x1f700, 0x1f784, 0x1f7c0, 0x1f7c8, 0x1f808, 0x1f80c, 0x1f840, 0x1f844, 0x1f84c, 0x1f84c, 0x1fa84, 0x1fa90, 0x1fac0, 0x1fac0, 0x1fae0, 0x1fae0, 0x1fb00, 0x1fb84, 0x1fbc0, 0x1fbc8, 0x1fc08, 0x1fc0c, 0x1fc40, 0x1fc44, 0x1fc4c, 0x1fc4c, 0x1fe84, 0x1fe90, 0x1fec0, 0x1fec0, 0x1fee0, 0x1fee0, 0x1ff00, 0x1ff84, 0x1ffc0, 0x1ffc8, 0x30000, 0x30030, 0x30100, 0x30144, 0x30190, 0x301a0, 0x301a8, 0x301b8, 0x301c4, 0x301c8, 0x301d0, 0x301d0, 0x30200, 0x30318, 0x30400, 0x304b4, 0x304c0, 0x3052c, 0x30540, 0x3061c, 0x30800, 0x30828, 0x30834, 0x30834, 0x308c0, 0x30908, 0x30910, 0x309ac, 0x30a00, 0x30a14, 0x30a1c, 0x30a2c, 0x30a44, 0x30a50, 0x30a74, 0x30a74, 0x30a7c, 0x30afc, 0x30b08, 0x30c24, 0x30d00, 0x30d00, 0x30d08, 0x30d14, 0x30d1c, 0x30d20, 0x30d3c, 0x30d3c, 0x30d48, 0x30d50, 0x31200, 0x3120c, 0x31220, 0x31220, 0x31240, 0x31240, 0x31600, 0x3160c, 0x31a00, 0x31a1c, 0x31e00, 0x31e20, 0x31e38, 0x31e3c, 0x31e80, 0x31e80, 0x31e88, 0x31ea8, 0x31eb0, 0x31eb4, 0x31ec8, 0x31ed4, 0x31fb8, 0x32004, 0x32200, 0x32200, 0x32208, 0x32240, 0x32248, 0x32280, 0x32288, 0x322c0, 0x322c8, 0x322fc, 0x32600, 0x32630, 0x32a00, 0x32abc, 0x32b00, 0x32b10, 0x32b20, 0x32b30, 0x32b40, 0x32b50, 0x32b60, 0x32b70, 0x33000, 0x33028, 0x33030, 0x33048, 0x33060, 0x33068, 0x33070, 0x3309c, 0x330f0, 0x33128, 0x33130, 0x33148, 0x33160, 0x33168, 0x33170, 0x3319c, 0x331f0, 0x33238, 0x33240, 0x33240, 0x33248, 0x33250, 0x3325c, 0x33264, 0x33270, 0x332b8, 0x332c0, 0x332e4, 0x332f8, 0x33338, 0x33340, 0x33340, 0x33348, 0x33350, 0x3335c, 0x33364, 0x33370, 0x333b8, 0x333c0, 0x333e4, 0x333f8, 0x33428, 0x33430, 0x33448, 0x33460, 0x33468, 0x33470, 0x3349c, 0x334f0, 0x33528, 0x33530, 0x33548, 0x33560, 0x33568, 0x33570, 0x3359c, 0x335f0, 0x33638, 0x33640, 0x33640, 0x33648, 0x33650, 0x3365c, 0x33664, 0x33670, 0x336b8, 0x336c0, 0x336e4, 0x336f8, 0x33738, 0x33740, 0x33740, 0x33748, 0x33750, 0x3375c, 0x33764, 0x33770, 0x337b8, 0x337c0, 0x337e4, 0x337f8, 0x337fc, 0x33814, 0x33814, 0x3382c, 0x3382c, 0x33880, 0x3388c, 0x338e8, 0x338ec, 0x33900, 0x33928, 0x33930, 0x33948, 0x33960, 0x33968, 0x33970, 0x3399c, 0x339f0, 0x33a38, 0x33a40, 0x33a40, 0x33a48, 0x33a50, 0x33a5c, 0x33a64, 0x33a70, 0x33ab8, 0x33ac0, 0x33ae4, 0x33af8, 0x33b10, 0x33b28, 0x33b28, 0x33b3c, 0x33b50, 0x33bf0, 0x33c10, 0x33c28, 0x33c28, 0x33c3c, 0x33c50, 0x33cf0, 0x33cfc, 0x34000, 0x34030, 0x34100, 0x34144, 0x34190, 0x341a0, 0x341a8, 0x341b8, 0x341c4, 0x341c8, 0x341d0, 0x341d0, 0x34200, 0x34318, 0x34400, 0x344b4, 0x344c0, 0x3452c, 0x34540, 0x3461c, 0x34800, 0x34828, 0x34834, 0x34834, 0x348c0, 0x34908, 0x34910, 0x349ac, 0x34a00, 0x34a14, 0x34a1c, 0x34a2c, 0x34a44, 0x34a50, 0x34a74, 0x34a74, 0x34a7c, 0x34afc, 0x34b08, 0x34c24, 0x34d00, 0x34d00, 0x34d08, 0x34d14, 0x34d1c, 0x34d20, 0x34d3c, 0x34d3c, 0x34d48, 0x34d50, 0x35200, 0x3520c, 0x35220, 0x35220, 0x35240, 0x35240, 0x35600, 0x3560c, 0x35a00, 0x35a1c, 0x35e00, 0x35e20, 0x35e38, 0x35e3c, 0x35e80, 0x35e80, 0x35e88, 0x35ea8, 0x35eb0, 0x35eb4, 0x35ec8, 0x35ed4, 0x35fb8, 0x36004, 0x36200, 0x36200, 0x36208, 0x36240, 0x36248, 0x36280, 0x36288, 0x362c0, 0x362c8, 0x362fc, 0x36600, 0x36630, 0x36a00, 0x36abc, 0x36b00, 0x36b10, 0x36b20, 0x36b30, 0x36b40, 0x36b50, 0x36b60, 0x36b70, 0x37000, 0x37028, 0x37030, 0x37048, 0x37060, 0x37068, 0x37070, 0x3709c, 0x370f0, 0x37128, 0x37130, 0x37148, 0x37160, 0x37168, 0x37170, 0x3719c, 0x371f0, 0x37238, 0x37240, 0x37240, 0x37248, 0x37250, 0x3725c, 0x37264, 0x37270, 0x372b8, 0x372c0, 0x372e4, 0x372f8, 0x37338, 0x37340, 0x37340, 0x37348, 0x37350, 0x3735c, 0x37364, 0x37370, 0x373b8, 0x373c0, 0x373e4, 0x373f8, 0x37428, 0x37430, 0x37448, 0x37460, 0x37468, 0x37470, 0x3749c, 0x374f0, 0x37528, 0x37530, 0x37548, 0x37560, 0x37568, 0x37570, 0x3759c, 0x375f0, 0x37638, 0x37640, 0x37640, 0x37648, 0x37650, 0x3765c, 0x37664, 0x37670, 0x376b8, 0x376c0, 0x376e4, 0x376f8, 0x37738, 0x37740, 0x37740, 0x37748, 0x37750, 0x3775c, 0x37764, 0x37770, 0x377b8, 0x377c0, 0x377e4, 0x377f8, 0x377fc, 0x37814, 0x37814, 0x3782c, 0x3782c, 0x37880, 0x3788c, 0x378e8, 0x378ec, 0x37900, 0x37928, 0x37930, 0x37948, 0x37960, 0x37968, 0x37970, 0x3799c, 0x379f0, 0x37a38, 0x37a40, 0x37a40, 0x37a48, 0x37a50, 0x37a5c, 0x37a64, 0x37a70, 0x37ab8, 0x37ac0, 0x37ae4, 0x37af8, 0x37b10, 0x37b28, 0x37b28, 0x37b3c, 0x37b50, 0x37bf0, 0x37c10, 0x37c28, 0x37c28, 0x37c3c, 0x37c50, 0x37cf0, 0x37cfc, 0x38000, 0x38030, 0x38100, 0x38144, 0x38190, 0x381a0, 0x381a8, 0x381b8, 0x381c4, 0x381c8, 0x381d0, 0x381d0, 0x38200, 0x38318, 0x38400, 0x384b4, 0x384c0, 0x3852c, 0x38540, 0x3861c, 0x38800, 0x38828, 0x38834, 0x38834, 0x388c0, 0x38908, 0x38910, 0x389ac, 0x38a00, 0x38a14, 0x38a1c, 0x38a2c, 0x38a44, 0x38a50, 0x38a74, 0x38a74, 0x38a7c, 0x38afc, 0x38b08, 0x38c24, 0x38d00, 0x38d00, 0x38d08, 0x38d14, 0x38d1c, 0x38d20, 0x38d3c, 0x38d3c, 0x38d48, 0x38d50, 0x39200, 0x3920c, 0x39220, 0x39220, 0x39240, 0x39240, 0x39600, 0x3960c, 0x39a00, 0x39a1c, 0x39e00, 0x39e20, 0x39e38, 0x39e3c, 0x39e80, 0x39e80, 0x39e88, 0x39ea8, 0x39eb0, 0x39eb4, 0x39ec8, 0x39ed4, 0x39fb8, 0x3a004, 0x3a200, 0x3a200, 0x3a208, 0x3a240, 0x3a248, 0x3a280, 0x3a288, 0x3a2c0, 0x3a2c8, 0x3a2fc, 0x3a600, 0x3a630, 0x3aa00, 0x3aabc, 0x3ab00, 0x3ab10, 0x3ab20, 0x3ab30, 0x3ab40, 0x3ab50, 0x3ab60, 0x3ab70, 0x3b000, 0x3b028, 0x3b030, 0x3b048, 0x3b060, 0x3b068, 0x3b070, 0x3b09c, 0x3b0f0, 0x3b128, 0x3b130, 0x3b148, 0x3b160, 0x3b168, 0x3b170, 0x3b19c, 0x3b1f0, 0x3b238, 0x3b240, 0x3b240, 0x3b248, 0x3b250, 0x3b25c, 0x3b264, 0x3b270, 0x3b2b8, 0x3b2c0, 0x3b2e4, 0x3b2f8, 0x3b338, 0x3b340, 0x3b340, 0x3b348, 0x3b350, 0x3b35c, 0x3b364, 0x3b370, 0x3b3b8, 0x3b3c0, 0x3b3e4, 0x3b3f8, 0x3b428, 0x3b430, 0x3b448, 0x3b460, 0x3b468, 0x3b470, 0x3b49c, 0x3b4f0, 0x3b528, 0x3b530, 0x3b548, 0x3b560, 0x3b568, 0x3b570, 0x3b59c, 0x3b5f0, 0x3b638, 0x3b640, 0x3b640, 0x3b648, 0x3b650, 0x3b65c, 0x3b664, 0x3b670, 0x3b6b8, 0x3b6c0, 0x3b6e4, 0x3b6f8, 0x3b738, 0x3b740, 0x3b740, 0x3b748, 0x3b750, 0x3b75c, 0x3b764, 0x3b770, 0x3b7b8, 0x3b7c0, 0x3b7e4, 0x3b7f8, 0x3b7fc, 0x3b814, 0x3b814, 0x3b82c, 0x3b82c, 0x3b880, 0x3b88c, 0x3b8e8, 0x3b8ec, 0x3b900, 0x3b928, 0x3b930, 0x3b948, 0x3b960, 0x3b968, 0x3b970, 0x3b99c, 0x3b9f0, 0x3ba38, 0x3ba40, 0x3ba40, 0x3ba48, 0x3ba50, 0x3ba5c, 0x3ba64, 0x3ba70, 0x3bab8, 0x3bac0, 0x3bae4, 0x3baf8, 0x3bb10, 0x3bb28, 0x3bb28, 0x3bb3c, 0x3bb50, 0x3bbf0, 0x3bc10, 0x3bc28, 0x3bc28, 0x3bc3c, 0x3bc50, 0x3bcf0, 0x3bcfc, 0x3c000, 0x3c030, 0x3c100, 0x3c144, 0x3c190, 0x3c1a0, 0x3c1a8, 0x3c1b8, 0x3c1c4, 0x3c1c8, 0x3c1d0, 0x3c1d0, 0x3c200, 0x3c318, 0x3c400, 0x3c4b4, 0x3c4c0, 0x3c52c, 0x3c540, 0x3c61c, 0x3c800, 0x3c828, 0x3c834, 0x3c834, 0x3c8c0, 0x3c908, 0x3c910, 0x3c9ac, 0x3ca00, 0x3ca14, 0x3ca1c, 0x3ca2c, 0x3ca44, 0x3ca50, 0x3ca74, 0x3ca74, 0x3ca7c, 0x3cafc, 0x3cb08, 0x3cc24, 0x3cd00, 0x3cd00, 0x3cd08, 0x3cd14, 0x3cd1c, 0x3cd20, 0x3cd3c, 0x3cd3c, 0x3cd48, 0x3cd50, 0x3d200, 0x3d20c, 0x3d220, 0x3d220, 0x3d240, 0x3d240, 0x3d600, 0x3d60c, 0x3da00, 0x3da1c, 0x3de00, 0x3de20, 0x3de38, 0x3de3c, 0x3de80, 0x3de80, 0x3de88, 0x3dea8, 0x3deb0, 0x3deb4, 0x3dec8, 0x3ded4, 0x3dfb8, 0x3e004, 0x3e200, 0x3e200, 0x3e208, 0x3e240, 0x3e248, 0x3e280, 0x3e288, 0x3e2c0, 0x3e2c8, 0x3e2fc, 0x3e600, 0x3e630, 0x3ea00, 0x3eabc, 0x3eb00, 0x3eb10, 0x3eb20, 0x3eb30, 0x3eb40, 0x3eb50, 0x3eb60, 0x3eb70, 0x3f000, 0x3f028, 0x3f030, 0x3f048, 0x3f060, 0x3f068, 0x3f070, 0x3f09c, 0x3f0f0, 0x3f128, 0x3f130, 0x3f148, 0x3f160, 0x3f168, 0x3f170, 0x3f19c, 0x3f1f0, 0x3f238, 0x3f240, 0x3f240, 0x3f248, 0x3f250, 0x3f25c, 0x3f264, 0x3f270, 0x3f2b8, 0x3f2c0, 0x3f2e4, 0x3f2f8, 0x3f338, 0x3f340, 0x3f340, 0x3f348, 0x3f350, 0x3f35c, 0x3f364, 0x3f370, 0x3f3b8, 0x3f3c0, 0x3f3e4, 0x3f3f8, 0x3f428, 0x3f430, 0x3f448, 0x3f460, 0x3f468, 0x3f470, 0x3f49c, 0x3f4f0, 0x3f528, 0x3f530, 0x3f548, 0x3f560, 0x3f568, 0x3f570, 0x3f59c, 0x3f5f0, 0x3f638, 0x3f640, 0x3f640, 0x3f648, 0x3f650, 0x3f65c, 0x3f664, 0x3f670, 0x3f6b8, 0x3f6c0, 0x3f6e4, 0x3f6f8, 0x3f738, 0x3f740, 0x3f740, 0x3f748, 0x3f750, 0x3f75c, 0x3f764, 0x3f770, 0x3f7b8, 0x3f7c0, 0x3f7e4, 0x3f7f8, 0x3f7fc, 0x3f814, 0x3f814, 0x3f82c, 0x3f82c, 0x3f880, 0x3f88c, 0x3f8e8, 0x3f8ec, 0x3f900, 0x3f928, 0x3f930, 0x3f948, 0x3f960, 0x3f968, 0x3f970, 0x3f99c, 0x3f9f0, 0x3fa38, 0x3fa40, 0x3fa40, 0x3fa48, 0x3fa50, 0x3fa5c, 0x3fa64, 0x3fa70, 0x3fab8, 0x3fac0, 0x3fae4, 0x3faf8, 0x3fb10, 0x3fb28, 0x3fb28, 0x3fb3c, 0x3fb50, 0x3fbf0, 0x3fc10, 0x3fc28, 0x3fc28, 0x3fc3c, 0x3fc50, 0x3fcf0, 0x3fcfc, 0x40000, 0x4000c, 0x40040, 0x40050, 0x40060, 0x40068, 0x4007c, 0x4008c, 0x40094, 0x400b0, 0x400c0, 0x40144, 0x40180, 0x4018c, 0x40200, 0x40254, 0x40260, 0x40264, 0x40270, 0x40288, 0x40290, 0x40298, 0x402ac, 0x402c8, 0x402d0, 0x402e0, 0x402f0, 0x402f0, 0x40300, 0x4033c, 0x403f8, 0x403fc, 0x41304, 0x413c4, 0x41400, 0x4140c, 0x41414, 0x4141c, 0x41480, 0x414d0, 0x44000, 0x44054, 0x4405c, 0x44078, 0x440c0, 0x44174, 0x44180, 0x441ac, 0x441b4, 0x441b8, 0x441c0, 0x44254, 0x4425c, 0x44278, 0x442c0, 0x44374, 0x44380, 0x443ac, 0x443b4, 0x443b8, 0x443c0, 0x44454, 0x4445c, 0x44478, 0x444c0, 0x44574, 0x44580, 0x445ac, 0x445b4, 0x445b8, 0x445c0, 0x44654, 0x4465c, 0x44678, 0x446c0, 0x44774, 0x44780, 0x447ac, 0x447b4, 0x447b8, 0x447c0, 0x44854, 0x4485c, 0x44878, 0x448c0, 0x44974, 0x44980, 0x449ac, 0x449b4, 0x449b8, 0x449c0, 0x449fc, 0x45000, 0x45004, 0x45010, 0x45030, 0x45040, 0x45060, 0x45068, 0x45068, 0x45080, 0x45084, 0x450a0, 0x450b0, 0x45200, 0x45204, 0x45210, 0x45230, 0x45240, 0x45260, 0x45268, 0x45268, 0x45280, 0x45284, 0x452a0, 0x452b0, 0x460c0, 0x460e4, 0x47000, 0x4703c, 0x47044, 0x4708c, 0x47200, 0x47250, 0x47400, 0x47408, 0x47414, 0x47420, 0x47600, 0x47618, 0x47800, 0x47814, 0x48000, 0x4800c, 0x48040, 0x48050, 0x48060, 0x48068, 0x4807c, 0x4808c, 0x48094, 0x480b0, 0x480c0, 0x48144, 0x48180, 0x4818c, 0x48200, 0x48254, 0x48260, 0x48264, 0x48270, 0x48288, 0x48290, 0x48298, 0x482ac, 0x482c8, 0x482d0, 0x482e0, 0x482f0, 0x482f0, 0x48300, 0x4833c, 0x483f8, 0x483fc, 0x49304, 0x493c4, 0x49400, 0x4940c, 0x49414, 0x4941c, 0x49480, 0x494d0, 0x4c000, 0x4c054, 0x4c05c, 0x4c078, 0x4c0c0, 0x4c174, 0x4c180, 0x4c1ac, 0x4c1b4, 0x4c1b8, 0x4c1c0, 0x4c254, 0x4c25c, 0x4c278, 0x4c2c0, 0x4c374, 0x4c380, 0x4c3ac, 0x4c3b4, 0x4c3b8, 0x4c3c0, 0x4c454, 0x4c45c, 0x4c478, 0x4c4c0, 0x4c574, 0x4c580, 0x4c5ac, 0x4c5b4, 0x4c5b8, 0x4c5c0, 0x4c654, 0x4c65c, 0x4c678, 0x4c6c0, 0x4c774, 0x4c780, 0x4c7ac, 0x4c7b4, 0x4c7b8, 0x4c7c0, 0x4c854, 0x4c85c, 0x4c878, 0x4c8c0, 0x4c974, 0x4c980, 0x4c9ac, 0x4c9b4, 0x4c9b8, 0x4c9c0, 0x4c9fc, 0x4d000, 0x4d004, 0x4d010, 0x4d030, 0x4d040, 0x4d060, 0x4d068, 0x4d068, 0x4d080, 0x4d084, 0x4d0a0, 0x4d0b0, 0x4d200, 0x4d204, 0x4d210, 0x4d230, 0x4d240, 0x4d260, 0x4d268, 0x4d268, 0x4d280, 0x4d284, 0x4d2a0, 0x4d2b0, 0x4e0c0, 0x4e0e4, 0x4f000, 0x4f03c, 0x4f044, 0x4f08c, 0x4f200, 0x4f250, 0x4f400, 0x4f408, 0x4f414, 0x4f420, 0x4f600, 0x4f618, 0x4f800, 0x4f814, 0x50000, 0x50084, 0x50090, 0x500cc, 0x50400, 0x50400, 0x50800, 0x50884, 0x50890, 0x508cc, 0x50c00, 0x50c00, 0x51000, 0x5101c, 0x51300, 0x51308, }; static const unsigned int t5vf_reg_ranges[] = { VF_SGE_REG(A_SGE_VF_KDOORBELL), VF_SGE_REG(A_SGE_VF_GTS), VF_MPS_REG(A_MPS_VF_CTL), VF_MPS_REG(A_MPS_VF_STAT_RX_VF_ERR_FRAMES_H), VF_PL_REG(A_PL_VF_WHOAMI), VF_PL_REG(A_PL_VF_REVISION), VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_CTRL), VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_STATUS), FW_T4VF_MBDATA_BASE_ADDR, FW_T4VF_MBDATA_BASE_ADDR + ((NUM_CIM_PF_MAILBOX_DATA_INSTANCES - 1) * 4), }; static const unsigned int t6_reg_ranges[] = { 0x1008, 0x101c, 0x1024, 0x10a8, 0x10b4, 0x10f8, 0x1100, 0x1114, 0x111c, 0x112c, 0x1138, 0x113c, 0x1144, 0x114c, 0x1180, 0x1184, 0x1190, 0x1194, 0x11a0, 0x11a4, 0x11b0, 0x11b4, 0x11fc, 0x1274, 0x1280, 0x133c, 0x1800, 0x18fc, 0x3000, 0x302c, 0x3060, 0x30b0, 0x30b8, 0x30d8, 0x30e0, 0x30fc, 0x3140, 0x357c, 0x35a8, 0x35cc, 0x35ec, 0x35ec, 0x3600, 0x5624, 0x56cc, 0x56ec, 0x56f4, 0x5720, 0x5728, 0x575c, 0x580c, 0x5814, 0x5890, 0x589c, 0x58a4, 0x58ac, 0x58b8, 0x58bc, 0x5940, 0x595c, 0x5980, 0x598c, 0x59b0, 0x59c8, 0x59d0, 0x59dc, 0x59fc, 0x5a18, 0x5a60, 0x5a6c, 0x5a80, 0x5a8c, 0x5a94, 0x5a9c, 0x5b94, 0x5bfc, 0x5c10, 0x5e48, 0x5e50, 0x5e94, 0x5ea0, 0x5eb0, 0x5ec0, 0x5ec0, 0x5ec8, 0x5ed0, 0x5ee0, 0x5ee0, 0x5ef0, 0x5ef0, 0x5f00, 0x5f00, 0x6000, 0x6020, 0x6028, 0x6040, 0x6058, 0x609c, 0x60a8, 0x619c, 0x7700, 0x7798, 0x77c0, 0x7880, 0x78cc, 0x78fc, 0x7b00, 0x7b58, 0x7b60, 0x7b84, 0x7b8c, 0x7c54, 0x7d00, 0x7d38, 0x7d40, 0x7d84, 0x7d8c, 0x7ddc, 0x7de4, 0x7e04, 0x7e10, 0x7e1c, 0x7e24, 0x7e38, 0x7e40, 0x7e44, 0x7e4c, 0x7e78, 0x7e80, 0x7edc, 0x7ee8, 0x7efc, 0x8dc0, 0x8de4, 0x8df8, 0x8e04, 0x8e10, 0x8e84, 0x8ea0, 0x8f88, 0x8fb8, 0x9058, 0x9060, 0x9060, 0x9068, 0x90f8, 0x9100, 0x9124, 0x9400, 0x9470, 0x9600, 0x9600, 0x9608, 0x9638, 0x9640, 0x9704, 0x9710, 0x971c, 0x9800, 0x9808, 0x9820, 0x983c, 0x9850, 0x9864, 0x9c00, 0x9c6c, 0x9c80, 0x9cec, 0x9d00, 0x9d6c, 0x9d80, 0x9dec, 0x9e00, 0x9e6c, 0x9e80, 0x9eec, 0x9f00, 0x9f6c, 0x9f80, 0xa020, 0xd004, 0xd03c, 0xd100, 0xd118, 0xd200, 0xd214, 0xd220, 0xd234, 0xd240, 0xd254, 0xd260, 0xd274, 0xd280, 0xd294, 0xd2a0, 0xd2b4, 0xd2c0, 0xd2d4, 0xd2e0, 0xd2f4, 0xd300, 0xd31c, 0xdfc0, 0xdfe0, 0xe000, 0xf008, 0xf010, 0xf018, 0xf020, 0xf028, 0x11000, 0x11014, 0x11048, 0x1106c, 0x11074, 0x11088, 0x11098, 0x11120, 0x1112c, 0x1117c, 0x11190, 0x112e0, 0x11300, 0x1130c, 0x12000, 0x1206c, 0x19040, 0x1906c, 0x19078, 0x19080, 0x1908c, 0x190e8, 0x190f0, 0x190f8, 0x19100, 0x19110, 0x19120, 0x19124, 0x19150, 0x19194, 0x1919c, 0x191b0, 0x191d0, 0x191e8, 0x19238, 0x19290, 0x192a4, 0x192b0, 0x192bc, 0x192bc, 0x19348, 0x1934c, 0x193f8, 0x19418, 0x19420, 0x19428, 0x19430, 0x19444, 0x1944c, 0x1946c, 0x19474, 0x19474, 0x19490, 0x194cc, 0x194f0, 0x194f8, 0x19c00, 0x19c48, 0x19c50, 0x19c80, 0x19c94, 0x19c98, 0x19ca0, 0x19cbc, 0x19ce4, 0x19ce4, 0x19cf0, 0x19cf8, 0x19d00, 0x19d28, 0x19d50, 0x19d78, 0x19d94, 0x19d98, 0x19da0, 0x19dc8, 0x19df0, 0x19e10, 0x19e50, 0x19e6c, 0x19ea0, 0x19ebc, 0x19ec4, 0x19ef4, 0x19f04, 0x19f2c, 0x19f34, 0x19f34, 0x19f40, 0x19f50, 0x19f90, 0x19fac, 0x19fc4, 0x19fc8, 0x19fd0, 0x19fe4, 0x1a000, 0x1a004, 0x1a010, 0x1a06c, 0x1a0b0, 0x1a0e4, 0x1a0ec, 0x1a0f8, 0x1a100, 0x1a108, 0x1a114, 0x1a120, 0x1a128, 0x1a130, 0x1a138, 0x1a138, 0x1a190, 0x1a1c4, 0x1a1fc, 0x1a1fc, 0x1e008, 0x1e00c, 0x1e040, 0x1e044, 0x1e04c, 0x1e04c, 0x1e284, 0x1e290, 0x1e2c0, 0x1e2c0, 0x1e2e0, 0x1e2e0, 0x1e300, 0x1e384, 0x1e3c0, 0x1e3c8, 0x1e408, 0x1e40c, 0x1e440, 0x1e444, 0x1e44c, 0x1e44c, 0x1e684, 0x1e690, 0x1e6c0, 0x1e6c0, 0x1e6e0, 0x1e6e0, 0x1e700, 0x1e784, 0x1e7c0, 0x1e7c8, 0x1e808, 0x1e80c, 0x1e840, 0x1e844, 0x1e84c, 0x1e84c, 0x1ea84, 0x1ea90, 0x1eac0, 0x1eac0, 0x1eae0, 0x1eae0, 0x1eb00, 0x1eb84, 0x1ebc0, 0x1ebc8, 0x1ec08, 0x1ec0c, 0x1ec40, 0x1ec44, 0x1ec4c, 0x1ec4c, 0x1ee84, 0x1ee90, 0x1eec0, 0x1eec0, 0x1eee0, 0x1eee0, 0x1ef00, 0x1ef84, 0x1efc0, 0x1efc8, 0x1f008, 0x1f00c, 0x1f040, 0x1f044, 0x1f04c, 0x1f04c, 0x1f284, 0x1f290, 0x1f2c0, 0x1f2c0, 0x1f2e0, 0x1f2e0, 0x1f300, 0x1f384, 0x1f3c0, 0x1f3c8, 0x1f408, 0x1f40c, 0x1f440, 0x1f444, 0x1f44c, 0x1f44c, 0x1f684, 0x1f690, 0x1f6c0, 0x1f6c0, 0x1f6e0, 0x1f6e0, 0x1f700, 0x1f784, 0x1f7c0, 0x1f7c8, 0x1f808, 0x1f80c, 0x1f840, 0x1f844, 0x1f84c, 0x1f84c, 0x1fa84, 0x1fa90, 0x1fac0, 0x1fac0, 0x1fae0, 0x1fae0, 0x1fb00, 0x1fb84, 0x1fbc0, 0x1fbc8, 0x1fc08, 0x1fc0c, 0x1fc40, 0x1fc44, 0x1fc4c, 0x1fc4c, 0x1fe84, 0x1fe90, 0x1fec0, 0x1fec0, 0x1fee0, 0x1fee0, 0x1ff00, 0x1ff84, 0x1ffc0, 0x1ffc8, 0x30000, 0x30030, 0x30100, 0x30168, 0x30190, 0x301a0, 0x301a8, 0x301b8, 0x301c4, 0x301c8, 0x301d0, 0x301d0, 0x30200, 0x30320, 0x30400, 0x304b4, 0x304c0, 0x3052c, 0x30540, 0x3061c, 0x30800, 0x308a0, 0x308c0, 0x30908, 0x30910, 0x309b8, 0x30a00, 0x30a04, 0x30a0c, 0x30a14, 0x30a1c, 0x30a2c, 0x30a44, 0x30a50, 0x30a74, 0x30a74, 0x30a7c, 0x30afc, 0x30b08, 0x30c24, 0x30d00, 0x30d14, 0x30d1c, 0x30d3c, 0x30d44, 0x30d4c, 0x30d54, 0x30d74, 0x30d7c, 0x30d7c, 0x30de0, 0x30de0, 0x30e00, 0x30ed4, 0x30f00, 0x30fa4, 0x30fc0, 0x30fc4, 0x31000, 0x31004, 0x31080, 0x310fc, 0x31208, 0x31220, 0x3123c, 0x31254, 0x31300, 0x31300, 0x31308, 0x3131c, 0x31338, 0x3133c, 0x31380, 0x31380, 0x31388, 0x313a8, 0x313b4, 0x313b4, 0x31400, 0x31420, 0x31438, 0x3143c, 0x31480, 0x31480, 0x314a8, 0x314a8, 0x314b0, 0x314b4, 0x314c8, 0x314d4, 0x31a40, 0x31a4c, 0x31af0, 0x31b20, 0x31b38, 0x31b3c, 0x31b80, 0x31b80, 0x31ba8, 0x31ba8, 0x31bb0, 0x31bb4, 0x31bc8, 0x31bd4, 0x32140, 0x3218c, 0x321f0, 0x321f4, 0x32200, 0x32200, 0x32218, 0x32218, 0x32400, 0x32400, 0x32408, 0x3241c, 0x32618, 0x32620, 0x32664, 0x32664, 0x326a8, 0x326a8, 0x326ec, 0x326ec, 0x32a00, 0x32abc, 0x32b00, 0x32b18, 0x32b20, 0x32b38, 0x32b40, 0x32b58, 0x32b60, 0x32b78, 0x32c00, 0x32c00, 0x32c08, 0x32c3c, 0x33000, 0x3302c, 0x33034, 0x33050, 0x33058, 0x33058, 0x33060, 0x3308c, 0x3309c, 0x330ac, 0x330c0, 0x330c0, 0x330c8, 0x330d0, 0x330d8, 0x330e0, 0x330ec, 0x3312c, 0x33134, 0x33150, 0x33158, 0x33158, 0x33160, 0x3318c, 0x3319c, 0x331ac, 0x331c0, 0x331c0, 0x331c8, 0x331d0, 0x331d8, 0x331e0, 0x331ec, 0x33290, 0x33298, 0x332c4, 0x332e4, 0x33390, 0x33398, 0x333c4, 0x333e4, 0x3342c, 0x33434, 0x33450, 0x33458, 0x33458, 0x33460, 0x3348c, 0x3349c, 0x334ac, 0x334c0, 0x334c0, 0x334c8, 0x334d0, 0x334d8, 0x334e0, 0x334ec, 0x3352c, 0x33534, 0x33550, 0x33558, 0x33558, 0x33560, 0x3358c, 0x3359c, 0x335ac, 0x335c0, 0x335c0, 0x335c8, 0x335d0, 0x335d8, 0x335e0, 0x335ec, 0x33690, 0x33698, 0x336c4, 0x336e4, 0x33790, 0x33798, 0x337c4, 0x337e4, 0x337fc, 0x33814, 0x33814, 0x33854, 0x33868, 0x33880, 0x3388c, 0x338c0, 0x338d0, 0x338e8, 0x338ec, 0x33900, 0x3392c, 0x33934, 0x33950, 0x33958, 0x33958, 0x33960, 0x3398c, 0x3399c, 0x339ac, 0x339c0, 0x339c0, 0x339c8, 0x339d0, 0x339d8, 0x339e0, 0x339ec, 0x33a90, 0x33a98, 0x33ac4, 0x33ae4, 0x33b10, 0x33b24, 0x33b28, 0x33b38, 0x33b50, 0x33bf0, 0x33c10, 0x33c24, 0x33c28, 0x33c38, 0x33c50, 0x33cf0, 0x33cfc, 0x34000, 0x34030, 0x34100, 0x34168, 0x34190, 0x341a0, 0x341a8, 0x341b8, 0x341c4, 0x341c8, 0x341d0, 0x341d0, 0x34200, 0x34320, 0x34400, 0x344b4, 0x344c0, 0x3452c, 0x34540, 0x3461c, 0x34800, 0x348a0, 0x348c0, 0x34908, 0x34910, 0x349b8, 0x34a00, 0x34a04, 0x34a0c, 0x34a14, 0x34a1c, 0x34a2c, 0x34a44, 0x34a50, 0x34a74, 0x34a74, 0x34a7c, 0x34afc, 0x34b08, 0x34c24, 0x34d00, 0x34d14, 0x34d1c, 0x34d3c, 0x34d44, 0x34d4c, 0x34d54, 0x34d74, 0x34d7c, 0x34d7c, 0x34de0, 0x34de0, 0x34e00, 0x34ed4, 0x34f00, 0x34fa4, 0x34fc0, 0x34fc4, 0x35000, 0x35004, 0x35080, 0x350fc, 0x35208, 0x35220, 0x3523c, 0x35254, 0x35300, 0x35300, 0x35308, 0x3531c, 0x35338, 0x3533c, 0x35380, 0x35380, 0x35388, 0x353a8, 0x353b4, 0x353b4, 0x35400, 0x35420, 0x35438, 0x3543c, 0x35480, 0x35480, 0x354a8, 0x354a8, 0x354b0, 0x354b4, 0x354c8, 0x354d4, 0x35a40, 0x35a4c, 0x35af0, 0x35b20, 0x35b38, 0x35b3c, 0x35b80, 0x35b80, 0x35ba8, 0x35ba8, 0x35bb0, 0x35bb4, 0x35bc8, 0x35bd4, 0x36140, 0x3618c, 0x361f0, 0x361f4, 0x36200, 0x36200, 0x36218, 0x36218, 0x36400, 0x36400, 0x36408, 0x3641c, 0x36618, 0x36620, 0x36664, 0x36664, 0x366a8, 0x366a8, 0x366ec, 0x366ec, 0x36a00, 0x36abc, 0x36b00, 0x36b18, 0x36b20, 0x36b38, 0x36b40, 0x36b58, 0x36b60, 0x36b78, 0x36c00, 0x36c00, 0x36c08, 0x36c3c, 0x37000, 0x3702c, 0x37034, 0x37050, 0x37058, 0x37058, 0x37060, 0x3708c, 0x3709c, 0x370ac, 0x370c0, 0x370c0, 0x370c8, 0x370d0, 0x370d8, 0x370e0, 0x370ec, 0x3712c, 0x37134, 0x37150, 0x37158, 0x37158, 0x37160, 0x3718c, 0x3719c, 0x371ac, 0x371c0, 0x371c0, 0x371c8, 0x371d0, 0x371d8, 0x371e0, 0x371ec, 0x37290, 0x37298, 0x372c4, 0x372e4, 0x37390, 0x37398, 0x373c4, 0x373e4, 0x3742c, 0x37434, 0x37450, 0x37458, 0x37458, 0x37460, 0x3748c, 0x3749c, 0x374ac, 0x374c0, 0x374c0, 0x374c8, 0x374d0, 0x374d8, 0x374e0, 0x374ec, 0x3752c, 0x37534, 0x37550, 0x37558, 0x37558, 0x37560, 0x3758c, 0x3759c, 0x375ac, 0x375c0, 0x375c0, 0x375c8, 0x375d0, 0x375d8, 0x375e0, 0x375ec, 0x37690, 0x37698, 0x376c4, 0x376e4, 0x37790, 0x37798, 0x377c4, 0x377e4, 0x377fc, 0x37814, 0x37814, 0x37854, 0x37868, 0x37880, 0x3788c, 0x378c0, 0x378d0, 0x378e8, 0x378ec, 0x37900, 0x3792c, 0x37934, 0x37950, 0x37958, 0x37958, 0x37960, 0x3798c, 0x3799c, 0x379ac, 0x379c0, 0x379c0, 0x379c8, 0x379d0, 0x379d8, 0x379e0, 0x379ec, 0x37a90, 0x37a98, 0x37ac4, 0x37ae4, 0x37b10, 0x37b24, 0x37b28, 0x37b38, 0x37b50, 0x37bf0, 0x37c10, 0x37c24, 0x37c28, 0x37c38, 0x37c50, 0x37cf0, 0x37cfc, 0x40040, 0x40040, 0x40080, 0x40084, 0x40100, 0x40100, 0x40140, 0x401bc, 0x40200, 0x40214, 0x40228, 0x40228, 0x40240, 0x40258, 0x40280, 0x40280, 0x40304, 0x40304, 0x40330, 0x4033c, 0x41304, 0x413c8, 0x413d0, 0x413dc, 0x413f0, 0x413f0, 0x41400, 0x4140c, 0x41414, 0x4141c, 0x41480, 0x414d0, 0x44000, 0x4407c, 0x440c0, 0x441ac, 0x441b4, 0x4427c, 0x442c0, 0x443ac, 0x443b4, 0x4447c, 0x444c0, 0x445ac, 0x445b4, 0x4467c, 0x446c0, 0x447ac, 0x447b4, 0x4487c, 0x448c0, 0x449ac, 0x449b4, 0x44a7c, 0x44ac0, 0x44bac, 0x44bb4, 0x44c7c, 0x44cc0, 0x44dac, 0x44db4, 0x44e7c, 0x44ec0, 0x44fac, 0x44fb4, 0x4507c, 0x450c0, 0x451ac, 0x451b4, 0x451fc, 0x45800, 0x45804, 0x45810, 0x45830, 0x45840, 0x45860, 0x45868, 0x45868, 0x45880, 0x45884, 0x458a0, 0x458b0, 0x45a00, 0x45a04, 0x45a10, 0x45a30, 0x45a40, 0x45a60, 0x45a68, 0x45a68, 0x45a80, 0x45a84, 0x45aa0, 0x45ab0, 0x460c0, 0x460e4, 0x47000, 0x4703c, 0x47044, 0x4708c, 0x47200, 0x47250, 0x47400, 0x47408, 0x47414, 0x47420, 0x47600, 0x47618, 0x47800, 0x47814, 0x47820, 0x4782c, 0x50000, 0x50084, 0x50090, 0x500cc, 0x50300, 0x50384, 0x50400, 0x50400, 0x50800, 0x50884, 0x50890, 0x508cc, 0x50b00, 0x50b84, 0x50c00, 0x50c00, 0x51000, 0x51020, 0x51028, 0x510b0, 0x51300, 0x51324, }; static const unsigned int t6vf_reg_ranges[] = { VF_SGE_REG(A_SGE_VF_KDOORBELL), VF_SGE_REG(A_SGE_VF_GTS), VF_MPS_REG(A_MPS_VF_CTL), VF_MPS_REG(A_MPS_VF_STAT_RX_VF_ERR_FRAMES_H), VF_PL_REG(A_PL_VF_WHOAMI), VF_PL_REG(A_PL_VF_REVISION), VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_CTRL), VF_CIM_REG(A_CIM_VF_EXT_MAILBOX_STATUS), FW_T6VF_MBDATA_BASE_ADDR, FW_T6VF_MBDATA_BASE_ADDR + ((NUM_CIM_PF_MAILBOX_DATA_INSTANCES - 1) * 4), }; u32 *buf_end = (u32 *)(buf + buf_size); const unsigned int *reg_ranges; int reg_ranges_size, range; unsigned int chip_version = chip_id(adap); /* * Select the right set of register ranges to dump depending on the * adapter chip type. */ switch (chip_version) { case CHELSIO_T4: if (adap->flags & IS_VF) { reg_ranges = t4vf_reg_ranges; reg_ranges_size = ARRAY_SIZE(t4vf_reg_ranges); } else { reg_ranges = t4_reg_ranges; reg_ranges_size = ARRAY_SIZE(t4_reg_ranges); } break; case CHELSIO_T5: if (adap->flags & IS_VF) { reg_ranges = t5vf_reg_ranges; reg_ranges_size = ARRAY_SIZE(t5vf_reg_ranges); } else { reg_ranges = t5_reg_ranges; reg_ranges_size = ARRAY_SIZE(t5_reg_ranges); } break; case CHELSIO_T6: if (adap->flags & IS_VF) { reg_ranges = t6vf_reg_ranges; reg_ranges_size = ARRAY_SIZE(t6vf_reg_ranges); } else { reg_ranges = t6_reg_ranges; reg_ranges_size = ARRAY_SIZE(t6_reg_ranges); } break; default: CH_ERR(adap, "Unsupported chip version %d\n", chip_version); return; } /* * Clear the register buffer and insert the appropriate register * values selected by the above register ranges. */ memset(buf, 0, buf_size); for (range = 0; range < reg_ranges_size; range += 2) { unsigned int reg = reg_ranges[range]; unsigned int last_reg = reg_ranges[range + 1]; u32 *bufp = (u32 *)(buf + reg); /* * Iterate across the register range filling in the register * buffer but don't write past the end of the register buffer. */ while (reg <= last_reg && bufp < buf_end) { *bufp++ = t4_read_reg(adap, reg); reg += sizeof(u32); } } } /* * Partial EEPROM Vital Product Data structure. Includes only the ID and * VPD-R sections. */ struct t4_vpd_hdr { u8 id_tag; u8 id_len[2]; u8 id_data[ID_LEN]; u8 vpdr_tag; u8 vpdr_len[2]; }; /* * EEPROM reads take a few tens of us while writes can take a bit over 5 ms. */ #define EEPROM_DELAY 10 /* 10us per poll spin */ #define EEPROM_MAX_POLL 5000 /* x 5000 == 50ms */ #define EEPROM_STAT_ADDR 0x7bfc #define VPD_SIZE 0x800 #define VPD_BASE 0x400 #define VPD_BASE_OLD 0 #define VPD_LEN 1024 #define VPD_INFO_FLD_HDR_SIZE 3 #define CHELSIO_VPD_UNIQUE_ID 0x82 /* * Small utility function to wait till any outstanding VPD Access is complete. * We have a per-adapter state variable "VPD Busy" to indicate when we have a * VPD Access in flight. This allows us to handle the problem of having a * previous VPD Access time out and prevent an attempt to inject a new VPD * Request before any in-flight VPD reguest has completed. */ static int t4_seeprom_wait(struct adapter *adapter) { unsigned int base = adapter->params.pci.vpd_cap_addr; int max_poll; /* * If no VPD Access is in flight, we can just return success right * away. */ if (!adapter->vpd_busy) return 0; /* * Poll the VPD Capability Address/Flag register waiting for it * to indicate that the operation is complete. */ max_poll = EEPROM_MAX_POLL; do { u16 val; udelay(EEPROM_DELAY); t4_os_pci_read_cfg2(adapter, base + PCI_VPD_ADDR, &val); /* * If the operation is complete, mark the VPD as no longer * busy and return success. */ if ((val & PCI_VPD_ADDR_F) == adapter->vpd_flag) { adapter->vpd_busy = 0; return 0; } } while (--max_poll); /* * Failure! Note that we leave the VPD Busy status set in order to * avoid pushing a new VPD Access request into the VPD Capability till * the current operation eventually succeeds. It's a bug to issue a * new request when an existing request is in flight and will result * in corrupt hardware state. */ return -ETIMEDOUT; } /** * t4_seeprom_read - read a serial EEPROM location * @adapter: adapter to read * @addr: EEPROM virtual address * @data: where to store the read data * * Read a 32-bit word from a location in serial EEPROM using the card's PCI * VPD capability. Note that this function must be called with a virtual * address. */ int t4_seeprom_read(struct adapter *adapter, u32 addr, u32 *data) { unsigned int base = adapter->params.pci.vpd_cap_addr; int ret; /* * VPD Accesses must alway be 4-byte aligned! */ if (addr >= EEPROMVSIZE || (addr & 3)) return -EINVAL; /* * Wait for any previous operation which may still be in flight to * complete. */ ret = t4_seeprom_wait(adapter); if (ret) { CH_ERR(adapter, "VPD still busy from previous operation\n"); return ret; } /* * Issue our new VPD Read request, mark the VPD as being busy and wait * for our request to complete. If it doesn't complete, note the * error and return it to our caller. Note that we do not reset the * VPD Busy status! */ t4_os_pci_write_cfg2(adapter, base + PCI_VPD_ADDR, (u16)addr); adapter->vpd_busy = 1; adapter->vpd_flag = PCI_VPD_ADDR_F; ret = t4_seeprom_wait(adapter); if (ret) { CH_ERR(adapter, "VPD read of address %#x failed\n", addr); return ret; } /* * Grab the returned data, swizzle it into our endianness and * return success. */ t4_os_pci_read_cfg4(adapter, base + PCI_VPD_DATA, data); *data = le32_to_cpu(*data); return 0; } /** * t4_seeprom_write - write a serial EEPROM location * @adapter: adapter to write * @addr: virtual EEPROM address * @data: value to write * * Write a 32-bit word to a location in serial EEPROM using the card's PCI * VPD capability. Note that this function must be called with a virtual * address. */ int t4_seeprom_write(struct adapter *adapter, u32 addr, u32 data) { unsigned int base = adapter->params.pci.vpd_cap_addr; int ret; u32 stats_reg; int max_poll; /* * VPD Accesses must alway be 4-byte aligned! */ if (addr >= EEPROMVSIZE || (addr & 3)) return -EINVAL; /* * Wait for any previous operation which may still be in flight to * complete. */ ret = t4_seeprom_wait(adapter); if (ret) { CH_ERR(adapter, "VPD still busy from previous operation\n"); return ret; } /* * Issue our new VPD Read request, mark the VPD as being busy and wait * for our request to complete. If it doesn't complete, note the * error and return it to our caller. Note that we do not reset the * VPD Busy status! */ t4_os_pci_write_cfg4(adapter, base + PCI_VPD_DATA, cpu_to_le32(data)); t4_os_pci_write_cfg2(adapter, base + PCI_VPD_ADDR, (u16)addr | PCI_VPD_ADDR_F); adapter->vpd_busy = 1; adapter->vpd_flag = 0; ret = t4_seeprom_wait(adapter); if (ret) { CH_ERR(adapter, "VPD write of address %#x failed\n", addr); return ret; } /* * Reset PCI_VPD_DATA register after a transaction and wait for our * request to complete. If it doesn't complete, return error. */ t4_os_pci_write_cfg4(adapter, base + PCI_VPD_DATA, 0); max_poll = EEPROM_MAX_POLL; do { udelay(EEPROM_DELAY); t4_seeprom_read(adapter, EEPROM_STAT_ADDR, &stats_reg); } while ((stats_reg & 0x1) && --max_poll); if (!max_poll) return -ETIMEDOUT; /* Return success! */ return 0; } /** * t4_eeprom_ptov - translate a physical EEPROM address to virtual * @phys_addr: the physical EEPROM address * @fn: the PCI function number * @sz: size of function-specific area * * Translate a physical EEPROM address to virtual. The first 1K is * accessed through virtual addresses starting at 31K, the rest is * accessed through virtual addresses starting at 0. * * The mapping is as follows: * [0..1K) -> [31K..32K) * [1K..1K+A) -> [ES-A..ES) * [1K+A..ES) -> [0..ES-A-1K) * * where A = @fn * @sz, and ES = EEPROM size. */ int t4_eeprom_ptov(unsigned int phys_addr, unsigned int fn, unsigned int sz) { fn *= sz; if (phys_addr < 1024) return phys_addr + (31 << 10); if (phys_addr < 1024 + fn) return EEPROMSIZE - fn + phys_addr - 1024; if (phys_addr < EEPROMSIZE) return phys_addr - 1024 - fn; return -EINVAL; } /** * t4_seeprom_wp - enable/disable EEPROM write protection * @adapter: the adapter * @enable: whether to enable or disable write protection * * Enables or disables write protection on the serial EEPROM. */ int t4_seeprom_wp(struct adapter *adapter, int enable) { return t4_seeprom_write(adapter, EEPROM_STAT_ADDR, enable ? 0xc : 0); } /** * get_vpd_keyword_val - Locates an information field keyword in the VPD * @v: Pointer to buffered vpd data structure * @kw: The keyword to search for * * Returns the value of the information field keyword or * -ENOENT otherwise. */ static int get_vpd_keyword_val(const struct t4_vpd_hdr *v, const char *kw) { int i; unsigned int offset , len; const u8 *buf = (const u8 *)v; const u8 *vpdr_len = &v->vpdr_len[0]; offset = sizeof(struct t4_vpd_hdr); len = (u16)vpdr_len[0] + ((u16)vpdr_len[1] << 8); if (len + sizeof(struct t4_vpd_hdr) > VPD_LEN) { return -ENOENT; } for (i = offset; i + VPD_INFO_FLD_HDR_SIZE <= offset + len;) { if(memcmp(buf + i , kw , 2) == 0){ i += VPD_INFO_FLD_HDR_SIZE; return i; } i += VPD_INFO_FLD_HDR_SIZE + buf[i+2]; } return -ENOENT; } /** * get_vpd_params - read VPD parameters from VPD EEPROM * @adapter: adapter to read * @p: where to store the parameters * @vpd: caller provided temporary space to read the VPD into * * Reads card parameters stored in VPD EEPROM. */ static int get_vpd_params(struct adapter *adapter, struct vpd_params *p, u8 *vpd) { int i, ret, addr; int ec, sn, pn, na; u8 csum; const struct t4_vpd_hdr *v; /* * Card information normally starts at VPD_BASE but early cards had * it at 0. */ ret = t4_seeprom_read(adapter, VPD_BASE, (u32 *)(vpd)); if (ret) return (ret); /* * The VPD shall have a unique identifier specified by the PCI SIG. * For chelsio adapters, the identifier is 0x82. The first byte of a VPD * shall be CHELSIO_VPD_UNIQUE_ID (0x82). The VPD programming software * is expected to automatically put this entry at the * beginning of the VPD. */ addr = *vpd == CHELSIO_VPD_UNIQUE_ID ? VPD_BASE : VPD_BASE_OLD; for (i = 0; i < VPD_LEN; i += 4) { ret = t4_seeprom_read(adapter, addr + i, (u32 *)(vpd + i)); if (ret) return ret; } v = (const struct t4_vpd_hdr *)vpd; #define FIND_VPD_KW(var,name) do { \ var = get_vpd_keyword_val(v , name); \ if (var < 0) { \ CH_ERR(adapter, "missing VPD keyword " name "\n"); \ return -EINVAL; \ } \ } while (0) FIND_VPD_KW(i, "RV"); for (csum = 0; i >= 0; i--) csum += vpd[i]; if (csum) { CH_ERR(adapter, "corrupted VPD EEPROM, actual csum %u\n", csum); return -EINVAL; } FIND_VPD_KW(ec, "EC"); FIND_VPD_KW(sn, "SN"); FIND_VPD_KW(pn, "PN"); FIND_VPD_KW(na, "NA"); #undef FIND_VPD_KW memcpy(p->id, v->id_data, ID_LEN); strstrip(p->id); memcpy(p->ec, vpd + ec, EC_LEN); strstrip(p->ec); i = vpd[sn - VPD_INFO_FLD_HDR_SIZE + 2]; memcpy(p->sn, vpd + sn, min(i, SERNUM_LEN)); strstrip(p->sn); i = vpd[pn - VPD_INFO_FLD_HDR_SIZE + 2]; memcpy(p->pn, vpd + pn, min(i, PN_LEN)); strstrip((char *)p->pn); i = vpd[na - VPD_INFO_FLD_HDR_SIZE + 2]; memcpy(p->na, vpd + na, min(i, MACADDR_LEN)); strstrip((char *)p->na); return 0; } /* serial flash and firmware constants and flash config file constants */ enum { SF_ATTEMPTS = 10, /* max retries for SF operations */ /* flash command opcodes */ SF_PROG_PAGE = 2, /* program 256B page */ SF_WR_DISABLE = 4, /* disable writes */ SF_RD_STATUS = 5, /* read status register */ SF_WR_ENABLE = 6, /* enable writes */ SF_RD_DATA_FAST = 0xb, /* read flash */ SF_RD_ID = 0x9f, /* read ID */ SF_ERASE_SECTOR = 0xd8, /* erase 64KB sector */ }; /** * sf1_read - read data from the serial flash * @adapter: the adapter * @byte_cnt: number of bytes to read * @cont: whether another operation will be chained * @lock: whether to lock SF for PL access only * @valp: where to store the read data * * Reads up to 4 bytes of data from the serial flash. The location of * the read needs to be specified prior to calling this by issuing the * appropriate commands to the serial flash. */ static int sf1_read(struct adapter *adapter, unsigned int byte_cnt, int cont, int lock, u32 *valp) { int ret; if (!byte_cnt || byte_cnt > 4) return -EINVAL; if (t4_read_reg(adapter, A_SF_OP) & F_BUSY) return -EBUSY; t4_write_reg(adapter, A_SF_OP, V_SF_LOCK(lock) | V_CONT(cont) | V_BYTECNT(byte_cnt - 1)); ret = t4_wait_op_done(adapter, A_SF_OP, F_BUSY, 0, SF_ATTEMPTS, 5); if (!ret) *valp = t4_read_reg(adapter, A_SF_DATA); return ret; } /** * sf1_write - write data to the serial flash * @adapter: the adapter * @byte_cnt: number of bytes to write * @cont: whether another operation will be chained * @lock: whether to lock SF for PL access only * @val: value to write * * Writes up to 4 bytes of data to the serial flash. The location of * the write needs to be specified prior to calling this by issuing the * appropriate commands to the serial flash. */ static int sf1_write(struct adapter *adapter, unsigned int byte_cnt, int cont, int lock, u32 val) { if (!byte_cnt || byte_cnt > 4) return -EINVAL; if (t4_read_reg(adapter, A_SF_OP) & F_BUSY) return -EBUSY; t4_write_reg(adapter, A_SF_DATA, val); t4_write_reg(adapter, A_SF_OP, V_SF_LOCK(lock) | V_CONT(cont) | V_BYTECNT(byte_cnt - 1) | V_OP(1)); return t4_wait_op_done(adapter, A_SF_OP, F_BUSY, 0, SF_ATTEMPTS, 5); } /** * flash_wait_op - wait for a flash operation to complete * @adapter: the adapter * @attempts: max number of polls of the status register * @delay: delay between polls in ms * * Wait for a flash operation to complete by polling the status register. */ static int flash_wait_op(struct adapter *adapter, int attempts, int delay) { int ret; u32 status; while (1) { if ((ret = sf1_write(adapter, 1, 1, 1, SF_RD_STATUS)) != 0 || (ret = sf1_read(adapter, 1, 0, 1, &status)) != 0) return ret; if (!(status & 1)) return 0; if (--attempts == 0) return -EAGAIN; if (delay) msleep(delay); } } /** * t4_read_flash - read words from serial flash * @adapter: the adapter * @addr: the start address for the read * @nwords: how many 32-bit words to read * @data: where to store the read data * @byte_oriented: whether to store data as bytes or as words * * Read the specified number of 32-bit words from the serial flash. * If @byte_oriented is set the read data is stored as a byte array * (i.e., big-endian), otherwise as 32-bit words in the platform's * natural endianness. */ int t4_read_flash(struct adapter *adapter, unsigned int addr, unsigned int nwords, u32 *data, int byte_oriented) { int ret; if (addr + nwords * sizeof(u32) > adapter->params.sf_size || (addr & 3)) return -EINVAL; addr = swab32(addr) | SF_RD_DATA_FAST; if ((ret = sf1_write(adapter, 4, 1, 0, addr)) != 0 || (ret = sf1_read(adapter, 1, 1, 0, data)) != 0) return ret; for ( ; nwords; nwords--, data++) { ret = sf1_read(adapter, 4, nwords > 1, nwords == 1, data); if (nwords == 1) t4_write_reg(adapter, A_SF_OP, 0); /* unlock SF */ if (ret) return ret; if (byte_oriented) *data = (__force __u32)(cpu_to_be32(*data)); } return 0; } /** * t4_write_flash - write up to a page of data to the serial flash * @adapter: the adapter * @addr: the start address to write * @n: length of data to write in bytes * @data: the data to write * @byte_oriented: whether to store data as bytes or as words * * Writes up to a page of data (256 bytes) to the serial flash starting * at the given address. All the data must be written to the same page. * If @byte_oriented is set the write data is stored as byte stream * (i.e. matches what on disk), otherwise in big-endian. */ int t4_write_flash(struct adapter *adapter, unsigned int addr, unsigned int n, const u8 *data, int byte_oriented) { int ret; u32 buf[SF_PAGE_SIZE / 4]; unsigned int i, c, left, val, offset = addr & 0xff; if (addr >= adapter->params.sf_size || offset + n > SF_PAGE_SIZE) return -EINVAL; val = swab32(addr) | SF_PROG_PAGE; if ((ret = sf1_write(adapter, 1, 0, 1, SF_WR_ENABLE)) != 0 || (ret = sf1_write(adapter, 4, 1, 1, val)) != 0) goto unlock; for (left = n; left; left -= c) { c = min(left, 4U); for (val = 0, i = 0; i < c; ++i) val = (val << 8) + *data++; if (!byte_oriented) val = cpu_to_be32(val); ret = sf1_write(adapter, c, c != left, 1, val); if (ret) goto unlock; } ret = flash_wait_op(adapter, 8, 1); if (ret) goto unlock; t4_write_reg(adapter, A_SF_OP, 0); /* unlock SF */ /* Read the page to verify the write succeeded */ ret = t4_read_flash(adapter, addr & ~0xff, ARRAY_SIZE(buf), buf, byte_oriented); if (ret) return ret; if (memcmp(data - n, (u8 *)buf + offset, n)) { CH_ERR(adapter, "failed to correctly write the flash page at %#x\n", addr); return -EIO; } return 0; unlock: t4_write_reg(adapter, A_SF_OP, 0); /* unlock SF */ return ret; } /** * t4_get_fw_version - read the firmware version * @adapter: the adapter * @vers: where to place the version * * Reads the FW version from flash. */ int t4_get_fw_version(struct adapter *adapter, u32 *vers) { return t4_read_flash(adapter, FLASH_FW_START + offsetof(struct fw_hdr, fw_ver), 1, vers, 0); } /** * t4_get_bs_version - read the firmware bootstrap version * @adapter: the adapter * @vers: where to place the version * * Reads the FW Bootstrap version from flash. */ int t4_get_bs_version(struct adapter *adapter, u32 *vers) { return t4_read_flash(adapter, FLASH_FWBOOTSTRAP_START + offsetof(struct fw_hdr, fw_ver), 1, vers, 0); } /** * t4_get_tp_version - read the TP microcode version * @adapter: the adapter * @vers: where to place the version * * Reads the TP microcode version from flash. */ int t4_get_tp_version(struct adapter *adapter, u32 *vers) { return t4_read_flash(adapter, FLASH_FW_START + offsetof(struct fw_hdr, tp_microcode_ver), 1, vers, 0); } /** * t4_get_exprom_version - return the Expansion ROM version (if any) * @adapter: the adapter * @vers: where to place the version * * Reads the Expansion ROM header from FLASH and returns the version * number (if present) through the @vers return value pointer. We return * this in the Firmware Version Format since it's convenient. Return * 0 on success, -ENOENT if no Expansion ROM is present. */ int t4_get_exprom_version(struct adapter *adap, u32 *vers) { struct exprom_header { unsigned char hdr_arr[16]; /* must start with 0x55aa */ unsigned char hdr_ver[4]; /* Expansion ROM version */ } *hdr; u32 exprom_header_buf[DIV_ROUND_UP(sizeof(struct exprom_header), sizeof(u32))]; int ret; ret = t4_read_flash(adap, FLASH_EXP_ROM_START, ARRAY_SIZE(exprom_header_buf), exprom_header_buf, 0); if (ret) return ret; hdr = (struct exprom_header *)exprom_header_buf; if (hdr->hdr_arr[0] != 0x55 || hdr->hdr_arr[1] != 0xaa) return -ENOENT; *vers = (V_FW_HDR_FW_VER_MAJOR(hdr->hdr_ver[0]) | V_FW_HDR_FW_VER_MINOR(hdr->hdr_ver[1]) | V_FW_HDR_FW_VER_MICRO(hdr->hdr_ver[2]) | V_FW_HDR_FW_VER_BUILD(hdr->hdr_ver[3])); return 0; } /** * t4_get_scfg_version - return the Serial Configuration version * @adapter: the adapter * @vers: where to place the version * * Reads the Serial Configuration Version via the Firmware interface * (thus this can only be called once we're ready to issue Firmware * commands). The format of the Serial Configuration version is * adapter specific. Returns 0 on success, an error on failure. * * Note that early versions of the Firmware didn't include the ability * to retrieve the Serial Configuration version, so we zero-out the * return-value parameter in that case to avoid leaving it with * garbage in it. * * Also note that the Firmware will return its cached copy of the Serial * Initialization Revision ID, not the actual Revision ID as written in * the Serial EEPROM. This is only an issue if a new VPD has been written * and the Firmware/Chip haven't yet gone through a RESET sequence. So * it's best to defer calling this routine till after a FW_RESET_CMD has * been issued if the Host Driver will be performing a full adapter * initialization. */ int t4_get_scfg_version(struct adapter *adapter, u32 *vers) { u32 scfgrev_param; int ret; scfgrev_param = (V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_SCFGREV)); ret = t4_query_params(adapter, adapter->mbox, adapter->pf, 0, 1, &scfgrev_param, vers); if (ret) *vers = 0; return ret; } /** * t4_get_vpd_version - return the VPD version * @adapter: the adapter * @vers: where to place the version * * Reads the VPD via the Firmware interface (thus this can only be called * once we're ready to issue Firmware commands). The format of the * VPD version is adapter specific. Returns 0 on success, an error on * failure. * * Note that early versions of the Firmware didn't include the ability * to retrieve the VPD version, so we zero-out the return-value parameter * in that case to avoid leaving it with garbage in it. * * Also note that the Firmware will return its cached copy of the VPD * Revision ID, not the actual Revision ID as written in the Serial * EEPROM. This is only an issue if a new VPD has been written and the * Firmware/Chip haven't yet gone through a RESET sequence. So it's best * to defer calling this routine till after a FW_RESET_CMD has been issued * if the Host Driver will be performing a full adapter initialization. */ int t4_get_vpd_version(struct adapter *adapter, u32 *vers) { u32 vpdrev_param; int ret; vpdrev_param = (V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_VPDREV)); ret = t4_query_params(adapter, adapter->mbox, adapter->pf, 0, 1, &vpdrev_param, vers); if (ret) *vers = 0; return ret; } /** * t4_get_version_info - extract various chip/firmware version information * @adapter: the adapter * * Reads various chip/firmware version numbers and stores them into the * adapter Adapter Parameters structure. If any of the efforts fails * the first failure will be returned, but all of the version numbers * will be read. */ int t4_get_version_info(struct adapter *adapter) { int ret = 0; #define FIRST_RET(__getvinfo) \ do { \ int __ret = __getvinfo; \ if (__ret && !ret) \ ret = __ret; \ } while (0) FIRST_RET(t4_get_fw_version(adapter, &adapter->params.fw_vers)); FIRST_RET(t4_get_bs_version(adapter, &adapter->params.bs_vers)); FIRST_RET(t4_get_tp_version(adapter, &adapter->params.tp_vers)); FIRST_RET(t4_get_exprom_version(adapter, &adapter->params.er_vers)); FIRST_RET(t4_get_scfg_version(adapter, &adapter->params.scfg_vers)); FIRST_RET(t4_get_vpd_version(adapter, &adapter->params.vpd_vers)); #undef FIRST_RET return ret; } /** * t4_flash_erase_sectors - erase a range of flash sectors * @adapter: the adapter * @start: the first sector to erase * @end: the last sector to erase * * Erases the sectors in the given inclusive range. */ int t4_flash_erase_sectors(struct adapter *adapter, int start, int end) { int ret = 0; if (end >= adapter->params.sf_nsec) return -EINVAL; while (start <= end) { if ((ret = sf1_write(adapter, 1, 0, 1, SF_WR_ENABLE)) != 0 || (ret = sf1_write(adapter, 4, 0, 1, SF_ERASE_SECTOR | (start << 8))) != 0 || (ret = flash_wait_op(adapter, 14, 500)) != 0) { CH_ERR(adapter, "erase of flash sector %d failed, error %d\n", start, ret); break; } start++; } t4_write_reg(adapter, A_SF_OP, 0); /* unlock SF */ return ret; } /** * t4_flash_cfg_addr - return the address of the flash configuration file * @adapter: the adapter * * Return the address within the flash where the Firmware Configuration * File is stored, or an error if the device FLASH is too small to contain * a Firmware Configuration File. */ int t4_flash_cfg_addr(struct adapter *adapter) { /* * If the device FLASH isn't large enough to hold a Firmware * Configuration File, return an error. */ if (adapter->params.sf_size < FLASH_CFG_START + FLASH_CFG_MAX_SIZE) return -ENOSPC; return FLASH_CFG_START; } /* * Return TRUE if the specified firmware matches the adapter. I.e. T4 * firmware for T4 adapters, T5 firmware for T5 adapters, etc. We go ahead * and emit an error message for mismatched firmware to save our caller the * effort ... */ static int t4_fw_matches_chip(struct adapter *adap, const struct fw_hdr *hdr) { /* * The expression below will return FALSE for any unsupported adapter * which will keep us "honest" in the future ... */ if ((is_t4(adap) && hdr->chip == FW_HDR_CHIP_T4) || (is_t5(adap) && hdr->chip == FW_HDR_CHIP_T5) || (is_t6(adap) && hdr->chip == FW_HDR_CHIP_T6)) return 1; CH_ERR(adap, "FW image (%d) is not suitable for this adapter (%d)\n", hdr->chip, chip_id(adap)); return 0; } /** * t4_load_fw - download firmware * @adap: the adapter * @fw_data: the firmware image to write * @size: image size * * Write the supplied firmware image to the card's serial flash. */ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size) { u32 csum; int ret, addr; unsigned int i; u8 first_page[SF_PAGE_SIZE]; const u32 *p = (const u32 *)fw_data; const struct fw_hdr *hdr = (const struct fw_hdr *)fw_data; unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec; unsigned int fw_start_sec; unsigned int fw_start; unsigned int fw_size; if (ntohl(hdr->magic) == FW_HDR_MAGIC_BOOTSTRAP) { fw_start_sec = FLASH_FWBOOTSTRAP_START_SEC; fw_start = FLASH_FWBOOTSTRAP_START; fw_size = FLASH_FWBOOTSTRAP_MAX_SIZE; } else { fw_start_sec = FLASH_FW_START_SEC; fw_start = FLASH_FW_START; fw_size = FLASH_FW_MAX_SIZE; } if (!size) { CH_ERR(adap, "FW image has no data\n"); return -EINVAL; } if (size & 511) { CH_ERR(adap, "FW image size not multiple of 512 bytes\n"); return -EINVAL; } if ((unsigned int) be16_to_cpu(hdr->len512) * 512 != size) { CH_ERR(adap, "FW image size differs from size in FW header\n"); return -EINVAL; } if (size > fw_size) { CH_ERR(adap, "FW image too large, max is %u bytes\n", fw_size); return -EFBIG; } if (!t4_fw_matches_chip(adap, hdr)) return -EINVAL; for (csum = 0, i = 0; i < size / sizeof(csum); i++) csum += be32_to_cpu(p[i]); if (csum != 0xffffffff) { CH_ERR(adap, "corrupted firmware image, checksum %#x\n", csum); return -EINVAL; } i = DIV_ROUND_UP(size, sf_sec_size); /* # of sectors spanned */ ret = t4_flash_erase_sectors(adap, fw_start_sec, fw_start_sec + i - 1); if (ret) goto out; /* * We write the correct version at the end so the driver can see a bad * version if the FW write fails. Start by writing a copy of the * first page with a bad version. */ memcpy(first_page, fw_data, SF_PAGE_SIZE); ((struct fw_hdr *)first_page)->fw_ver = cpu_to_be32(0xffffffff); ret = t4_write_flash(adap, fw_start, SF_PAGE_SIZE, first_page, 1); if (ret) goto out; addr = fw_start; for (size -= SF_PAGE_SIZE; size; size -= SF_PAGE_SIZE) { addr += SF_PAGE_SIZE; fw_data += SF_PAGE_SIZE; ret = t4_write_flash(adap, addr, SF_PAGE_SIZE, fw_data, 1); if (ret) goto out; } ret = t4_write_flash(adap, fw_start + offsetof(struct fw_hdr, fw_ver), sizeof(hdr->fw_ver), (const u8 *)&hdr->fw_ver, 1); out: if (ret) CH_ERR(adap, "firmware download failed, error %d\n", ret); return ret; } /** * t4_fwcache - firmware cache operation * @adap: the adapter * @op : the operation (flush or flush and invalidate) */ int t4_fwcache(struct adapter *adap, enum fw_params_param_dev_fwcache op) { struct fw_params_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_PARAMS_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_PARAMS_CMD_PFN(adap->pf) | V_FW_PARAMS_CMD_VFN(0)); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); c.param[0].mnem = cpu_to_be32(V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_FWCACHE)); c.param[0].val = (__force __be32)op; return t4_wr_mbox(adap, adap->mbox, &c, sizeof(c), NULL); } void t4_cim_read_pif_la(struct adapter *adap, u32 *pif_req, u32 *pif_rsp, unsigned int *pif_req_wrptr, unsigned int *pif_rsp_wrptr) { int i, j; u32 cfg, val, req, rsp; cfg = t4_read_reg(adap, A_CIM_DEBUGCFG); if (cfg & F_LADBGEN) t4_write_reg(adap, A_CIM_DEBUGCFG, cfg ^ F_LADBGEN); val = t4_read_reg(adap, A_CIM_DEBUGSTS); req = G_POLADBGWRPTR(val); rsp = G_PILADBGWRPTR(val); if (pif_req_wrptr) *pif_req_wrptr = req; if (pif_rsp_wrptr) *pif_rsp_wrptr = rsp; for (i = 0; i < CIM_PIFLA_SIZE; i++) { for (j = 0; j < 6; j++) { t4_write_reg(adap, A_CIM_DEBUGCFG, V_POLADBGRDPTR(req) | V_PILADBGRDPTR(rsp)); *pif_req++ = t4_read_reg(adap, A_CIM_PO_LA_DEBUGDATA); *pif_rsp++ = t4_read_reg(adap, A_CIM_PI_LA_DEBUGDATA); req++; rsp++; } req = (req + 2) & M_POLADBGRDPTR; rsp = (rsp + 2) & M_PILADBGRDPTR; } t4_write_reg(adap, A_CIM_DEBUGCFG, cfg); } void t4_cim_read_ma_la(struct adapter *adap, u32 *ma_req, u32 *ma_rsp) { u32 cfg; int i, j, idx; cfg = t4_read_reg(adap, A_CIM_DEBUGCFG); if (cfg & F_LADBGEN) t4_write_reg(adap, A_CIM_DEBUGCFG, cfg ^ F_LADBGEN); for (i = 0; i < CIM_MALA_SIZE; i++) { for (j = 0; j < 5; j++) { idx = 8 * i + j; t4_write_reg(adap, A_CIM_DEBUGCFG, V_POLADBGRDPTR(idx) | V_PILADBGRDPTR(idx)); *ma_req++ = t4_read_reg(adap, A_CIM_PO_LA_MADEBUGDATA); *ma_rsp++ = t4_read_reg(adap, A_CIM_PI_LA_MADEBUGDATA); } } t4_write_reg(adap, A_CIM_DEBUGCFG, cfg); } void t4_ulprx_read_la(struct adapter *adap, u32 *la_buf) { unsigned int i, j; for (i = 0; i < 8; i++) { u32 *p = la_buf + i; t4_write_reg(adap, A_ULP_RX_LA_CTL, i); j = t4_read_reg(adap, A_ULP_RX_LA_WRPTR); t4_write_reg(adap, A_ULP_RX_LA_RDPTR, j); for (j = 0; j < ULPRX_LA_SIZE; j++, p += 8) *p = t4_read_reg(adap, A_ULP_RX_LA_RDDATA); } } /** * t4_link_l1cfg - apply link configuration to MAC/PHY * @phy: the PHY to setup * @mac: the MAC to setup * @lc: the requested link configuration * * Set up a port's MAC and PHY according to a desired link configuration. * - If the PHY can auto-negotiate first decide what to advertise, then * enable/disable auto-negotiation as desired, and reset. * - If the PHY does not auto-negotiate just reset it. * - If auto-negotiation is off set the MAC to the proper speed/duplex/FC, * otherwise do it later based on the outcome of auto-negotiation. */ int t4_link_l1cfg(struct adapter *adap, unsigned int mbox, unsigned int port, struct link_config *lc) { struct fw_port_cmd c; unsigned int mdi = V_FW_PORT_CAP_MDI(FW_PORT_CAP_MDI_AUTO); unsigned int aneg, fc, fec, speed; fc = 0; if (lc->requested_fc & PAUSE_RX) fc |= FW_PORT_CAP_FC_RX; if (lc->requested_fc & PAUSE_TX) fc |= FW_PORT_CAP_FC_TX; fec = 0; if (lc->requested_fec & FEC_RS) fec = FW_PORT_CAP_FEC_RS; else if (lc->requested_fec & FEC_BASER_RS) fec = FW_PORT_CAP_FEC_BASER_RS; else if (lc->requested_fec & FEC_RESERVED) fec = FW_PORT_CAP_FEC_RESERVED; if (!(lc->supported & FW_PORT_CAP_ANEG) || lc->requested_aneg == AUTONEG_DISABLE) { aneg = 0; switch (lc->requested_speed) { case 100: speed = FW_PORT_CAP_SPEED_100G; break; case 40: speed = FW_PORT_CAP_SPEED_40G; break; case 25: speed = FW_PORT_CAP_SPEED_25G; break; case 10: speed = FW_PORT_CAP_SPEED_10G; break; case 1: speed = FW_PORT_CAP_SPEED_1G; break; default: return -EINVAL; break; } } else { aneg = FW_PORT_CAP_ANEG; speed = lc->supported & V_FW_PORT_CAP_SPEED(M_FW_PORT_CAP_SPEED); } memset(&c, 0, sizeof(c)); c.op_to_portid = cpu_to_be32(V_FW_CMD_OP(FW_PORT_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_PORT_CMD_PORTID(port)); c.action_to_len16 = cpu_to_be32(V_FW_PORT_CMD_ACTION(FW_PORT_ACTION_L1_CFG) | FW_LEN16(c)); c.u.l1cfg.rcap = cpu_to_be32(aneg | speed | fc | fec | mdi); return t4_wr_mbox_ns(adap, mbox, &c, sizeof(c), NULL); } /** * t4_restart_aneg - restart autonegotiation * @adap: the adapter * @mbox: mbox to use for the FW command * @port: the port id * * Restarts autonegotiation for the selected port. */ int t4_restart_aneg(struct adapter *adap, unsigned int mbox, unsigned int port) { struct fw_port_cmd c; memset(&c, 0, sizeof(c)); c.op_to_portid = cpu_to_be32(V_FW_CMD_OP(FW_PORT_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_PORT_CMD_PORTID(port)); c.action_to_len16 = cpu_to_be32(V_FW_PORT_CMD_ACTION(FW_PORT_ACTION_L1_CFG) | FW_LEN16(c)); c.u.l1cfg.rcap = cpu_to_be32(FW_PORT_CAP_ANEG); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } typedef void (*int_handler_t)(struct adapter *adap); struct intr_info { unsigned int mask; /* bits to check in interrupt status */ const char *msg; /* message to print or NULL */ short stat_idx; /* stat counter to increment or -1 */ unsigned short fatal; /* whether the condition reported is fatal */ int_handler_t int_handler; /* platform-specific int handler */ }; /** * t4_handle_intr_status - table driven interrupt handler * @adapter: the adapter that generated the interrupt * @reg: the interrupt status register to process * @acts: table of interrupt actions * * A table driven interrupt handler that applies a set of masks to an * interrupt status word and performs the corresponding actions if the * interrupts described by the mask have occurred. The actions include * optionally emitting a warning or alert message. The table is terminated * by an entry specifying mask 0. Returns the number of fatal interrupt * conditions. */ static int t4_handle_intr_status(struct adapter *adapter, unsigned int reg, const struct intr_info *acts) { int fatal = 0; unsigned int mask = 0; unsigned int status = t4_read_reg(adapter, reg); for ( ; acts->mask; ++acts) { if (!(status & acts->mask)) continue; if (acts->fatal) { fatal++; CH_ALERT(adapter, "%s (0x%x)\n", acts->msg, status & acts->mask); } else if (acts->msg) CH_WARN_RATELIMIT(adapter, "%s (0x%x)\n", acts->msg, status & acts->mask); if (acts->int_handler) acts->int_handler(adapter); mask |= acts->mask; } status &= mask; if (status) /* clear processed interrupts */ t4_write_reg(adapter, reg, status); return fatal; } /* * Interrupt handler for the PCIE module. */ static void pcie_intr_handler(struct adapter *adapter) { static const struct intr_info sysbus_intr_info[] = { { F_RNPP, "RXNP array parity error", -1, 1 }, { F_RPCP, "RXPC array parity error", -1, 1 }, { F_RCIP, "RXCIF array parity error", -1, 1 }, { F_RCCP, "Rx completions control array parity error", -1, 1 }, { F_RFTP, "RXFT array parity error", -1, 1 }, { 0 } }; static const struct intr_info pcie_port_intr_info[] = { { F_TPCP, "TXPC array parity error", -1, 1 }, { F_TNPP, "TXNP array parity error", -1, 1 }, { F_TFTP, "TXFT array parity error", -1, 1 }, { F_TCAP, "TXCA array parity error", -1, 1 }, { F_TCIP, "TXCIF array parity error", -1, 1 }, { F_RCAP, "RXCA array parity error", -1, 1 }, { F_OTDD, "outbound request TLP discarded", -1, 1 }, { F_RDPE, "Rx data parity error", -1, 1 }, { F_TDUE, "Tx uncorrectable data error", -1, 1 }, { 0 } }; static const struct intr_info pcie_intr_info[] = { { F_MSIADDRLPERR, "MSI AddrL parity error", -1, 1 }, { F_MSIADDRHPERR, "MSI AddrH parity error", -1, 1 }, { F_MSIDATAPERR, "MSI data parity error", -1, 1 }, { F_MSIXADDRLPERR, "MSI-X AddrL parity error", -1, 1 }, { F_MSIXADDRHPERR, "MSI-X AddrH parity error", -1, 1 }, { F_MSIXDATAPERR, "MSI-X data parity error", -1, 1 }, { F_MSIXDIPERR, "MSI-X DI parity error", -1, 1 }, { F_PIOCPLPERR, "PCI PIO completion FIFO parity error", -1, 1 }, { F_PIOREQPERR, "PCI PIO request FIFO parity error", -1, 1 }, { F_TARTAGPERR, "PCI PCI target tag FIFO parity error", -1, 1 }, { F_CCNTPERR, "PCI CMD channel count parity error", -1, 1 }, { F_CREQPERR, "PCI CMD channel request parity error", -1, 1 }, { F_CRSPPERR, "PCI CMD channel response parity error", -1, 1 }, { F_DCNTPERR, "PCI DMA channel count parity error", -1, 1 }, { F_DREQPERR, "PCI DMA channel request parity error", -1, 1 }, { F_DRSPPERR, "PCI DMA channel response parity error", -1, 1 }, { F_HCNTPERR, "PCI HMA channel count parity error", -1, 1 }, { F_HREQPERR, "PCI HMA channel request parity error", -1, 1 }, { F_HRSPPERR, "PCI HMA channel response parity error", -1, 1 }, { F_CFGSNPPERR, "PCI config snoop FIFO parity error", -1, 1 }, { F_FIDPERR, "PCI FID parity error", -1, 1 }, { F_INTXCLRPERR, "PCI INTx clear parity error", -1, 1 }, { F_MATAGPERR, "PCI MA tag parity error", -1, 1 }, { F_PIOTAGPERR, "PCI PIO tag parity error", -1, 1 }, { F_RXCPLPERR, "PCI Rx completion parity error", -1, 1 }, { F_RXWRPERR, "PCI Rx write parity error", -1, 1 }, { F_RPLPERR, "PCI replay buffer parity error", -1, 1 }, { F_PCIESINT, "PCI core secondary fault", -1, 1 }, { F_PCIEPINT, "PCI core primary fault", -1, 1 }, { F_UNXSPLCPLERR, "PCI unexpected split completion error", -1, 0 }, { 0 } }; static const struct intr_info t5_pcie_intr_info[] = { { F_MSTGRPPERR, "Master Response Read Queue parity error", -1, 1 }, { F_MSTTIMEOUTPERR, "Master Timeout FIFO parity error", -1, 1 }, { F_MSIXSTIPERR, "MSI-X STI SRAM parity error", -1, 1 }, { F_MSIXADDRLPERR, "MSI-X AddrL parity error", -1, 1 }, { F_MSIXADDRHPERR, "MSI-X AddrH parity error", -1, 1 }, { F_MSIXDATAPERR, "MSI-X data parity error", -1, 1 }, { F_MSIXDIPERR, "MSI-X DI parity error", -1, 1 }, { F_PIOCPLGRPPERR, "PCI PIO completion Group FIFO parity error", -1, 1 }, { F_PIOREQGRPPERR, "PCI PIO request Group FIFO parity error", -1, 1 }, { F_TARTAGPERR, "PCI PCI target tag FIFO parity error", -1, 1 }, { F_MSTTAGQPERR, "PCI master tag queue parity error", -1, 1 }, { F_CREQPERR, "PCI CMD channel request parity error", -1, 1 }, { F_CRSPPERR, "PCI CMD channel response parity error", -1, 1 }, { F_DREQWRPERR, "PCI DMA channel write request parity error", -1, 1 }, { F_DREQPERR, "PCI DMA channel request parity error", -1, 1 }, { F_DRSPPERR, "PCI DMA channel response parity error", -1, 1 }, { F_HREQWRPERR, "PCI HMA channel count parity error", -1, 1 }, { F_HREQPERR, "PCI HMA channel request parity error", -1, 1 }, { F_HRSPPERR, "PCI HMA channel response parity error", -1, 1 }, { F_CFGSNPPERR, "PCI config snoop FIFO parity error", -1, 1 }, { F_FIDPERR, "PCI FID parity error", -1, 1 }, { F_VFIDPERR, "PCI INTx clear parity error", -1, 1 }, { F_MAGRPPERR, "PCI MA group FIFO parity error", -1, 1 }, { F_PIOTAGPERR, "PCI PIO tag parity error", -1, 1 }, { F_IPRXHDRGRPPERR, "PCI IP Rx header group parity error", -1, 1 }, { F_IPRXDATAGRPPERR, "PCI IP Rx data group parity error", -1, 1 }, { F_RPLPERR, "PCI IP replay buffer parity error", -1, 1 }, { F_IPSOTPERR, "PCI IP SOT buffer parity error", -1, 1 }, { F_TRGT1GRPPERR, "PCI TRGT1 group FIFOs parity error", -1, 1 }, { F_READRSPERR, "Outbound read error", -1, 0 }, { 0 } }; int fat; if (is_t4(adapter)) fat = t4_handle_intr_status(adapter, A_PCIE_CORE_UTL_SYSTEM_BUS_AGENT_STATUS, sysbus_intr_info) + t4_handle_intr_status(adapter, A_PCIE_CORE_UTL_PCI_EXPRESS_PORT_STATUS, pcie_port_intr_info) + t4_handle_intr_status(adapter, A_PCIE_INT_CAUSE, pcie_intr_info); else fat = t4_handle_intr_status(adapter, A_PCIE_INT_CAUSE, t5_pcie_intr_info); if (fat) t4_fatal_err(adapter); } /* * TP interrupt handler. */ static void tp_intr_handler(struct adapter *adapter) { static const struct intr_info tp_intr_info[] = { { 0x3fffffff, "TP parity error", -1, 1 }, { F_FLMTXFLSTEMPTY, "TP out of Tx pages", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adapter, A_TP_INT_CAUSE, tp_intr_info)) t4_fatal_err(adapter); } /* * SGE interrupt handler. */ static void sge_intr_handler(struct adapter *adapter) { u64 v; u32 err; static const struct intr_info sge_intr_info[] = { { F_ERR_CPL_EXCEED_IQE_SIZE, "SGE received CPL exceeding IQE size", -1, 1 }, { F_ERR_INVALID_CIDX_INC, "SGE GTS CIDX increment too large", -1, 0 }, { F_ERR_CPL_OPCODE_0, "SGE received 0-length CPL", -1, 0 }, { F_DBFIFO_LP_INT, NULL, -1, 0, t4_db_full }, { F_ERR_DATA_CPL_ON_HIGH_QID1 | F_ERR_DATA_CPL_ON_HIGH_QID0, "SGE IQID > 1023 received CPL for FL", -1, 0 }, { F_ERR_BAD_DB_PIDX3, "SGE DBP 3 pidx increment too large", -1, 0 }, { F_ERR_BAD_DB_PIDX2, "SGE DBP 2 pidx increment too large", -1, 0 }, { F_ERR_BAD_DB_PIDX1, "SGE DBP 1 pidx increment too large", -1, 0 }, { F_ERR_BAD_DB_PIDX0, "SGE DBP 0 pidx increment too large", -1, 0 }, { F_ERR_ING_CTXT_PRIO, "SGE too many priority ingress contexts", -1, 0 }, { F_INGRESS_SIZE_ERR, "SGE illegal ingress QID", -1, 0 }, { F_EGRESS_SIZE_ERR, "SGE illegal egress QID", -1, 0 }, { F_ERR_PCIE_ERROR0 | F_ERR_PCIE_ERROR1 | F_ERR_PCIE_ERROR2 | F_ERR_PCIE_ERROR3, "SGE PCIe error for a DBP thread", -1, 0 }, { 0 } }; static const struct intr_info t4t5_sge_intr_info[] = { { F_ERR_DROPPED_DB, NULL, -1, 0, t4_db_dropped }, { F_DBFIFO_HP_INT, NULL, -1, 0, t4_db_full }, { F_ERR_EGR_CTXT_PRIO, "SGE too many priority egress contexts", -1, 0 }, { 0 } }; /* * For now, treat below interrupts as fatal so that we disable SGE and * get better debug */ static const struct intr_info t6_sge_intr_info[] = { { F_FATAL_WRE_LEN, "SGE Actual WRE packet is less than advertized length", -1, 1 }, { 0 } }; v = (u64)t4_read_reg(adapter, A_SGE_INT_CAUSE1) | ((u64)t4_read_reg(adapter, A_SGE_INT_CAUSE2) << 32); if (v) { CH_ALERT(adapter, "SGE parity error (%#llx)\n", (unsigned long long)v); t4_write_reg(adapter, A_SGE_INT_CAUSE1, v); t4_write_reg(adapter, A_SGE_INT_CAUSE2, v >> 32); } v |= t4_handle_intr_status(adapter, A_SGE_INT_CAUSE3, sge_intr_info); if (chip_id(adapter) <= CHELSIO_T5) v |= t4_handle_intr_status(adapter, A_SGE_INT_CAUSE3, t4t5_sge_intr_info); else v |= t4_handle_intr_status(adapter, A_SGE_INT_CAUSE3, t6_sge_intr_info); err = t4_read_reg(adapter, A_SGE_ERROR_STATS); if (err & F_ERROR_QID_VALID) { CH_ERR(adapter, "SGE error for queue %u\n", G_ERROR_QID(err)); if (err & F_UNCAPTURED_ERROR) CH_ERR(adapter, "SGE UNCAPTURED_ERROR set (clearing)\n"); t4_write_reg(adapter, A_SGE_ERROR_STATS, F_ERROR_QID_VALID | F_UNCAPTURED_ERROR); } if (v != 0) t4_fatal_err(adapter); } #define CIM_OBQ_INTR (F_OBQULP0PARERR | F_OBQULP1PARERR | F_OBQULP2PARERR |\ F_OBQULP3PARERR | F_OBQSGEPARERR | F_OBQNCSIPARERR) #define CIM_IBQ_INTR (F_IBQTP0PARERR | F_IBQTP1PARERR | F_IBQULPPARERR |\ F_IBQSGEHIPARERR | F_IBQSGELOPARERR | F_IBQNCSIPARERR) /* * CIM interrupt handler. */ static void cim_intr_handler(struct adapter *adapter) { static const struct intr_info cim_intr_info[] = { { F_PREFDROPINT, "CIM control register prefetch drop", -1, 1 }, { CIM_OBQ_INTR, "CIM OBQ parity error", -1, 1 }, { CIM_IBQ_INTR, "CIM IBQ parity error", -1, 1 }, { F_MBUPPARERR, "CIM mailbox uP parity error", -1, 1 }, { F_MBHOSTPARERR, "CIM mailbox host parity error", -1, 1 }, { F_TIEQINPARERRINT, "CIM TIEQ outgoing parity error", -1, 1 }, { F_TIEQOUTPARERRINT, "CIM TIEQ incoming parity error", -1, 1 }, { F_TIMER0INT, "CIM TIMER0 interrupt", -1, 1 }, { 0 } }; static const struct intr_info cim_upintr_info[] = { { F_RSVDSPACEINT, "CIM reserved space access", -1, 1 }, { F_ILLTRANSINT, "CIM illegal transaction", -1, 1 }, { F_ILLWRINT, "CIM illegal write", -1, 1 }, { F_ILLRDINT, "CIM illegal read", -1, 1 }, { F_ILLRDBEINT, "CIM illegal read BE", -1, 1 }, { F_ILLWRBEINT, "CIM illegal write BE", -1, 1 }, { F_SGLRDBOOTINT, "CIM single read from boot space", -1, 1 }, { F_SGLWRBOOTINT, "CIM single write to boot space", -1, 1 }, { F_BLKWRBOOTINT, "CIM block write to boot space", -1, 1 }, { F_SGLRDFLASHINT, "CIM single read from flash space", -1, 1 }, { F_SGLWRFLASHINT, "CIM single write to flash space", -1, 1 }, { F_BLKWRFLASHINT, "CIM block write to flash space", -1, 1 }, { F_SGLRDEEPROMINT, "CIM single EEPROM read", -1, 1 }, { F_SGLWREEPROMINT, "CIM single EEPROM write", -1, 1 }, { F_BLKRDEEPROMINT, "CIM block EEPROM read", -1, 1 }, { F_BLKWREEPROMINT, "CIM block EEPROM write", -1, 1 }, { F_SGLRDCTLINT , "CIM single read from CTL space", -1, 1 }, { F_SGLWRCTLINT , "CIM single write to CTL space", -1, 1 }, { F_BLKRDCTLINT , "CIM block read from CTL space", -1, 1 }, { F_BLKWRCTLINT , "CIM block write to CTL space", -1, 1 }, { F_SGLRDPLINT , "CIM single read from PL space", -1, 1 }, { F_SGLWRPLINT , "CIM single write to PL space", -1, 1 }, { F_BLKRDPLINT , "CIM block read from PL space", -1, 1 }, { F_BLKWRPLINT , "CIM block write to PL space", -1, 1 }, { F_REQOVRLOOKUPINT , "CIM request FIFO overwrite", -1, 1 }, { F_RSPOVRLOOKUPINT , "CIM response FIFO overwrite", -1, 1 }, { F_TIMEOUTINT , "CIM PIF timeout", -1, 1 }, { F_TIMEOUTMAINT , "CIM PIF MA timeout", -1, 1 }, { 0 } }; u32 val, fw_err; int fat; fw_err = t4_read_reg(adapter, A_PCIE_FW); if (fw_err & F_PCIE_FW_ERR) t4_report_fw_error(adapter); /* When the Firmware detects an internal error which normally wouldn't * raise a Host Interrupt, it forces a CIM Timer0 interrupt in order * to make sure the Host sees the Firmware Crash. So if we have a * Timer0 interrupt and don't see a Firmware Crash, ignore the Timer0 * interrupt. */ val = t4_read_reg(adapter, A_CIM_HOST_INT_CAUSE); if (val & F_TIMER0INT) if (!(fw_err & F_PCIE_FW_ERR) || (G_PCIE_FW_EVAL(fw_err) != PCIE_FW_EVAL_CRASH)) t4_write_reg(adapter, A_CIM_HOST_INT_CAUSE, F_TIMER0INT); fat = t4_handle_intr_status(adapter, A_CIM_HOST_INT_CAUSE, cim_intr_info) + t4_handle_intr_status(adapter, A_CIM_HOST_UPACC_INT_CAUSE, cim_upintr_info); if (fat) t4_fatal_err(adapter); } /* * ULP RX interrupt handler. */ static void ulprx_intr_handler(struct adapter *adapter) { static const struct intr_info ulprx_intr_info[] = { { F_CAUSE_CTX_1, "ULPRX channel 1 context error", -1, 1 }, { F_CAUSE_CTX_0, "ULPRX channel 0 context error", -1, 1 }, { 0x7fffff, "ULPRX parity error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adapter, A_ULP_RX_INT_CAUSE, ulprx_intr_info)) t4_fatal_err(adapter); } /* * ULP TX interrupt handler. */ static void ulptx_intr_handler(struct adapter *adapter) { static const struct intr_info ulptx_intr_info[] = { { F_PBL_BOUND_ERR_CH3, "ULPTX channel 3 PBL out of bounds", -1, 0 }, { F_PBL_BOUND_ERR_CH2, "ULPTX channel 2 PBL out of bounds", -1, 0 }, { F_PBL_BOUND_ERR_CH1, "ULPTX channel 1 PBL out of bounds", -1, 0 }, { F_PBL_BOUND_ERR_CH0, "ULPTX channel 0 PBL out of bounds", -1, 0 }, { 0xfffffff, "ULPTX parity error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adapter, A_ULP_TX_INT_CAUSE, ulptx_intr_info)) t4_fatal_err(adapter); } /* * PM TX interrupt handler. */ static void pmtx_intr_handler(struct adapter *adapter) { static const struct intr_info pmtx_intr_info[] = { { F_PCMD_LEN_OVFL0, "PMTX channel 0 pcmd too large", -1, 1 }, { F_PCMD_LEN_OVFL1, "PMTX channel 1 pcmd too large", -1, 1 }, { F_PCMD_LEN_OVFL2, "PMTX channel 2 pcmd too large", -1, 1 }, { F_ZERO_C_CMD_ERROR, "PMTX 0-length pcmd", -1, 1 }, { 0xffffff0, "PMTX framing error", -1, 1 }, { F_OESPI_PAR_ERROR, "PMTX oespi parity error", -1, 1 }, { F_DB_OPTIONS_PAR_ERROR, "PMTX db_options parity error", -1, 1 }, { F_ICSPI_PAR_ERROR, "PMTX icspi parity error", -1, 1 }, { F_C_PCMD_PAR_ERROR, "PMTX c_pcmd parity error", -1, 1}, { 0 } }; if (t4_handle_intr_status(adapter, A_PM_TX_INT_CAUSE, pmtx_intr_info)) t4_fatal_err(adapter); } /* * PM RX interrupt handler. */ static void pmrx_intr_handler(struct adapter *adapter) { static const struct intr_info pmrx_intr_info[] = { { F_ZERO_E_CMD_ERROR, "PMRX 0-length pcmd", -1, 1 }, { 0x3ffff0, "PMRX framing error", -1, 1 }, { F_OCSPI_PAR_ERROR, "PMRX ocspi parity error", -1, 1 }, { F_DB_OPTIONS_PAR_ERROR, "PMRX db_options parity error", -1, 1 }, { F_IESPI_PAR_ERROR, "PMRX iespi parity error", -1, 1 }, { F_E_PCMD_PAR_ERROR, "PMRX e_pcmd parity error", -1, 1}, { 0 } }; if (t4_handle_intr_status(adapter, A_PM_RX_INT_CAUSE, pmrx_intr_info)) t4_fatal_err(adapter); } /* * CPL switch interrupt handler. */ static void cplsw_intr_handler(struct adapter *adapter) { static const struct intr_info cplsw_intr_info[] = { { F_CIM_OP_MAP_PERR, "CPLSW CIM op_map parity error", -1, 1 }, { F_CIM_OVFL_ERROR, "CPLSW CIM overflow", -1, 1 }, { F_TP_FRAMING_ERROR, "CPLSW TP framing error", -1, 1 }, { F_SGE_FRAMING_ERROR, "CPLSW SGE framing error", -1, 1 }, { F_CIM_FRAMING_ERROR, "CPLSW CIM framing error", -1, 1 }, { F_ZERO_SWITCH_ERROR, "CPLSW no-switch error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adapter, A_CPL_INTR_CAUSE, cplsw_intr_info)) t4_fatal_err(adapter); } /* * LE interrupt handler. */ static void le_intr_handler(struct adapter *adap) { unsigned int chip_ver = chip_id(adap); static const struct intr_info le_intr_info[] = { { F_LIPMISS, "LE LIP miss", -1, 0 }, { F_LIP0, "LE 0 LIP error", -1, 0 }, { F_PARITYERR, "LE parity error", -1, 1 }, { F_UNKNOWNCMD, "LE unknown command", -1, 1 }, { F_REQQPARERR, "LE request queue parity error", -1, 1 }, { 0 } }; static const struct intr_info t6_le_intr_info[] = { { F_T6_LIPMISS, "LE LIP miss", -1, 0 }, { F_T6_LIP0, "LE 0 LIP error", -1, 0 }, { F_TCAMINTPERR, "LE parity error", -1, 1 }, { F_T6_UNKNOWNCMD, "LE unknown command", -1, 1 }, { F_SSRAMINTPERR, "LE request queue parity error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adap, A_LE_DB_INT_CAUSE, (chip_ver <= CHELSIO_T5) ? le_intr_info : t6_le_intr_info)) t4_fatal_err(adap); } /* * MPS interrupt handler. */ static void mps_intr_handler(struct adapter *adapter) { static const struct intr_info mps_rx_intr_info[] = { { 0xffffff, "MPS Rx parity error", -1, 1 }, { 0 } }; static const struct intr_info mps_tx_intr_info[] = { { V_TPFIFO(M_TPFIFO), "MPS Tx TP FIFO parity error", -1, 1 }, { F_NCSIFIFO, "MPS Tx NC-SI FIFO parity error", -1, 1 }, { V_TXDATAFIFO(M_TXDATAFIFO), "MPS Tx data FIFO parity error", -1, 1 }, { V_TXDESCFIFO(M_TXDESCFIFO), "MPS Tx desc FIFO parity error", -1, 1 }, { F_BUBBLE, "MPS Tx underflow", -1, 1 }, { F_SECNTERR, "MPS Tx SOP/EOP error", -1, 1 }, { F_FRMERR, "MPS Tx framing error", -1, 1 }, { 0 } }; static const struct intr_info mps_trc_intr_info[] = { { V_FILTMEM(M_FILTMEM), "MPS TRC filter parity error", -1, 1 }, { V_PKTFIFO(M_PKTFIFO), "MPS TRC packet FIFO parity error", -1, 1 }, { F_MISCPERR, "MPS TRC misc parity error", -1, 1 }, { 0 } }; static const struct intr_info mps_stat_sram_intr_info[] = { { 0x1fffff, "MPS statistics SRAM parity error", -1, 1 }, { 0 } }; static const struct intr_info mps_stat_tx_intr_info[] = { { 0xfffff, "MPS statistics Tx FIFO parity error", -1, 1 }, { 0 } }; static const struct intr_info mps_stat_rx_intr_info[] = { { 0xffffff, "MPS statistics Rx FIFO parity error", -1, 1 }, { 0 } }; static const struct intr_info mps_cls_intr_info[] = { { F_MATCHSRAM, "MPS match SRAM parity error", -1, 1 }, { F_MATCHTCAM, "MPS match TCAM parity error", -1, 1 }, { F_HASHSRAM, "MPS hash SRAM parity error", -1, 1 }, { 0 } }; int fat; fat = t4_handle_intr_status(adapter, A_MPS_RX_PERR_INT_CAUSE, mps_rx_intr_info) + t4_handle_intr_status(adapter, A_MPS_TX_INT_CAUSE, mps_tx_intr_info) + t4_handle_intr_status(adapter, A_MPS_TRC_INT_CAUSE, mps_trc_intr_info) + t4_handle_intr_status(adapter, A_MPS_STAT_PERR_INT_CAUSE_SRAM, mps_stat_sram_intr_info) + t4_handle_intr_status(adapter, A_MPS_STAT_PERR_INT_CAUSE_TX_FIFO, mps_stat_tx_intr_info) + t4_handle_intr_status(adapter, A_MPS_STAT_PERR_INT_CAUSE_RX_FIFO, mps_stat_rx_intr_info) + t4_handle_intr_status(adapter, A_MPS_CLS_INT_CAUSE, mps_cls_intr_info); t4_write_reg(adapter, A_MPS_INT_CAUSE, 0); t4_read_reg(adapter, A_MPS_INT_CAUSE); /* flush */ if (fat) t4_fatal_err(adapter); } #define MEM_INT_MASK (F_PERR_INT_CAUSE | F_ECC_CE_INT_CAUSE | \ F_ECC_UE_INT_CAUSE) /* * EDC/MC interrupt handler. */ static void mem_intr_handler(struct adapter *adapter, int idx) { static const char name[4][7] = { "EDC0", "EDC1", "MC/MC0", "MC1" }; unsigned int addr, cnt_addr, v; if (idx <= MEM_EDC1) { addr = EDC_REG(A_EDC_INT_CAUSE, idx); cnt_addr = EDC_REG(A_EDC_ECC_STATUS, idx); } else if (idx == MEM_MC) { if (is_t4(adapter)) { addr = A_MC_INT_CAUSE; cnt_addr = A_MC_ECC_STATUS; } else { addr = A_MC_P_INT_CAUSE; cnt_addr = A_MC_P_ECC_STATUS; } } else { addr = MC_REG(A_MC_P_INT_CAUSE, 1); cnt_addr = MC_REG(A_MC_P_ECC_STATUS, 1); } v = t4_read_reg(adapter, addr) & MEM_INT_MASK; if (v & F_PERR_INT_CAUSE) CH_ALERT(adapter, "%s FIFO parity error\n", name[idx]); if (v & F_ECC_CE_INT_CAUSE) { u32 cnt = G_ECC_CECNT(t4_read_reg(adapter, cnt_addr)); if (idx <= MEM_EDC1) t4_edc_err_read(adapter, idx); t4_write_reg(adapter, cnt_addr, V_ECC_CECNT(M_ECC_CECNT)); CH_WARN_RATELIMIT(adapter, "%u %s correctable ECC data error%s\n", cnt, name[idx], cnt > 1 ? "s" : ""); } if (v & F_ECC_UE_INT_CAUSE) CH_ALERT(adapter, "%s uncorrectable ECC data error\n", name[idx]); t4_write_reg(adapter, addr, v); if (v & (F_PERR_INT_CAUSE | F_ECC_UE_INT_CAUSE)) t4_fatal_err(adapter); } /* * MA interrupt handler. */ static void ma_intr_handler(struct adapter *adapter) { u32 v, status = t4_read_reg(adapter, A_MA_INT_CAUSE); if (status & F_MEM_PERR_INT_CAUSE) { CH_ALERT(adapter, "MA parity error, parity status %#x\n", t4_read_reg(adapter, A_MA_PARITY_ERROR_STATUS1)); if (is_t5(adapter)) CH_ALERT(adapter, "MA parity error, parity status %#x\n", t4_read_reg(adapter, A_MA_PARITY_ERROR_STATUS2)); } if (status & F_MEM_WRAP_INT_CAUSE) { v = t4_read_reg(adapter, A_MA_INT_WRAP_STATUS); CH_ALERT(adapter, "MA address wrap-around error by " "client %u to address %#x\n", G_MEM_WRAP_CLIENT_NUM(v), G_MEM_WRAP_ADDRESS(v) << 4); } t4_write_reg(adapter, A_MA_INT_CAUSE, status); t4_fatal_err(adapter); } /* * SMB interrupt handler. */ static void smb_intr_handler(struct adapter *adap) { static const struct intr_info smb_intr_info[] = { { F_MSTTXFIFOPARINT, "SMB master Tx FIFO parity error", -1, 1 }, { F_MSTRXFIFOPARINT, "SMB master Rx FIFO parity error", -1, 1 }, { F_SLVFIFOPARINT, "SMB slave FIFO parity error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adap, A_SMB_INT_CAUSE, smb_intr_info)) t4_fatal_err(adap); } /* * NC-SI interrupt handler. */ static void ncsi_intr_handler(struct adapter *adap) { static const struct intr_info ncsi_intr_info[] = { { F_CIM_DM_PRTY_ERR, "NC-SI CIM parity error", -1, 1 }, { F_MPS_DM_PRTY_ERR, "NC-SI MPS parity error", -1, 1 }, { F_TXFIFO_PRTY_ERR, "NC-SI Tx FIFO parity error", -1, 1 }, { F_RXFIFO_PRTY_ERR, "NC-SI Rx FIFO parity error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adap, A_NCSI_INT_CAUSE, ncsi_intr_info)) t4_fatal_err(adap); } /* * XGMAC interrupt handler. */ static void xgmac_intr_handler(struct adapter *adap, int port) { u32 v, int_cause_reg; if (is_t4(adap)) int_cause_reg = PORT_REG(port, A_XGMAC_PORT_INT_CAUSE); else int_cause_reg = T5_PORT_REG(port, A_MAC_PORT_INT_CAUSE); v = t4_read_reg(adap, int_cause_reg); v &= (F_TXFIFO_PRTY_ERR | F_RXFIFO_PRTY_ERR); if (!v) return; if (v & F_TXFIFO_PRTY_ERR) CH_ALERT(adap, "XGMAC %d Tx FIFO parity error\n", port); if (v & F_RXFIFO_PRTY_ERR) CH_ALERT(adap, "XGMAC %d Rx FIFO parity error\n", port); t4_write_reg(adap, int_cause_reg, v); t4_fatal_err(adap); } /* * PL interrupt handler. */ static void pl_intr_handler(struct adapter *adap) { static const struct intr_info pl_intr_info[] = { { F_FATALPERR, "Fatal parity error", -1, 1 }, { F_PERRVFID, "PL VFID_MAP parity error", -1, 1 }, { 0 } }; static const struct intr_info t5_pl_intr_info[] = { { F_FATALPERR, "Fatal parity error", -1, 1 }, { 0 } }; if (t4_handle_intr_status(adap, A_PL_PL_INT_CAUSE, is_t4(adap) ? pl_intr_info : t5_pl_intr_info)) t4_fatal_err(adap); } #define PF_INTR_MASK (F_PFSW | F_PFCIM) /** * t4_slow_intr_handler - control path interrupt handler * @adapter: the adapter * * T4 interrupt handler for non-data global interrupt events, e.g., errors. * The designation 'slow' is because it involves register reads, while * data interrupts typically don't involve any MMIOs. */ int t4_slow_intr_handler(struct adapter *adapter) { u32 cause = t4_read_reg(adapter, A_PL_INT_CAUSE); if (!(cause & GLBL_INTR_MASK)) return 0; if (cause & F_CIM) cim_intr_handler(adapter); if (cause & F_MPS) mps_intr_handler(adapter); if (cause & F_NCSI) ncsi_intr_handler(adapter); if (cause & F_PL) pl_intr_handler(adapter); if (cause & F_SMB) smb_intr_handler(adapter); if (cause & F_MAC0) xgmac_intr_handler(adapter, 0); if (cause & F_MAC1) xgmac_intr_handler(adapter, 1); if (cause & F_MAC2) xgmac_intr_handler(adapter, 2); if (cause & F_MAC3) xgmac_intr_handler(adapter, 3); if (cause & F_PCIE) pcie_intr_handler(adapter); if (cause & F_MC0) mem_intr_handler(adapter, MEM_MC); if (is_t5(adapter) && (cause & F_MC1)) mem_intr_handler(adapter, MEM_MC1); if (cause & F_EDC0) mem_intr_handler(adapter, MEM_EDC0); if (cause & F_EDC1) mem_intr_handler(adapter, MEM_EDC1); if (cause & F_LE) le_intr_handler(adapter); if (cause & F_TP) tp_intr_handler(adapter); if (cause & F_MA) ma_intr_handler(adapter); if (cause & F_PM_TX) pmtx_intr_handler(adapter); if (cause & F_PM_RX) pmrx_intr_handler(adapter); if (cause & F_ULP_RX) ulprx_intr_handler(adapter); if (cause & F_CPL_SWITCH) cplsw_intr_handler(adapter); if (cause & F_SGE) sge_intr_handler(adapter); if (cause & F_ULP_TX) ulptx_intr_handler(adapter); /* Clear the interrupts just processed for which we are the master. */ t4_write_reg(adapter, A_PL_INT_CAUSE, cause & GLBL_INTR_MASK); (void)t4_read_reg(adapter, A_PL_INT_CAUSE); /* flush */ return 1; } /** * t4_intr_enable - enable interrupts * @adapter: the adapter whose interrupts should be enabled * * Enable PF-specific interrupts for the calling function and the top-level * interrupt concentrator for global interrupts. Interrupts are already * enabled at each module, here we just enable the roots of the interrupt * hierarchies. * * Note: this function should be called only when the driver manages * non PF-specific interrupts from the various HW modules. Only one PCI * function at a time should be doing this. */ void t4_intr_enable(struct adapter *adapter) { u32 val = 0; u32 whoami = t4_read_reg(adapter, A_PL_WHOAMI); u32 pf = (chip_id(adapter) <= CHELSIO_T5 ? G_SOURCEPF(whoami) : G_T6_SOURCEPF(whoami)); if (chip_id(adapter) <= CHELSIO_T5) val = F_ERR_DROPPED_DB | F_ERR_EGR_CTXT_PRIO | F_DBFIFO_HP_INT; else val = F_ERR_PCIE_ERROR0 | F_ERR_PCIE_ERROR1 | F_FATAL_WRE_LEN; t4_write_reg(adapter, A_SGE_INT_ENABLE3, F_ERR_CPL_EXCEED_IQE_SIZE | F_ERR_INVALID_CIDX_INC | F_ERR_CPL_OPCODE_0 | F_ERR_DATA_CPL_ON_HIGH_QID1 | F_INGRESS_SIZE_ERR | F_ERR_DATA_CPL_ON_HIGH_QID0 | F_ERR_BAD_DB_PIDX3 | F_ERR_BAD_DB_PIDX2 | F_ERR_BAD_DB_PIDX1 | F_ERR_BAD_DB_PIDX0 | F_ERR_ING_CTXT_PRIO | F_DBFIFO_LP_INT | F_EGRESS_SIZE_ERR | val); t4_write_reg(adapter, MYPF_REG(A_PL_PF_INT_ENABLE), PF_INTR_MASK); t4_set_reg_field(adapter, A_PL_INT_MAP0, 0, 1 << pf); } /** * t4_intr_disable - disable interrupts * @adapter: the adapter whose interrupts should be disabled * * Disable interrupts. We only disable the top-level interrupt * concentrators. The caller must be a PCI function managing global * interrupts. */ void t4_intr_disable(struct adapter *adapter) { u32 whoami = t4_read_reg(adapter, A_PL_WHOAMI); u32 pf = (chip_id(adapter) <= CHELSIO_T5 ? G_SOURCEPF(whoami) : G_T6_SOURCEPF(whoami)); t4_write_reg(adapter, MYPF_REG(A_PL_PF_INT_ENABLE), 0); t4_set_reg_field(adapter, A_PL_INT_MAP0, 1 << pf, 0); } /** * t4_intr_clear - clear all interrupts * @adapter: the adapter whose interrupts should be cleared * * Clears all interrupts. The caller must be a PCI function managing * global interrupts. */ void t4_intr_clear(struct adapter *adapter) { static const unsigned int cause_reg[] = { A_SGE_INT_CAUSE1, A_SGE_INT_CAUSE2, A_SGE_INT_CAUSE3, A_PCIE_NONFAT_ERR, A_PCIE_INT_CAUSE, A_MA_INT_WRAP_STATUS, A_MA_PARITY_ERROR_STATUS1, A_MA_INT_CAUSE, A_EDC_INT_CAUSE, EDC_REG(A_EDC_INT_CAUSE, 1), A_CIM_HOST_INT_CAUSE, A_CIM_HOST_UPACC_INT_CAUSE, MYPF_REG(A_CIM_PF_HOST_INT_CAUSE), A_TP_INT_CAUSE, A_ULP_RX_INT_CAUSE, A_ULP_TX_INT_CAUSE, A_PM_RX_INT_CAUSE, A_PM_TX_INT_CAUSE, A_MPS_RX_PERR_INT_CAUSE, A_CPL_INTR_CAUSE, MYPF_REG(A_PL_PF_INT_CAUSE), A_PL_PL_INT_CAUSE, A_LE_DB_INT_CAUSE, }; unsigned int i; for (i = 0; i < ARRAY_SIZE(cause_reg); ++i) t4_write_reg(adapter, cause_reg[i], 0xffffffff); t4_write_reg(adapter, is_t4(adapter) ? A_MC_INT_CAUSE : A_MC_P_INT_CAUSE, 0xffffffff); if (is_t4(adapter)) { t4_write_reg(adapter, A_PCIE_CORE_UTL_SYSTEM_BUS_AGENT_STATUS, 0xffffffff); t4_write_reg(adapter, A_PCIE_CORE_UTL_PCI_EXPRESS_PORT_STATUS, 0xffffffff); } else t4_write_reg(adapter, A_MA_PARITY_ERROR_STATUS2, 0xffffffff); t4_write_reg(adapter, A_PL_INT_CAUSE, GLBL_INTR_MASK); (void) t4_read_reg(adapter, A_PL_INT_CAUSE); /* flush */ } /** * hash_mac_addr - return the hash value of a MAC address * @addr: the 48-bit Ethernet MAC address * * Hashes a MAC address according to the hash function used by HW inexact * (hash) address matching. */ static int hash_mac_addr(const u8 *addr) { u32 a = ((u32)addr[0] << 16) | ((u32)addr[1] << 8) | addr[2]; u32 b = ((u32)addr[3] << 16) | ((u32)addr[4] << 8) | addr[5]; a ^= b; a ^= (a >> 12); a ^= (a >> 6); return a & 0x3f; } /** * t4_config_rss_range - configure a portion of the RSS mapping table * @adapter: the adapter * @mbox: mbox to use for the FW command * @viid: virtual interface whose RSS subtable is to be written * @start: start entry in the table to write * @n: how many table entries to write * @rspq: values for the "response queue" (Ingress Queue) lookup table * @nrspq: number of values in @rspq * * Programs the selected part of the VI's RSS mapping table with the * provided values. If @nrspq < @n the supplied values are used repeatedly * until the full table range is populated. * * The caller must ensure the values in @rspq are in the range allowed for * @viid. */ int t4_config_rss_range(struct adapter *adapter, int mbox, unsigned int viid, int start, int n, const u16 *rspq, unsigned int nrspq) { int ret; const u16 *rsp = rspq; const u16 *rsp_end = rspq + nrspq; struct fw_rss_ind_tbl_cmd cmd; memset(&cmd, 0, sizeof(cmd)); cmd.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_RSS_IND_TBL_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_RSS_IND_TBL_CMD_VIID(viid)); cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd)); /* * Each firmware RSS command can accommodate up to 32 RSS Ingress * Queue Identifiers. These Ingress Queue IDs are packed three to * a 32-bit word as 10-bit values with the upper remaining 2 bits * reserved. */ while (n > 0) { int nq = min(n, 32); int nq_packed = 0; __be32 *qp = &cmd.iq0_to_iq2; /* * Set up the firmware RSS command header to send the next * "nq" Ingress Queue IDs to the firmware. */ cmd.niqid = cpu_to_be16(nq); cmd.startidx = cpu_to_be16(start); /* * "nq" more done for the start of the next loop. */ start += nq; n -= nq; /* * While there are still Ingress Queue IDs to stuff into the * current firmware RSS command, retrieve them from the * Ingress Queue ID array and insert them into the command. */ while (nq > 0) { /* * Grab up to the next 3 Ingress Queue IDs (wrapping * around the Ingress Queue ID array if necessary) and * insert them into the firmware RSS command at the * current 3-tuple position within the commad. */ u16 qbuf[3]; u16 *qbp = qbuf; int nqbuf = min(3, nq); nq -= nqbuf; qbuf[0] = qbuf[1] = qbuf[2] = 0; while (nqbuf && nq_packed < 32) { nqbuf--; nq_packed++; *qbp++ = *rsp++; if (rsp >= rsp_end) rsp = rspq; } *qp++ = cpu_to_be32(V_FW_RSS_IND_TBL_CMD_IQ0(qbuf[0]) | V_FW_RSS_IND_TBL_CMD_IQ1(qbuf[1]) | V_FW_RSS_IND_TBL_CMD_IQ2(qbuf[2])); } /* * Send this portion of the RRS table update to the firmware; * bail out on any errors. */ ret = t4_wr_mbox(adapter, mbox, &cmd, sizeof(cmd), NULL); if (ret) return ret; } return 0; } /** * t4_config_glbl_rss - configure the global RSS mode * @adapter: the adapter * @mbox: mbox to use for the FW command * @mode: global RSS mode * @flags: mode-specific flags * * Sets the global RSS mode. */ int t4_config_glbl_rss(struct adapter *adapter, int mbox, unsigned int mode, unsigned int flags) { struct fw_rss_glb_config_cmd c; memset(&c, 0, sizeof(c)); c.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_RSS_GLB_CONFIG_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); if (mode == FW_RSS_GLB_CONFIG_CMD_MODE_MANUAL) { c.u.manual.mode_pkd = cpu_to_be32(V_FW_RSS_GLB_CONFIG_CMD_MODE(mode)); } else if (mode == FW_RSS_GLB_CONFIG_CMD_MODE_BASICVIRTUAL) { c.u.basicvirtual.mode_keymode = cpu_to_be32(V_FW_RSS_GLB_CONFIG_CMD_MODE(mode)); c.u.basicvirtual.synmapen_to_hashtoeplitz = cpu_to_be32(flags); } else return -EINVAL; return t4_wr_mbox(adapter, mbox, &c, sizeof(c), NULL); } /** * t4_config_vi_rss - configure per VI RSS settings * @adapter: the adapter * @mbox: mbox to use for the FW command * @viid: the VI id * @flags: RSS flags * @defq: id of the default RSS queue for the VI. * @skeyidx: RSS secret key table index for non-global mode * @skey: RSS vf_scramble key for VI. * * Configures VI-specific RSS properties. */ int t4_config_vi_rss(struct adapter *adapter, int mbox, unsigned int viid, unsigned int flags, unsigned int defq, unsigned int skeyidx, unsigned int skey) { struct fw_rss_vi_config_cmd c; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_RSS_VI_CONFIG_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_RSS_VI_CONFIG_CMD_VIID(viid)); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); c.u.basicvirtual.defaultq_to_udpen = cpu_to_be32(flags | V_FW_RSS_VI_CONFIG_CMD_DEFAULTQ(defq)); c.u.basicvirtual.secretkeyidx_pkd = cpu_to_be32( V_FW_RSS_VI_CONFIG_CMD_SECRETKEYIDX(skeyidx)); c.u.basicvirtual.secretkeyxor = cpu_to_be32(skey); return t4_wr_mbox(adapter, mbox, &c, sizeof(c), NULL); } /* Read an RSS table row */ static int rd_rss_row(struct adapter *adap, int row, u32 *val) { t4_write_reg(adap, A_TP_RSS_LKP_TABLE, 0xfff00000 | row); return t4_wait_op_done_val(adap, A_TP_RSS_LKP_TABLE, F_LKPTBLROWVLD, 1, 5, 0, val); } /** * t4_read_rss - read the contents of the RSS mapping table * @adapter: the adapter * @map: holds the contents of the RSS mapping table * * Reads the contents of the RSS hash->queue mapping table. */ int t4_read_rss(struct adapter *adapter, u16 *map) { u32 val; int i, ret; for (i = 0; i < RSS_NENTRIES / 2; ++i) { ret = rd_rss_row(adapter, i, &val); if (ret) return ret; *map++ = G_LKPTBLQUEUE0(val); *map++ = G_LKPTBLQUEUE1(val); } return 0; } /** * t4_tp_fw_ldst_rw - Access TP indirect register through LDST * @adap: the adapter * @cmd: TP fw ldst address space type * @vals: where the indirect register values are stored/written * @nregs: how many indirect registers to read/write * @start_idx: index of first indirect register to read/write * @rw: Read (1) or Write (0) * @sleep_ok: if true we may sleep while awaiting command completion * * Access TP indirect registers through LDST **/ static int t4_tp_fw_ldst_rw(struct adapter *adap, int cmd, u32 *vals, unsigned int nregs, unsigned int start_index, unsigned int rw, bool sleep_ok) { int ret = 0; unsigned int i; struct fw_ldst_cmd c; for (i = 0; i < nregs; i++) { memset(&c, 0, sizeof(c)); c.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | (rw ? F_FW_CMD_READ : F_FW_CMD_WRITE) | V_FW_LDST_CMD_ADDRSPACE(cmd)); c.cycles_to_len16 = cpu_to_be32(FW_LEN16(c)); c.u.addrval.addr = cpu_to_be32(start_index + i); c.u.addrval.val = rw ? 0 : cpu_to_be32(vals[i]); ret = t4_wr_mbox_meat(adap, adap->mbox, &c, sizeof(c), &c, sleep_ok); if (ret) return ret; if (rw) vals[i] = be32_to_cpu(c.u.addrval.val); } return 0; } /** * t4_tp_indirect_rw - Read/Write TP indirect register through LDST or backdoor * @adap: the adapter * @reg_addr: Address Register * @reg_data: Data register * @buff: where the indirect register values are stored/written * @nregs: how many indirect registers to read/write * @start_index: index of first indirect register to read/write * @rw: READ(1) or WRITE(0) * @sleep_ok: if true we may sleep while awaiting command completion * * Read/Write TP indirect registers through LDST if possible. * Else, use backdoor access **/ static void t4_tp_indirect_rw(struct adapter *adap, u32 reg_addr, u32 reg_data, u32 *buff, u32 nregs, u32 start_index, int rw, bool sleep_ok) { int rc = -EINVAL; int cmd; switch (reg_addr) { case A_TP_PIO_ADDR: cmd = FW_LDST_ADDRSPC_TP_PIO; break; case A_TP_TM_PIO_ADDR: cmd = FW_LDST_ADDRSPC_TP_TM_PIO; break; case A_TP_MIB_INDEX: cmd = FW_LDST_ADDRSPC_TP_MIB; break; default: goto indirect_access; } if (t4_use_ldst(adap)) rc = t4_tp_fw_ldst_rw(adap, cmd, buff, nregs, start_index, rw, sleep_ok); indirect_access: if (rc) { if (rw) t4_read_indirect(adap, reg_addr, reg_data, buff, nregs, start_index); else t4_write_indirect(adap, reg_addr, reg_data, buff, nregs, start_index); } } /** * t4_tp_pio_read - Read TP PIO registers * @adap: the adapter * @buff: where the indirect register values are written * @nregs: how many indirect registers to read * @start_index: index of first indirect register to read * @sleep_ok: if true we may sleep while awaiting command completion * * Read TP PIO Registers **/ void t4_tp_pio_read(struct adapter *adap, u32 *buff, u32 nregs, u32 start_index, bool sleep_ok) { t4_tp_indirect_rw(adap, A_TP_PIO_ADDR, A_TP_PIO_DATA, buff, nregs, start_index, 1, sleep_ok); } /** * t4_tp_pio_write - Write TP PIO registers * @adap: the adapter * @buff: where the indirect register values are stored * @nregs: how many indirect registers to write * @start_index: index of first indirect register to write * @sleep_ok: if true we may sleep while awaiting command completion * * Write TP PIO Registers **/ void t4_tp_pio_write(struct adapter *adap, const u32 *buff, u32 nregs, u32 start_index, bool sleep_ok) { t4_tp_indirect_rw(adap, A_TP_PIO_ADDR, A_TP_PIO_DATA, __DECONST(u32 *, buff), nregs, start_index, 0, sleep_ok); } /** * t4_tp_tm_pio_read - Read TP TM PIO registers * @adap: the adapter * @buff: where the indirect register values are written * @nregs: how many indirect registers to read * @start_index: index of first indirect register to read * @sleep_ok: if true we may sleep while awaiting command completion * * Read TP TM PIO Registers **/ void t4_tp_tm_pio_read(struct adapter *adap, u32 *buff, u32 nregs, u32 start_index, bool sleep_ok) { t4_tp_indirect_rw(adap, A_TP_TM_PIO_ADDR, A_TP_TM_PIO_DATA, buff, nregs, start_index, 1, sleep_ok); } /** * t4_tp_mib_read - Read TP MIB registers * @adap: the adapter * @buff: where the indirect register values are written * @nregs: how many indirect registers to read * @start_index: index of first indirect register to read * @sleep_ok: if true we may sleep while awaiting command completion * * Read TP MIB Registers **/ void t4_tp_mib_read(struct adapter *adap, u32 *buff, u32 nregs, u32 start_index, bool sleep_ok) { t4_tp_indirect_rw(adap, A_TP_MIB_INDEX, A_TP_MIB_DATA, buff, nregs, start_index, 1, sleep_ok); } /** * t4_read_rss_key - read the global RSS key * @adap: the adapter * @key: 10-entry array holding the 320-bit RSS key * @sleep_ok: if true we may sleep while awaiting command completion * * Reads the global 320-bit RSS key. */ void t4_read_rss_key(struct adapter *adap, u32 *key, bool sleep_ok) { t4_tp_pio_read(adap, key, 10, A_TP_RSS_SECRET_KEY0, sleep_ok); } /** * t4_write_rss_key - program one of the RSS keys * @adap: the adapter * @key: 10-entry array holding the 320-bit RSS key * @idx: which RSS key to write * @sleep_ok: if true we may sleep while awaiting command completion * * Writes one of the RSS keys with the given 320-bit value. If @idx is * 0..15 the corresponding entry in the RSS key table is written, * otherwise the global RSS key is written. */ void t4_write_rss_key(struct adapter *adap, const u32 *key, int idx, bool sleep_ok) { u8 rss_key_addr_cnt = 16; u32 vrt = t4_read_reg(adap, A_TP_RSS_CONFIG_VRT); /* * T6 and later: for KeyMode 3 (per-vf and per-vf scramble), * allows access to key addresses 16-63 by using KeyWrAddrX * as index[5:4](upper 2) into key table */ if ((chip_id(adap) > CHELSIO_T5) && (vrt & F_KEYEXTEND) && (G_KEYMODE(vrt) == 3)) rss_key_addr_cnt = 32; t4_tp_pio_write(adap, key, 10, A_TP_RSS_SECRET_KEY0, sleep_ok); if (idx >= 0 && idx < rss_key_addr_cnt) { if (rss_key_addr_cnt > 16) t4_write_reg(adap, A_TP_RSS_CONFIG_VRT, vrt | V_KEYWRADDRX(idx >> 4) | V_T6_VFWRADDR(idx) | F_KEYWREN); else t4_write_reg(adap, A_TP_RSS_CONFIG_VRT, vrt| V_KEYWRADDR(idx) | F_KEYWREN); } } /** * t4_read_rss_pf_config - read PF RSS Configuration Table * @adapter: the adapter * @index: the entry in the PF RSS table to read * @valp: where to store the returned value * @sleep_ok: if true we may sleep while awaiting command completion * * Reads the PF RSS Configuration Table at the specified index and returns * the value found there. */ void t4_read_rss_pf_config(struct adapter *adapter, unsigned int index, u32 *valp, bool sleep_ok) { t4_tp_pio_read(adapter, valp, 1, A_TP_RSS_PF0_CONFIG + index, sleep_ok); } /** * t4_write_rss_pf_config - write PF RSS Configuration Table * @adapter: the adapter * @index: the entry in the VF RSS table to read * @val: the value to store * @sleep_ok: if true we may sleep while awaiting command completion * * Writes the PF RSS Configuration Table at the specified index with the * specified value. */ void t4_write_rss_pf_config(struct adapter *adapter, unsigned int index, u32 val, bool sleep_ok) { t4_tp_pio_write(adapter, &val, 1, A_TP_RSS_PF0_CONFIG + index, sleep_ok); } /** * t4_read_rss_vf_config - read VF RSS Configuration Table * @adapter: the adapter * @index: the entry in the VF RSS table to read * @vfl: where to store the returned VFL * @vfh: where to store the returned VFH * @sleep_ok: if true we may sleep while awaiting command completion * * Reads the VF RSS Configuration Table at the specified index and returns * the (VFL, VFH) values found there. */ void t4_read_rss_vf_config(struct adapter *adapter, unsigned int index, u32 *vfl, u32 *vfh, bool sleep_ok) { u32 vrt, mask, data; if (chip_id(adapter) <= CHELSIO_T5) { mask = V_VFWRADDR(M_VFWRADDR); data = V_VFWRADDR(index); } else { mask = V_T6_VFWRADDR(M_T6_VFWRADDR); data = V_T6_VFWRADDR(index); } /* * Request that the index'th VF Table values be read into VFL/VFH. */ vrt = t4_read_reg(adapter, A_TP_RSS_CONFIG_VRT); vrt &= ~(F_VFRDRG | F_VFWREN | F_KEYWREN | mask); vrt |= data | F_VFRDEN; t4_write_reg(adapter, A_TP_RSS_CONFIG_VRT, vrt); /* * Grab the VFL/VFH values ... */ t4_tp_pio_read(adapter, vfl, 1, A_TP_RSS_VFL_CONFIG, sleep_ok); t4_tp_pio_read(adapter, vfh, 1, A_TP_RSS_VFH_CONFIG, sleep_ok); } /** * t4_write_rss_vf_config - write VF RSS Configuration Table * * @adapter: the adapter * @index: the entry in the VF RSS table to write * @vfl: the VFL to store * @vfh: the VFH to store * * Writes the VF RSS Configuration Table at the specified index with the * specified (VFL, VFH) values. */ void t4_write_rss_vf_config(struct adapter *adapter, unsigned int index, u32 vfl, u32 vfh, bool sleep_ok) { u32 vrt, mask, data; if (chip_id(adapter) <= CHELSIO_T5) { mask = V_VFWRADDR(M_VFWRADDR); data = V_VFWRADDR(index); } else { mask = V_T6_VFWRADDR(M_T6_VFWRADDR); data = V_T6_VFWRADDR(index); } /* * Load up VFL/VFH with the values to be written ... */ t4_tp_pio_write(adapter, &vfl, 1, A_TP_RSS_VFL_CONFIG, sleep_ok); t4_tp_pio_write(adapter, &vfh, 1, A_TP_RSS_VFH_CONFIG, sleep_ok); /* * Write the VFL/VFH into the VF Table at index'th location. */ vrt = t4_read_reg(adapter, A_TP_RSS_CONFIG_VRT); vrt &= ~(F_VFRDRG | F_VFWREN | F_KEYWREN | mask); vrt |= data | F_VFRDEN; t4_write_reg(adapter, A_TP_RSS_CONFIG_VRT, vrt); } /** * t4_read_rss_pf_map - read PF RSS Map * @adapter: the adapter * @sleep_ok: if true we may sleep while awaiting command completion * * Reads the PF RSS Map register and returns its value. */ u32 t4_read_rss_pf_map(struct adapter *adapter, bool sleep_ok) { u32 pfmap; t4_tp_pio_read(adapter, &pfmap, 1, A_TP_RSS_PF_MAP, sleep_ok); return pfmap; } /** * t4_write_rss_pf_map - write PF RSS Map * @adapter: the adapter * @pfmap: PF RSS Map value * * Writes the specified value to the PF RSS Map register. */ void t4_write_rss_pf_map(struct adapter *adapter, u32 pfmap, bool sleep_ok) { t4_tp_pio_write(adapter, &pfmap, 1, A_TP_RSS_PF_MAP, sleep_ok); } /** * t4_read_rss_pf_mask - read PF RSS Mask * @adapter: the adapter * @sleep_ok: if true we may sleep while awaiting command completion * * Reads the PF RSS Mask register and returns its value. */ u32 t4_read_rss_pf_mask(struct adapter *adapter, bool sleep_ok) { u32 pfmask; t4_tp_pio_read(adapter, &pfmask, 1, A_TP_RSS_PF_MSK, sleep_ok); return pfmask; } /** * t4_write_rss_pf_mask - write PF RSS Mask * @adapter: the adapter * @pfmask: PF RSS Mask value * * Writes the specified value to the PF RSS Mask register. */ void t4_write_rss_pf_mask(struct adapter *adapter, u32 pfmask, bool sleep_ok) { t4_tp_pio_write(adapter, &pfmask, 1, A_TP_RSS_PF_MSK, sleep_ok); } /** * t4_tp_get_tcp_stats - read TP's TCP MIB counters * @adap: the adapter * @v4: holds the TCP/IP counter values * @v6: holds the TCP/IPv6 counter values * @sleep_ok: if true we may sleep while awaiting command completion * * Returns the values of TP's TCP/IP and TCP/IPv6 MIB counters. * Either @v4 or @v6 may be %NULL to skip the corresponding stats. */ void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4, struct tp_tcp_stats *v6, bool sleep_ok) { u32 val[A_TP_MIB_TCP_RXT_SEG_LO - A_TP_MIB_TCP_OUT_RST + 1]; #define STAT_IDX(x) ((A_TP_MIB_TCP_##x) - A_TP_MIB_TCP_OUT_RST) #define STAT(x) val[STAT_IDX(x)] #define STAT64(x) (((u64)STAT(x##_HI) << 32) | STAT(x##_LO)) if (v4) { t4_tp_mib_read(adap, val, ARRAY_SIZE(val), A_TP_MIB_TCP_OUT_RST, sleep_ok); v4->tcp_out_rsts = STAT(OUT_RST); v4->tcp_in_segs = STAT64(IN_SEG); v4->tcp_out_segs = STAT64(OUT_SEG); v4->tcp_retrans_segs = STAT64(RXT_SEG); } if (v6) { t4_tp_mib_read(adap, val, ARRAY_SIZE(val), A_TP_MIB_TCP_V6OUT_RST, sleep_ok); v6->tcp_out_rsts = STAT(OUT_RST); v6->tcp_in_segs = STAT64(IN_SEG); v6->tcp_out_segs = STAT64(OUT_SEG); v6->tcp_retrans_segs = STAT64(RXT_SEG); } #undef STAT64 #undef STAT #undef STAT_IDX } /** * t4_tp_get_err_stats - read TP's error MIB counters * @adap: the adapter * @st: holds the counter values * @sleep_ok: if true we may sleep while awaiting command completion * * Returns the values of TP's error counters. */ void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st, bool sleep_ok) { int nchan = adap->chip_params->nchan; t4_tp_mib_read(adap, st->mac_in_errs, nchan, A_TP_MIB_MAC_IN_ERR_0, sleep_ok); t4_tp_mib_read(adap, st->hdr_in_errs, nchan, A_TP_MIB_HDR_IN_ERR_0, sleep_ok); t4_tp_mib_read(adap, st->tcp_in_errs, nchan, A_TP_MIB_TCP_IN_ERR_0, sleep_ok); t4_tp_mib_read(adap, st->tnl_cong_drops, nchan, A_TP_MIB_TNL_CNG_DROP_0, sleep_ok); t4_tp_mib_read(adap, st->ofld_chan_drops, nchan, A_TP_MIB_OFD_CHN_DROP_0, sleep_ok); t4_tp_mib_read(adap, st->tnl_tx_drops, nchan, A_TP_MIB_TNL_DROP_0, sleep_ok); t4_tp_mib_read(adap, st->ofld_vlan_drops, nchan, A_TP_MIB_OFD_VLN_DROP_0, sleep_ok); t4_tp_mib_read(adap, st->tcp6_in_errs, nchan, A_TP_MIB_TCP_V6IN_ERR_0, sleep_ok); t4_tp_mib_read(adap, &st->ofld_no_neigh, 2, A_TP_MIB_OFD_ARP_DROP, sleep_ok); } /** * t4_tp_get_proxy_stats - read TP's proxy MIB counters * @adap: the adapter * @st: holds the counter values * * Returns the values of TP's proxy counters. */ void t4_tp_get_proxy_stats(struct adapter *adap, struct tp_proxy_stats *st, bool sleep_ok) { int nchan = adap->chip_params->nchan; t4_tp_mib_read(adap, st->proxy, nchan, A_TP_MIB_TNL_LPBK_0, sleep_ok); } /** * t4_tp_get_cpl_stats - read TP's CPL MIB counters * @adap: the adapter * @st: holds the counter values * @sleep_ok: if true we may sleep while awaiting command completion * * Returns the values of TP's CPL counters. */ void t4_tp_get_cpl_stats(struct adapter *adap, struct tp_cpl_stats *st, bool sleep_ok) { int nchan = adap->chip_params->nchan; t4_tp_mib_read(adap, st->req, nchan, A_TP_MIB_CPL_IN_REQ_0, sleep_ok); t4_tp_mib_read(adap, st->rsp, nchan, A_TP_MIB_CPL_OUT_RSP_0, sleep_ok); } /** * t4_tp_get_rdma_stats - read TP's RDMA MIB counters * @adap: the adapter * @st: holds the counter values * * Returns the values of TP's RDMA counters. */ void t4_tp_get_rdma_stats(struct adapter *adap, struct tp_rdma_stats *st, bool sleep_ok) { t4_tp_mib_read(adap, &st->rqe_dfr_pkt, 2, A_TP_MIB_RQE_DFR_PKT, sleep_ok); } /** * t4_get_fcoe_stats - read TP's FCoE MIB counters for a port * @adap: the adapter * @idx: the port index * @st: holds the counter values * @sleep_ok: if true we may sleep while awaiting command completion * * Returns the values of TP's FCoE counters for the selected port. */ void t4_get_fcoe_stats(struct adapter *adap, unsigned int idx, struct tp_fcoe_stats *st, bool sleep_ok) { u32 val[2]; t4_tp_mib_read(adap, &st->frames_ddp, 1, A_TP_MIB_FCOE_DDP_0 + idx, sleep_ok); t4_tp_mib_read(adap, &st->frames_drop, 1, A_TP_MIB_FCOE_DROP_0 + idx, sleep_ok); t4_tp_mib_read(adap, val, 2, A_TP_MIB_FCOE_BYTE_0_HI + 2 * idx, sleep_ok); st->octets_ddp = ((u64)val[0] << 32) | val[1]; } /** * t4_get_usm_stats - read TP's non-TCP DDP MIB counters * @adap: the adapter * @st: holds the counter values * @sleep_ok: if true we may sleep while awaiting command completion * * Returns the values of TP's counters for non-TCP directly-placed packets. */ void t4_get_usm_stats(struct adapter *adap, struct tp_usm_stats *st, bool sleep_ok) { u32 val[4]; t4_tp_mib_read(adap, val, 4, A_TP_MIB_USM_PKTS, sleep_ok); st->frames = val[0]; st->drops = val[1]; st->octets = ((u64)val[2] << 32) | val[3]; } /** * t4_read_mtu_tbl - returns the values in the HW path MTU table * @adap: the adapter * @mtus: where to store the MTU values * @mtu_log: where to store the MTU base-2 log (may be %NULL) * * Reads the HW path MTU table. */ void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log) { u32 v; int i; for (i = 0; i < NMTUS; ++i) { t4_write_reg(adap, A_TP_MTU_TABLE, V_MTUINDEX(0xff) | V_MTUVALUE(i)); v = t4_read_reg(adap, A_TP_MTU_TABLE); mtus[i] = G_MTUVALUE(v); if (mtu_log) mtu_log[i] = G_MTUWIDTH(v); } } /** * t4_read_cong_tbl - reads the congestion control table * @adap: the adapter * @incr: where to store the alpha values * * Reads the additive increments programmed into the HW congestion * control table. */ void t4_read_cong_tbl(struct adapter *adap, u16 incr[NMTUS][NCCTRL_WIN]) { unsigned int mtu, w; for (mtu = 0; mtu < NMTUS; ++mtu) for (w = 0; w < NCCTRL_WIN; ++w) { t4_write_reg(adap, A_TP_CCTRL_TABLE, V_ROWINDEX(0xffff) | (mtu << 5) | w); incr[mtu][w] = (u16)t4_read_reg(adap, A_TP_CCTRL_TABLE) & 0x1fff; } } /** * t4_tp_wr_bits_indirect - set/clear bits in an indirect TP register * @adap: the adapter * @addr: the indirect TP register address * @mask: specifies the field within the register to modify * @val: new value for the field * * Sets a field of an indirect TP register to the given value. */ void t4_tp_wr_bits_indirect(struct adapter *adap, unsigned int addr, unsigned int mask, unsigned int val) { t4_write_reg(adap, A_TP_PIO_ADDR, addr); val |= t4_read_reg(adap, A_TP_PIO_DATA) & ~mask; t4_write_reg(adap, A_TP_PIO_DATA, val); } /** * init_cong_ctrl - initialize congestion control parameters * @a: the alpha values for congestion control * @b: the beta values for congestion control * * Initialize the congestion control parameters. */ static void init_cong_ctrl(unsigned short *a, unsigned short *b) { a[0] = a[1] = a[2] = a[3] = a[4] = a[5] = a[6] = a[7] = a[8] = 1; a[9] = 2; a[10] = 3; a[11] = 4; a[12] = 5; a[13] = 6; a[14] = 7; a[15] = 8; a[16] = 9; a[17] = 10; a[18] = 14; a[19] = 17; a[20] = 21; a[21] = 25; a[22] = 30; a[23] = 35; a[24] = 45; a[25] = 60; a[26] = 80; a[27] = 100; a[28] = 200; a[29] = 300; a[30] = 400; a[31] = 500; b[0] = b[1] = b[2] = b[3] = b[4] = b[5] = b[6] = b[7] = b[8] = 0; b[9] = b[10] = 1; b[11] = b[12] = 2; b[13] = b[14] = b[15] = b[16] = 3; b[17] = b[18] = b[19] = b[20] = b[21] = 4; b[22] = b[23] = b[24] = b[25] = b[26] = b[27] = 5; b[28] = b[29] = 6; b[30] = b[31] = 7; } /* The minimum additive increment value for the congestion control table */ #define CC_MIN_INCR 2U /** * t4_load_mtus - write the MTU and congestion control HW tables * @adap: the adapter * @mtus: the values for the MTU table * @alpha: the values for the congestion control alpha parameter * @beta: the values for the congestion control beta parameter * * Write the HW MTU table with the supplied MTUs and the high-speed * congestion control table with the supplied alpha, beta, and MTUs. * We write the two tables together because the additive increments * depend on the MTUs. */ void t4_load_mtus(struct adapter *adap, const unsigned short *mtus, const unsigned short *alpha, const unsigned short *beta) { static const unsigned int avg_pkts[NCCTRL_WIN] = { 2, 6, 10, 14, 20, 28, 40, 56, 80, 112, 160, 224, 320, 448, 640, 896, 1281, 1792, 2560, 3584, 5120, 7168, 10240, 14336, 20480, 28672, 40960, 57344, 81920, 114688, 163840, 229376 }; unsigned int i, w; for (i = 0; i < NMTUS; ++i) { unsigned int mtu = mtus[i]; unsigned int log2 = fls(mtu); if (!(mtu & ((1 << log2) >> 2))) /* round */ log2--; t4_write_reg(adap, A_TP_MTU_TABLE, V_MTUINDEX(i) | V_MTUWIDTH(log2) | V_MTUVALUE(mtu)); for (w = 0; w < NCCTRL_WIN; ++w) { unsigned int inc; inc = max(((mtu - 40) * alpha[w]) / avg_pkts[w], CC_MIN_INCR); t4_write_reg(adap, A_TP_CCTRL_TABLE, (i << 21) | (w << 16) | (beta[w] << 13) | inc); } } } /** * t4_set_pace_tbl - set the pace table * @adap: the adapter * @pace_vals: the pace values in microseconds * @start: index of the first entry in the HW pace table to set * @n: how many entries to set * * Sets (a subset of the) HW pace table. */ int t4_set_pace_tbl(struct adapter *adap, const unsigned int *pace_vals, unsigned int start, unsigned int n) { unsigned int vals[NTX_SCHED], i; unsigned int tick_ns = dack_ticks_to_usec(adap, 1000); if (n > NTX_SCHED) return -ERANGE; /* convert values from us to dack ticks, rounding to closest value */ for (i = 0; i < n; i++, pace_vals++) { vals[i] = (1000 * *pace_vals + tick_ns / 2) / tick_ns; if (vals[i] > 0x7ff) return -ERANGE; if (*pace_vals && vals[i] == 0) return -ERANGE; } for (i = 0; i < n; i++, start++) t4_write_reg(adap, A_TP_PACE_TABLE, (start << 16) | vals[i]); return 0; } /** * t4_set_sched_bps - set the bit rate for a HW traffic scheduler * @adap: the adapter * @kbps: target rate in Kbps * @sched: the scheduler index * * Configure a Tx HW scheduler for the target rate. */ int t4_set_sched_bps(struct adapter *adap, int sched, unsigned int kbps) { unsigned int v, tps, cpt, bpt, delta, mindelta = ~0; unsigned int clk = adap->params.vpd.cclk * 1000; unsigned int selected_cpt = 0, selected_bpt = 0; if (kbps > 0) { kbps *= 125; /* -> bytes */ for (cpt = 1; cpt <= 255; cpt++) { tps = clk / cpt; bpt = (kbps + tps / 2) / tps; if (bpt > 0 && bpt <= 255) { v = bpt * tps; delta = v >= kbps ? v - kbps : kbps - v; if (delta < mindelta) { mindelta = delta; selected_cpt = cpt; selected_bpt = bpt; } } else if (selected_cpt) break; } if (!selected_cpt) return -EINVAL; } t4_write_reg(adap, A_TP_TM_PIO_ADDR, A_TP_TX_MOD_Q1_Q0_RATE_LIMIT - sched / 2); v = t4_read_reg(adap, A_TP_TM_PIO_DATA); if (sched & 1) v = (v & 0xffff) | (selected_cpt << 16) | (selected_bpt << 24); else v = (v & 0xffff0000) | selected_cpt | (selected_bpt << 8); t4_write_reg(adap, A_TP_TM_PIO_DATA, v); return 0; } /** * t4_set_sched_ipg - set the IPG for a Tx HW packet rate scheduler * @adap: the adapter * @sched: the scheduler index * @ipg: the interpacket delay in tenths of nanoseconds * * Set the interpacket delay for a HW packet rate scheduler. */ int t4_set_sched_ipg(struct adapter *adap, int sched, unsigned int ipg) { unsigned int v, addr = A_TP_TX_MOD_Q1_Q0_TIMER_SEPARATOR - sched / 2; /* convert ipg to nearest number of core clocks */ ipg *= core_ticks_per_usec(adap); ipg = (ipg + 5000) / 10000; if (ipg > M_TXTIMERSEPQ0) return -EINVAL; t4_write_reg(adap, A_TP_TM_PIO_ADDR, addr); v = t4_read_reg(adap, A_TP_TM_PIO_DATA); if (sched & 1) v = (v & V_TXTIMERSEPQ0(M_TXTIMERSEPQ0)) | V_TXTIMERSEPQ1(ipg); else v = (v & V_TXTIMERSEPQ1(M_TXTIMERSEPQ1)) | V_TXTIMERSEPQ0(ipg); t4_write_reg(adap, A_TP_TM_PIO_DATA, v); t4_read_reg(adap, A_TP_TM_PIO_DATA); return 0; } /* * Calculates a rate in bytes/s given the number of 256-byte units per 4K core * clocks. The formula is * * bytes/s = bytes256 * 256 * ClkFreq / 4096 * * which is equivalent to * * bytes/s = 62.5 * bytes256 * ClkFreq_ms */ static u64 chan_rate(struct adapter *adap, unsigned int bytes256) { u64 v = bytes256 * adap->params.vpd.cclk; return v * 62 + v / 2; } /** * t4_get_chan_txrate - get the current per channel Tx rates * @adap: the adapter * @nic_rate: rates for NIC traffic * @ofld_rate: rates for offloaded traffic * * Return the current Tx rates in bytes/s for NIC and offloaded traffic * for each channel. */ void t4_get_chan_txrate(struct adapter *adap, u64 *nic_rate, u64 *ofld_rate) { u32 v; v = t4_read_reg(adap, A_TP_TX_TRATE); nic_rate[0] = chan_rate(adap, G_TNLRATE0(v)); nic_rate[1] = chan_rate(adap, G_TNLRATE1(v)); if (adap->chip_params->nchan > 2) { nic_rate[2] = chan_rate(adap, G_TNLRATE2(v)); nic_rate[3] = chan_rate(adap, G_TNLRATE3(v)); } v = t4_read_reg(adap, A_TP_TX_ORATE); ofld_rate[0] = chan_rate(adap, G_OFDRATE0(v)); ofld_rate[1] = chan_rate(adap, G_OFDRATE1(v)); if (adap->chip_params->nchan > 2) { ofld_rate[2] = chan_rate(adap, G_OFDRATE2(v)); ofld_rate[3] = chan_rate(adap, G_OFDRATE3(v)); } } /** * t4_set_trace_filter - configure one of the tracing filters * @adap: the adapter * @tp: the desired trace filter parameters * @idx: which filter to configure * @enable: whether to enable or disable the filter * * Configures one of the tracing filters available in HW. If @tp is %NULL * it indicates that the filter is already written in the register and it * just needs to be enabled or disabled. */ int t4_set_trace_filter(struct adapter *adap, const struct trace_params *tp, int idx, int enable) { int i, ofst = idx * 4; u32 data_reg, mask_reg, cfg; u32 multitrc = F_TRCMULTIFILTER; u32 en = is_t4(adap) ? F_TFEN : F_T5_TFEN; if (idx < 0 || idx >= NTRACE) return -EINVAL; if (tp == NULL || !enable) { t4_set_reg_field(adap, A_MPS_TRC_FILTER_MATCH_CTL_A + ofst, en, enable ? en : 0); return 0; } /* * TODO - After T4 data book is updated, specify the exact * section below. * * See T4 data book - MPS section for a complete description * of the below if..else handling of A_MPS_TRC_CFG register * value. */ cfg = t4_read_reg(adap, A_MPS_TRC_CFG); if (cfg & F_TRCMULTIFILTER) { /* * If multiple tracers are enabled, then maximum * capture size is 2.5KB (FIFO size of a single channel) * minus 2 flits for CPL_TRACE_PKT header. */ if (tp->snap_len > ((10 * 1024 / 4) - (2 * 8))) return -EINVAL; } else { /* * If multiple tracers are disabled, to avoid deadlocks * maximum packet capture size of 9600 bytes is recommended. * Also in this mode, only trace0 can be enabled and running. */ multitrc = 0; if (tp->snap_len > 9600 || idx) return -EINVAL; } if (tp->port > (is_t4(adap) ? 11 : 19) || tp->invert > 1 || tp->skip_len > M_TFLENGTH || tp->skip_ofst > M_TFOFFSET || tp->min_len > M_TFMINPKTSIZE) return -EINVAL; /* stop the tracer we'll be changing */ t4_set_reg_field(adap, A_MPS_TRC_FILTER_MATCH_CTL_A + ofst, en, 0); idx *= (A_MPS_TRC_FILTER1_MATCH - A_MPS_TRC_FILTER0_MATCH); data_reg = A_MPS_TRC_FILTER0_MATCH + idx; mask_reg = A_MPS_TRC_FILTER0_DONT_CARE + idx; for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) { t4_write_reg(adap, data_reg, tp->data[i]); t4_write_reg(adap, mask_reg, ~tp->mask[i]); } t4_write_reg(adap, A_MPS_TRC_FILTER_MATCH_CTL_B + ofst, V_TFCAPTUREMAX(tp->snap_len) | V_TFMINPKTSIZE(tp->min_len)); t4_write_reg(adap, A_MPS_TRC_FILTER_MATCH_CTL_A + ofst, V_TFOFFSET(tp->skip_ofst) | V_TFLENGTH(tp->skip_len) | en | (is_t4(adap) ? V_TFPORT(tp->port) | V_TFINVERTMATCH(tp->invert) : V_T5_TFPORT(tp->port) | V_T5_TFINVERTMATCH(tp->invert))); return 0; } /** * t4_get_trace_filter - query one of the tracing filters * @adap: the adapter * @tp: the current trace filter parameters * @idx: which trace filter to query * @enabled: non-zero if the filter is enabled * * Returns the current settings of one of the HW tracing filters. */ void t4_get_trace_filter(struct adapter *adap, struct trace_params *tp, int idx, int *enabled) { u32 ctla, ctlb; int i, ofst = idx * 4; u32 data_reg, mask_reg; ctla = t4_read_reg(adap, A_MPS_TRC_FILTER_MATCH_CTL_A + ofst); ctlb = t4_read_reg(adap, A_MPS_TRC_FILTER_MATCH_CTL_B + ofst); if (is_t4(adap)) { *enabled = !!(ctla & F_TFEN); tp->port = G_TFPORT(ctla); tp->invert = !!(ctla & F_TFINVERTMATCH); } else { *enabled = !!(ctla & F_T5_TFEN); tp->port = G_T5_TFPORT(ctla); tp->invert = !!(ctla & F_T5_TFINVERTMATCH); } tp->snap_len = G_TFCAPTUREMAX(ctlb); tp->min_len = G_TFMINPKTSIZE(ctlb); tp->skip_ofst = G_TFOFFSET(ctla); tp->skip_len = G_TFLENGTH(ctla); ofst = (A_MPS_TRC_FILTER1_MATCH - A_MPS_TRC_FILTER0_MATCH) * idx; data_reg = A_MPS_TRC_FILTER0_MATCH + ofst; mask_reg = A_MPS_TRC_FILTER0_DONT_CARE + ofst; for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) { tp->mask[i] = ~t4_read_reg(adap, mask_reg); tp->data[i] = t4_read_reg(adap, data_reg) & tp->mask[i]; } } /** * t4_pmtx_get_stats - returns the HW stats from PMTX * @adap: the adapter * @cnt: where to store the count statistics * @cycles: where to store the cycle statistics * * Returns performance statistics from PMTX. */ void t4_pmtx_get_stats(struct adapter *adap, u32 cnt[], u64 cycles[]) { int i; u32 data[2]; for (i = 0; i < adap->chip_params->pm_stats_cnt; i++) { t4_write_reg(adap, A_PM_TX_STAT_CONFIG, i + 1); cnt[i] = t4_read_reg(adap, A_PM_TX_STAT_COUNT); if (is_t4(adap)) cycles[i] = t4_read_reg64(adap, A_PM_TX_STAT_LSB); else { t4_read_indirect(adap, A_PM_TX_DBG_CTRL, A_PM_TX_DBG_DATA, data, 2, A_PM_TX_DBG_STAT_MSB); cycles[i] = (((u64)data[0] << 32) | data[1]); } } } /** * t4_pmrx_get_stats - returns the HW stats from PMRX * @adap: the adapter * @cnt: where to store the count statistics * @cycles: where to store the cycle statistics * * Returns performance statistics from PMRX. */ void t4_pmrx_get_stats(struct adapter *adap, u32 cnt[], u64 cycles[]) { int i; u32 data[2]; for (i = 0; i < adap->chip_params->pm_stats_cnt; i++) { t4_write_reg(adap, A_PM_RX_STAT_CONFIG, i + 1); cnt[i] = t4_read_reg(adap, A_PM_RX_STAT_COUNT); if (is_t4(adap)) { cycles[i] = t4_read_reg64(adap, A_PM_RX_STAT_LSB); } else { t4_read_indirect(adap, A_PM_RX_DBG_CTRL, A_PM_RX_DBG_DATA, data, 2, A_PM_RX_DBG_STAT_MSB); cycles[i] = (((u64)data[0] << 32) | data[1]); } } } /** * t4_get_mps_bg_map - return the buffer groups associated with a port * @adap: the adapter * @idx: the port index * * Returns a bitmap indicating which MPS buffer groups are associated * with the given port. Bit i is set if buffer group i is used by the * port. */ static unsigned int t4_get_mps_bg_map(struct adapter *adap, int idx) { u32 n = G_NUMPORTS(t4_read_reg(adap, A_MPS_CMN_CTL)); if (n == 0) return idx == 0 ? 0xf : 0; if (n == 1 && chip_id(adap) <= CHELSIO_T5) return idx < 2 ? (3 << (2 * idx)) : 0; return 1 << idx; } /** * t4_get_port_type_description - return Port Type string description * @port_type: firmware Port Type enumeration */ const char *t4_get_port_type_description(enum fw_port_type port_type) { static const char *const port_type_description[] = { "Fiber_XFI", "Fiber_XAUI", "BT_SGMII", "BT_XFI", "BT_XAUI", "KX4", "CX4", "KX", "KR", "SFP", "BP_AP", "BP4_AP", "QSFP_10G", "QSA", "QSFP", "BP40_BA", "KR4_100G", "CR4_QSFP", "CR_QSFP", "CR2_QSFP", "SFP28", "KR_SFP28", }; if (port_type < ARRAY_SIZE(port_type_description)) return port_type_description[port_type]; return "UNKNOWN"; } /** * t4_get_port_stats_offset - collect port stats relative to a previous * snapshot * @adap: The adapter * @idx: The port * @stats: Current stats to fill * @offset: Previous stats snapshot */ void t4_get_port_stats_offset(struct adapter *adap, int idx, struct port_stats *stats, struct port_stats *offset) { u64 *s, *o; int i; t4_get_port_stats(adap, idx, stats); for (i = 0, s = (u64 *)stats, o = (u64 *)offset ; i < (sizeof(struct port_stats)/sizeof(u64)) ; i++, s++, o++) *s -= *o; } /** * t4_get_port_stats - collect port statistics * @adap: the adapter * @idx: the port index * @p: the stats structure to fill * * Collect statistics related to the given port from HW. */ void t4_get_port_stats(struct adapter *adap, int idx, struct port_stats *p) { u32 bgmap = t4_get_mps_bg_map(adap, idx); u32 stat_ctl = t4_read_reg(adap, A_MPS_STAT_CTL); #define GET_STAT(name) \ t4_read_reg64(adap, \ (is_t4(adap) ? PORT_REG(idx, A_MPS_PORT_STAT_##name##_L) : \ T5_PORT_REG(idx, A_MPS_PORT_STAT_##name##_L))) #define GET_STAT_COM(name) t4_read_reg64(adap, A_MPS_STAT_##name##_L) p->tx_pause = GET_STAT(TX_PORT_PAUSE); p->tx_octets = GET_STAT(TX_PORT_BYTES); p->tx_frames = GET_STAT(TX_PORT_FRAMES); p->tx_bcast_frames = GET_STAT(TX_PORT_BCAST); p->tx_mcast_frames = GET_STAT(TX_PORT_MCAST); p->tx_ucast_frames = GET_STAT(TX_PORT_UCAST); p->tx_error_frames = GET_STAT(TX_PORT_ERROR); p->tx_frames_64 = GET_STAT(TX_PORT_64B); p->tx_frames_65_127 = GET_STAT(TX_PORT_65B_127B); p->tx_frames_128_255 = GET_STAT(TX_PORT_128B_255B); p->tx_frames_256_511 = GET_STAT(TX_PORT_256B_511B); p->tx_frames_512_1023 = GET_STAT(TX_PORT_512B_1023B); p->tx_frames_1024_1518 = GET_STAT(TX_PORT_1024B_1518B); p->tx_frames_1519_max = GET_STAT(TX_PORT_1519B_MAX); p->tx_drop = GET_STAT(TX_PORT_DROP); p->tx_ppp0 = GET_STAT(TX_PORT_PPP0); p->tx_ppp1 = GET_STAT(TX_PORT_PPP1); p->tx_ppp2 = GET_STAT(TX_PORT_PPP2); p->tx_ppp3 = GET_STAT(TX_PORT_PPP3); p->tx_ppp4 = GET_STAT(TX_PORT_PPP4); p->tx_ppp5 = GET_STAT(TX_PORT_PPP5); p->tx_ppp6 = GET_STAT(TX_PORT_PPP6); p->tx_ppp7 = GET_STAT(TX_PORT_PPP7); if (chip_id(adap) >= CHELSIO_T5) { if (stat_ctl & F_COUNTPAUSESTATTX) { p->tx_frames -= p->tx_pause; p->tx_octets -= p->tx_pause * 64; } if (stat_ctl & F_COUNTPAUSEMCTX) p->tx_mcast_frames -= p->tx_pause; } p->rx_pause = GET_STAT(RX_PORT_PAUSE); p->rx_octets = GET_STAT(RX_PORT_BYTES); p->rx_frames = GET_STAT(RX_PORT_FRAMES); p->rx_bcast_frames = GET_STAT(RX_PORT_BCAST); p->rx_mcast_frames = GET_STAT(RX_PORT_MCAST); p->rx_ucast_frames = GET_STAT(RX_PORT_UCAST); p->rx_too_long = GET_STAT(RX_PORT_MTU_ERROR); p->rx_jabber = GET_STAT(RX_PORT_MTU_CRC_ERROR); p->rx_fcs_err = GET_STAT(RX_PORT_CRC_ERROR); p->rx_len_err = GET_STAT(RX_PORT_LEN_ERROR); p->rx_symbol_err = GET_STAT(RX_PORT_SYM_ERROR); p->rx_runt = GET_STAT(RX_PORT_LESS_64B); p->rx_frames_64 = GET_STAT(RX_PORT_64B); p->rx_frames_65_127 = GET_STAT(RX_PORT_65B_127B); p->rx_frames_128_255 = GET_STAT(RX_PORT_128B_255B); p->rx_frames_256_511 = GET_STAT(RX_PORT_256B_511B); p->rx_frames_512_1023 = GET_STAT(RX_PORT_512B_1023B); p->rx_frames_1024_1518 = GET_STAT(RX_PORT_1024B_1518B); p->rx_frames_1519_max = GET_STAT(RX_PORT_1519B_MAX); p->rx_ppp0 = GET_STAT(RX_PORT_PPP0); p->rx_ppp1 = GET_STAT(RX_PORT_PPP1); p->rx_ppp2 = GET_STAT(RX_PORT_PPP2); p->rx_ppp3 = GET_STAT(RX_PORT_PPP3); p->rx_ppp4 = GET_STAT(RX_PORT_PPP4); p->rx_ppp5 = GET_STAT(RX_PORT_PPP5); p->rx_ppp6 = GET_STAT(RX_PORT_PPP6); p->rx_ppp7 = GET_STAT(RX_PORT_PPP7); if (chip_id(adap) >= CHELSIO_T5) { if (stat_ctl & F_COUNTPAUSESTATRX) { p->rx_frames -= p->rx_pause; p->rx_octets -= p->rx_pause * 64; } if (stat_ctl & F_COUNTPAUSEMCRX) p->rx_mcast_frames -= p->rx_pause; } p->rx_ovflow0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_MAC_DROP_FRAME) : 0; p->rx_ovflow1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_MAC_DROP_FRAME) : 0; p->rx_ovflow2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_MAC_DROP_FRAME) : 0; p->rx_ovflow3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_MAC_DROP_FRAME) : 0; p->rx_trunc0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_MAC_TRUNC_FRAME) : 0; p->rx_trunc1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_MAC_TRUNC_FRAME) : 0; p->rx_trunc2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_MAC_TRUNC_FRAME) : 0; p->rx_trunc3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_MAC_TRUNC_FRAME) : 0; #undef GET_STAT #undef GET_STAT_COM } /** * t4_get_lb_stats - collect loopback port statistics * @adap: the adapter * @idx: the loopback port index * @p: the stats structure to fill * * Return HW statistics for the given loopback port. */ void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p) { u32 bgmap = t4_get_mps_bg_map(adap, idx); #define GET_STAT(name) \ t4_read_reg64(adap, \ (is_t4(adap) ? \ PORT_REG(idx, A_MPS_PORT_STAT_LB_PORT_##name##_L) : \ T5_PORT_REG(idx, A_MPS_PORT_STAT_LB_PORT_##name##_L))) #define GET_STAT_COM(name) t4_read_reg64(adap, A_MPS_STAT_##name##_L) p->octets = GET_STAT(BYTES); p->frames = GET_STAT(FRAMES); p->bcast_frames = GET_STAT(BCAST); p->mcast_frames = GET_STAT(MCAST); p->ucast_frames = GET_STAT(UCAST); p->error_frames = GET_STAT(ERROR); p->frames_64 = GET_STAT(64B); p->frames_65_127 = GET_STAT(65B_127B); p->frames_128_255 = GET_STAT(128B_255B); p->frames_256_511 = GET_STAT(256B_511B); p->frames_512_1023 = GET_STAT(512B_1023B); p->frames_1024_1518 = GET_STAT(1024B_1518B); p->frames_1519_max = GET_STAT(1519B_MAX); p->drop = GET_STAT(DROP_FRAMES); p->ovflow0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_DROP_FRAME) : 0; p->ovflow1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_DROP_FRAME) : 0; p->ovflow2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_DROP_FRAME) : 0; p->ovflow3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_DROP_FRAME) : 0; p->trunc0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_TRUNC_FRAME) : 0; p->trunc1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_TRUNC_FRAME) : 0; p->trunc2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_TRUNC_FRAME) : 0; p->trunc3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_TRUNC_FRAME) : 0; #undef GET_STAT #undef GET_STAT_COM } /** * t4_wol_magic_enable - enable/disable magic packet WoL * @adap: the adapter * @port: the physical port index * @addr: MAC address expected in magic packets, %NULL to disable * * Enables/disables magic packet wake-on-LAN for the selected port. */ void t4_wol_magic_enable(struct adapter *adap, unsigned int port, const u8 *addr) { u32 mag_id_reg_l, mag_id_reg_h, port_cfg_reg; if (is_t4(adap)) { mag_id_reg_l = PORT_REG(port, A_XGMAC_PORT_MAGIC_MACID_LO); mag_id_reg_h = PORT_REG(port, A_XGMAC_PORT_MAGIC_MACID_HI); port_cfg_reg = PORT_REG(port, A_XGMAC_PORT_CFG2); } else { mag_id_reg_l = T5_PORT_REG(port, A_MAC_PORT_MAGIC_MACID_LO); mag_id_reg_h = T5_PORT_REG(port, A_MAC_PORT_MAGIC_MACID_HI); port_cfg_reg = T5_PORT_REG(port, A_MAC_PORT_CFG2); } if (addr) { t4_write_reg(adap, mag_id_reg_l, (addr[2] << 24) | (addr[3] << 16) | (addr[4] << 8) | addr[5]); t4_write_reg(adap, mag_id_reg_h, (addr[0] << 8) | addr[1]); } t4_set_reg_field(adap, port_cfg_reg, F_MAGICEN, V_MAGICEN(addr != NULL)); } /** * t4_wol_pat_enable - enable/disable pattern-based WoL * @adap: the adapter * @port: the physical port index * @map: bitmap of which HW pattern filters to set * @mask0: byte mask for bytes 0-63 of a packet * @mask1: byte mask for bytes 64-127 of a packet * @crc: Ethernet CRC for selected bytes * @enable: enable/disable switch * * Sets the pattern filters indicated in @map to mask out the bytes * specified in @mask0/@mask1 in received packets and compare the CRC of * the resulting packet against @crc. If @enable is %true pattern-based * WoL is enabled, otherwise disabled. */ int t4_wol_pat_enable(struct adapter *adap, unsigned int port, unsigned int map, u64 mask0, u64 mask1, unsigned int crc, bool enable) { int i; u32 port_cfg_reg; if (is_t4(adap)) port_cfg_reg = PORT_REG(port, A_XGMAC_PORT_CFG2); else port_cfg_reg = T5_PORT_REG(port, A_MAC_PORT_CFG2); if (!enable) { t4_set_reg_field(adap, port_cfg_reg, F_PATEN, 0); return 0; } if (map > 0xff) return -EINVAL; #define EPIO_REG(name) \ (is_t4(adap) ? PORT_REG(port, A_XGMAC_PORT_EPIO_##name) : \ T5_PORT_REG(port, A_MAC_PORT_EPIO_##name)) t4_write_reg(adap, EPIO_REG(DATA1), mask0 >> 32); t4_write_reg(adap, EPIO_REG(DATA2), mask1); t4_write_reg(adap, EPIO_REG(DATA3), mask1 >> 32); for (i = 0; i < NWOL_PAT; i++, map >>= 1) { if (!(map & 1)) continue; /* write byte masks */ t4_write_reg(adap, EPIO_REG(DATA0), mask0); t4_write_reg(adap, EPIO_REG(OP), V_ADDRESS(i) | F_EPIOWR); t4_read_reg(adap, EPIO_REG(OP)); /* flush */ if (t4_read_reg(adap, EPIO_REG(OP)) & F_BUSY) return -ETIMEDOUT; /* write CRC */ t4_write_reg(adap, EPIO_REG(DATA0), crc); t4_write_reg(adap, EPIO_REG(OP), V_ADDRESS(i + 32) | F_EPIOWR); t4_read_reg(adap, EPIO_REG(OP)); /* flush */ if (t4_read_reg(adap, EPIO_REG(OP)) & F_BUSY) return -ETIMEDOUT; } #undef EPIO_REG t4_set_reg_field(adap, port_cfg_reg, 0, F_PATEN); return 0; } /* t4_mk_filtdelwr - create a delete filter WR * @ftid: the filter ID * @wr: the filter work request to populate * @qid: ingress queue to receive the delete notification * * Creates a filter work request to delete the supplied filter. If @qid is * negative the delete notification is suppressed. */ void t4_mk_filtdelwr(unsigned int ftid, struct fw_filter_wr *wr, int qid) { memset(wr, 0, sizeof(*wr)); wr->op_pkd = cpu_to_be32(V_FW_WR_OP(FW_FILTER_WR)); wr->len16_pkd = cpu_to_be32(V_FW_WR_LEN16(sizeof(*wr) / 16)); wr->tid_to_iq = cpu_to_be32(V_FW_FILTER_WR_TID(ftid) | V_FW_FILTER_WR_NOREPLY(qid < 0)); wr->del_filter_to_l2tix = cpu_to_be32(F_FW_FILTER_WR_DEL_FILTER); if (qid >= 0) wr->rx_chan_rx_rpl_iq = cpu_to_be16(V_FW_FILTER_WR_RX_RPL_IQ(qid)); } #define INIT_CMD(var, cmd, rd_wr) do { \ (var).op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_##cmd##_CMD) | \ F_FW_CMD_REQUEST | \ F_FW_CMD_##rd_wr); \ (var).retval_len16 = cpu_to_be32(FW_LEN16(var)); \ } while (0) int t4_fwaddrspace_write(struct adapter *adap, unsigned int mbox, u32 addr, u32 val) { u32 ldst_addrspace; struct fw_ldst_cmd c; memset(&c, 0, sizeof(c)); ldst_addrspace = V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_FIRMWARE); c.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | ldst_addrspace); c.cycles_to_len16 = cpu_to_be32(FW_LEN16(c)); c.u.addrval.addr = cpu_to_be32(addr); c.u.addrval.val = cpu_to_be32(val); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_mdio_rd - read a PHY register through MDIO * @adap: the adapter * @mbox: mailbox to use for the FW command * @phy_addr: the PHY address * @mmd: the PHY MMD to access (0 for clause 22 PHYs) * @reg: the register to read * @valp: where to store the value * * Issues a FW command through the given mailbox to read a PHY register. */ int t4_mdio_rd(struct adapter *adap, unsigned int mbox, unsigned int phy_addr, unsigned int mmd, unsigned int reg, unsigned int *valp) { int ret; u32 ldst_addrspace; struct fw_ldst_cmd c; memset(&c, 0, sizeof(c)); ldst_addrspace = V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_MDIO); c.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | ldst_addrspace); c.cycles_to_len16 = cpu_to_be32(FW_LEN16(c)); c.u.mdio.paddr_mmd = cpu_to_be16(V_FW_LDST_CMD_PADDR(phy_addr) | V_FW_LDST_CMD_MMD(mmd)); c.u.mdio.raddr = cpu_to_be16(reg); ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); if (ret == 0) *valp = be16_to_cpu(c.u.mdio.rval); return ret; } /** * t4_mdio_wr - write a PHY register through MDIO * @adap: the adapter * @mbox: mailbox to use for the FW command * @phy_addr: the PHY address * @mmd: the PHY MMD to access (0 for clause 22 PHYs) * @reg: the register to write * @valp: value to write * * Issues a FW command through the given mailbox to write a PHY register. */ int t4_mdio_wr(struct adapter *adap, unsigned int mbox, unsigned int phy_addr, unsigned int mmd, unsigned int reg, unsigned int val) { u32 ldst_addrspace; struct fw_ldst_cmd c; memset(&c, 0, sizeof(c)); ldst_addrspace = V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_MDIO); c.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | ldst_addrspace); c.cycles_to_len16 = cpu_to_be32(FW_LEN16(c)); c.u.mdio.paddr_mmd = cpu_to_be16(V_FW_LDST_CMD_PADDR(phy_addr) | V_FW_LDST_CMD_MMD(mmd)); c.u.mdio.raddr = cpu_to_be16(reg); c.u.mdio.rval = cpu_to_be16(val); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * * t4_sge_decode_idma_state - decode the idma state * @adap: the adapter * @state: the state idma is stuck in */ void t4_sge_decode_idma_state(struct adapter *adapter, int state) { static const char * const t4_decode[] = { "IDMA_IDLE", "IDMA_PUSH_MORE_CPL_FIFO", "IDMA_PUSH_CPL_MSG_HEADER_TO_FIFO", "Not used", "IDMA_PHYSADDR_SEND_PCIEHDR", "IDMA_PHYSADDR_SEND_PAYLOAD_FIRST", "IDMA_PHYSADDR_SEND_PAYLOAD", "IDMA_SEND_FIFO_TO_IMSG", "IDMA_FL_REQ_DATA_FL_PREP", "IDMA_FL_REQ_DATA_FL", "IDMA_FL_DROP", "IDMA_FL_H_REQ_HEADER_FL", "IDMA_FL_H_SEND_PCIEHDR", "IDMA_FL_H_PUSH_CPL_FIFO", "IDMA_FL_H_SEND_CPL", "IDMA_FL_H_SEND_IP_HDR_FIRST", "IDMA_FL_H_SEND_IP_HDR", "IDMA_FL_H_REQ_NEXT_HEADER_FL", "IDMA_FL_H_SEND_NEXT_PCIEHDR", "IDMA_FL_H_SEND_IP_HDR_PADDING", "IDMA_FL_D_SEND_PCIEHDR", "IDMA_FL_D_SEND_CPL_AND_IP_HDR", "IDMA_FL_D_REQ_NEXT_DATA_FL", "IDMA_FL_SEND_PCIEHDR", "IDMA_FL_PUSH_CPL_FIFO", "IDMA_FL_SEND_CPL", "IDMA_FL_SEND_PAYLOAD_FIRST", "IDMA_FL_SEND_PAYLOAD", "IDMA_FL_REQ_NEXT_DATA_FL", "IDMA_FL_SEND_NEXT_PCIEHDR", "IDMA_FL_SEND_PADDING", "IDMA_FL_SEND_COMPLETION_TO_IMSG", "IDMA_FL_SEND_FIFO_TO_IMSG", "IDMA_FL_REQ_DATAFL_DONE", "IDMA_FL_REQ_HEADERFL_DONE", }; static const char * const t5_decode[] = { "IDMA_IDLE", "IDMA_ALMOST_IDLE", "IDMA_PUSH_MORE_CPL_FIFO", "IDMA_PUSH_CPL_MSG_HEADER_TO_FIFO", "IDMA_SGEFLRFLUSH_SEND_PCIEHDR", "IDMA_PHYSADDR_SEND_PCIEHDR", "IDMA_PHYSADDR_SEND_PAYLOAD_FIRST", "IDMA_PHYSADDR_SEND_PAYLOAD", "IDMA_SEND_FIFO_TO_IMSG", "IDMA_FL_REQ_DATA_FL", "IDMA_FL_DROP", "IDMA_FL_DROP_SEND_INC", "IDMA_FL_H_REQ_HEADER_FL", "IDMA_FL_H_SEND_PCIEHDR", "IDMA_FL_H_PUSH_CPL_FIFO", "IDMA_FL_H_SEND_CPL", "IDMA_FL_H_SEND_IP_HDR_FIRST", "IDMA_FL_H_SEND_IP_HDR", "IDMA_FL_H_REQ_NEXT_HEADER_FL", "IDMA_FL_H_SEND_NEXT_PCIEHDR", "IDMA_FL_H_SEND_IP_HDR_PADDING", "IDMA_FL_D_SEND_PCIEHDR", "IDMA_FL_D_SEND_CPL_AND_IP_HDR", "IDMA_FL_D_REQ_NEXT_DATA_FL", "IDMA_FL_SEND_PCIEHDR", "IDMA_FL_PUSH_CPL_FIFO", "IDMA_FL_SEND_CPL", "IDMA_FL_SEND_PAYLOAD_FIRST", "IDMA_FL_SEND_PAYLOAD", "IDMA_FL_REQ_NEXT_DATA_FL", "IDMA_FL_SEND_NEXT_PCIEHDR", "IDMA_FL_SEND_PADDING", "IDMA_FL_SEND_COMPLETION_TO_IMSG", }; static const char * const t6_decode[] = { "IDMA_IDLE", "IDMA_PUSH_MORE_CPL_FIFO", "IDMA_PUSH_CPL_MSG_HEADER_TO_FIFO", "IDMA_SGEFLRFLUSH_SEND_PCIEHDR", "IDMA_PHYSADDR_SEND_PCIEHDR", "IDMA_PHYSADDR_SEND_PAYLOAD_FIRST", "IDMA_PHYSADDR_SEND_PAYLOAD", "IDMA_FL_REQ_DATA_FL", "IDMA_FL_DROP", "IDMA_FL_DROP_SEND_INC", "IDMA_FL_H_REQ_HEADER_FL", "IDMA_FL_H_SEND_PCIEHDR", "IDMA_FL_H_PUSH_CPL_FIFO", "IDMA_FL_H_SEND_CPL", "IDMA_FL_H_SEND_IP_HDR_FIRST", "IDMA_FL_H_SEND_IP_HDR", "IDMA_FL_H_REQ_NEXT_HEADER_FL", "IDMA_FL_H_SEND_NEXT_PCIEHDR", "IDMA_FL_H_SEND_IP_HDR_PADDING", "IDMA_FL_D_SEND_PCIEHDR", "IDMA_FL_D_SEND_CPL_AND_IP_HDR", "IDMA_FL_D_REQ_NEXT_DATA_FL", "IDMA_FL_SEND_PCIEHDR", "IDMA_FL_PUSH_CPL_FIFO", "IDMA_FL_SEND_CPL", "IDMA_FL_SEND_PAYLOAD_FIRST", "IDMA_FL_SEND_PAYLOAD", "IDMA_FL_REQ_NEXT_DATA_FL", "IDMA_FL_SEND_NEXT_PCIEHDR", "IDMA_FL_SEND_PADDING", "IDMA_FL_SEND_COMPLETION_TO_IMSG", }; static const u32 sge_regs[] = { A_SGE_DEBUG_DATA_LOW_INDEX_2, A_SGE_DEBUG_DATA_LOW_INDEX_3, A_SGE_DEBUG_DATA_HIGH_INDEX_10, }; const char * const *sge_idma_decode; int sge_idma_decode_nstates; int i; unsigned int chip_version = chip_id(adapter); /* Select the right set of decode strings to dump depending on the * adapter chip type. */ switch (chip_version) { case CHELSIO_T4: sge_idma_decode = (const char * const *)t4_decode; sge_idma_decode_nstates = ARRAY_SIZE(t4_decode); break; case CHELSIO_T5: sge_idma_decode = (const char * const *)t5_decode; sge_idma_decode_nstates = ARRAY_SIZE(t5_decode); break; case CHELSIO_T6: sge_idma_decode = (const char * const *)t6_decode; sge_idma_decode_nstates = ARRAY_SIZE(t6_decode); break; default: CH_ERR(adapter, "Unsupported chip version %d\n", chip_version); return; } if (state < sge_idma_decode_nstates) CH_WARN(adapter, "idma state %s\n", sge_idma_decode[state]); else CH_WARN(adapter, "idma state %d unknown\n", state); for (i = 0; i < ARRAY_SIZE(sge_regs); i++) CH_WARN(adapter, "SGE register %#x value %#x\n", sge_regs[i], t4_read_reg(adapter, sge_regs[i])); } /** * t4_sge_ctxt_flush - flush the SGE context cache * @adap: the adapter * @mbox: mailbox to use for the FW command * * Issues a FW command through the given mailbox to flush the * SGE context cache. */ int t4_sge_ctxt_flush(struct adapter *adap, unsigned int mbox) { int ret; u32 ldst_addrspace; struct fw_ldst_cmd c; memset(&c, 0, sizeof(c)); ldst_addrspace = V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_SGE_EGRC); c.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | ldst_addrspace); c.cycles_to_len16 = cpu_to_be32(FW_LEN16(c)); c.u.idctxt.msg_ctxtflush = cpu_to_be32(F_FW_LDST_CMD_CTXTFLUSH); ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); return ret; } /** * t4_fw_hello - establish communication with FW * @adap: the adapter * @mbox: mailbox to use for the FW command * @evt_mbox: mailbox to receive async FW events * @master: specifies the caller's willingness to be the device master * @state: returns the current device state (if non-NULL) * * Issues a command to establish communication with FW. Returns either * an error (negative integer) or the mailbox of the Master PF. */ int t4_fw_hello(struct adapter *adap, unsigned int mbox, unsigned int evt_mbox, enum dev_master master, enum dev_state *state) { int ret; struct fw_hello_cmd c; u32 v; unsigned int master_mbox; int retries = FW_CMD_HELLO_RETRIES; retry: memset(&c, 0, sizeof(c)); INIT_CMD(c, HELLO, WRITE); c.err_to_clearinit = cpu_to_be32( V_FW_HELLO_CMD_MASTERDIS(master == MASTER_CANT) | V_FW_HELLO_CMD_MASTERFORCE(master == MASTER_MUST) | V_FW_HELLO_CMD_MBMASTER(master == MASTER_MUST ? mbox : M_FW_HELLO_CMD_MBMASTER) | V_FW_HELLO_CMD_MBASYNCNOT(evt_mbox) | V_FW_HELLO_CMD_STAGE(FW_HELLO_CMD_STAGE_OS) | F_FW_HELLO_CMD_CLEARINIT); /* * Issue the HELLO command to the firmware. If it's not successful * but indicates that we got a "busy" or "timeout" condition, retry * the HELLO until we exhaust our retry limit. If we do exceed our * retry limit, check to see if the firmware left us any error * information and report that if so ... */ ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); if (ret != FW_SUCCESS) { if ((ret == -EBUSY || ret == -ETIMEDOUT) && retries-- > 0) goto retry; if (t4_read_reg(adap, A_PCIE_FW) & F_PCIE_FW_ERR) t4_report_fw_error(adap); return ret; } v = be32_to_cpu(c.err_to_clearinit); master_mbox = G_FW_HELLO_CMD_MBMASTER(v); if (state) { if (v & F_FW_HELLO_CMD_ERR) *state = DEV_STATE_ERR; else if (v & F_FW_HELLO_CMD_INIT) *state = DEV_STATE_INIT; else *state = DEV_STATE_UNINIT; } /* * If we're not the Master PF then we need to wait around for the * Master PF Driver to finish setting up the adapter. * * Note that we also do this wait if we're a non-Master-capable PF and * there is no current Master PF; a Master PF may show up momentarily * and we wouldn't want to fail pointlessly. (This can happen when an * OS loads lots of different drivers rapidly at the same time). In * this case, the Master PF returned by the firmware will be * M_PCIE_FW_MASTER so the test below will work ... */ if ((v & (F_FW_HELLO_CMD_ERR|F_FW_HELLO_CMD_INIT)) == 0 && master_mbox != mbox) { int waiting = FW_CMD_HELLO_TIMEOUT; /* * Wait for the firmware to either indicate an error or * initialized state. If we see either of these we bail out * and report the issue to the caller. If we exhaust the * "hello timeout" and we haven't exhausted our retries, try * again. Otherwise bail with a timeout error. */ for (;;) { u32 pcie_fw; msleep(50); waiting -= 50; /* * If neither Error nor Initialialized are indicated * by the firmware keep waiting till we exhaust our * timeout ... and then retry if we haven't exhausted * our retries ... */ pcie_fw = t4_read_reg(adap, A_PCIE_FW); if (!(pcie_fw & (F_PCIE_FW_ERR|F_PCIE_FW_INIT))) { if (waiting <= 0) { if (retries-- > 0) goto retry; return -ETIMEDOUT; } continue; } /* * We either have an Error or Initialized condition * report errors preferentially. */ if (state) { if (pcie_fw & F_PCIE_FW_ERR) *state = DEV_STATE_ERR; else if (pcie_fw & F_PCIE_FW_INIT) *state = DEV_STATE_INIT; } /* * If we arrived before a Master PF was selected and * there's not a valid Master PF, grab its identity * for our caller. */ if (master_mbox == M_PCIE_FW_MASTER && (pcie_fw & F_PCIE_FW_MASTER_VLD)) master_mbox = G_PCIE_FW_MASTER(pcie_fw); break; } } return master_mbox; } /** * t4_fw_bye - end communication with FW * @adap: the adapter * @mbox: mailbox to use for the FW command * * Issues a command to terminate communication with FW. */ int t4_fw_bye(struct adapter *adap, unsigned int mbox) { struct fw_bye_cmd c; memset(&c, 0, sizeof(c)); INIT_CMD(c, BYE, WRITE); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_fw_reset - issue a reset to FW * @adap: the adapter * @mbox: mailbox to use for the FW command * @reset: specifies the type of reset to perform * * Issues a reset command of the specified type to FW. */ int t4_fw_reset(struct adapter *adap, unsigned int mbox, int reset) { struct fw_reset_cmd c; memset(&c, 0, sizeof(c)); INIT_CMD(c, RESET, WRITE); c.val = cpu_to_be32(reset); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_fw_halt - issue a reset/halt to FW and put uP into RESET * @adap: the adapter * @mbox: mailbox to use for the FW RESET command (if desired) * @force: force uP into RESET even if FW RESET command fails * * Issues a RESET command to firmware (if desired) with a HALT indication * and then puts the microprocessor into RESET state. The RESET command * will only be issued if a legitimate mailbox is provided (mbox <= * M_PCIE_FW_MASTER). * * This is generally used in order for the host to safely manipulate the * adapter without fear of conflicting with whatever the firmware might * be doing. The only way out of this state is to RESTART the firmware * ... */ int t4_fw_halt(struct adapter *adap, unsigned int mbox, int force) { int ret = 0; /* * If a legitimate mailbox is provided, issue a RESET command * with a HALT indication. */ if (mbox <= M_PCIE_FW_MASTER) { struct fw_reset_cmd c; memset(&c, 0, sizeof(c)); INIT_CMD(c, RESET, WRITE); c.val = cpu_to_be32(F_PIORST | F_PIORSTMODE); c.halt_pkd = cpu_to_be32(F_FW_RESET_CMD_HALT); ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /* * Normally we won't complete the operation if the firmware RESET * command fails but if our caller insists we'll go ahead and put the * uP into RESET. This can be useful if the firmware is hung or even * missing ... We'll have to take the risk of putting the uP into * RESET without the cooperation of firmware in that case. * * We also force the firmware's HALT flag to be on in case we bypassed * the firmware RESET command above or we're dealing with old firmware * which doesn't have the HALT capability. This will serve as a flag * for the incoming firmware to know that it's coming out of a HALT * rather than a RESET ... if it's new enough to understand that ... */ if (ret == 0 || force) { t4_set_reg_field(adap, A_CIM_BOOT_CFG, F_UPCRST, F_UPCRST); t4_set_reg_field(adap, A_PCIE_FW, F_PCIE_FW_HALT, F_PCIE_FW_HALT); } /* * And we always return the result of the firmware RESET command * even when we force the uP into RESET ... */ return ret; } /** * t4_fw_restart - restart the firmware by taking the uP out of RESET * @adap: the adapter * @reset: if we want to do a RESET to restart things * * Restart firmware previously halted by t4_fw_halt(). On successful * return the previous PF Master remains as the new PF Master and there * is no need to issue a new HELLO command, etc. * * We do this in two ways: * * 1. If we're dealing with newer firmware we'll simply want to take * the chip's microprocessor out of RESET. This will cause the * firmware to start up from its start vector. And then we'll loop * until the firmware indicates it's started again (PCIE_FW.HALT * reset to 0) or we timeout. * * 2. If we're dealing with older firmware then we'll need to RESET * the chip since older firmware won't recognize the PCIE_FW.HALT * flag and automatically RESET itself on startup. */ int t4_fw_restart(struct adapter *adap, unsigned int mbox, int reset) { if (reset) { /* * Since we're directing the RESET instead of the firmware * doing it automatically, we need to clear the PCIE_FW.HALT * bit. */ t4_set_reg_field(adap, A_PCIE_FW, F_PCIE_FW_HALT, 0); /* * If we've been given a valid mailbox, first try to get the * firmware to do the RESET. If that works, great and we can * return success. Otherwise, if we haven't been given a * valid mailbox or the RESET command failed, fall back to * hitting the chip with a hammer. */ if (mbox <= M_PCIE_FW_MASTER) { t4_set_reg_field(adap, A_CIM_BOOT_CFG, F_UPCRST, 0); msleep(100); if (t4_fw_reset(adap, mbox, F_PIORST | F_PIORSTMODE) == 0) return 0; } t4_write_reg(adap, A_PL_RST, F_PIORST | F_PIORSTMODE); msleep(2000); } else { int ms; t4_set_reg_field(adap, A_CIM_BOOT_CFG, F_UPCRST, 0); for (ms = 0; ms < FW_CMD_MAX_TIMEOUT; ) { if (!(t4_read_reg(adap, A_PCIE_FW) & F_PCIE_FW_HALT)) return FW_SUCCESS; msleep(100); ms += 100; } return -ETIMEDOUT; } return 0; } /** * t4_fw_upgrade - perform all of the steps necessary to upgrade FW * @adap: the adapter * @mbox: mailbox to use for the FW RESET command (if desired) * @fw_data: the firmware image to write * @size: image size * @force: force upgrade even if firmware doesn't cooperate * * Perform all of the steps necessary for upgrading an adapter's * firmware image. Normally this requires the cooperation of the * existing firmware in order to halt all existing activities * but if an invalid mailbox token is passed in we skip that step * (though we'll still put the adapter microprocessor into RESET in * that case). * * On successful return the new firmware will have been loaded and * the adapter will have been fully RESET losing all previous setup * state. On unsuccessful return the adapter may be completely hosed ... * positive errno indicates that the adapter is ~probably~ intact, a * negative errno indicates that things are looking bad ... */ int t4_fw_upgrade(struct adapter *adap, unsigned int mbox, const u8 *fw_data, unsigned int size, int force) { const struct fw_hdr *fw_hdr = (const struct fw_hdr *)fw_data; unsigned int bootstrap = be32_to_cpu(fw_hdr->magic) == FW_HDR_MAGIC_BOOTSTRAP; int reset, ret; if (!t4_fw_matches_chip(adap, fw_hdr)) return -EINVAL; if (!bootstrap) { ret = t4_fw_halt(adap, mbox, force); if (ret < 0 && !force) return ret; } ret = t4_load_fw(adap, fw_data, size); if (ret < 0 || bootstrap) return ret; /* * Older versions of the firmware don't understand the new * PCIE_FW.HALT flag and so won't know to perform a RESET when they * restart. So for newly loaded older firmware we'll have to do the * RESET for it so it starts up on a clean slate. We can tell if * the newly loaded firmware will handle this right by checking * its header flags to see if it advertises the capability. */ reset = ((be32_to_cpu(fw_hdr->flags) & FW_HDR_FLAGS_RESET_HALT) == 0); return t4_fw_restart(adap, mbox, reset); } /* * Card doesn't have a firmware, install one. */ int t4_fw_forceinstall(struct adapter *adap, const u8 *fw_data, unsigned int size) { const struct fw_hdr *fw_hdr = (const struct fw_hdr *)fw_data; unsigned int bootstrap = be32_to_cpu(fw_hdr->magic) == FW_HDR_MAGIC_BOOTSTRAP; int ret; if (!t4_fw_matches_chip(adap, fw_hdr) || bootstrap) return -EINVAL; t4_set_reg_field(adap, A_CIM_BOOT_CFG, F_UPCRST, F_UPCRST); t4_write_reg(adap, A_PCIE_FW, 0); /* Clobber internal state */ ret = t4_load_fw(adap, fw_data, size); if (ret < 0) return ret; t4_write_reg(adap, A_PL_RST, F_PIORST | F_PIORSTMODE); msleep(1000); return (0); } /** * t4_fw_initialize - ask FW to initialize the device * @adap: the adapter * @mbox: mailbox to use for the FW command * * Issues a command to FW to partially initialize the device. This * performs initialization that generally doesn't depend on user input. */ int t4_fw_initialize(struct adapter *adap, unsigned int mbox) { struct fw_initialize_cmd c; memset(&c, 0, sizeof(c)); INIT_CMD(c, INITIALIZE, WRITE); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_query_params_rw - query FW or device parameters * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF * @vf: the VF * @nparams: the number of parameters * @params: the parameter names * @val: the parameter values * @rw: Write and read flag * * Reads the value of FW or device parameters. Up to 7 parameters can be * queried at once. */ int t4_query_params_rw(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int nparams, const u32 *params, u32 *val, int rw) { int i, ret; struct fw_params_cmd c; __be32 *p = &c.param[0].mnem; if (nparams > 7) return -EINVAL; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_PARAMS_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | V_FW_PARAMS_CMD_PFN(pf) | V_FW_PARAMS_CMD_VFN(vf)); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); for (i = 0; i < nparams; i++) { *p++ = cpu_to_be32(*params++); if (rw) *p = cpu_to_be32(*(val + i)); p++; } ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); if (ret == 0) for (i = 0, p = &c.param[0].val; i < nparams; i++, p += 2) *val++ = be32_to_cpu(*p); return ret; } int t4_query_params(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int nparams, const u32 *params, u32 *val) { return t4_query_params_rw(adap, mbox, pf, vf, nparams, params, val, 0); } /** * t4_set_params_timeout - sets FW or device parameters * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF * @vf: the VF * @nparams: the number of parameters * @params: the parameter names * @val: the parameter values * @timeout: the timeout time * * Sets the value of FW or device parameters. Up to 7 parameters can be * specified at once. */ int t4_set_params_timeout(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int nparams, const u32 *params, const u32 *val, int timeout) { struct fw_params_cmd c; __be32 *p = &c.param[0].mnem; if (nparams > 7) return -EINVAL; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_PARAMS_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_PARAMS_CMD_PFN(pf) | V_FW_PARAMS_CMD_VFN(vf)); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); while (nparams--) { *p++ = cpu_to_be32(*params++); *p++ = cpu_to_be32(*val++); } return t4_wr_mbox_timeout(adap, mbox, &c, sizeof(c), NULL, timeout); } /** * t4_set_params - sets FW or device parameters * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF * @vf: the VF * @nparams: the number of parameters * @params: the parameter names * @val: the parameter values * * Sets the value of FW or device parameters. Up to 7 parameters can be * specified at once. */ int t4_set_params(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int nparams, const u32 *params, const u32 *val) { return t4_set_params_timeout(adap, mbox, pf, vf, nparams, params, val, FW_CMD_MAX_TIMEOUT); } /** * t4_cfg_pfvf - configure PF/VF resource limits * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF being configured * @vf: the VF being configured * @txq: the max number of egress queues * @txq_eth_ctrl: the max number of egress Ethernet or control queues * @rxqi: the max number of interrupt-capable ingress queues * @rxq: the max number of interruptless ingress queues * @tc: the PCI traffic class * @vi: the max number of virtual interfaces * @cmask: the channel access rights mask for the PF/VF * @pmask: the port access rights mask for the PF/VF * @nexact: the maximum number of exact MPS filters * @rcaps: read capabilities * @wxcaps: write/execute capabilities * * Configures resource limits and capabilities for a physical or virtual * function. */ int t4_cfg_pfvf(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int txq, unsigned int txq_eth_ctrl, unsigned int rxqi, unsigned int rxq, unsigned int tc, unsigned int vi, unsigned int cmask, unsigned int pmask, unsigned int nexact, unsigned int rcaps, unsigned int wxcaps) { struct fw_pfvf_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_PFVF_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_PFVF_CMD_PFN(pf) | V_FW_PFVF_CMD_VFN(vf)); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); c.niqflint_niq = cpu_to_be32(V_FW_PFVF_CMD_NIQFLINT(rxqi) | V_FW_PFVF_CMD_NIQ(rxq)); c.type_to_neq = cpu_to_be32(V_FW_PFVF_CMD_CMASK(cmask) | V_FW_PFVF_CMD_PMASK(pmask) | V_FW_PFVF_CMD_NEQ(txq)); c.tc_to_nexactf = cpu_to_be32(V_FW_PFVF_CMD_TC(tc) | V_FW_PFVF_CMD_NVI(vi) | V_FW_PFVF_CMD_NEXACTF(nexact)); c.r_caps_to_nethctrl = cpu_to_be32(V_FW_PFVF_CMD_R_CAPS(rcaps) | V_FW_PFVF_CMD_WX_CAPS(wxcaps) | V_FW_PFVF_CMD_NETHCTRL(txq_eth_ctrl)); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_alloc_vi_func - allocate a virtual interface * @adap: the adapter * @mbox: mailbox to use for the FW command * @port: physical port associated with the VI * @pf: the PF owning the VI * @vf: the VF owning the VI * @nmac: number of MAC addresses needed (1 to 5) * @mac: the MAC addresses of the VI * @rss_size: size of RSS table slice associated with this VI * @portfunc: which Port Application Function MAC Address is desired * @idstype: Intrusion Detection Type * * Allocates a virtual interface for the given physical port. If @mac is * not %NULL it contains the MAC addresses of the VI as assigned by FW. * If @rss_size is %NULL the VI is not assigned any RSS slice by FW. * @mac should be large enough to hold @nmac Ethernet addresses, they are * stored consecutively so the space needed is @nmac * 6 bytes. * Returns a negative error number or the non-negative VI id. */ int t4_alloc_vi_func(struct adapter *adap, unsigned int mbox, unsigned int port, unsigned int pf, unsigned int vf, unsigned int nmac, u8 *mac, u16 *rss_size, unsigned int portfunc, unsigned int idstype) { int ret; struct fw_vi_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_VI_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_VI_CMD_PFN(pf) | V_FW_VI_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_VI_CMD_ALLOC | FW_LEN16(c)); c.type_to_viid = cpu_to_be16(V_FW_VI_CMD_TYPE(idstype) | V_FW_VI_CMD_FUNC(portfunc)); c.portid_pkd = V_FW_VI_CMD_PORTID(port); c.nmac = nmac - 1; if(!rss_size) c.norss_rsssize = F_FW_VI_CMD_NORSS; ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); if (ret) return ret; if (mac) { memcpy(mac, c.mac, sizeof(c.mac)); switch (nmac) { case 5: memcpy(mac + 24, c.nmac3, sizeof(c.nmac3)); case 4: memcpy(mac + 18, c.nmac2, sizeof(c.nmac2)); case 3: memcpy(mac + 12, c.nmac1, sizeof(c.nmac1)); case 2: memcpy(mac + 6, c.nmac0, sizeof(c.nmac0)); } } if (rss_size) *rss_size = G_FW_VI_CMD_RSSSIZE(be16_to_cpu(c.norss_rsssize)); return G_FW_VI_CMD_VIID(be16_to_cpu(c.type_to_viid)); } /** * t4_alloc_vi - allocate an [Ethernet Function] virtual interface * @adap: the adapter * @mbox: mailbox to use for the FW command * @port: physical port associated with the VI * @pf: the PF owning the VI * @vf: the VF owning the VI * @nmac: number of MAC addresses needed (1 to 5) * @mac: the MAC addresses of the VI * @rss_size: size of RSS table slice associated with this VI * * backwards compatible and convieniance routine to allocate a Virtual * Interface with a Ethernet Port Application Function and Intrustion * Detection System disabled. */ int t4_alloc_vi(struct adapter *adap, unsigned int mbox, unsigned int port, unsigned int pf, unsigned int vf, unsigned int nmac, u8 *mac, u16 *rss_size) { return t4_alloc_vi_func(adap, mbox, port, pf, vf, nmac, mac, rss_size, FW_VI_FUNC_ETH, 0); } /** * t4_free_vi - free a virtual interface * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the VI * @vf: the VF owning the VI * @viid: virtual interface identifiler * * Free a previously allocated virtual interface. */ int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int viid) { struct fw_vi_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_VI_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_VI_CMD_PFN(pf) | V_FW_VI_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_VI_CMD_FREE | FW_LEN16(c)); c.type_to_viid = cpu_to_be16(V_FW_VI_CMD_VIID(viid)); return t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); } /** * t4_set_rxmode - set Rx properties of a virtual interface * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @mtu: the new MTU or -1 * @promisc: 1 to enable promiscuous mode, 0 to disable it, -1 no change * @all_multi: 1 to enable all-multi mode, 0 to disable it, -1 no change * @bcast: 1 to enable broadcast Rx, 0 to disable it, -1 no change * @vlanex: 1 to enable HW VLAN extraction, 0 to disable it, -1 no change * @sleep_ok: if true we may sleep while awaiting command completion * * Sets Rx properties of a virtual interface. */ int t4_set_rxmode(struct adapter *adap, unsigned int mbox, unsigned int viid, int mtu, int promisc, int all_multi, int bcast, int vlanex, bool sleep_ok) { struct fw_vi_rxmode_cmd c; /* convert to FW values */ if (mtu < 0) mtu = M_FW_VI_RXMODE_CMD_MTU; if (promisc < 0) promisc = M_FW_VI_RXMODE_CMD_PROMISCEN; if (all_multi < 0) all_multi = M_FW_VI_RXMODE_CMD_ALLMULTIEN; if (bcast < 0) bcast = M_FW_VI_RXMODE_CMD_BROADCASTEN; if (vlanex < 0) vlanex = M_FW_VI_RXMODE_CMD_VLANEXEN; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_RXMODE_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_VI_RXMODE_CMD_VIID(viid)); c.retval_len16 = cpu_to_be32(FW_LEN16(c)); c.mtu_to_vlanexen = cpu_to_be32(V_FW_VI_RXMODE_CMD_MTU(mtu) | V_FW_VI_RXMODE_CMD_PROMISCEN(promisc) | V_FW_VI_RXMODE_CMD_ALLMULTIEN(all_multi) | V_FW_VI_RXMODE_CMD_BROADCASTEN(bcast) | V_FW_VI_RXMODE_CMD_VLANEXEN(vlanex)); return t4_wr_mbox_meat(adap, mbox, &c, sizeof(c), NULL, sleep_ok); } /** * t4_alloc_mac_filt - allocates exact-match filters for MAC addresses * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @free: if true any existing filters for this VI id are first removed * @naddr: the number of MAC addresses to allocate filters for (up to 7) * @addr: the MAC address(es) * @idx: where to store the index of each allocated filter * @hash: pointer to hash address filter bitmap * @sleep_ok: call is allowed to sleep * * Allocates an exact-match filter for each of the supplied addresses and * sets it to the corresponding address. If @idx is not %NULL it should * have at least @naddr entries, each of which will be set to the index of * the filter allocated for the corresponding MAC address. If a filter * could not be allocated for an address its index is set to 0xffff. * If @hash is not %NULL addresses that fail to allocate an exact filter * are hashed and update the hash filter bitmap pointed at by @hash. * * Returns a negative error number or the number of filters allocated. */ int t4_alloc_mac_filt(struct adapter *adap, unsigned int mbox, unsigned int viid, bool free, unsigned int naddr, const u8 **addr, u16 *idx, u64 *hash, bool sleep_ok) { int offset, ret = 0; struct fw_vi_mac_cmd c; unsigned int nfilters = 0; unsigned int max_naddr = adap->chip_params->mps_tcam_size; unsigned int rem = naddr; if (naddr > max_naddr) return -EINVAL; for (offset = 0; offset < naddr ; /**/) { unsigned int fw_naddr = (rem < ARRAY_SIZE(c.u.exact) ? rem : ARRAY_SIZE(c.u.exact)); size_t len16 = DIV_ROUND_UP(offsetof(struct fw_vi_mac_cmd, u.exact[fw_naddr]), 16); struct fw_vi_mac_exact *p; int i; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_MAC_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_CMD_EXEC(free) | V_FW_VI_MAC_CMD_VIID(viid)); c.freemacs_to_len16 = cpu_to_be32(V_FW_VI_MAC_CMD_FREEMACS(free) | V_FW_CMD_LEN16(len16)); for (i = 0, p = c.u.exact; i < fw_naddr; i++, p++) { p->valid_to_idx = cpu_to_be16(F_FW_VI_MAC_CMD_VALID | V_FW_VI_MAC_CMD_IDX(FW_VI_MAC_ADD_MAC)); memcpy(p->macaddr, addr[offset+i], sizeof(p->macaddr)); } /* * It's okay if we run out of space in our MAC address arena. * Some of the addresses we submit may get stored so we need * to run through the reply to see what the results were ... */ ret = t4_wr_mbox_meat(adap, mbox, &c, sizeof(c), &c, sleep_ok); if (ret && ret != -FW_ENOMEM) break; for (i = 0, p = c.u.exact; i < fw_naddr; i++, p++) { u16 index = G_FW_VI_MAC_CMD_IDX( be16_to_cpu(p->valid_to_idx)); if (idx) idx[offset+i] = (index >= max_naddr ? 0xffff : index); if (index < max_naddr) nfilters++; else if (hash) *hash |= (1ULL << hash_mac_addr(addr[offset+i])); } free = false; offset += fw_naddr; rem -= fw_naddr; } if (ret == 0 || ret == -FW_ENOMEM) ret = nfilters; return ret; } /** * t4_change_mac - modifies the exact-match filter for a MAC address * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @idx: index of existing filter for old value of MAC address, or -1 * @addr: the new MAC address value * @persist: whether a new MAC allocation should be persistent * @add_smt: if true also add the address to the HW SMT * * Modifies an exact-match filter and sets it to the new MAC address if * @idx >= 0, or adds the MAC address to a new filter if @idx < 0. In the * latter case the address is added persistently if @persist is %true. * * Note that in general it is not possible to modify the value of a given * filter so the generic way to modify an address filter is to free the one * being used by the old address value and allocate a new filter for the * new address value. * * Returns a negative error number or the index of the filter with the new * MAC value. Note that this index may differ from @idx. */ int t4_change_mac(struct adapter *adap, unsigned int mbox, unsigned int viid, int idx, const u8 *addr, bool persist, bool add_smt) { int ret, mode; struct fw_vi_mac_cmd c; struct fw_vi_mac_exact *p = c.u.exact; unsigned int max_mac_addr = adap->chip_params->mps_tcam_size; if (idx < 0) /* new allocation */ idx = persist ? FW_VI_MAC_ADD_PERSIST_MAC : FW_VI_MAC_ADD_MAC; mode = add_smt ? FW_VI_MAC_SMT_AND_MPSTCAM : FW_VI_MAC_MPS_TCAM_ENTRY; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_MAC_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_VI_MAC_CMD_VIID(viid)); c.freemacs_to_len16 = cpu_to_be32(V_FW_CMD_LEN16(1)); p->valid_to_idx = cpu_to_be16(F_FW_VI_MAC_CMD_VALID | V_FW_VI_MAC_CMD_SMAC_RESULT(mode) | V_FW_VI_MAC_CMD_IDX(idx)); memcpy(p->macaddr, addr, sizeof(p->macaddr)); ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); if (ret == 0) { ret = G_FW_VI_MAC_CMD_IDX(be16_to_cpu(p->valid_to_idx)); if (ret >= max_mac_addr) ret = -ENOMEM; } return ret; } /** * t4_set_addr_hash - program the MAC inexact-match hash filter * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @ucast: whether the hash filter should also match unicast addresses * @vec: the value to be written to the hash filter * @sleep_ok: call is allowed to sleep * * Sets the 64-bit inexact-match hash filter for a virtual interface. */ int t4_set_addr_hash(struct adapter *adap, unsigned int mbox, unsigned int viid, bool ucast, u64 vec, bool sleep_ok) { struct fw_vi_mac_cmd c; u32 val; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_MAC_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_VI_ENABLE_CMD_VIID(viid)); val = V_FW_VI_MAC_CMD_ENTRY_TYPE(FW_VI_MAC_TYPE_HASHVEC) | V_FW_VI_MAC_CMD_HASHUNIEN(ucast) | V_FW_CMD_LEN16(1); c.freemacs_to_len16 = cpu_to_be32(val); c.u.hash.hashvec = cpu_to_be64(vec); return t4_wr_mbox_meat(adap, mbox, &c, sizeof(c), NULL, sleep_ok); } /** * t4_enable_vi_params - enable/disable a virtual interface * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @rx_en: 1=enable Rx, 0=disable Rx * @tx_en: 1=enable Tx, 0=disable Tx * @dcb_en: 1=enable delivery of Data Center Bridging messages. * * Enables/disables a virtual interface. Note that setting DCB Enable * only makes sense when enabling a Virtual Interface ... */ int t4_enable_vi_params(struct adapter *adap, unsigned int mbox, unsigned int viid, bool rx_en, bool tx_en, bool dcb_en) { struct fw_vi_enable_cmd c; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_ENABLE_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_VI_ENABLE_CMD_VIID(viid)); c.ien_to_len16 = cpu_to_be32(V_FW_VI_ENABLE_CMD_IEN(rx_en) | V_FW_VI_ENABLE_CMD_EEN(tx_en) | V_FW_VI_ENABLE_CMD_DCB_INFO(dcb_en) | FW_LEN16(c)); return t4_wr_mbox_ns(adap, mbox, &c, sizeof(c), NULL); } /** * t4_enable_vi - enable/disable a virtual interface * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @rx_en: 1=enable Rx, 0=disable Rx * @tx_en: 1=enable Tx, 0=disable Tx * * Enables/disables a virtual interface. Note that setting DCB Enable * only makes sense when enabling a Virtual Interface ... */ int t4_enable_vi(struct adapter *adap, unsigned int mbox, unsigned int viid, bool rx_en, bool tx_en) { return t4_enable_vi_params(adap, mbox, viid, rx_en, tx_en, 0); } /** * t4_identify_port - identify a VI's port by blinking its LED * @adap: the adapter * @mbox: mailbox to use for the FW command * @viid: the VI id * @nblinks: how many times to blink LED at 2.5 Hz * * Identifies a VI's port by blinking its LED. */ int t4_identify_port(struct adapter *adap, unsigned int mbox, unsigned int viid, unsigned int nblinks) { struct fw_vi_enable_cmd c; memset(&c, 0, sizeof(c)); c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_ENABLE_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_VI_ENABLE_CMD_VIID(viid)); c.ien_to_len16 = cpu_to_be32(F_FW_VI_ENABLE_CMD_LED | FW_LEN16(c)); c.blinkdur = cpu_to_be16(nblinks); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_iq_stop - stop an ingress queue and its FLs * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the queues * @vf: the VF owning the queues * @iqtype: the ingress queue type (FW_IQ_TYPE_FL_INT_CAP, etc.) * @iqid: ingress queue id * @fl0id: FL0 queue id or 0xffff if no attached FL0 * @fl1id: FL1 queue id or 0xffff if no attached FL1 * * Stops an ingress queue and its associated FLs, if any. This causes * any current or future data/messages destined for these queues to be * tossed. */ int t4_iq_stop(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int iqtype, unsigned int iqid, unsigned int fl0id, unsigned int fl1id) { struct fw_iq_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_IQ_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_IQ_CMD_PFN(pf) | V_FW_IQ_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_IQ_CMD_IQSTOP | FW_LEN16(c)); c.type_to_iqandstindex = cpu_to_be32(V_FW_IQ_CMD_TYPE(iqtype)); c.iqid = cpu_to_be16(iqid); c.fl0id = cpu_to_be16(fl0id); c.fl1id = cpu_to_be16(fl1id); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_iq_free - free an ingress queue and its FLs * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the queues * @vf: the VF owning the queues * @iqtype: the ingress queue type (FW_IQ_TYPE_FL_INT_CAP, etc.) * @iqid: ingress queue id * @fl0id: FL0 queue id or 0xffff if no attached FL0 * @fl1id: FL1 queue id or 0xffff if no attached FL1 * * Frees an ingress queue and its associated FLs, if any. */ int t4_iq_free(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int iqtype, unsigned int iqid, unsigned int fl0id, unsigned int fl1id) { struct fw_iq_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_IQ_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_IQ_CMD_PFN(pf) | V_FW_IQ_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_IQ_CMD_FREE | FW_LEN16(c)); c.type_to_iqandstindex = cpu_to_be32(V_FW_IQ_CMD_TYPE(iqtype)); c.iqid = cpu_to_be16(iqid); c.fl0id = cpu_to_be16(fl0id); c.fl1id = cpu_to_be16(fl1id); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_eth_eq_free - free an Ethernet egress queue * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the queue * @vf: the VF owning the queue * @eqid: egress queue id * * Frees an Ethernet egress queue. */ int t4_eth_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int eqid) { struct fw_eq_eth_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_EQ_ETH_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_EQ_ETH_CMD_PFN(pf) | V_FW_EQ_ETH_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_EQ_ETH_CMD_FREE | FW_LEN16(c)); c.eqid_pkd = cpu_to_be32(V_FW_EQ_ETH_CMD_EQID(eqid)); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_ctrl_eq_free - free a control egress queue * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the queue * @vf: the VF owning the queue * @eqid: egress queue id * * Frees a control egress queue. */ int t4_ctrl_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int eqid) { struct fw_eq_ctrl_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_EQ_CTRL_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_EQ_CTRL_CMD_PFN(pf) | V_FW_EQ_CTRL_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_EQ_CTRL_CMD_FREE | FW_LEN16(c)); c.cmpliqid_eqid = cpu_to_be32(V_FW_EQ_CTRL_CMD_EQID(eqid)); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_ofld_eq_free - free an offload egress queue * @adap: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the queue * @vf: the VF owning the queue * @eqid: egress queue id * * Frees a control egress queue. */ int t4_ofld_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int eqid) { struct fw_eq_ofld_cmd c; memset(&c, 0, sizeof(c)); c.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_EQ_OFLD_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_EXEC | V_FW_EQ_OFLD_CMD_PFN(pf) | V_FW_EQ_OFLD_CMD_VFN(vf)); c.alloc_to_len16 = cpu_to_be32(F_FW_EQ_OFLD_CMD_FREE | FW_LEN16(c)); c.eqid_pkd = cpu_to_be32(V_FW_EQ_OFLD_CMD_EQID(eqid)); return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL); } /** * t4_link_down_rc_str - return a string for a Link Down Reason Code * @link_down_rc: Link Down Reason Code * * Returns a string representation of the Link Down Reason Code. */ const char *t4_link_down_rc_str(unsigned char link_down_rc) { static const char *reason[] = { "Link Down", "Remote Fault", "Auto-negotiation Failure", "Reserved3", "Insufficient Airflow", "Unable To Determine Reason", "No RX Signal Detected", "Reserved7", }; if (link_down_rc >= ARRAY_SIZE(reason)) return "Bad Reason Code"; return reason[link_down_rc]; } /* * Updates all fields owned by the common code in port_info and link_config * based on information provided by the firmware. Does not touch any * requested_* field. */ static void handle_port_info(struct port_info *pi, const struct fw_port_info *p) { struct link_config *lc = &pi->link_cfg; int speed; unsigned char fc, fec; u32 stat = be32_to_cpu(p->lstatus_to_modtype); pi->port_type = G_FW_PORT_CMD_PTYPE(stat); pi->mod_type = G_FW_PORT_CMD_MODTYPE(stat); pi->mdio_addr = stat & F_FW_PORT_CMD_MDIOCAP ? G_FW_PORT_CMD_MDIOADDR(stat) : -1; lc->supported = be16_to_cpu(p->pcap); lc->advertising = be16_to_cpu(p->acap); lc->lp_advertising = be16_to_cpu(p->lpacap); lc->link_ok = (stat & F_FW_PORT_CMD_LSTATUS) != 0; lc->link_down_rc = G_FW_PORT_CMD_LINKDNRC(stat); speed = 0; if (stat & V_FW_PORT_CMD_LSPEED(FW_PORT_CAP_SPEED_100M)) speed = 100; else if (stat & V_FW_PORT_CMD_LSPEED(FW_PORT_CAP_SPEED_1G)) speed = 1000; else if (stat & V_FW_PORT_CMD_LSPEED(FW_PORT_CAP_SPEED_10G)) speed = 10000; else if (stat & V_FW_PORT_CMD_LSPEED(FW_PORT_CAP_SPEED_25G)) speed = 25000; else if (stat & V_FW_PORT_CMD_LSPEED(FW_PORT_CAP_SPEED_40G)) speed = 40000; else if (stat & V_FW_PORT_CMD_LSPEED(FW_PORT_CAP_SPEED_100G)) speed = 100000; lc->speed = speed; fc = 0; if (stat & F_FW_PORT_CMD_RXPAUSE) fc |= PAUSE_RX; if (stat & F_FW_PORT_CMD_TXPAUSE) fc |= PAUSE_TX; lc->fc = fc; fec = 0; if (lc->advertising & FW_PORT_CAP_FEC_RS) fec |= FEC_RS; if (lc->advertising & FW_PORT_CAP_FEC_BASER_RS) fec |= FEC_BASER_RS; if (lc->advertising & FW_PORT_CAP_FEC_RESERVED) fec |= FEC_RESERVED; lc->fec = fec; } /** * t4_update_port_info - retrieve and update port information if changed * @pi: the port_info * * We issue a Get Port Information Command to the Firmware and, if * successful, we check to see if anything is different from what we * last recorded and update things accordingly. */ int t4_update_port_info(struct port_info *pi) { struct fw_port_cmd port_cmd; int ret; memset(&port_cmd, 0, sizeof port_cmd); port_cmd.op_to_portid = cpu_to_be32(V_FW_CMD_OP(FW_PORT_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | V_FW_PORT_CMD_PORTID(pi->tx_chan)); port_cmd.action_to_len16 = cpu_to_be32( V_FW_PORT_CMD_ACTION(FW_PORT_ACTION_GET_PORT_INFO) | FW_LEN16(port_cmd)); ret = t4_wr_mbox_ns(pi->adapter, pi->adapter->mbox, &port_cmd, sizeof(port_cmd), &port_cmd); if (ret) return ret; handle_port_info(pi, &port_cmd.u.info); return 0; } /** * t4_handle_fw_rpl - process a FW reply message * @adap: the adapter * @rpl: start of the FW message * * Processes a FW message, such as link state change messages. */ int t4_handle_fw_rpl(struct adapter *adap, const __be64 *rpl) { u8 opcode = *(const u8 *)rpl; const struct fw_port_cmd *p = (const void *)rpl; unsigned int action = G_FW_PORT_CMD_ACTION(be32_to_cpu(p->action_to_len16)); if (opcode == FW_PORT_CMD && action == FW_PORT_ACTION_GET_PORT_INFO) { /* link/module state change message */ int i, old_ptype, old_mtype; int chan = G_FW_PORT_CMD_PORTID(be32_to_cpu(p->op_to_portid)); struct port_info *pi = NULL; struct link_config *lc, *old_lc; for_each_port(adap, i) { pi = adap2pinfo(adap, i); if (pi->tx_chan == chan) break; } lc = &pi->link_cfg; old_lc = &pi->old_link_cfg; old_ptype = pi->port_type; old_mtype = pi->mod_type; handle_port_info(pi, &p->u.info); if (old_ptype != pi->port_type || old_mtype != pi->mod_type) { t4_os_portmod_changed(pi); } if (old_lc->link_ok != lc->link_ok || old_lc->speed != lc->speed || old_lc->fec != lc->fec || old_lc->fc != lc->fc) { t4_os_link_changed(pi); *old_lc = *lc; } } else { CH_WARN_RATELIMIT(adap, "Unknown firmware reply %d\n", opcode); return -EINVAL; } return 0; } /** * get_pci_mode - determine a card's PCI mode * @adapter: the adapter * @p: where to store the PCI settings * * Determines a card's PCI mode and associated parameters, such as speed * and width. */ static void get_pci_mode(struct adapter *adapter, struct pci_params *p) { u16 val; u32 pcie_cap; pcie_cap = t4_os_find_pci_capability(adapter, PCI_CAP_ID_EXP); if (pcie_cap) { t4_os_pci_read_cfg2(adapter, pcie_cap + PCI_EXP_LNKSTA, &val); p->speed = val & PCI_EXP_LNKSTA_CLS; p->width = (val & PCI_EXP_LNKSTA_NLW) >> 4; } } struct flash_desc { u32 vendor_and_model_id; u32 size_mb; }; int t4_get_flash_params(struct adapter *adapter) { /* * Table for non-standard supported Flash parts. Note, all Flash * parts must have 64KB sectors. */ static struct flash_desc supported_flash[] = { { 0x00150201, 4 << 20 }, /* Spansion 4MB S25FL032P */ }; int ret; u32 flashid = 0; unsigned int part, manufacturer; unsigned int density, size; /* * Issue a Read ID Command to the Flash part. We decode supported * Flash parts and their sizes from this. There's a newer Query * Command which can retrieve detailed geometry information but many * Flash parts don't support it. */ ret = sf1_write(adapter, 1, 1, 0, SF_RD_ID); if (!ret) ret = sf1_read(adapter, 3, 0, 1, &flashid); t4_write_reg(adapter, A_SF_OP, 0); /* unlock SF */ if (ret < 0) return ret; /* * Check to see if it's one of our non-standard supported Flash parts. */ for (part = 0; part < ARRAY_SIZE(supported_flash); part++) if (supported_flash[part].vendor_and_model_id == flashid) { adapter->params.sf_size = supported_flash[part].size_mb; adapter->params.sf_nsec = adapter->params.sf_size / SF_SEC_SIZE; goto found; } /* * Decode Flash part size. The code below looks repetative with * common encodings, but that's not guaranteed in the JEDEC * specification for the Read JADEC ID command. The only thing that * we're guaranteed by the JADEC specification is where the * Manufacturer ID is in the returned result. After that each * Manufacturer ~could~ encode things completely differently. * Note, all Flash parts must have 64KB sectors. */ manufacturer = flashid & 0xff; switch (manufacturer) { case 0x20: { /* Micron/Numonix */ /* * This Density -> Size decoding table is taken from Micron * Data Sheets. */ density = (flashid >> 16) & 0xff; switch (density) { case 0x14: size = 1 << 20; break; /* 1MB */ case 0x15: size = 1 << 21; break; /* 2MB */ case 0x16: size = 1 << 22; break; /* 4MB */ case 0x17: size = 1 << 23; break; /* 8MB */ case 0x18: size = 1 << 24; break; /* 16MB */ case 0x19: size = 1 << 25; break; /* 32MB */ case 0x20: size = 1 << 26; break; /* 64MB */ case 0x21: size = 1 << 27; break; /* 128MB */ case 0x22: size = 1 << 28; break; /* 256MB */ default: CH_ERR(adapter, "Micron Flash Part has bad size, " "ID = %#x, Density code = %#x\n", flashid, density); return -EINVAL; } break; } case 0xef: { /* Winbond */ /* * This Density -> Size decoding table is taken from Winbond * Data Sheets. */ density = (flashid >> 16) & 0xff; switch (density) { case 0x17: size = 1 << 23; break; /* 8MB */ case 0x18: size = 1 << 24; break; /* 16MB */ default: CH_ERR(adapter, "Winbond Flash Part has bad size, " "ID = %#x, Density code = %#x\n", flashid, density); return -EINVAL; } break; } default: CH_ERR(adapter, "Unsupported Flash Part, ID = %#x\n", flashid); return -EINVAL; } /* * Store decoded Flash size and fall through into vetting code. */ adapter->params.sf_size = size; adapter->params.sf_nsec = size / SF_SEC_SIZE; found: /* * We should ~probably~ reject adapters with FLASHes which are too * small but we have some legacy FPGAs with small FLASHes that we'd * still like to use. So instead we emit a scary message ... */ if (adapter->params.sf_size < FLASH_MIN_SIZE) CH_WARN(adapter, "WARNING: Flash Part ID %#x, size %#x < %#x\n", flashid, adapter->params.sf_size, FLASH_MIN_SIZE); return 0; } static void set_pcie_completion_timeout(struct adapter *adapter, u8 range) { u16 val; u32 pcie_cap; pcie_cap = t4_os_find_pci_capability(adapter, PCI_CAP_ID_EXP); if (pcie_cap) { t4_os_pci_read_cfg2(adapter, pcie_cap + PCI_EXP_DEVCTL2, &val); val &= 0xfff0; val |= range ; t4_os_pci_write_cfg2(adapter, pcie_cap + PCI_EXP_DEVCTL2, val); } } const struct chip_params *t4_get_chip_params(int chipid) { static const struct chip_params chip_params[] = { { /* T4 */ .nchan = NCHAN, .pm_stats_cnt = PM_NSTATS, .cng_ch_bits_log = 2, .nsched_cls = 15, .cim_num_obq = CIM_NUM_OBQ, .mps_rplc_size = 128, .vfcount = 128, .sge_fl_db = F_DBPRIO, .mps_tcam_size = NUM_MPS_CLS_SRAM_L_INSTANCES, }, { /* T5 */ .nchan = NCHAN, .pm_stats_cnt = PM_NSTATS, .cng_ch_bits_log = 2, .nsched_cls = 16, .cim_num_obq = CIM_NUM_OBQ_T5, .mps_rplc_size = 128, .vfcount = 128, .sge_fl_db = F_DBPRIO | F_DBTYPE, .mps_tcam_size = NUM_MPS_T5_CLS_SRAM_L_INSTANCES, }, { /* T6 */ .nchan = T6_NCHAN, .pm_stats_cnt = T6_PM_NSTATS, .cng_ch_bits_log = 3, .nsched_cls = 16, .cim_num_obq = CIM_NUM_OBQ_T5, .mps_rplc_size = 256, .vfcount = 256, .sge_fl_db = 0, .mps_tcam_size = NUM_MPS_T5_CLS_SRAM_L_INSTANCES, }, }; chipid -= CHELSIO_T4; if (chipid < 0 || chipid >= ARRAY_SIZE(chip_params)) return NULL; return &chip_params[chipid]; } /** * t4_prep_adapter - prepare SW and HW for operation * @adapter: the adapter * @buf: temporary space of at least VPD_LEN size provided by the caller. * * Initialize adapter SW state for the various HW modules, set initial * values for some adapter tunables, take PHYs out of reset, and * initialize the MDIO interface. */ int t4_prep_adapter(struct adapter *adapter, u8 *buf) { int ret; uint16_t device_id; uint32_t pl_rev; get_pci_mode(adapter, &adapter->params.pci); pl_rev = t4_read_reg(adapter, A_PL_REV); adapter->params.chipid = G_CHIPID(pl_rev); adapter->params.rev = G_REV(pl_rev); if (adapter->params.chipid == 0) { /* T4 did not have chipid in PL_REV (T5 onwards do) */ adapter->params.chipid = CHELSIO_T4; /* T4A1 chip is not supported */ if (adapter->params.rev == 1) { CH_ALERT(adapter, "T4 rev 1 chip is not supported.\n"); return -EINVAL; } } adapter->chip_params = t4_get_chip_params(chip_id(adapter)); if (adapter->chip_params == NULL) return -EINVAL; adapter->params.pci.vpd_cap_addr = t4_os_find_pci_capability(adapter, PCI_CAP_ID_VPD); ret = t4_get_flash_params(adapter); if (ret < 0) return ret; ret = get_vpd_params(adapter, &adapter->params.vpd, buf); if (ret < 0) return ret; /* Cards with real ASICs have the chipid in the PCIe device id */ t4_os_pci_read_cfg2(adapter, PCI_DEVICE_ID, &device_id); if (device_id >> 12 == chip_id(adapter)) adapter->params.cim_la_size = CIMLA_SIZE; else { /* FPGA */ adapter->params.fpga = 1; adapter->params.cim_la_size = 2 * CIMLA_SIZE; } init_cong_ctrl(adapter->params.a_wnd, adapter->params.b_wnd); /* * Default port and clock for debugging in case we can't reach FW. */ adapter->params.nports = 1; adapter->params.portvec = 1; adapter->params.vpd.cclk = 50000; /* Set pci completion timeout value to 4 seconds. */ set_pcie_completion_timeout(adapter, 0xd); return 0; } /** * t4_shutdown_adapter - shut down adapter, host & wire * @adapter: the adapter * * Perform an emergency shutdown of the adapter and stop it from * continuing any further communication on the ports or DMA to the * host. This is typically used when the adapter and/or firmware * have crashed and we want to prevent any further accidental * communication with the rest of the world. This will also force * the port Link Status to go down -- if register writes work -- * which should help our peers figure out that we're down. */ int t4_shutdown_adapter(struct adapter *adapter) { int port; t4_intr_disable(adapter); t4_write_reg(adapter, A_DBG_GPIO_EN, 0); for_each_port(adapter, port) { u32 a_port_cfg = is_t4(adapter) ? PORT_REG(port, A_XGMAC_PORT_CFG) : T5_PORT_REG(port, A_MAC_PORT_CFG); t4_write_reg(adapter, a_port_cfg, t4_read_reg(adapter, a_port_cfg) & ~V_SIGNAL_DET(1)); } t4_set_reg_field(adapter, A_SGE_CONTROL, F_GLOBALENABLE, 0); return 0; } /** * t4_init_devlog_params - initialize adapter->params.devlog * @adap: the adapter * @fw_attach: whether we can talk to the firmware * * Initialize various fields of the adapter's Firmware Device Log * Parameters structure. */ int t4_init_devlog_params(struct adapter *adap, int fw_attach) { struct devlog_params *dparams = &adap->params.devlog; u32 pf_dparams; unsigned int devlog_meminfo; struct fw_devlog_cmd devlog_cmd; int ret; /* If we're dealing with newer firmware, the Device Log Paramerters * are stored in a designated register which allows us to access the * Device Log even if we can't talk to the firmware. */ pf_dparams = t4_read_reg(adap, PCIE_FW_REG(A_PCIE_FW_PF, PCIE_FW_PF_DEVLOG)); if (pf_dparams) { unsigned int nentries, nentries128; dparams->memtype = G_PCIE_FW_PF_DEVLOG_MEMTYPE(pf_dparams); dparams->start = G_PCIE_FW_PF_DEVLOG_ADDR16(pf_dparams) << 4; nentries128 = G_PCIE_FW_PF_DEVLOG_NENTRIES128(pf_dparams); nentries = (nentries128 + 1) * 128; dparams->size = nentries * sizeof(struct fw_devlog_e); return 0; } /* * For any failing returns ... */ memset(dparams, 0, sizeof *dparams); /* * If we can't talk to the firmware, there's really nothing we can do * at this point. */ if (!fw_attach) return -ENXIO; /* Otherwise, ask the firmware for it's Device Log Parameters. */ memset(&devlog_cmd, 0, sizeof devlog_cmd); devlog_cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_DEVLOG_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ); devlog_cmd.retval_len16 = cpu_to_be32(FW_LEN16(devlog_cmd)); ret = t4_wr_mbox(adap, adap->mbox, &devlog_cmd, sizeof(devlog_cmd), &devlog_cmd); if (ret) return ret; devlog_meminfo = be32_to_cpu(devlog_cmd.memtype_devlog_memaddr16_devlog); dparams->memtype = G_FW_DEVLOG_CMD_MEMTYPE_DEVLOG(devlog_meminfo); dparams->start = G_FW_DEVLOG_CMD_MEMADDR16_DEVLOG(devlog_meminfo) << 4; dparams->size = be32_to_cpu(devlog_cmd.memsize_devlog); return 0; } /** * t4_init_sge_params - initialize adap->params.sge * @adapter: the adapter * * Initialize various fields of the adapter's SGE Parameters structure. */ int t4_init_sge_params(struct adapter *adapter) { u32 r; struct sge_params *sp = &adapter->params.sge; unsigned i, tscale = 1; r = t4_read_reg(adapter, A_SGE_INGRESS_RX_THRESHOLD); sp->counter_val[0] = G_THRESHOLD_0(r); sp->counter_val[1] = G_THRESHOLD_1(r); sp->counter_val[2] = G_THRESHOLD_2(r); sp->counter_val[3] = G_THRESHOLD_3(r); if (chip_id(adapter) >= CHELSIO_T6) { r = t4_read_reg(adapter, A_SGE_ITP_CONTROL); tscale = G_TSCALE(r); if (tscale == 0) tscale = 1; else tscale += 2; } r = t4_read_reg(adapter, A_SGE_TIMER_VALUE_0_AND_1); sp->timer_val[0] = core_ticks_to_us(adapter, G_TIMERVALUE0(r)) * tscale; sp->timer_val[1] = core_ticks_to_us(adapter, G_TIMERVALUE1(r)) * tscale; r = t4_read_reg(adapter, A_SGE_TIMER_VALUE_2_AND_3); sp->timer_val[2] = core_ticks_to_us(adapter, G_TIMERVALUE2(r)) * tscale; sp->timer_val[3] = core_ticks_to_us(adapter, G_TIMERVALUE3(r)) * tscale; r = t4_read_reg(adapter, A_SGE_TIMER_VALUE_4_AND_5); sp->timer_val[4] = core_ticks_to_us(adapter, G_TIMERVALUE4(r)) * tscale; sp->timer_val[5] = core_ticks_to_us(adapter, G_TIMERVALUE5(r)) * tscale; r = t4_read_reg(adapter, A_SGE_CONM_CTRL); sp->fl_starve_threshold = G_EGRTHRESHOLD(r) * 2 + 1; if (is_t4(adapter)) sp->fl_starve_threshold2 = sp->fl_starve_threshold; else if (is_t5(adapter)) sp->fl_starve_threshold2 = G_EGRTHRESHOLDPACKING(r) * 2 + 1; else sp->fl_starve_threshold2 = G_T6_EGRTHRESHOLDPACKING(r) * 2 + 1; /* egress queues: log2 of # of doorbells per BAR2 page */ r = t4_read_reg(adapter, A_SGE_EGRESS_QUEUES_PER_PAGE_PF); r >>= S_QUEUESPERPAGEPF0 + (S_QUEUESPERPAGEPF1 - S_QUEUESPERPAGEPF0) * adapter->pf; sp->eq_s_qpp = r & M_QUEUESPERPAGEPF0; /* ingress queues: log2 of # of doorbells per BAR2 page */ r = t4_read_reg(adapter, A_SGE_INGRESS_QUEUES_PER_PAGE_PF); r >>= S_QUEUESPERPAGEPF0 + (S_QUEUESPERPAGEPF1 - S_QUEUESPERPAGEPF0) * adapter->pf; sp->iq_s_qpp = r & M_QUEUESPERPAGEPF0; r = t4_read_reg(adapter, A_SGE_HOST_PAGE_SIZE); r >>= S_HOSTPAGESIZEPF0 + (S_HOSTPAGESIZEPF1 - S_HOSTPAGESIZEPF0) * adapter->pf; sp->page_shift = (r & M_HOSTPAGESIZEPF0) + 10; r = t4_read_reg(adapter, A_SGE_CONTROL); sp->sge_control = r; sp->spg_len = r & F_EGRSTATUSPAGESIZE ? 128 : 64; sp->fl_pktshift = G_PKTSHIFT(r); if (chip_id(adapter) <= CHELSIO_T5) { sp->pad_boundary = 1 << (G_INGPADBOUNDARY(r) + X_INGPADBOUNDARY_SHIFT); } else { sp->pad_boundary = 1 << (G_INGPADBOUNDARY(r) + X_T6_INGPADBOUNDARY_SHIFT); } if (is_t4(adapter)) sp->pack_boundary = sp->pad_boundary; else { r = t4_read_reg(adapter, A_SGE_CONTROL2); if (G_INGPACKBOUNDARY(r) == 0) sp->pack_boundary = 16; else sp->pack_boundary = 1 << (G_INGPACKBOUNDARY(r) + 5); } for (i = 0; i < SGE_FLBUF_SIZES; i++) sp->sge_fl_buffer_size[i] = t4_read_reg(adapter, A_SGE_FL_BUFFER_SIZE0 + (4 * i)); return 0; } /* * Read and cache the adapter's compressed filter mode and ingress config. */ static void read_filter_mode_and_ingress_config(struct adapter *adap, bool sleep_ok) { struct tp_params *tpp = &adap->params.tp; t4_tp_pio_read(adap, &tpp->vlan_pri_map, 1, A_TP_VLAN_PRI_MAP, sleep_ok); t4_tp_pio_read(adap, &tpp->ingress_config, 1, A_TP_INGRESS_CONFIG, sleep_ok); /* * Now that we have TP_VLAN_PRI_MAP cached, we can calculate the field * shift positions of several elements of the Compressed Filter Tuple * for this adapter which we need frequently ... */ tpp->fcoe_shift = t4_filter_field_shift(adap, F_FCOE); tpp->port_shift = t4_filter_field_shift(adap, F_PORT); tpp->vnic_shift = t4_filter_field_shift(adap, F_VNIC_ID); tpp->vlan_shift = t4_filter_field_shift(adap, F_VLAN); tpp->tos_shift = t4_filter_field_shift(adap, F_TOS); tpp->protocol_shift = t4_filter_field_shift(adap, F_PROTOCOL); tpp->ethertype_shift = t4_filter_field_shift(adap, F_ETHERTYPE); tpp->macmatch_shift = t4_filter_field_shift(adap, F_MACMATCH); tpp->matchtype_shift = t4_filter_field_shift(adap, F_MPSHITTYPE); tpp->frag_shift = t4_filter_field_shift(adap, F_FRAGMENTATION); /* * If TP_INGRESS_CONFIG.VNID == 0, then TP_VLAN_PRI_MAP.VNIC_ID * represents the presence of an Outer VLAN instead of a VNIC ID. */ if ((tpp->ingress_config & F_VNIC) == 0) tpp->vnic_shift = -1; } /** * t4_init_tp_params - initialize adap->params.tp * @adap: the adapter * * Initialize various fields of the adapter's TP Parameters structure. */ int t4_init_tp_params(struct adapter *adap, bool sleep_ok) { int chan; u32 v; struct tp_params *tpp = &adap->params.tp; v = t4_read_reg(adap, A_TP_TIMER_RESOLUTION); tpp->tre = G_TIMERRESOLUTION(v); tpp->dack_re = G_DELAYEDACKRESOLUTION(v); /* MODQ_REQ_MAP defaults to setting queues 0-3 to chan 0-3 */ for (chan = 0; chan < MAX_NCHAN; chan++) tpp->tx_modq[chan] = chan; read_filter_mode_and_ingress_config(adap, sleep_ok); /* * Cache a mask of the bits that represent the error vector portion of * rx_pkt.err_vec. T6+ can use a compressed error vector to make room * for information about outer encapsulation (GENEVE/VXLAN/NVGRE). */ tpp->err_vec_mask = htobe16(0xffff); if (chip_id(adap) > CHELSIO_T5) { v = t4_read_reg(adap, A_TP_OUT_CONFIG); if (v & F_CRXPKTENC) { tpp->err_vec_mask = htobe16(V_T6_COMPR_RXERR_VEC(M_T6_COMPR_RXERR_VEC)); } } return 0; } /** * t4_filter_field_shift - calculate filter field shift * @adap: the adapter * @filter_sel: the desired field (from TP_VLAN_PRI_MAP bits) * * Return the shift position of a filter field within the Compressed * Filter Tuple. The filter field is specified via its selection bit * within TP_VLAN_PRI_MAL (filter mode). E.g. F_VLAN. */ int t4_filter_field_shift(const struct adapter *adap, int filter_sel) { unsigned int filter_mode = adap->params.tp.vlan_pri_map; unsigned int sel; int field_shift; if ((filter_mode & filter_sel) == 0) return -1; for (sel = 1, field_shift = 0; sel < filter_sel; sel <<= 1) { switch (filter_mode & sel) { case F_FCOE: field_shift += W_FT_FCOE; break; case F_PORT: field_shift += W_FT_PORT; break; case F_VNIC_ID: field_shift += W_FT_VNIC_ID; break; case F_VLAN: field_shift += W_FT_VLAN; break; case F_TOS: field_shift += W_FT_TOS; break; case F_PROTOCOL: field_shift += W_FT_PROTOCOL; break; case F_ETHERTYPE: field_shift += W_FT_ETHERTYPE; break; case F_MACMATCH: field_shift += W_FT_MACMATCH; break; case F_MPSHITTYPE: field_shift += W_FT_MPSHITTYPE; break; case F_FRAGMENTATION: field_shift += W_FT_FRAGMENTATION; break; } } return field_shift; } int t4_port_init(struct adapter *adap, int mbox, int pf, int vf, int port_id) { u8 addr[6]; int ret, i, j; struct fw_port_cmd c; u16 rss_size; struct port_info *p = adap2pinfo(adap, port_id); u32 param, val; memset(&c, 0, sizeof(c)); for (i = 0, j = -1; i <= p->port_id; i++) { do { j++; } while ((adap->params.portvec & (1 << j)) == 0); } if (!(adap->flags & IS_VF) || adap->params.vfres.r_caps & FW_CMD_CAP_PORT) { t4_update_port_info(p); } ret = t4_alloc_vi(adap, mbox, j, pf, vf, 1, addr, &rss_size); if (ret < 0) return ret; p->vi[0].viid = ret; if (chip_id(adap) <= CHELSIO_T5) p->vi[0].smt_idx = (ret & 0x7f) << 1; else p->vi[0].smt_idx = (ret & 0x7f); p->tx_chan = j; p->rx_chan_map = t4_get_mps_bg_map(adap, j); p->lport = j; p->vi[0].rss_size = rss_size; t4_os_set_hw_addr(p, addr); param = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_RSSINFO) | V_FW_PARAMS_PARAM_YZ(p->vi[0].viid); ret = t4_query_params(adap, mbox, pf, vf, 1, ¶m, &val); if (ret) p->vi[0].rss_base = 0xffff; else { /* MPASS((val >> 16) == rss_size); */ p->vi[0].rss_base = val & 0xffff; } return 0; } /** * t4_read_cimq_cfg - read CIM queue configuration * @adap: the adapter * @base: holds the queue base addresses in bytes * @size: holds the queue sizes in bytes * @thres: holds the queue full thresholds in bytes * * Returns the current configuration of the CIM queues, starting with * the IBQs, then the OBQs. */ void t4_read_cimq_cfg(struct adapter *adap, u16 *base, u16 *size, u16 *thres) { unsigned int i, v; int cim_num_obq = adap->chip_params->cim_num_obq; for (i = 0; i < CIM_NUM_IBQ; i++) { t4_write_reg(adap, A_CIM_QUEUE_CONFIG_REF, F_IBQSELECT | V_QUENUMSELECT(i)); v = t4_read_reg(adap, A_CIM_QUEUE_CONFIG_CTRL); /* value is in 256-byte units */ *base++ = G_CIMQBASE(v) * 256; *size++ = G_CIMQSIZE(v) * 256; *thres++ = G_QUEFULLTHRSH(v) * 8; /* 8-byte unit */ } for (i = 0; i < cim_num_obq; i++) { t4_write_reg(adap, A_CIM_QUEUE_CONFIG_REF, F_OBQSELECT | V_QUENUMSELECT(i)); v = t4_read_reg(adap, A_CIM_QUEUE_CONFIG_CTRL); /* value is in 256-byte units */ *base++ = G_CIMQBASE(v) * 256; *size++ = G_CIMQSIZE(v) * 256; } } /** * t4_read_cim_ibq - read the contents of a CIM inbound queue * @adap: the adapter * @qid: the queue index * @data: where to store the queue contents * @n: capacity of @data in 32-bit words * * Reads the contents of the selected CIM queue starting at address 0 up * to the capacity of @data. @n must be a multiple of 4. Returns < 0 on * error and the number of 32-bit words actually read on success. */ int t4_read_cim_ibq(struct adapter *adap, unsigned int qid, u32 *data, size_t n) { int i, err, attempts; unsigned int addr; const unsigned int nwords = CIM_IBQ_SIZE * 4; if (qid > 5 || (n & 3)) return -EINVAL; addr = qid * nwords; if (n > nwords) n = nwords; /* It might take 3-10ms before the IBQ debug read access is allowed. * Wait for 1 Sec with a delay of 1 usec. */ attempts = 1000000; for (i = 0; i < n; i++, addr++) { t4_write_reg(adap, A_CIM_IBQ_DBG_CFG, V_IBQDBGADDR(addr) | F_IBQDBGEN); err = t4_wait_op_done(adap, A_CIM_IBQ_DBG_CFG, F_IBQDBGBUSY, 0, attempts, 1); if (err) return err; *data++ = t4_read_reg(adap, A_CIM_IBQ_DBG_DATA); } t4_write_reg(adap, A_CIM_IBQ_DBG_CFG, 0); return i; } /** * t4_read_cim_obq - read the contents of a CIM outbound queue * @adap: the adapter * @qid: the queue index * @data: where to store the queue contents * @n: capacity of @data in 32-bit words * * Reads the contents of the selected CIM queue starting at address 0 up * to the capacity of @data. @n must be a multiple of 4. Returns < 0 on * error and the number of 32-bit words actually read on success. */ int t4_read_cim_obq(struct adapter *adap, unsigned int qid, u32 *data, size_t n) { int i, err; unsigned int addr, v, nwords; int cim_num_obq = adap->chip_params->cim_num_obq; if ((qid > (cim_num_obq - 1)) || (n & 3)) return -EINVAL; t4_write_reg(adap, A_CIM_QUEUE_CONFIG_REF, F_OBQSELECT | V_QUENUMSELECT(qid)); v = t4_read_reg(adap, A_CIM_QUEUE_CONFIG_CTRL); addr = G_CIMQBASE(v) * 64; /* muliple of 256 -> muliple of 4 */ nwords = G_CIMQSIZE(v) * 64; /* same */ if (n > nwords) n = nwords; for (i = 0; i < n; i++, addr++) { t4_write_reg(adap, A_CIM_OBQ_DBG_CFG, V_OBQDBGADDR(addr) | F_OBQDBGEN); err = t4_wait_op_done(adap, A_CIM_OBQ_DBG_CFG, F_OBQDBGBUSY, 0, 2, 1); if (err) return err; *data++ = t4_read_reg(adap, A_CIM_OBQ_DBG_DATA); } t4_write_reg(adap, A_CIM_OBQ_DBG_CFG, 0); return i; } enum { CIM_QCTL_BASE = 0, CIM_CTL_BASE = 0x2000, CIM_PBT_ADDR_BASE = 0x2800, CIM_PBT_LRF_BASE = 0x3000, CIM_PBT_DATA_BASE = 0x3800 }; /** * t4_cim_read - read a block from CIM internal address space * @adap: the adapter * @addr: the start address within the CIM address space * @n: number of words to read * @valp: where to store the result * * Reads a block of 4-byte words from the CIM intenal address space. */ int t4_cim_read(struct adapter *adap, unsigned int addr, unsigned int n, unsigned int *valp) { int ret = 0; if (t4_read_reg(adap, A_CIM_HOST_ACC_CTRL) & F_HOSTBUSY) return -EBUSY; for ( ; !ret && n--; addr += 4) { t4_write_reg(adap, A_CIM_HOST_ACC_CTRL, addr); ret = t4_wait_op_done(adap, A_CIM_HOST_ACC_CTRL, F_HOSTBUSY, 0, 5, 2); if (!ret) *valp++ = t4_read_reg(adap, A_CIM_HOST_ACC_DATA); } return ret; } /** * t4_cim_write - write a block into CIM internal address space * @adap: the adapter * @addr: the start address within the CIM address space * @n: number of words to write * @valp: set of values to write * * Writes a block of 4-byte words into the CIM intenal address space. */ int t4_cim_write(struct adapter *adap, unsigned int addr, unsigned int n, const unsigned int *valp) { int ret = 0; if (t4_read_reg(adap, A_CIM_HOST_ACC_CTRL) & F_HOSTBUSY) return -EBUSY; for ( ; !ret && n--; addr += 4) { t4_write_reg(adap, A_CIM_HOST_ACC_DATA, *valp++); t4_write_reg(adap, A_CIM_HOST_ACC_CTRL, addr | F_HOSTWRITE); ret = t4_wait_op_done(adap, A_CIM_HOST_ACC_CTRL, F_HOSTBUSY, 0, 5, 2); } return ret; } static int t4_cim_write1(struct adapter *adap, unsigned int addr, unsigned int val) { return t4_cim_write(adap, addr, 1, &val); } /** * t4_cim_ctl_read - read a block from CIM control region * @adap: the adapter * @addr: the start address within the CIM control region * @n: number of words to read * @valp: where to store the result * * Reads a block of 4-byte words from the CIM control region. */ int t4_cim_ctl_read(struct adapter *adap, unsigned int addr, unsigned int n, unsigned int *valp) { return t4_cim_read(adap, addr + CIM_CTL_BASE, n, valp); } /** * t4_cim_read_la - read CIM LA capture buffer * @adap: the adapter * @la_buf: where to store the LA data * @wrptr: the HW write pointer within the capture buffer * * Reads the contents of the CIM LA buffer with the most recent entry at * the end of the returned data and with the entry at @wrptr first. * We try to leave the LA in the running state we find it in. */ int t4_cim_read_la(struct adapter *adap, u32 *la_buf, unsigned int *wrptr) { int i, ret; unsigned int cfg, val, idx; ret = t4_cim_read(adap, A_UP_UP_DBG_LA_CFG, 1, &cfg); if (ret) return ret; if (cfg & F_UPDBGLAEN) { /* LA is running, freeze it */ ret = t4_cim_write1(adap, A_UP_UP_DBG_LA_CFG, 0); if (ret) return ret; } ret = t4_cim_read(adap, A_UP_UP_DBG_LA_CFG, 1, &val); if (ret) goto restart; idx = G_UPDBGLAWRPTR(val); if (wrptr) *wrptr = idx; for (i = 0; i < adap->params.cim_la_size; i++) { ret = t4_cim_write1(adap, A_UP_UP_DBG_LA_CFG, V_UPDBGLARDPTR(idx) | F_UPDBGLARDEN); if (ret) break; ret = t4_cim_read(adap, A_UP_UP_DBG_LA_CFG, 1, &val); if (ret) break; if (val & F_UPDBGLARDEN) { ret = -ETIMEDOUT; break; } ret = t4_cim_read(adap, A_UP_UP_DBG_LA_DATA, 1, &la_buf[i]); if (ret) break; /* address can't exceed 0xfff (UpDbgLaRdPtr is of 12-bits) */ idx = (idx + 1) & M_UPDBGLARDPTR; /* * Bits 0-3 of UpDbgLaRdPtr can be between 0000 to 1001 to * identify the 32-bit portion of the full 312-bit data */ if (is_t6(adap)) while ((idx & 0xf) > 9) idx = (idx + 1) % M_UPDBGLARDPTR; } restart: if (cfg & F_UPDBGLAEN) { int r = t4_cim_write1(adap, A_UP_UP_DBG_LA_CFG, cfg & ~F_UPDBGLARDEN); if (!ret) ret = r; } return ret; } /** * t4_tp_read_la - read TP LA capture buffer * @adap: the adapter * @la_buf: where to store the LA data * @wrptr: the HW write pointer within the capture buffer * * Reads the contents of the TP LA buffer with the most recent entry at * the end of the returned data and with the entry at @wrptr first. * We leave the LA in the running state we find it in. */ void t4_tp_read_la(struct adapter *adap, u64 *la_buf, unsigned int *wrptr) { bool last_incomplete; unsigned int i, cfg, val, idx; cfg = t4_read_reg(adap, A_TP_DBG_LA_CONFIG) & 0xffff; if (cfg & F_DBGLAENABLE) /* freeze LA */ t4_write_reg(adap, A_TP_DBG_LA_CONFIG, adap->params.tp.la_mask | (cfg ^ F_DBGLAENABLE)); val = t4_read_reg(adap, A_TP_DBG_LA_CONFIG); idx = G_DBGLAWPTR(val); last_incomplete = G_DBGLAMODE(val) >= 2 && (val & F_DBGLAWHLF) == 0; if (last_incomplete) idx = (idx + 1) & M_DBGLARPTR; if (wrptr) *wrptr = idx; val &= 0xffff; val &= ~V_DBGLARPTR(M_DBGLARPTR); val |= adap->params.tp.la_mask; for (i = 0; i < TPLA_SIZE; i++) { t4_write_reg(adap, A_TP_DBG_LA_CONFIG, V_DBGLARPTR(idx) | val); la_buf[i] = t4_read_reg64(adap, A_TP_DBG_LA_DATAL); idx = (idx + 1) & M_DBGLARPTR; } /* Wipe out last entry if it isn't valid */ if (last_incomplete) la_buf[TPLA_SIZE - 1] = ~0ULL; if (cfg & F_DBGLAENABLE) /* restore running state */ t4_write_reg(adap, A_TP_DBG_LA_CONFIG, cfg | adap->params.tp.la_mask); } /* * SGE Hung Ingress DMA Warning Threshold time and Warning Repeat Rate (in * seconds). If we find one of the SGE Ingress DMA State Machines in the same * state for more than the Warning Threshold then we'll issue a warning about * a potential hang. We'll repeat the warning as the SGE Ingress DMA Channel * appears to be hung every Warning Repeat second till the situation clears. * If the situation clears, we'll note that as well. */ #define SGE_IDMA_WARN_THRESH 1 #define SGE_IDMA_WARN_REPEAT 300 /** * t4_idma_monitor_init - initialize SGE Ingress DMA Monitor * @adapter: the adapter * @idma: the adapter IDMA Monitor state * * Initialize the state of an SGE Ingress DMA Monitor. */ void t4_idma_monitor_init(struct adapter *adapter, struct sge_idma_monitor_state *idma) { /* Initialize the state variables for detecting an SGE Ingress DMA * hang. The SGE has internal counters which count up on each clock * tick whenever the SGE finds its Ingress DMA State Engines in the * same state they were on the previous clock tick. The clock used is * the Core Clock so we have a limit on the maximum "time" they can * record; typically a very small number of seconds. For instance, * with a 600MHz Core Clock, we can only count up to a bit more than * 7s. So we'll synthesize a larger counter in order to not run the * risk of having the "timers" overflow and give us the flexibility to * maintain a Hung SGE State Machine of our own which operates across * a longer time frame. */ idma->idma_1s_thresh = core_ticks_per_usec(adapter) * 1000000; /* 1s */ idma->idma_stalled[0] = idma->idma_stalled[1] = 0; } /** * t4_idma_monitor - monitor SGE Ingress DMA state * @adapter: the adapter * @idma: the adapter IDMA Monitor state * @hz: number of ticks/second * @ticks: number of ticks since the last IDMA Monitor call */ void t4_idma_monitor(struct adapter *adapter, struct sge_idma_monitor_state *idma, int hz, int ticks) { int i, idma_same_state_cnt[2]; /* Read the SGE Debug Ingress DMA Same State Count registers. These * are counters inside the SGE which count up on each clock when the * SGE finds its Ingress DMA State Engines in the same states they * were in the previous clock. The counters will peg out at * 0xffffffff without wrapping around so once they pass the 1s * threshold they'll stay above that till the IDMA state changes. */ t4_write_reg(adapter, A_SGE_DEBUG_INDEX, 13); idma_same_state_cnt[0] = t4_read_reg(adapter, A_SGE_DEBUG_DATA_HIGH); idma_same_state_cnt[1] = t4_read_reg(adapter, A_SGE_DEBUG_DATA_LOW); for (i = 0; i < 2; i++) { u32 debug0, debug11; /* If the Ingress DMA Same State Counter ("timer") is less * than 1s, then we can reset our synthesized Stall Timer and * continue. If we have previously emitted warnings about a * potential stalled Ingress Queue, issue a note indicating * that the Ingress Queue has resumed forward progress. */ if (idma_same_state_cnt[i] < idma->idma_1s_thresh) { if (idma->idma_stalled[i] >= SGE_IDMA_WARN_THRESH*hz) CH_WARN(adapter, "SGE idma%d, queue %u, " "resumed after %d seconds\n", i, idma->idma_qid[i], idma->idma_stalled[i]/hz); idma->idma_stalled[i] = 0; continue; } /* Synthesize an SGE Ingress DMA Same State Timer in the Hz * domain. The first time we get here it'll be because we * passed the 1s Threshold; each additional time it'll be * because the RX Timer Callback is being fired on its regular * schedule. * * If the stall is below our Potential Hung Ingress Queue * Warning Threshold, continue. */ if (idma->idma_stalled[i] == 0) { idma->idma_stalled[i] = hz; idma->idma_warn[i] = 0; } else { idma->idma_stalled[i] += ticks; idma->idma_warn[i] -= ticks; } if (idma->idma_stalled[i] < SGE_IDMA_WARN_THRESH*hz) continue; /* We'll issue a warning every SGE_IDMA_WARN_REPEAT seconds. */ if (idma->idma_warn[i] > 0) continue; idma->idma_warn[i] = SGE_IDMA_WARN_REPEAT*hz; /* Read and save the SGE IDMA State and Queue ID information. * We do this every time in case it changes across time ... * can't be too careful ... */ t4_write_reg(adapter, A_SGE_DEBUG_INDEX, 0); debug0 = t4_read_reg(adapter, A_SGE_DEBUG_DATA_LOW); idma->idma_state[i] = (debug0 >> (i * 9)) & 0x3f; t4_write_reg(adapter, A_SGE_DEBUG_INDEX, 11); debug11 = t4_read_reg(adapter, A_SGE_DEBUG_DATA_LOW); idma->idma_qid[i] = (debug11 >> (i * 16)) & 0xffff; CH_WARN(adapter, "SGE idma%u, queue %u, potentially stuck in " " state %u for %d seconds (debug0=%#x, debug11=%#x)\n", i, idma->idma_qid[i], idma->idma_state[i], idma->idma_stalled[i]/hz, debug0, debug11); t4_sge_decode_idma_state(adapter, idma->idma_state[i]); } } /** * t4_read_pace_tbl - read the pace table * @adap: the adapter * @pace_vals: holds the returned values * * Returns the values of TP's pace table in microseconds. */ void t4_read_pace_tbl(struct adapter *adap, unsigned int pace_vals[NTX_SCHED]) { unsigned int i, v; for (i = 0; i < NTX_SCHED; i++) { t4_write_reg(adap, A_TP_PACE_TABLE, 0xffff0000 + i); v = t4_read_reg(adap, A_TP_PACE_TABLE); pace_vals[i] = dack_ticks_to_usec(adap, v); } } /** * t4_get_tx_sched - get the configuration of a Tx HW traffic scheduler * @adap: the adapter * @sched: the scheduler index * @kbps: the byte rate in Kbps * @ipg: the interpacket delay in tenths of nanoseconds * * Return the current configuration of a HW Tx scheduler. */ void t4_get_tx_sched(struct adapter *adap, unsigned int sched, unsigned int *kbps, unsigned int *ipg, bool sleep_ok) { unsigned int v, addr, bpt, cpt; if (kbps) { addr = A_TP_TX_MOD_Q1_Q0_RATE_LIMIT - sched / 2; t4_tp_tm_pio_read(adap, &v, 1, addr, sleep_ok); if (sched & 1) v >>= 16; bpt = (v >> 8) & 0xff; cpt = v & 0xff; if (!cpt) *kbps = 0; /* scheduler disabled */ else { v = (adap->params.vpd.cclk * 1000) / cpt; /* ticks/s */ *kbps = (v * bpt) / 125; } } if (ipg) { addr = A_TP_TX_MOD_Q1_Q0_TIMER_SEPARATOR - sched / 2; t4_tp_tm_pio_read(adap, &v, 1, addr, sleep_ok); if (sched & 1) v >>= 16; v &= 0xffff; *ipg = (10000 * v) / core_ticks_per_usec(adap); } } /** * t4_load_cfg - download config file * @adap: the adapter * @cfg_data: the cfg text file to write * @size: text file size * * Write the supplied config text file to the card's serial flash. */ int t4_load_cfg(struct adapter *adap, const u8 *cfg_data, unsigned int size) { int ret, i, n, cfg_addr; unsigned int addr; unsigned int flash_cfg_start_sec; unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec; cfg_addr = t4_flash_cfg_addr(adap); if (cfg_addr < 0) return cfg_addr; addr = cfg_addr; flash_cfg_start_sec = addr / SF_SEC_SIZE; if (size > FLASH_CFG_MAX_SIZE) { CH_ERR(adap, "cfg file too large, max is %u bytes\n", FLASH_CFG_MAX_SIZE); return -EFBIG; } i = DIV_ROUND_UP(FLASH_CFG_MAX_SIZE, /* # of sectors spanned */ sf_sec_size); ret = t4_flash_erase_sectors(adap, flash_cfg_start_sec, flash_cfg_start_sec + i - 1); /* * If size == 0 then we're simply erasing the FLASH sectors associated * with the on-adapter Firmware Configuration File. */ if (ret || size == 0) goto out; /* this will write to the flash up to SF_PAGE_SIZE at a time */ for (i = 0; i< size; i+= SF_PAGE_SIZE) { if ( (size - i) < SF_PAGE_SIZE) n = size - i; else n = SF_PAGE_SIZE; ret = t4_write_flash(adap, addr, n, cfg_data, 1); if (ret) goto out; addr += SF_PAGE_SIZE; cfg_data += SF_PAGE_SIZE; } out: if (ret) CH_ERR(adap, "config file %s failed %d\n", (size == 0 ? "clear" : "download"), ret); return ret; } /** * t5_fw_init_extern_mem - initialize the external memory * @adap: the adapter * * Initializes the external memory on T5. */ int t5_fw_init_extern_mem(struct adapter *adap) { u32 params[1], val[1]; int ret; if (!is_t5(adap)) return 0; val[0] = 0xff; /* Initialize all MCs */ params[0] = (V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_MCINIT)); ret = t4_set_params_timeout(adap, adap->mbox, adap->pf, 0, 1, params, val, FW_CMD_MAX_TIMEOUT); return ret; } /* BIOS boot headers */ typedef struct pci_expansion_rom_header { u8 signature[2]; /* ROM Signature. Should be 0xaa55 */ u8 reserved[22]; /* Reserved per processor Architecture data */ u8 pcir_offset[2]; /* Offset to PCI Data Structure */ } pci_exp_rom_header_t; /* PCI_EXPANSION_ROM_HEADER */ /* Legacy PCI Expansion ROM Header */ typedef struct legacy_pci_expansion_rom_header { u8 signature[2]; /* ROM Signature. Should be 0xaa55 */ u8 size512; /* Current Image Size in units of 512 bytes */ u8 initentry_point[4]; u8 cksum; /* Checksum computed on the entire Image */ u8 reserved[16]; /* Reserved */ u8 pcir_offset[2]; /* Offset to PCI Data Struture */ } legacy_pci_exp_rom_header_t; /* LEGACY_PCI_EXPANSION_ROM_HEADER */ /* EFI PCI Expansion ROM Header */ typedef struct efi_pci_expansion_rom_header { u8 signature[2]; // ROM signature. The value 0xaa55 u8 initialization_size[2]; /* Units 512. Includes this header */ u8 efi_signature[4]; /* Signature from EFI image header. 0x0EF1 */ u8 efi_subsystem[2]; /* Subsystem value for EFI image header */ u8 efi_machine_type[2]; /* Machine type from EFI image header */ u8 compression_type[2]; /* Compression type. */ /* * Compression type definition * 0x0: uncompressed * 0x1: Compressed * 0x2-0xFFFF: Reserved */ u8 reserved[8]; /* Reserved */ u8 efi_image_header_offset[2]; /* Offset to EFI Image */ u8 pcir_offset[2]; /* Offset to PCI Data Structure */ } efi_pci_exp_rom_header_t; /* EFI PCI Expansion ROM Header */ /* PCI Data Structure Format */ typedef struct pcir_data_structure { /* PCI Data Structure */ u8 signature[4]; /* Signature. The string "PCIR" */ u8 vendor_id[2]; /* Vendor Identification */ u8 device_id[2]; /* Device Identification */ u8 vital_product[2]; /* Pointer to Vital Product Data */ u8 length[2]; /* PCIR Data Structure Length */ u8 revision; /* PCIR Data Structure Revision */ u8 class_code[3]; /* Class Code */ u8 image_length[2]; /* Image Length. Multiple of 512B */ u8 code_revision[2]; /* Revision Level of Code/Data */ u8 code_type; /* Code Type. */ /* * PCI Expansion ROM Code Types * 0x00: Intel IA-32, PC-AT compatible. Legacy * 0x01: Open Firmware standard for PCI. FCODE * 0x02: Hewlett-Packard PA RISC. HP reserved * 0x03: EFI Image. EFI * 0x04-0xFF: Reserved. */ u8 indicator; /* Indicator. Identifies the last image in the ROM */ u8 reserved[2]; /* Reserved */ } pcir_data_t; /* PCI__DATA_STRUCTURE */ /* BOOT constants */ enum { BOOT_FLASH_BOOT_ADDR = 0x0,/* start address of boot image in flash */ BOOT_SIGNATURE = 0xaa55, /* signature of BIOS boot ROM */ BOOT_SIZE_INC = 512, /* image size measured in 512B chunks */ BOOT_MIN_SIZE = sizeof(pci_exp_rom_header_t), /* basic header */ BOOT_MAX_SIZE = 1024*BOOT_SIZE_INC, /* 1 byte * length increment */ VENDOR_ID = 0x1425, /* Vendor ID */ PCIR_SIGNATURE = 0x52494350 /* PCIR signature */ }; /* * modify_device_id - Modifies the device ID of the Boot BIOS image * @adatper: the device ID to write. * @boot_data: the boot image to modify. * * Write the supplied device ID to the boot BIOS image. */ static void modify_device_id(int device_id, u8 *boot_data) { legacy_pci_exp_rom_header_t *header; pcir_data_t *pcir_header; u32 cur_header = 0; /* * Loop through all chained images and change the device ID's */ while (1) { header = (legacy_pci_exp_rom_header_t *) &boot_data[cur_header]; pcir_header = (pcir_data_t *) &boot_data[cur_header + le16_to_cpu(*(u16*)header->pcir_offset)]; /* * Only modify the Device ID if code type is Legacy or HP. * 0x00: Okay to modify * 0x01: FCODE. Do not be modify * 0x03: Okay to modify * 0x04-0xFF: Do not modify */ if (pcir_header->code_type == 0x00) { u8 csum = 0; int i; /* * Modify Device ID to match current adatper */ *(u16*) pcir_header->device_id = device_id; /* * Set checksum temporarily to 0. * We will recalculate it later. */ header->cksum = 0x0; /* * Calculate and update checksum */ for (i = 0; i < (header->size512 * 512); i++) csum += (u8)boot_data[cur_header + i]; /* * Invert summed value to create the checksum * Writing new checksum value directly to the boot data */ boot_data[cur_header + 7] = -csum; } else if (pcir_header->code_type == 0x03) { /* * Modify Device ID to match current adatper */ *(u16*) pcir_header->device_id = device_id; } /* * Check indicator element to identify if this is the last * image in the ROM. */ if (pcir_header->indicator & 0x80) break; /* * Move header pointer up to the next image in the ROM. */ cur_header += header->size512 * 512; } } /* * t4_load_boot - download boot flash * @adapter: the adapter * @boot_data: the boot image to write * @boot_addr: offset in flash to write boot_data * @size: image size * * Write the supplied boot image to the card's serial flash. * The boot image has the following sections: a 28-byte header and the * boot image. */ int t4_load_boot(struct adapter *adap, u8 *boot_data, unsigned int boot_addr, unsigned int size) { pci_exp_rom_header_t *header; int pcir_offset ; pcir_data_t *pcir_header; int ret, addr; uint16_t device_id; unsigned int i; unsigned int boot_sector = (boot_addr * 1024 ); unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec; /* * Make sure the boot image does not encroach on the firmware region */ if ((boot_sector + size) >> 16 > FLASH_FW_START_SEC) { CH_ERR(adap, "boot image encroaching on firmware region\n"); return -EFBIG; } /* * The boot sector is comprised of the Expansion-ROM boot, iSCSI boot, * and Boot configuration data sections. These 3 boot sections span * sectors 0 to 7 in flash and live right before the FW image location. */ i = DIV_ROUND_UP(size ? size : FLASH_FW_START, sf_sec_size); ret = t4_flash_erase_sectors(adap, boot_sector >> 16, (boot_sector >> 16) + i - 1); /* * If size == 0 then we're simply erasing the FLASH sectors associated * with the on-adapter option ROM file */ if (ret || (size == 0)) goto out; /* Get boot header */ header = (pci_exp_rom_header_t *)boot_data; pcir_offset = le16_to_cpu(*(u16 *)header->pcir_offset); /* PCIR Data Structure */ pcir_header = (pcir_data_t *) &boot_data[pcir_offset]; /* * Perform some primitive sanity testing to avoid accidentally * writing garbage over the boot sectors. We ought to check for * more but it's not worth it for now ... */ if (size < BOOT_MIN_SIZE || size > BOOT_MAX_SIZE) { CH_ERR(adap, "boot image too small/large\n"); return -EFBIG; } #ifndef CHELSIO_T4_DIAGS /* * Check BOOT ROM header signature */ if (le16_to_cpu(*(u16*)header->signature) != BOOT_SIGNATURE ) { CH_ERR(adap, "Boot image missing signature\n"); return -EINVAL; } /* * Check PCI header signature */ if (le32_to_cpu(*(u32*)pcir_header->signature) != PCIR_SIGNATURE) { CH_ERR(adap, "PCI header missing signature\n"); return -EINVAL; } /* * Check Vendor ID matches Chelsio ID */ if (le16_to_cpu(*(u16*)pcir_header->vendor_id) != VENDOR_ID) { CH_ERR(adap, "Vendor ID missing signature\n"); return -EINVAL; } #endif /* * Retrieve adapter's device ID */ t4_os_pci_read_cfg2(adap, PCI_DEVICE_ID, &device_id); /* Want to deal with PF 0 so I strip off PF 4 indicator */ device_id = device_id & 0xf0ff; /* * Check PCIE Device ID */ if (le16_to_cpu(*(u16*)pcir_header->device_id) != device_id) { /* * Change the device ID in the Boot BIOS image to match * the Device ID of the current adapter. */ modify_device_id(device_id, boot_data); } /* * Skip over the first SF_PAGE_SIZE worth of data and write it after * we finish copying the rest of the boot image. This will ensure * that the BIOS boot header will only be written if the boot image * was written in full. */ addr = boot_sector; for (size -= SF_PAGE_SIZE; size; size -= SF_PAGE_SIZE) { addr += SF_PAGE_SIZE; boot_data += SF_PAGE_SIZE; ret = t4_write_flash(adap, addr, SF_PAGE_SIZE, boot_data, 0); if (ret) goto out; } ret = t4_write_flash(adap, boot_sector, SF_PAGE_SIZE, (const u8 *)header, 0); out: if (ret) CH_ERR(adap, "boot image download failed, error %d\n", ret); return ret; } /* * t4_flash_bootcfg_addr - return the address of the flash optionrom configuration * @adapter: the adapter * * Return the address within the flash where the OptionROM Configuration * is stored, or an error if the device FLASH is too small to contain * a OptionROM Configuration. */ static int t4_flash_bootcfg_addr(struct adapter *adapter) { /* * If the device FLASH isn't large enough to hold a Firmware * Configuration File, return an error. */ if (adapter->params.sf_size < FLASH_BOOTCFG_START + FLASH_BOOTCFG_MAX_SIZE) return -ENOSPC; return FLASH_BOOTCFG_START; } int t4_load_bootcfg(struct adapter *adap,const u8 *cfg_data, unsigned int size) { int ret, i, n, cfg_addr; unsigned int addr; unsigned int flash_cfg_start_sec; unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec; cfg_addr = t4_flash_bootcfg_addr(adap); if (cfg_addr < 0) return cfg_addr; addr = cfg_addr; flash_cfg_start_sec = addr / SF_SEC_SIZE; if (size > FLASH_BOOTCFG_MAX_SIZE) { CH_ERR(adap, "bootcfg file too large, max is %u bytes\n", FLASH_BOOTCFG_MAX_SIZE); return -EFBIG; } i = DIV_ROUND_UP(FLASH_BOOTCFG_MAX_SIZE,/* # of sectors spanned */ sf_sec_size); ret = t4_flash_erase_sectors(adap, flash_cfg_start_sec, flash_cfg_start_sec + i - 1); /* * If size == 0 then we're simply erasing the FLASH sectors associated * with the on-adapter OptionROM Configuration File. */ if (ret || size == 0) goto out; /* this will write to the flash up to SF_PAGE_SIZE at a time */ for (i = 0; i< size; i+= SF_PAGE_SIZE) { if ( (size - i) < SF_PAGE_SIZE) n = size - i; else n = SF_PAGE_SIZE; ret = t4_write_flash(adap, addr, n, cfg_data, 0); if (ret) goto out; addr += SF_PAGE_SIZE; cfg_data += SF_PAGE_SIZE; } out: if (ret) CH_ERR(adap, "boot config data %s failed %d\n", (size == 0 ? "clear" : "download"), ret); return ret; } /** * t4_set_filter_mode - configure the optional components of filter tuples * @adap: the adapter * @mode_map: a bitmap selcting which optional filter components to enable * @sleep_ok: if true we may sleep while awaiting command completion * * Sets the filter mode by selecting the optional components to enable * in filter tuples. Returns 0 on success and a negative error if the * requested mode needs more bits than are available for optional * components. */ int t4_set_filter_mode(struct adapter *adap, unsigned int mode_map, bool sleep_ok) { static u8 width[] = { 1, 3, 17, 17, 8, 8, 16, 9, 3, 1 }; int i, nbits = 0; for (i = S_FCOE; i <= S_FRAGMENTATION; i++) if (mode_map & (1 << i)) nbits += width[i]; if (nbits > FILTER_OPT_LEN) return -EINVAL; t4_tp_pio_write(adap, &mode_map, 1, A_TP_VLAN_PRI_MAP, sleep_ok); read_filter_mode_and_ingress_config(adap, sleep_ok); return 0; } /** * t4_clr_port_stats - clear port statistics * @adap: the adapter * @idx: the port index * * Clear HW statistics for the given port. */ void t4_clr_port_stats(struct adapter *adap, int idx) { unsigned int i; u32 bgmap = t4_get_mps_bg_map(adap, idx); u32 port_base_addr; if (is_t4(adap)) port_base_addr = PORT_BASE(idx); else port_base_addr = T5_PORT_BASE(idx); for (i = A_MPS_PORT_STAT_TX_PORT_BYTES_L; i <= A_MPS_PORT_STAT_TX_PORT_PPP7_H; i += 8) t4_write_reg(adap, port_base_addr + i, 0); for (i = A_MPS_PORT_STAT_RX_PORT_BYTES_L; i <= A_MPS_PORT_STAT_RX_PORT_LESS_64B_H; i += 8) t4_write_reg(adap, port_base_addr + i, 0); for (i = 0; i < 4; i++) if (bgmap & (1 << i)) { t4_write_reg(adap, A_MPS_STAT_RX_BG_0_MAC_DROP_FRAME_L + i * 8, 0); t4_write_reg(adap, A_MPS_STAT_RX_BG_0_MAC_TRUNC_FRAME_L + i * 8, 0); } } /** * t4_i2c_rd - read I2C data from adapter * @adap: the adapter * @port: Port number if per-port device; <0 if not * @devid: per-port device ID or absolute device ID * @offset: byte offset into device I2C space * @len: byte length of I2C space data * @buf: buffer in which to return I2C data * * Reads the I2C data from the indicated device and location. */ int t4_i2c_rd(struct adapter *adap, unsigned int mbox, int port, unsigned int devid, unsigned int offset, unsigned int len, u8 *buf) { u32 ldst_addrspace; struct fw_ldst_cmd ldst; int ret; if (port >= 4 || devid >= 256 || offset >= 256 || len > sizeof ldst.u.i2c.data) return -EINVAL; memset(&ldst, 0, sizeof ldst); ldst_addrspace = V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_I2C); ldst.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | ldst_addrspace); ldst.cycles_to_len16 = cpu_to_be32(FW_LEN16(ldst)); ldst.u.i2c.pid = (port < 0 ? 0xff : port); ldst.u.i2c.did = devid; ldst.u.i2c.boffset = offset; ldst.u.i2c.blen = len; ret = t4_wr_mbox(adap, mbox, &ldst, sizeof ldst, &ldst); if (!ret) memcpy(buf, ldst.u.i2c.data, len); return ret; } /** * t4_i2c_wr - write I2C data to adapter * @adap: the adapter * @port: Port number if per-port device; <0 if not * @devid: per-port device ID or absolute device ID * @offset: byte offset into device I2C space * @len: byte length of I2C space data * @buf: buffer containing new I2C data * * Write the I2C data to the indicated device and location. */ int t4_i2c_wr(struct adapter *adap, unsigned int mbox, int port, unsigned int devid, unsigned int offset, unsigned int len, u8 *buf) { u32 ldst_addrspace; struct fw_ldst_cmd ldst; if (port >= 4 || devid >= 256 || offset >= 256 || len > sizeof ldst.u.i2c.data) return -EINVAL; memset(&ldst, 0, sizeof ldst); ldst_addrspace = V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_I2C); ldst.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | ldst_addrspace); ldst.cycles_to_len16 = cpu_to_be32(FW_LEN16(ldst)); ldst.u.i2c.pid = (port < 0 ? 0xff : port); ldst.u.i2c.did = devid; ldst.u.i2c.boffset = offset; ldst.u.i2c.blen = len; memcpy(ldst.u.i2c.data, buf, len); return t4_wr_mbox(adap, mbox, &ldst, sizeof ldst, &ldst); } /** * t4_sge_ctxt_rd - read an SGE context through FW * @adap: the adapter * @mbox: mailbox to use for the FW command * @cid: the context id * @ctype: the context type * @data: where to store the context data * * Issues a FW command through the given mailbox to read an SGE context. */ int t4_sge_ctxt_rd(struct adapter *adap, unsigned int mbox, unsigned int cid, enum ctxt_type ctype, u32 *data) { int ret; struct fw_ldst_cmd c; if (ctype == CTXT_EGRESS) ret = FW_LDST_ADDRSPC_SGE_EGRC; else if (ctype == CTXT_INGRESS) ret = FW_LDST_ADDRSPC_SGE_INGC; else if (ctype == CTXT_FLM) ret = FW_LDST_ADDRSPC_SGE_FLMC; else ret = FW_LDST_ADDRSPC_SGE_CONMC; memset(&c, 0, sizeof(c)); c.op_to_addrspace = cpu_to_be32(V_FW_CMD_OP(FW_LDST_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ | V_FW_LDST_CMD_ADDRSPACE(ret)); c.cycles_to_len16 = cpu_to_be32(FW_LEN16(c)); c.u.idctxt.physid = cpu_to_be32(cid); ret = t4_wr_mbox(adap, mbox, &c, sizeof(c), &c); if (ret == 0) { data[0] = be32_to_cpu(c.u.idctxt.ctxt_data0); data[1] = be32_to_cpu(c.u.idctxt.ctxt_data1); data[2] = be32_to_cpu(c.u.idctxt.ctxt_data2); data[3] = be32_to_cpu(c.u.idctxt.ctxt_data3); data[4] = be32_to_cpu(c.u.idctxt.ctxt_data4); data[5] = be32_to_cpu(c.u.idctxt.ctxt_data5); } return ret; } /** * t4_sge_ctxt_rd_bd - read an SGE context bypassing FW * @adap: the adapter * @cid: the context id * @ctype: the context type * @data: where to store the context data * * Reads an SGE context directly, bypassing FW. This is only for * debugging when FW is unavailable. */ int t4_sge_ctxt_rd_bd(struct adapter *adap, unsigned int cid, enum ctxt_type ctype, u32 *data) { int i, ret; t4_write_reg(adap, A_SGE_CTXT_CMD, V_CTXTQID(cid) | V_CTXTTYPE(ctype)); ret = t4_wait_op_done(adap, A_SGE_CTXT_CMD, F_BUSY, 0, 3, 1); if (!ret) for (i = A_SGE_CTXT_DATA0; i <= A_SGE_CTXT_DATA5; i += 4) *data++ = t4_read_reg(adap, i); return ret; } int t4_sched_config(struct adapter *adapter, int type, int minmaxen, int sleep_ok) { struct fw_sched_cmd cmd; memset(&cmd, 0, sizeof(cmd)); cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_SCHED_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd)); cmd.u.config.sc = FW_SCHED_SC_CONFIG; cmd.u.config.type = type; cmd.u.config.minmaxen = minmaxen; return t4_wr_mbox_meat(adapter,adapter->mbox, &cmd, sizeof(cmd), NULL, sleep_ok); } int t4_sched_params(struct adapter *adapter, int type, int level, int mode, int rateunit, int ratemode, int channel, int cl, int minrate, int maxrate, int weight, int pktsize, int sleep_ok) { struct fw_sched_cmd cmd; memset(&cmd, 0, sizeof(cmd)); cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_SCHED_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd)); cmd.u.params.sc = FW_SCHED_SC_PARAMS; cmd.u.params.type = type; cmd.u.params.level = level; cmd.u.params.mode = mode; cmd.u.params.ch = channel; cmd.u.params.cl = cl; cmd.u.params.unit = rateunit; cmd.u.params.rate = ratemode; cmd.u.params.min = cpu_to_be32(minrate); cmd.u.params.max = cpu_to_be32(maxrate); cmd.u.params.weight = cpu_to_be16(weight); cmd.u.params.pktsize = cpu_to_be16(pktsize); return t4_wr_mbox_meat(adapter,adapter->mbox, &cmd, sizeof(cmd), NULL, sleep_ok); } int t4_sched_params_ch_rl(struct adapter *adapter, int channel, int ratemode, unsigned int maxrate, int sleep_ok) { struct fw_sched_cmd cmd; memset(&cmd, 0, sizeof(cmd)); cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_SCHED_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd)); cmd.u.params.sc = FW_SCHED_SC_PARAMS; cmd.u.params.type = FW_SCHED_TYPE_PKTSCHED; cmd.u.params.level = FW_SCHED_PARAMS_LEVEL_CH_RL; cmd.u.params.ch = channel; cmd.u.params.rate = ratemode; /* REL or ABS */ cmd.u.params.max = cpu_to_be32(maxrate);/* % or kbps */ return t4_wr_mbox_meat(adapter,adapter->mbox, &cmd, sizeof(cmd), NULL, sleep_ok); } int t4_sched_params_cl_wrr(struct adapter *adapter, int channel, int cl, int weight, int sleep_ok) { struct fw_sched_cmd cmd; if (weight < 0 || weight > 100) return -EINVAL; memset(&cmd, 0, sizeof(cmd)); cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_SCHED_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd)); cmd.u.params.sc = FW_SCHED_SC_PARAMS; cmd.u.params.type = FW_SCHED_TYPE_PKTSCHED; cmd.u.params.level = FW_SCHED_PARAMS_LEVEL_CL_WRR; cmd.u.params.ch = channel; cmd.u.params.cl = cl; cmd.u.params.weight = cpu_to_be16(weight); return t4_wr_mbox_meat(adapter,adapter->mbox, &cmd, sizeof(cmd), NULL, sleep_ok); } int t4_sched_params_cl_rl_kbps(struct adapter *adapter, int channel, int cl, int mode, unsigned int maxrate, int pktsize, int sleep_ok) { struct fw_sched_cmd cmd; memset(&cmd, 0, sizeof(cmd)); cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_SCHED_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd)); cmd.u.params.sc = FW_SCHED_SC_PARAMS; cmd.u.params.type = FW_SCHED_TYPE_PKTSCHED; cmd.u.params.level = FW_SCHED_PARAMS_LEVEL_CL_RL; cmd.u.params.mode = mode; cmd.u.params.ch = channel; cmd.u.params.cl = cl; cmd.u.params.unit = FW_SCHED_PARAMS_UNIT_BITRATE; cmd.u.params.rate = FW_SCHED_PARAMS_RATE_ABS; cmd.u.params.max = cpu_to_be32(maxrate); cmd.u.params.pktsize = cpu_to_be16(pktsize); return t4_wr_mbox_meat(adapter,adapter->mbox, &cmd, sizeof(cmd), NULL, sleep_ok); } /* * t4_config_watchdog - configure (enable/disable) a watchdog timer * @adapter: the adapter * @mbox: mailbox to use for the FW command * @pf: the PF owning the queue * @vf: the VF owning the queue * @timeout: watchdog timeout in ms * @action: watchdog timer / action * * There are separate watchdog timers for each possible watchdog * action. Configure one of the watchdog timers by setting a non-zero * timeout. Disable a watchdog timer by using a timeout of zero. */ int t4_config_watchdog(struct adapter *adapter, unsigned int mbox, unsigned int pf, unsigned int vf, unsigned int timeout, unsigned int action) { struct fw_watchdog_cmd wdog; unsigned int ticks; /* * The watchdog command expects a timeout in units of 10ms so we need * to convert it here (via rounding) and force a minimum of one 10ms * "tick" if the timeout is non-zero but the conversion results in 0 * ticks. */ ticks = (timeout + 5)/10; if (timeout && !ticks) ticks = 1; memset(&wdog, 0, sizeof wdog); wdog.op_to_vfn = cpu_to_be32(V_FW_CMD_OP(FW_WATCHDOG_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | V_FW_PARAMS_CMD_PFN(pf) | V_FW_PARAMS_CMD_VFN(vf)); wdog.retval_len16 = cpu_to_be32(FW_LEN16(wdog)); wdog.timeout = cpu_to_be32(ticks); wdog.action = cpu_to_be32(action); return t4_wr_mbox(adapter, mbox, &wdog, sizeof wdog, NULL); } int t4_get_devlog_level(struct adapter *adapter, unsigned int *level) { struct fw_devlog_cmd devlog_cmd; int ret; memset(&devlog_cmd, 0, sizeof(devlog_cmd)); devlog_cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_DEVLOG_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_READ); devlog_cmd.retval_len16 = cpu_to_be32(FW_LEN16(devlog_cmd)); ret = t4_wr_mbox(adapter, adapter->mbox, &devlog_cmd, sizeof(devlog_cmd), &devlog_cmd); if (ret) return ret; *level = devlog_cmd.level; return 0; } int t4_set_devlog_level(struct adapter *adapter, unsigned int level) { struct fw_devlog_cmd devlog_cmd; memset(&devlog_cmd, 0, sizeof(devlog_cmd)); devlog_cmd.op_to_write = cpu_to_be32(V_FW_CMD_OP(FW_DEVLOG_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE); devlog_cmd.level = level; devlog_cmd.retval_len16 = cpu_to_be32(FW_LEN16(devlog_cmd)); return t4_wr_mbox(adapter, adapter->mbox, &devlog_cmd, sizeof(devlog_cmd), &devlog_cmd); } Index: projects/runtime-coverage/sys/dev/e1000/if_em.c =================================================================== --- projects/runtime-coverage/sys/dev/e1000/if_em.c (revision 322921) +++ projects/runtime-coverage/sys/dev/e1000/if_em.c (revision 322922) @@ -1,4529 +1,4529 @@ /*- * Copyright (c) 2016 Matt Macy * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* $FreeBSD$ */ #include "if_em.h" #include #include #define em_mac_min e1000_82547 #define igb_mac_min e1000_82575 /********************************************************************* * Driver version: *********************************************************************/ char em_driver_version[] = "7.6.1-k"; /********************************************************************* * PCI Device ID Table * * Used by probe to select devices to load on * Last field stores an index into e1000_strings * Last entry must be all 0s * * { Vendor ID, Device ID, SubVendor ID, SubDevice ID, String Index } *********************************************************************/ static pci_vendor_info_t em_vendor_info_array[] = { /* Intel(R) PRO/1000 Network Connection - Legacy em*/ PVID(0x8086, E1000_DEV_ID_82540EM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82540EM_LOM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82540EP, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82540EP_LOM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82540EP_LP, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541EI, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541ER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541ER_LOM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541EI_MOBILE, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541GI, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541GI_LF, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82541GI_MOBILE, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82542, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82543GC_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82543GC_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82544EI_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82544EI_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82544GC_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82544GC_LOM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82545EM_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82545EM_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82545GM_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82545GM_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82545GM_SERDES, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546EB_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546EB_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546EB_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546GB_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546GB_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546GB_SERDES, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546GB_PCIE, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546GB_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82547EI, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82547EI_MOBILE, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82547GI, "Intel(R) PRO/1000 Network Connection"), /* Intel(R) PRO/1000 Network Connection - em */ PVID(0x8086, E1000_DEV_ID_82571EB_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_SERDES, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_SERDES_DUAL, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_SERDES_QUAD, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_QUAD_COPPER_LP, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571EB_QUAD_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82571PT_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82572EI, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82572EI_COPPER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82572EI_FIBER, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82572EI_SERDES, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82573E, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82573E_IAMT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82573L, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82583V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_80003ES2LAN_COPPER_SPT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_80003ES2LAN_SERDES_SPT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_80003ES2LAN_COPPER_DPT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_80003ES2LAN_SERDES_DPT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IGP_M_AMT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IGP_AMT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IGP_C, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IFE, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IFE_GT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IFE_G, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_IGP_M, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH8_82567V_3, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IGP_M_AMT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IGP_AMT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IGP_C, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IGP_M, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IGP_M_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IFE, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IFE_GT, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_IFE_G, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH9_BM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82574L, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_82574LA, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH10_R_BM_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH10_R_BM_LF, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH10_R_BM_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH10_D_BM_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH10_D_BM_LF, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_ICH10_D_BM_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_M_HV_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_M_HV_LC, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_D_HV_DM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_D_HV_DC, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH2_LV_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH2_LV_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_LPT_I217_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_LPT_I217_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_LPTLP_I218_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_LPTLP_I218_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_I218_LM2, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_I218_V2, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_I218_LM3, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_I218_V3, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM2, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V2, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_LBG_I219_LM3, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM4, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V4, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM5, "Intel(R) PRO/1000 Network Connection"), PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V5, "Intel(R) PRO/1000 Network Connection"), /* required last entry */ PVID_END }; static pci_vendor_info_t igb_vendor_info_array[] = { /* Intel(R) PRO/1000 Network Connection - igb */ PVID(0x8086, E1000_DEV_ID_82575EB_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82575EB_FIBER_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82575GB_QUAD_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_NS, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_NS_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_SERDES_QUAD, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_QUAD_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_QUAD_COPPER_ET2, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82576_VF, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82580_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82580_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82580_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82580_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82580_COPPER_DUAL, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_82580_QUAD_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_DH89XXCC_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_DH89XXCC_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_DH89XXCC_SFP, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_DH89XXCC_BACKPLANE, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I350_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I350_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I350_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I350_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I350_VF, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_COPPER_IT, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_COPPER_OEM1, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_COPPER_FLASHLESS, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_SERDES_FLASHLESS, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I210_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I211_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I354_BACKPLANE_1GBPS, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I354_BACKPLANE_2_5GBPS, "Intel(R) PRO/1000 PCI-Express Network Driver"), PVID(0x8086, E1000_DEV_ID_I354_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"), /* required last entry */ PVID_END }; /********************************************************************* * Function prototypes *********************************************************************/ static void *em_register(device_t dev); static void *igb_register(device_t dev); static int em_if_attach_pre(if_ctx_t ctx); static int em_if_attach_post(if_ctx_t ctx); static int em_if_detach(if_ctx_t ctx); static int em_if_shutdown(if_ctx_t ctx); static int em_if_suspend(if_ctx_t ctx); static int em_if_resume(if_ctx_t ctx); static int em_if_tx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int ntxqs, int ntxqsets); static int em_if_rx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int nrxqs, int nrxqsets); static void em_if_queues_free(if_ctx_t ctx); static uint64_t em_if_get_counter(if_ctx_t, ift_counter); static void em_if_init(if_ctx_t ctx); static void em_if_stop(if_ctx_t ctx); static void em_if_media_status(if_ctx_t, struct ifmediareq *); static int em_if_media_change(if_ctx_t ctx); static int em_if_mtu_set(if_ctx_t ctx, uint32_t mtu); static void em_if_timer(if_ctx_t ctx, uint16_t qid); static void em_if_vlan_register(if_ctx_t ctx, u16 vtag); static void em_if_vlan_unregister(if_ctx_t ctx, u16 vtag); static void em_identify_hardware(if_ctx_t ctx); static int em_allocate_pci_resources(if_ctx_t ctx); static void em_free_pci_resources(if_ctx_t ctx); static void em_reset(if_ctx_t ctx); static int em_setup_interface(if_ctx_t ctx); static int em_setup_msix(if_ctx_t ctx); static void em_initialize_transmit_unit(if_ctx_t ctx); static void em_initialize_receive_unit(if_ctx_t ctx); static void em_if_enable_intr(if_ctx_t ctx); static void em_if_disable_intr(if_ctx_t ctx); static int em_if_rx_queue_intr_enable(if_ctx_t ctx, uint16_t rxqid); static int em_if_tx_queue_intr_enable(if_ctx_t ctx, uint16_t txqid); static void em_if_multi_set(if_ctx_t ctx); static void em_if_update_admin_status(if_ctx_t ctx); static void em_if_debug(if_ctx_t ctx); static void em_update_stats_counters(struct adapter *); static void em_add_hw_stats(struct adapter *adapter); static int em_if_set_promisc(if_ctx_t ctx, int flags); static void em_setup_vlan_hw_support(struct adapter *); static int em_sysctl_nvm_info(SYSCTL_HANDLER_ARGS); static void em_print_nvm_info(struct adapter *); static int em_sysctl_debug_info(SYSCTL_HANDLER_ARGS); static int em_get_rs(SYSCTL_HANDLER_ARGS); static void em_print_debug_info(struct adapter *); static int em_is_valid_ether_addr(u8 *); static int em_sysctl_int_delay(SYSCTL_HANDLER_ARGS); static void em_add_int_delay_sysctl(struct adapter *, const char *, const char *, struct em_int_delay_info *, int, int); /* Management and WOL Support */ static void em_init_manageability(struct adapter *); static void em_release_manageability(struct adapter *); static void em_get_hw_control(struct adapter *); static void em_release_hw_control(struct adapter *); static void em_get_wakeup(if_ctx_t ctx); static void em_enable_wakeup(if_ctx_t ctx); static int em_enable_phy_wakeup(struct adapter *); static void em_disable_aspm(struct adapter *); int em_intr(void *arg); static void em_disable_promisc(if_ctx_t ctx); /* MSIX handlers */ static int em_if_msix_intr_assign(if_ctx_t, int); static int em_msix_link(void *); static void em_handle_link(void *context); static void em_enable_vectors_82574(if_ctx_t); static int em_set_flowcntl(SYSCTL_HANDLER_ARGS); static int em_sysctl_eee(SYSCTL_HANDLER_ARGS); static void em_if_led_func(if_ctx_t ctx, int onoff); static int em_get_regs(SYSCTL_HANDLER_ARGS); static void lem_smartspeed(struct adapter *adapter); static void igb_configure_queues(struct adapter *adapter); /********************************************************************* * FreeBSD Device Interface Entry Points *********************************************************************/ static device_method_t em_methods[] = { /* Device interface */ DEVMETHOD(device_register, em_register), DEVMETHOD(device_probe, iflib_device_probe), DEVMETHOD(device_attach, iflib_device_attach), DEVMETHOD(device_detach, iflib_device_detach), DEVMETHOD(device_shutdown, iflib_device_shutdown), DEVMETHOD(device_suspend, iflib_device_suspend), DEVMETHOD(device_resume, iflib_device_resume), DEVMETHOD_END }; static device_method_t igb_methods[] = { /* Device interface */ DEVMETHOD(device_register, igb_register), DEVMETHOD(device_probe, iflib_device_probe), DEVMETHOD(device_attach, iflib_device_attach), DEVMETHOD(device_detach, iflib_device_detach), DEVMETHOD(device_shutdown, iflib_device_shutdown), DEVMETHOD(device_suspend, iflib_device_suspend), DEVMETHOD(device_resume, iflib_device_resume), DEVMETHOD_END }; static driver_t em_driver = { "em", em_methods, sizeof(struct adapter), }; static devclass_t em_devclass; DRIVER_MODULE(em, pci, em_driver, em_devclass, 0, 0); MODULE_DEPEND(em, pci, 1, 1, 1); MODULE_DEPEND(em, ether, 1, 1, 1); MODULE_DEPEND(em, iflib, 1, 1, 1); static driver_t igb_driver = { "igb", igb_methods, sizeof(struct adapter), }; static devclass_t igb_devclass; DRIVER_MODULE(igb, pci, igb_driver, igb_devclass, 0, 0); MODULE_DEPEND(igb, pci, 1, 1, 1); MODULE_DEPEND(igb, ether, 1, 1, 1); MODULE_DEPEND(igb, iflib, 1, 1, 1); static device_method_t em_if_methods[] = { DEVMETHOD(ifdi_attach_pre, em_if_attach_pre), DEVMETHOD(ifdi_attach_post, em_if_attach_post), DEVMETHOD(ifdi_detach, em_if_detach), DEVMETHOD(ifdi_shutdown, em_if_shutdown), DEVMETHOD(ifdi_suspend, em_if_suspend), DEVMETHOD(ifdi_resume, em_if_resume), DEVMETHOD(ifdi_init, em_if_init), DEVMETHOD(ifdi_stop, em_if_stop), DEVMETHOD(ifdi_msix_intr_assign, em_if_msix_intr_assign), DEVMETHOD(ifdi_intr_enable, em_if_enable_intr), DEVMETHOD(ifdi_intr_disable, em_if_disable_intr), DEVMETHOD(ifdi_tx_queues_alloc, em_if_tx_queues_alloc), DEVMETHOD(ifdi_rx_queues_alloc, em_if_rx_queues_alloc), DEVMETHOD(ifdi_queues_free, em_if_queues_free), DEVMETHOD(ifdi_update_admin_status, em_if_update_admin_status), DEVMETHOD(ifdi_multi_set, em_if_multi_set), DEVMETHOD(ifdi_media_status, em_if_media_status), DEVMETHOD(ifdi_media_change, em_if_media_change), DEVMETHOD(ifdi_mtu_set, em_if_mtu_set), DEVMETHOD(ifdi_promisc_set, em_if_set_promisc), DEVMETHOD(ifdi_timer, em_if_timer), DEVMETHOD(ifdi_vlan_register, em_if_vlan_register), DEVMETHOD(ifdi_vlan_unregister, em_if_vlan_unregister), DEVMETHOD(ifdi_get_counter, em_if_get_counter), DEVMETHOD(ifdi_led_func, em_if_led_func), DEVMETHOD(ifdi_rx_queue_intr_enable, em_if_rx_queue_intr_enable), DEVMETHOD(ifdi_tx_queue_intr_enable, em_if_tx_queue_intr_enable), DEVMETHOD(ifdi_debug, em_if_debug), DEVMETHOD_END }; /* * note that if (adapter->msix_mem) is replaced by: * if (adapter->intr_type == IFLIB_INTR_MSIX) */ static driver_t em_if_driver = { "em_if", em_if_methods, sizeof(struct adapter) }; /********************************************************************* * Tunable default values. *********************************************************************/ #define EM_TICKS_TO_USECS(ticks) ((1024 * (ticks) + 500) / 1000) #define EM_USECS_TO_TICKS(usecs) ((1000 * (usecs) + 512) / 1024) #define M_TSO_LEN 66 #define MAX_INTS_PER_SEC 8000 #define DEFAULT_ITR (1000000000/(MAX_INTS_PER_SEC * 256)) /* Allow common code without TSO */ #ifndef CSUM_TSO #define CSUM_TSO 0 #endif #define TSO_WORKAROUND 4 static SYSCTL_NODE(_hw, OID_AUTO, em, CTLFLAG_RD, 0, "EM driver parameters"); static int em_disable_crc_stripping = 0; SYSCTL_INT(_hw_em, OID_AUTO, disable_crc_stripping, CTLFLAG_RDTUN, &em_disable_crc_stripping, 0, "Disable CRC Stripping"); static int em_tx_int_delay_dflt = EM_TICKS_TO_USECS(EM_TIDV); static int em_rx_int_delay_dflt = EM_TICKS_TO_USECS(EM_RDTR); SYSCTL_INT(_hw_em, OID_AUTO, tx_int_delay, CTLFLAG_RDTUN, &em_tx_int_delay_dflt, 0, "Default transmit interrupt delay in usecs"); SYSCTL_INT(_hw_em, OID_AUTO, rx_int_delay, CTLFLAG_RDTUN, &em_rx_int_delay_dflt, 0, "Default receive interrupt delay in usecs"); static int em_tx_abs_int_delay_dflt = EM_TICKS_TO_USECS(EM_TADV); static int em_rx_abs_int_delay_dflt = EM_TICKS_TO_USECS(EM_RADV); SYSCTL_INT(_hw_em, OID_AUTO, tx_abs_int_delay, CTLFLAG_RDTUN, &em_tx_abs_int_delay_dflt, 0, "Default transmit interrupt delay limit in usecs"); SYSCTL_INT(_hw_em, OID_AUTO, rx_abs_int_delay, CTLFLAG_RDTUN, &em_rx_abs_int_delay_dflt, 0, "Default receive interrupt delay limit in usecs"); static int em_smart_pwr_down = FALSE; SYSCTL_INT(_hw_em, OID_AUTO, smart_pwr_down, CTLFLAG_RDTUN, &em_smart_pwr_down, 0, "Set to true to leave smart power down enabled on newer adapters"); /* Controls whether promiscuous also shows bad packets */ static int em_debug_sbp = TRUE; SYSCTL_INT(_hw_em, OID_AUTO, sbp, CTLFLAG_RDTUN, &em_debug_sbp, 0, "Show bad packets in promiscuous mode"); /* How many packets rxeof tries to clean at a time */ static int em_rx_process_limit = 100; SYSCTL_INT(_hw_em, OID_AUTO, rx_process_limit, CTLFLAG_RDTUN, &em_rx_process_limit, 0, "Maximum number of received packets to process " "at a time, -1 means unlimited"); /* Energy efficient ethernet - default to OFF */ static int eee_setting = 1; SYSCTL_INT(_hw_em, OID_AUTO, eee_setting, CTLFLAG_RDTUN, &eee_setting, 0, "Enable Energy Efficient Ethernet"); /* ** Tuneable Interrupt rate */ static int em_max_interrupt_rate = 8000; SYSCTL_INT(_hw_em, OID_AUTO, max_interrupt_rate, CTLFLAG_RDTUN, &em_max_interrupt_rate, 0, "Maximum interrupts per second"); /* Global used in WOL setup with multiport cards */ static int global_quad_port_a = 0; extern struct if_txrx igb_txrx; extern struct if_txrx em_txrx; extern struct if_txrx lem_txrx; static struct if_shared_ctx em_sctx_init = { .isc_magic = IFLIB_MAGIC, .isc_q_align = PAGE_SIZE, .isc_tx_maxsize = EM_TSO_SIZE, .isc_tx_maxsegsize = PAGE_SIZE, .isc_rx_maxsize = MJUM9BYTES, .isc_rx_nsegments = 1, .isc_rx_maxsegsize = MJUM9BYTES, .isc_nfl = 1, .isc_nrxqs = 1, .isc_ntxqs = 1, .isc_admin_intrcnt = 1, .isc_vendor_info = em_vendor_info_array, .isc_driver_version = em_driver_version, .isc_driver = &em_if_driver, .isc_flags = IFLIB_NEED_SCRATCH | IFLIB_TSO_INIT_IP, .isc_nrxd_min = {EM_MIN_RXD}, .isc_ntxd_min = {EM_MIN_TXD}, .isc_nrxd_max = {EM_MAX_RXD}, .isc_ntxd_max = {EM_MAX_TXD}, .isc_nrxd_default = {EM_DEFAULT_RXD}, .isc_ntxd_default = {EM_DEFAULT_TXD}, }; if_shared_ctx_t em_sctx = &em_sctx_init; static struct if_shared_ctx igb_sctx_init = { .isc_magic = IFLIB_MAGIC, .isc_q_align = PAGE_SIZE, .isc_tx_maxsize = EM_TSO_SIZE, .isc_tx_maxsegsize = PAGE_SIZE, .isc_rx_maxsize = MJUM9BYTES, .isc_rx_nsegments = 1, .isc_rx_maxsegsize = MJUM9BYTES, .isc_nfl = 1, .isc_nrxqs = 1, .isc_ntxqs = 1, .isc_admin_intrcnt = 1, .isc_vendor_info = igb_vendor_info_array, .isc_driver_version = em_driver_version, .isc_driver = &em_if_driver, .isc_flags = IFLIB_NEED_SCRATCH | IFLIB_TSO_INIT_IP, .isc_nrxd_min = {EM_MIN_RXD}, .isc_ntxd_min = {EM_MIN_TXD}, - .isc_nrxd_max = {EM_MAX_RXD}, - .isc_ntxd_max = {EM_MAX_TXD}, + .isc_nrxd_max = {IGB_MAX_RXD}, + .isc_ntxd_max = {IGB_MAX_TXD}, .isc_nrxd_default = {EM_DEFAULT_RXD}, .isc_ntxd_default = {EM_DEFAULT_TXD}, }; if_shared_ctx_t igb_sctx = &igb_sctx_init; /***************************************************************** * * Dump Registers * ****************************************************************/ #define IGB_REGS_LEN 739 static int em_get_regs(SYSCTL_HANDLER_ARGS) { struct adapter *adapter = (struct adapter *)arg1; struct e1000_hw *hw = &adapter->hw; struct sbuf *sb; u32 *regs_buff = (u32 *)malloc(sizeof(u32) * IGB_REGS_LEN, M_DEVBUF, M_NOWAIT); int rc; memset(regs_buff, 0, IGB_REGS_LEN * sizeof(u32)); rc = sysctl_wire_old_buffer(req, 0); MPASS(rc == 0); if (rc != 0) return (rc); sb = sbuf_new_for_sysctl(NULL, NULL, 32*400, req); MPASS(sb != NULL); if (sb == NULL) return (ENOMEM); /* General Registers */ regs_buff[0] = E1000_READ_REG(hw, E1000_CTRL); regs_buff[1] = E1000_READ_REG(hw, E1000_STATUS); regs_buff[2] = E1000_READ_REG(hw, E1000_CTRL_EXT); regs_buff[3] = E1000_READ_REG(hw, E1000_ICR); regs_buff[4] = E1000_READ_REG(hw, E1000_RCTL); regs_buff[5] = E1000_READ_REG(hw, E1000_RDLEN(0)); regs_buff[6] = E1000_READ_REG(hw, E1000_RDH(0)); regs_buff[7] = E1000_READ_REG(hw, E1000_RDT(0)); regs_buff[8] = E1000_READ_REG(hw, E1000_RXDCTL(0)); regs_buff[9] = E1000_READ_REG(hw, E1000_RDBAL(0)); regs_buff[10] = E1000_READ_REG(hw, E1000_RDBAH(0)); regs_buff[11] = E1000_READ_REG(hw, E1000_TCTL); regs_buff[12] = E1000_READ_REG(hw, E1000_TDBAL(0)); regs_buff[13] = E1000_READ_REG(hw, E1000_TDBAH(0)); regs_buff[14] = E1000_READ_REG(hw, E1000_TDLEN(0)); regs_buff[15] = E1000_READ_REG(hw, E1000_TDH(0)); regs_buff[16] = E1000_READ_REG(hw, E1000_TDT(0)); regs_buff[17] = E1000_READ_REG(hw, E1000_TXDCTL(0)); regs_buff[18] = E1000_READ_REG(hw, E1000_TDFH); regs_buff[19] = E1000_READ_REG(hw, E1000_TDFT); regs_buff[20] = E1000_READ_REG(hw, E1000_TDFHS); regs_buff[21] = E1000_READ_REG(hw, E1000_TDFPC); sbuf_printf(sb, "General Registers\n"); sbuf_printf(sb, "\tCTRL\t %08x\n", regs_buff[0]); sbuf_printf(sb, "\tSTATUS\t %08x\n", regs_buff[1]); sbuf_printf(sb, "\tCTRL_EXIT\t %08x\n\n", regs_buff[2]); sbuf_printf(sb, "Interrupt Registers\n"); sbuf_printf(sb, "\tICR\t %08x\n\n", regs_buff[3]); sbuf_printf(sb, "RX Registers\n"); sbuf_printf(sb, "\tRCTL\t %08x\n", regs_buff[4]); sbuf_printf(sb, "\tRDLEN\t %08x\n", regs_buff[5]); sbuf_printf(sb, "\tRDH\t %08x\n", regs_buff[6]); sbuf_printf(sb, "\tRDT\t %08x\n", regs_buff[7]); sbuf_printf(sb, "\tRXDCTL\t %08x\n", regs_buff[8]); sbuf_printf(sb, "\tRDBAL\t %08x\n", regs_buff[9]); sbuf_printf(sb, "\tRDBAH\t %08x\n\n", regs_buff[10]); sbuf_printf(sb, "TX Registers\n"); sbuf_printf(sb, "\tTCTL\t %08x\n", regs_buff[11]); sbuf_printf(sb, "\tTDBAL\t %08x\n", regs_buff[12]); sbuf_printf(sb, "\tTDBAH\t %08x\n", regs_buff[13]); sbuf_printf(sb, "\tTDLEN\t %08x\n", regs_buff[14]); sbuf_printf(sb, "\tTDH\t %08x\n", regs_buff[15]); sbuf_printf(sb, "\tTDT\t %08x\n", regs_buff[16]); sbuf_printf(sb, "\tTXDCTL\t %08x\n", regs_buff[17]); sbuf_printf(sb, "\tTDFH\t %08x\n", regs_buff[18]); sbuf_printf(sb, "\tTDFT\t %08x\n", regs_buff[19]); sbuf_printf(sb, "\tTDFHS\t %08x\n", regs_buff[20]); sbuf_printf(sb, "\tTDFPC\t %08x\n\n", regs_buff[21]); #ifdef DUMP_DESCS { if_softc_ctx_t scctx = adapter->shared; struct rx_ring *rxr = &rx_que->rxr; struct tx_ring *txr = &tx_que->txr; int ntxd = scctx->isc_ntxd[0]; int nrxd = scctx->isc_nrxd[0]; int j; for (j = 0; j < nrxd; j++) { u32 staterr = le32toh(rxr->rx_base[j].wb.upper.status_error); u32 length = le32toh(rxr->rx_base[j].wb.upper.length); sbuf_printf(sb, "\tReceive Descriptor Address %d: %08" PRIx64 " Error:%d Length:%d\n", j, rxr->rx_base[j].read.buffer_addr, staterr, length); } for (j = 0; j < min(ntxd, 256); j++) { unsigned int *ptr = (unsigned int *)&txr->tx_base[j]; sbuf_printf(sb, "\tTXD[%03d] [0]: %08x [1]: %08x [2]: %08x [3]: %08x eop: %d DD=%d\n", j, ptr[0], ptr[1], ptr[2], ptr[3], buf->eop, buf->eop != -1 ? txr->tx_base[buf->eop].upper.fields.status & E1000_TXD_STAT_DD : 0); } } #endif rc = sbuf_finish(sb); sbuf_delete(sb); return(rc); } static void * em_register(device_t dev) { return (em_sctx); } static void * igb_register(device_t dev) { return (igb_sctx); } static int em_set_num_queues(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); int maxqueues; /* Sanity check based on HW */ switch (adapter->hw.mac.type) { case e1000_82576: case e1000_82580: case e1000_i350: case e1000_i354: maxqueues = 8; break; case e1000_i210: case e1000_82575: maxqueues = 4; break; case e1000_i211: case e1000_82574: maxqueues = 2; break; default: maxqueues = 1; break; } return (maxqueues); } #define EM_CAPS \ IFCAP_TSO4 | IFCAP_TXCSUM | IFCAP_LRO | IFCAP_RXCSUM | IFCAP_VLAN_HWFILTER | IFCAP_WOL_MAGIC | \ IFCAP_WOL_MCAST | IFCAP_WOL | IFCAP_VLAN_HWTSO | IFCAP_HWCSUM | IFCAP_VLAN_HWTAGGING | \ IFCAP_VLAN_HWCSUM | IFCAP_VLAN_HWTSO | IFCAP_VLAN_MTU; #define IGB_CAPS \ IFCAP_TSO4 | IFCAP_TXCSUM | IFCAP_LRO | IFCAP_RXCSUM | IFCAP_VLAN_HWFILTER | IFCAP_WOL_MAGIC | \ IFCAP_WOL_MCAST | IFCAP_WOL | IFCAP_VLAN_HWTSO | IFCAP_HWCSUM | IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWCSUM | \ IFCAP_VLAN_HWTSO | IFCAP_VLAN_MTU | IFCAP_TXCSUM_IPV6 | IFCAP_HWCSUM_IPV6 | IFCAP_JUMBO_MTU; /********************************************************************* * Device initialization routine * * The attach entry point is called when the driver is being loaded. * This routine identifies the type of hardware, allocates all resources * and initializes the hardware. * * return 0 on success, positive on failure *********************************************************************/ static int em_if_attach_pre(if_ctx_t ctx) { struct adapter *adapter; if_softc_ctx_t scctx; device_t dev; struct e1000_hw *hw; int error = 0; INIT_DEBUGOUT("em_if_attach_pre begin"); dev = iflib_get_dev(ctx); adapter = iflib_get_softc(ctx); if (resource_disabled("em", device_get_unit(dev))) { device_printf(dev, "Disabled by device hint\n"); return (ENXIO); } adapter->ctx = ctx; adapter->dev = adapter->osdep.dev = dev; scctx = adapter->shared = iflib_get_softc_ctx(ctx); adapter->media = iflib_get_media(ctx); hw = &adapter->hw; adapter->tx_process_limit = scctx->isc_ntxd[0]; /* SYSCTL stuff */ SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "nvm", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, em_sysctl_nvm_info, "I", "NVM Information"); SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "debug", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, em_sysctl_debug_info, "I", "Debug Information"); SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "fc", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, em_set_flowcntl, "I", "Flow Control"); SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "reg_dump", CTLTYPE_STRING | CTLFLAG_RD, adapter, 0, em_get_regs, "A", "Dump Registers"); SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "rs_dump", CTLTYPE_INT | CTLFLAG_RW, adapter, 0, em_get_rs, "I", "Dump RS indexes"); /* Determine hardware and mac info */ em_identify_hardware(ctx); /* Set isc_msix_bar */ scctx->isc_msix_bar = PCIR_BAR(EM_MSIX_BAR); scctx->isc_tx_nsegments = EM_MAX_SCATTER; scctx->isc_tx_tso_segments_max = scctx->isc_tx_nsegments; scctx->isc_tx_tso_size_max = EM_TSO_SIZE; scctx->isc_tx_tso_segsize_max = EM_TSO_SEG_SIZE; scctx->isc_nrxqsets_max = scctx->isc_ntxqsets_max = em_set_num_queues(ctx); device_printf(dev, "attach_pre capping queues at %d\n", scctx->isc_ntxqsets_max); scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO; if (adapter->hw.mac.type >= igb_mac_min) { int try_second_bar; scctx->isc_txqsizes[0] = roundup2(scctx->isc_ntxd[0] * sizeof(union e1000_adv_tx_desc), EM_DBA_ALIGN); scctx->isc_rxqsizes[0] = roundup2(scctx->isc_nrxd[0] * sizeof(union e1000_adv_rx_desc), EM_DBA_ALIGN); scctx->isc_txd_size[0] = sizeof(union e1000_adv_tx_desc); scctx->isc_rxd_size[0] = sizeof(union e1000_adv_rx_desc); scctx->isc_txrx = &igb_txrx; scctx->isc_capenable = IGB_CAPS; scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_TSO | CSUM_IP6_TCP \ | CSUM_IP6_UDP | CSUM_IP6_TCP; if (adapter->hw.mac.type != e1000_82575) scctx->isc_tx_csum_flags |= CSUM_SCTP | CSUM_IP6_SCTP; /* ** Some new devices, as with ixgbe, now may ** use a different BAR, so we need to keep ** track of which is used. */ try_second_bar = pci_read_config(dev, scctx->isc_msix_bar, 4); if (try_second_bar == 0) scctx->isc_msix_bar += 4; } else if (adapter->hw.mac.type >= em_mac_min) { scctx->isc_txqsizes[0] = roundup2(scctx->isc_ntxd[0]* sizeof(struct e1000_tx_desc), EM_DBA_ALIGN); scctx->isc_rxqsizes[0] = roundup2(scctx->isc_nrxd[0] * sizeof(union e1000_rx_desc_extended), EM_DBA_ALIGN); scctx->isc_txd_size[0] = sizeof(struct e1000_tx_desc); scctx->isc_rxd_size[0] = sizeof(union e1000_rx_desc_extended); scctx->isc_txrx = &em_txrx; scctx->isc_capenable = EM_CAPS; scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO; } else { scctx->isc_txqsizes[0] = roundup2((scctx->isc_ntxd[0] + 1) * sizeof(struct e1000_tx_desc), EM_DBA_ALIGN); scctx->isc_rxqsizes[0] = roundup2((scctx->isc_nrxd[0] + 1) * sizeof(struct e1000_rx_desc), EM_DBA_ALIGN); scctx->isc_txd_size[0] = sizeof(struct e1000_tx_desc); scctx->isc_rxd_size[0] = sizeof(struct e1000_rx_desc); scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO; scctx->isc_txrx = &lem_txrx; scctx->isc_capenable = EM_CAPS; if (adapter->hw.mac.type < e1000_82543) scctx->isc_capenable &= ~(IFCAP_HWCSUM|IFCAP_VLAN_HWCSUM); scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO; scctx->isc_msix_bar = 0; } /* Setup PCI resources */ if (em_allocate_pci_resources(ctx)) { device_printf(dev, "Allocation of PCI resources failed\n"); error = ENXIO; goto err_pci; } /* ** For ICH8 and family we need to ** map the flash memory, and this ** must happen after the MAC is ** identified */ if ((hw->mac.type == e1000_ich8lan) || (hw->mac.type == e1000_ich9lan) || (hw->mac.type == e1000_ich10lan) || (hw->mac.type == e1000_pchlan) || (hw->mac.type == e1000_pch2lan) || (hw->mac.type == e1000_pch_lpt)) { int rid = EM_BAR_TYPE_FLASH; adapter->flash = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &rid, RF_ACTIVE); if (adapter->flash == NULL) { device_printf(dev, "Mapping of Flash failed\n"); error = ENXIO; goto err_pci; } /* This is used in the shared code */ hw->flash_address = (u8 *)adapter->flash; adapter->osdep.flash_bus_space_tag = rman_get_bustag(adapter->flash); adapter->osdep.flash_bus_space_handle = rman_get_bushandle(adapter->flash); } /* ** In the new SPT device flash is not a ** separate BAR, rather it is also in BAR0, ** so use the same tag and an offset handle for the ** FLASH read/write macros in the shared code. */ else if (hw->mac.type == e1000_pch_spt) { adapter->osdep.flash_bus_space_tag = adapter->osdep.mem_bus_space_tag; adapter->osdep.flash_bus_space_handle = adapter->osdep.mem_bus_space_handle + E1000_FLASH_BASE_ADDR; } /* Do Shared Code initialization */ error = e1000_setup_init_funcs(hw, TRUE); if (error) { device_printf(dev, "Setup of Shared code failed, error %d\n", error); error = ENXIO; goto err_pci; } em_setup_msix(ctx); e1000_get_bus_info(hw); /* Set up some sysctls for the tunable interrupt delays */ em_add_int_delay_sysctl(adapter, "rx_int_delay", "receive interrupt delay in usecs", &adapter->rx_int_delay, E1000_REGISTER(hw, E1000_RDTR), em_rx_int_delay_dflt); em_add_int_delay_sysctl(adapter, "tx_int_delay", "transmit interrupt delay in usecs", &adapter->tx_int_delay, E1000_REGISTER(hw, E1000_TIDV), em_tx_int_delay_dflt); em_add_int_delay_sysctl(adapter, "rx_abs_int_delay", "receive interrupt delay limit in usecs", &adapter->rx_abs_int_delay, E1000_REGISTER(hw, E1000_RADV), em_rx_abs_int_delay_dflt); em_add_int_delay_sysctl(adapter, "tx_abs_int_delay", "transmit interrupt delay limit in usecs", &adapter->tx_abs_int_delay, E1000_REGISTER(hw, E1000_TADV), em_tx_abs_int_delay_dflt); em_add_int_delay_sysctl(adapter, "itr", "interrupt delay limit in usecs/4", &adapter->tx_itr, E1000_REGISTER(hw, E1000_ITR), DEFAULT_ITR); hw->mac.autoneg = DO_AUTO_NEG; hw->phy.autoneg_wait_to_complete = FALSE; hw->phy.autoneg_advertised = AUTONEG_ADV_DEFAULT; if (adapter->hw.mac.type < em_mac_min) { e1000_init_script_state_82541(&adapter->hw, TRUE); e1000_set_tbi_compatibility_82543(&adapter->hw, TRUE); } /* Copper options */ if (hw->phy.media_type == e1000_media_type_copper) { hw->phy.mdix = AUTO_ALL_MODES; hw->phy.disable_polarity_correction = FALSE; hw->phy.ms_type = EM_MASTER_SLAVE; } /* * Set the frame limits assuming * standard ethernet sized frames. */ scctx->isc_max_frame_size = adapter->hw.mac.max_frame_size = ETHERMTU + ETHER_HDR_LEN + ETHERNET_FCS_SIZE; /* * This controls when hardware reports transmit completion * status. */ hw->mac.report_tx_early = 1; /* Allocate multicast array memory. */ adapter->mta = malloc(sizeof(u8) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES, M_DEVBUF, M_NOWAIT); if (adapter->mta == NULL) { device_printf(dev, "Can not allocate multicast setup array\n"); error = ENOMEM; goto err_late; } /* Check SOL/IDER usage */ if (e1000_check_reset_block(hw)) device_printf(dev, "PHY reset is blocked" " due to SOL/IDER session.\n"); /* Sysctl for setting Energy Efficient Ethernet */ hw->dev_spec.ich8lan.eee_disable = eee_setting; SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "eee_control", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, em_sysctl_eee, "I", "Disable Energy Efficient Ethernet"); /* ** Start from a known state, this is ** important in reading the nvm and ** mac from that. */ e1000_reset_hw(hw); /* Make sure we have a good EEPROM before we read from it */ if (e1000_validate_nvm_checksum(hw) < 0) { /* ** Some PCI-E parts fail the first check due to ** the link being in sleep state, call it again, ** if it fails a second time its a real issue. */ if (e1000_validate_nvm_checksum(hw) < 0) { device_printf(dev, "The EEPROM Checksum Is Not Valid\n"); error = EIO; goto err_late; } } /* Copy the permanent MAC address out of the EEPROM */ if (e1000_read_mac_addr(hw) < 0) { device_printf(dev, "EEPROM read error while reading MAC" " address\n"); error = EIO; goto err_late; } if (!em_is_valid_ether_addr(hw->mac.addr)) { device_printf(dev, "Invalid MAC address\n"); error = EIO; goto err_late; } /* Disable ULP support */ e1000_disable_ulp_lpt_lp(hw, TRUE); /* * Get Wake-on-Lan and Management info for later use */ em_get_wakeup(ctx); iflib_set_mac(ctx, hw->mac.addr); return (0); err_late: em_release_hw_control(adapter); err_pci: em_free_pci_resources(ctx); free(adapter->mta, M_DEVBUF); return (error); } static int em_if_attach_post(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct e1000_hw *hw = &adapter->hw; int error = 0; /* Setup OS specific network interface */ error = em_setup_interface(ctx); if (error != 0) { goto err_late; } em_reset(ctx); /* Initialize statistics */ em_update_stats_counters(adapter); hw->mac.get_link_status = 1; em_if_update_admin_status(ctx); em_add_hw_stats(adapter); /* Non-AMT based hardware can now take control from firmware */ if (adapter->has_manage && !adapter->has_amt) em_get_hw_control(adapter); INIT_DEBUGOUT("em_if_attach_post: end"); return (error); err_late: em_release_hw_control(adapter); em_free_pci_resources(ctx); em_if_queues_free(ctx); free(adapter->mta, M_DEVBUF); return (error); } /********************************************************************* * Device removal routine * * The detach entry point is called when the driver is being removed. * This routine stops the adapter and deallocates all the resources * that were allocated for driver operation. * * return 0 on success, positive on failure *********************************************************************/ static int em_if_detach(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); INIT_DEBUGOUT("em_detach: begin"); e1000_phy_hw_reset(&adapter->hw); em_release_manageability(adapter); em_release_hw_control(adapter); em_free_pci_resources(ctx); return (0); } /********************************************************************* * * Shutdown entry point * **********************************************************************/ static int em_if_shutdown(if_ctx_t ctx) { return em_if_suspend(ctx); } /* * Suspend/resume device methods. */ static int em_if_suspend(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); em_release_manageability(adapter); em_release_hw_control(adapter); em_enable_wakeup(ctx); return (0); } static int em_if_resume(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); if (adapter->hw.mac.type == e1000_pch2lan) e1000_resume_workarounds_pchlan(&adapter->hw); em_if_init(ctx); em_init_manageability(adapter); return(0); } static int em_if_mtu_set(if_ctx_t ctx, uint32_t mtu) { int max_frame_size; struct adapter *adapter = iflib_get_softc(ctx); if_softc_ctx_t scctx = iflib_get_softc_ctx(ctx); IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFMTU (Set Interface MTU)"); switch (adapter->hw.mac.type) { case e1000_82571: case e1000_82572: case e1000_ich9lan: case e1000_ich10lan: case e1000_pch2lan: case e1000_pch_lpt: case e1000_pch_spt: case e1000_82574: case e1000_82583: case e1000_80003es2lan: /* 9K Jumbo Frame size */ max_frame_size = 9234; break; case e1000_pchlan: max_frame_size = 4096; break; case e1000_82542: case e1000_ich8lan: /* Adapters that do not support jumbo frames */ max_frame_size = ETHER_MAX_LEN; break; default: if (adapter->hw.mac.type >= igb_mac_min) max_frame_size = 9234; else /* lem */ max_frame_size = MAX_JUMBO_FRAME_SIZE; } if (mtu > max_frame_size - ETHER_HDR_LEN - ETHER_CRC_LEN) { return (EINVAL); } scctx->isc_max_frame_size = adapter->hw.mac.max_frame_size = mtu + ETHER_HDR_LEN + ETHER_CRC_LEN; return (0); } /********************************************************************* * Init entry point * * This routine is used in two ways. It is used by the stack as * init entry point in network interface structure. It is also used * by the driver as a hw/sw initialization routine to get to a * consistent state. * * return 0 on success, positive on failure **********************************************************************/ static void em_if_init(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct ifnet *ifp = iflib_get_ifp(ctx); struct em_tx_queue *tx_que; int i; INIT_DEBUGOUT("em_if_init: begin"); /* Get the latest mac address, User can use a LAA */ bcopy(if_getlladdr(ifp), adapter->hw.mac.addr, ETHER_ADDR_LEN); /* Put the address into the Receive Address Array */ e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, 0); /* * With the 82571 adapter, RAR[0] may be overwritten * when the other port is reset, we make a duplicate * in RAR[14] for that eventuality, this assures * the interface continues to function. */ if (adapter->hw.mac.type == e1000_82571) { e1000_set_laa_state_82571(&adapter->hw, TRUE); e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, E1000_RAR_ENTRIES - 1); } /* Initialize the hardware */ em_reset(ctx); em_if_update_admin_status(ctx); for (i = 0, tx_que = adapter->tx_queues; i < adapter->tx_num_queues; i++, tx_que++) { struct tx_ring *txr = &tx_que->txr; txr->tx_rs_cidx = txr->tx_rs_pidx = txr->tx_cidx_processed = 0; } /* Setup VLAN support, basic and offload if available */ E1000_WRITE_REG(&adapter->hw, E1000_VET, ETHERTYPE_VLAN); /* Clear bad data from Rx FIFOs */ if (adapter->hw.mac.type >= igb_mac_min) e1000_rx_fifo_flush_82575(&adapter->hw); /* Configure for OS presence */ em_init_manageability(adapter); /* Prepare transmit descriptors and buffers */ em_initialize_transmit_unit(ctx); /* Setup Multicast table */ em_if_multi_set(ctx); /* * Figure out the desired mbuf * pool for doing jumbos */ if (adapter->hw.mac.max_frame_size <= 2048) adapter->rx_mbuf_sz = MCLBYTES; #ifndef CONTIGMALLOC_WORKS else adapter->rx_mbuf_sz = MJUMPAGESIZE; #else else if (adapter->hw.mac.max_frame_size <= 4096) adapter->rx_mbuf_sz = MJUMPAGESIZE; else adapter->rx_mbuf_sz = MJUM9BYTES; #endif em_initialize_receive_unit(ctx); /* Use real VLAN Filter support? */ if (if_getcapenable(ifp) & IFCAP_VLAN_HWTAGGING) { if (if_getcapenable(ifp) & IFCAP_VLAN_HWFILTER) /* Use real VLAN Filter support */ em_setup_vlan_hw_support(adapter); else { u32 ctrl; ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); ctrl |= E1000_CTRL_VME; E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl); } } /* Don't lose promiscuous settings */ em_if_set_promisc(ctx, IFF_PROMISC); e1000_clear_hw_cntrs_base_generic(&adapter->hw); /* MSI/X configuration for 82574 */ if (adapter->hw.mac.type == e1000_82574) { int tmp = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); tmp |= E1000_CTRL_EXT_PBA_CLR; E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, tmp); /* Set the IVAR - interrupt vector routing. */ E1000_WRITE_REG(&adapter->hw, E1000_IVAR, adapter->ivars); } else if (adapter->intr_type == IFLIB_INTR_MSIX) /* Set up queue routing */ igb_configure_queues(adapter); /* this clears any pending interrupts */ E1000_READ_REG(&adapter->hw, E1000_ICR); E1000_WRITE_REG(&adapter->hw, E1000_ICS, E1000_ICS_LSC); /* AMT based hardware can now take control from firmware */ if (adapter->has_manage && adapter->has_amt) em_get_hw_control(adapter); /* Set Energy Efficient Ethernet */ if (adapter->hw.mac.type >= igb_mac_min && adapter->hw.phy.media_type == e1000_media_type_copper) { if (adapter->hw.mac.type == e1000_i354) e1000_set_eee_i354(&adapter->hw, TRUE, TRUE); else e1000_set_eee_i350(&adapter->hw, TRUE, TRUE); } } /********************************************************************* * * Fast Legacy/MSI Combined Interrupt Service routine * *********************************************************************/ int em_intr(void *arg) { struct adapter *adapter = arg; if_ctx_t ctx = adapter->ctx; u32 reg_icr; reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); if (adapter->intr_type != IFLIB_INTR_LEGACY) goto skip_stray; /* Hot eject? */ if (reg_icr == 0xffffffff) return FILTER_STRAY; /* Definitely not our interrupt. */ if (reg_icr == 0x0) return FILTER_STRAY; /* * Starting with the 82571 chip, bit 31 should be used to * determine whether the interrupt belongs to us. */ if (adapter->hw.mac.type >= e1000_82571 && (reg_icr & E1000_ICR_INT_ASSERTED) == 0) return FILTER_STRAY; skip_stray: /* Link status change */ if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) { adapter->hw.mac.get_link_status = 1; iflib_admin_intr_deferred(ctx); } if (reg_icr & E1000_ICR_RXO) adapter->rx_overruns++; return (FILTER_SCHEDULE_THREAD); } static void igb_rx_enable_queue(struct adapter *adapter, struct em_rx_queue *rxq) { E1000_WRITE_REG(&adapter->hw, E1000_EIMS, rxq->eims); } static void em_rx_enable_queue(struct adapter *adapter, struct em_rx_queue *rxq) { E1000_WRITE_REG(&adapter->hw, E1000_IMS, rxq->eims); } static void igb_tx_enable_queue(struct adapter *adapter, struct em_tx_queue *txq) { E1000_WRITE_REG(&adapter->hw, E1000_EIMS, txq->eims); } static void em_tx_enable_queue(struct adapter *adapter, struct em_tx_queue *txq) { E1000_WRITE_REG(&adapter->hw, E1000_IMS, txq->eims); } static int em_if_rx_queue_intr_enable(if_ctx_t ctx, uint16_t rxqid) { struct adapter *adapter = iflib_get_softc(ctx); struct em_rx_queue *rxq = &adapter->rx_queues[rxqid]; if (adapter->hw.mac.type >= igb_mac_min) igb_rx_enable_queue(adapter, rxq); else em_rx_enable_queue(adapter, rxq); return (0); } static int em_if_tx_queue_intr_enable(if_ctx_t ctx, uint16_t txqid) { struct adapter *adapter = iflib_get_softc(ctx); struct em_tx_queue *txq = &adapter->tx_queues[txqid]; if (adapter->hw.mac.type >= igb_mac_min) igb_tx_enable_queue(adapter, txq); else em_tx_enable_queue(adapter, txq); return (0); } /********************************************************************* * * MSIX RX Interrupt Service routine * **********************************************************************/ static int em_msix_que(void *arg) { struct em_rx_queue *que = arg; ++que->irqs; return (FILTER_SCHEDULE_THREAD); } /********************************************************************* * * MSIX Link Fast Interrupt Service routine * **********************************************************************/ static int em_msix_link(void *arg) { struct adapter *adapter = arg; u32 reg_icr; ++adapter->link_irq; MPASS(adapter->hw.back != NULL); reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); if (reg_icr & E1000_ICR_RXO) adapter->rx_overruns++; if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) { em_handle_link(adapter->ctx); } else { E1000_WRITE_REG(&adapter->hw, E1000_IMS, EM_MSIX_LINK | E1000_IMS_LSC); if (adapter->hw.mac.type >= igb_mac_min) E1000_WRITE_REG(&adapter->hw, E1000_EIMS, adapter->link_mask); } /* * Because we must read the ICR for this interrupt * it may clear other causes using autoclear, for * this reason we simply create a soft interrupt * for all these vectors. */ if (reg_icr && adapter->hw.mac.type < igb_mac_min) { E1000_WRITE_REG(&adapter->hw, E1000_ICS, adapter->ims); } return (FILTER_HANDLED); } static void em_handle_link(void *context) { if_ctx_t ctx = context; struct adapter *adapter = iflib_get_softc(ctx); adapter->hw.mac.get_link_status = 1; iflib_admin_intr_deferred(ctx); } /********************************************************************* * * Media Ioctl callback * * This routine is called whenever the user queries the status of * the interface using ifconfig. * **********************************************************************/ static void em_if_media_status(if_ctx_t ctx, struct ifmediareq *ifmr) { struct adapter *adapter = iflib_get_softc(ctx); u_char fiber_type = IFM_1000_SX; INIT_DEBUGOUT("em_if_media_status: begin"); iflib_admin_intr_deferred(ctx); ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; if (!adapter->link_active) { return; } ifmr->ifm_status |= IFM_ACTIVE; if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) { if (adapter->hw.mac.type == e1000_82545) fiber_type = IFM_1000_LX; ifmr->ifm_active |= fiber_type | IFM_FDX; } else { switch (adapter->link_speed) { case 10: ifmr->ifm_active |= IFM_10_T; break; case 100: ifmr->ifm_active |= IFM_100_TX; break; case 1000: ifmr->ifm_active |= IFM_1000_T; break; } if (adapter->link_duplex == FULL_DUPLEX) ifmr->ifm_active |= IFM_FDX; else ifmr->ifm_active |= IFM_HDX; } } /********************************************************************* * * Media Ioctl callback * * This routine is called when the user changes speed/duplex using * media/mediopt option with ifconfig. * **********************************************************************/ static int em_if_media_change(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct ifmedia *ifm = iflib_get_media(ctx); INIT_DEBUGOUT("em_if_media_change: begin"); if (IFM_TYPE(ifm->ifm_media) != IFM_ETHER) return (EINVAL); switch (IFM_SUBTYPE(ifm->ifm_media)) { case IFM_AUTO: adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_advertised = AUTONEG_ADV_DEFAULT; break; case IFM_1000_LX: case IFM_1000_SX: case IFM_1000_T: adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_advertised = ADVERTISE_1000_FULL; break; case IFM_100_TX: adapter->hw.mac.autoneg = FALSE; adapter->hw.phy.autoneg_advertised = 0; if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX) adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_FULL; else adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_HALF; break; case IFM_10_T: adapter->hw.mac.autoneg = FALSE; adapter->hw.phy.autoneg_advertised = 0; if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX) adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_FULL; else adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_HALF; break; default: device_printf(adapter->dev, "Unsupported media type\n"); } em_if_init(ctx); return (0); } static int em_if_set_promisc(if_ctx_t ctx, int flags) { struct adapter *adapter = iflib_get_softc(ctx); u32 reg_rctl; em_disable_promisc(ctx); reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); if (flags & IFF_PROMISC) { reg_rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE); /* Turn this on if you want to see bad packets */ if (em_debug_sbp) reg_rctl |= E1000_RCTL_SBP; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } else if (flags & IFF_ALLMULTI) { reg_rctl |= E1000_RCTL_MPE; reg_rctl &= ~E1000_RCTL_UPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } return (0); } static void em_disable_promisc(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct ifnet *ifp = iflib_get_ifp(ctx); u32 reg_rctl; int mcnt = 0; reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl &= (~E1000_RCTL_UPE); if (if_getflags(ifp) & IFF_ALLMULTI) mcnt = MAX_NUM_MULTICAST_ADDRESSES; else mcnt = if_multiaddr_count(ifp, MAX_NUM_MULTICAST_ADDRESSES); /* Don't disable if in MAX groups */ if (mcnt < MAX_NUM_MULTICAST_ADDRESSES) reg_rctl &= (~E1000_RCTL_MPE); reg_rctl &= (~E1000_RCTL_SBP); E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } /********************************************************************* * Multicast Update * * This routine is called whenever multicast address list is updated. * **********************************************************************/ static void em_if_multi_set(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct ifnet *ifp = iflib_get_ifp(ctx); u32 reg_rctl = 0; u8 *mta; /* Multicast array memory */ int mcnt = 0; IOCTL_DEBUGOUT("em_set_multi: begin"); mta = adapter->mta; bzero(mta, sizeof(u8) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES); if (adapter->hw.mac.type == e1000_82542 && adapter->hw.revision_id == E1000_REVISION_2) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); if (adapter->hw.bus.pci_cmd_word & CMD_MEM_WRT_INVALIDATE) e1000_pci_clear_mwi(&adapter->hw); reg_rctl |= E1000_RCTL_RST; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); msec_delay(5); } if_multiaddr_array(ifp, mta, &mcnt, MAX_NUM_MULTICAST_ADDRESSES); if (mcnt >= MAX_NUM_MULTICAST_ADDRESSES) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl |= E1000_RCTL_MPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } else e1000_update_mc_addr_list(&adapter->hw, mta, mcnt); if (adapter->hw.mac.type == e1000_82542 && adapter->hw.revision_id == E1000_REVISION_2) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl &= ~E1000_RCTL_RST; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); msec_delay(5); if (adapter->hw.bus.pci_cmd_word & CMD_MEM_WRT_INVALIDATE) e1000_pci_set_mwi(&adapter->hw); } } /********************************************************************* * Timer routine * * This routine checks for link status and updates statistics. * **********************************************************************/ static void em_if_timer(if_ctx_t ctx, uint16_t qid) { struct adapter *adapter = iflib_get_softc(ctx); struct em_rx_queue *que; int i; int trigger = 0; if (qid != 0) return; iflib_admin_intr_deferred(ctx); /* Reset LAA into RAR[0] on 82571 */ if ((adapter->hw.mac.type == e1000_82571) && e1000_get_laa_state_82571(&adapter->hw)) e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, 0); if (adapter->hw.mac.type < em_mac_min) lem_smartspeed(adapter); /* Mask to use in the irq trigger */ if (adapter->intr_type == IFLIB_INTR_MSIX) { for (i = 0, que = adapter->rx_queues; i < adapter->rx_num_queues; i++, que++) trigger |= que->eims; } else { trigger = E1000_ICS_RXDMT0; } } static void em_if_update_admin_status(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct e1000_hw *hw = &adapter->hw; struct ifnet *ifp = iflib_get_ifp(ctx); device_t dev = iflib_get_dev(ctx); u32 link_check, thstat, ctrl; link_check = thstat = ctrl = 0; /* Get the cached link value or read phy for real */ switch (hw->phy.media_type) { case e1000_media_type_copper: if (hw->mac.get_link_status) { if (hw->mac.type == e1000_pch_spt) msec_delay(50); /* Do the work to read phy */ e1000_check_for_link(hw); link_check = !hw->mac.get_link_status; if (link_check) /* ESB2 fix */ e1000_cfg_on_link_up(hw); } else { link_check = TRUE; } break; case e1000_media_type_fiber: e1000_check_for_link(hw); link_check = (E1000_READ_REG(hw, E1000_STATUS) & E1000_STATUS_LU); break; case e1000_media_type_internal_serdes: e1000_check_for_link(hw); link_check = adapter->hw.mac.serdes_has_link; break; /* VF device is type_unknown */ case e1000_media_type_unknown: e1000_check_for_link(hw); link_check = !hw->mac.get_link_status; /* FALLTHROUGH */ default: break; } /* Check for thermal downshift or shutdown */ if (hw->mac.type == e1000_i350) { thstat = E1000_READ_REG(hw, E1000_THSTAT); ctrl = E1000_READ_REG(hw, E1000_CTRL_EXT); } /* Now check for a transition */ if (link_check && (adapter->link_active == 0)) { e1000_get_speed_and_duplex(hw, &adapter->link_speed, &adapter->link_duplex); /* Check if we must disable SPEED_MODE bit on PCI-E */ if ((adapter->link_speed != SPEED_1000) && ((hw->mac.type == e1000_82571) || (hw->mac.type == e1000_82572))) { int tarc0; tarc0 = E1000_READ_REG(hw, E1000_TARC(0)); tarc0 &= ~TARC_SPEED_MODE_BIT; E1000_WRITE_REG(hw, E1000_TARC(0), tarc0); } if (bootverbose) device_printf(dev, "Link is up %d Mbps %s\n", adapter->link_speed, ((adapter->link_duplex == FULL_DUPLEX) ? "Full Duplex" : "Half Duplex")); adapter->link_active = 1; adapter->smartspeed = 0; if_setbaudrate(ifp, adapter->link_speed * 1000000); if ((ctrl & E1000_CTRL_EXT_LINK_MODE_GMII) && (thstat & E1000_THSTAT_LINK_THROTTLE)) device_printf(dev, "Link: thermal downshift\n"); /* Delay Link Up for Phy update */ if (((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211)) && (hw->phy.id == I210_I_PHY_ID)) msec_delay(I210_LINK_DELAY); /* Reset if the media type changed. */ if ((hw->dev_spec._82575.media_changed) && (adapter->hw.mac.type >= igb_mac_min)) { hw->dev_spec._82575.media_changed = false; adapter->flags |= IGB_MEDIA_RESET; em_reset(ctx); } iflib_link_state_change(ctx, LINK_STATE_UP, ifp->if_baudrate); printf("Link state changed to up\n"); } else if (!link_check && (adapter->link_active == 1)) { if_setbaudrate(ifp, 0); adapter->link_speed = 0; adapter->link_duplex = 0; if (bootverbose) device_printf(dev, "Link is Down\n"); adapter->link_active = 0; iflib_link_state_change(ctx, LINK_STATE_DOWN, ifp->if_baudrate); printf("link state changed to down\n"); } em_update_stats_counters(adapter); E1000_WRITE_REG(&adapter->hw, E1000_IMS, EM_MSIX_LINK | E1000_IMS_LSC); } /********************************************************************* * * This routine disables all traffic on the adapter by issuing a * global reset on the MAC and deallocates TX/RX buffers. * * This routine should always be called with BOTH the CORE * and TX locks. **********************************************************************/ static void em_if_stop(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); INIT_DEBUGOUT("em_stop: begin"); e1000_reset_hw(&adapter->hw); if (adapter->hw.mac.type >= e1000_82544) E1000_WRITE_REG(&adapter->hw, E1000_WUFC, 0); e1000_led_off(&adapter->hw); e1000_cleanup_led(&adapter->hw); } /********************************************************************* * * Determine hardware revision. * **********************************************************************/ static void em_identify_hardware(if_ctx_t ctx) { device_t dev = iflib_get_dev(ctx); struct adapter *adapter = iflib_get_softc(ctx); /* Make sure our PCI config space has the necessary stuff set */ adapter->hw.bus.pci_cmd_word = pci_read_config(dev, PCIR_COMMAND, 2); /* Save off the information about this board */ adapter->hw.vendor_id = pci_get_vendor(dev); adapter->hw.device_id = pci_get_device(dev); adapter->hw.revision_id = pci_read_config(dev, PCIR_REVID, 1); adapter->hw.subsystem_vendor_id = pci_read_config(dev, PCIR_SUBVEND_0, 2); adapter->hw.subsystem_device_id = pci_read_config(dev, PCIR_SUBDEV_0, 2); /* Do Shared Code Init and Setup */ if (e1000_set_mac_type(&adapter->hw)) { device_printf(dev, "Setup init failure\n"); return; } } static int em_allocate_pci_resources(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); device_t dev = iflib_get_dev(ctx); int rid, val; rid = PCIR_BAR(0); adapter->memory = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &rid, RF_ACTIVE); if (adapter->memory == NULL) { device_printf(dev, "Unable to allocate bus resource: memory\n"); return (ENXIO); } adapter->osdep.mem_bus_space_tag = rman_get_bustag(adapter->memory); adapter->osdep.mem_bus_space_handle = rman_get_bushandle(adapter->memory); adapter->hw.hw_addr = (u8 *)&adapter->osdep.mem_bus_space_handle; /* Only older adapters use IO mapping */ if (adapter->hw.mac.type < em_mac_min && adapter->hw.mac.type > e1000_82543) { /* Figure our where our IO BAR is ? */ for (rid = PCIR_BAR(0); rid < PCIR_CIS;) { val = pci_read_config(dev, rid, 4); if (EM_BAR_TYPE(val) == EM_BAR_TYPE_IO) { adapter->io_rid = rid; break; } rid += 4; /* check for 64bit BAR */ if (EM_BAR_MEM_TYPE(val) == EM_BAR_MEM_TYPE_64BIT) rid += 4; } if (rid >= PCIR_CIS) { device_printf(dev, "Unable to locate IO BAR\n"); return (ENXIO); } adapter->ioport = bus_alloc_resource_any(dev, SYS_RES_IOPORT, &adapter->io_rid, RF_ACTIVE); if (adapter->ioport == NULL) { device_printf(dev, "Unable to allocate bus resource: " "ioport\n"); return (ENXIO); } adapter->hw.io_base = 0; adapter->osdep.io_bus_space_tag = rman_get_bustag(adapter->ioport); adapter->osdep.io_bus_space_handle = rman_get_bushandle(adapter->ioport); } adapter->hw.back = &adapter->osdep; return (0); } /********************************************************************* * * Setup the MSIX Interrupt handlers * **********************************************************************/ static int em_if_msix_intr_assign(if_ctx_t ctx, int msix) { struct adapter *adapter = iflib_get_softc(ctx); struct em_rx_queue *rx_que = adapter->rx_queues; struct em_tx_queue *tx_que = adapter->tx_queues; int error, rid, i, vector = 0, rx_vectors; char buf[16]; /* First set up ring resources */ for (i = 0; i < adapter->rx_num_queues; i++, rx_que++, vector++) { rid = vector + 1; snprintf(buf, sizeof(buf), "rxq%d", i); error = iflib_irq_alloc_generic(ctx, &rx_que->que_irq, rid, IFLIB_INTR_RXTX, em_msix_que, rx_que, rx_que->me, buf); if (error) { device_printf(iflib_get_dev(ctx), "Failed to allocate que int %d err: %d", i, error); adapter->rx_num_queues = i + 1; goto fail; } rx_que->msix = vector; /* * Set the bit to enable interrupt * in E1000_IMS -- bits 20 and 21 * are for RX0 and RX1, note this has * NOTHING to do with the MSIX vector */ if (adapter->hw.mac.type == e1000_82574) { rx_que->eims = 1 << (20 + i); adapter->ims |= rx_que->eims; adapter->ivars |= (8 | rx_que->msix) << (i * 4); } else if (adapter->hw.mac.type == e1000_82575) rx_que->eims = E1000_EICR_TX_QUEUE0 << vector; else rx_que->eims = 1 << vector; } rx_vectors = vector; vector = 0; for (i = 0; i < adapter->tx_num_queues; i++, tx_que++, vector++) { rid = vector + 1; snprintf(buf, sizeof(buf), "txq%d", i); tx_que = &adapter->tx_queues[i]; iflib_softirq_alloc_generic(ctx, rid, IFLIB_INTR_TX, tx_que, tx_que->me, buf); tx_que->msix = (vector % adapter->tx_num_queues); /* * Set the bit to enable interrupt * in E1000_IMS -- bits 22 and 23 * are for TX0 and TX1, note this has * NOTHING to do with the MSIX vector */ if (adapter->hw.mac.type == e1000_82574) { tx_que->eims = 1 << (22 + i); adapter->ims |= tx_que->eims; adapter->ivars |= (8 | tx_que->msix) << (8 + (i * 4)); } else if (adapter->hw.mac.type == e1000_82575) { tx_que->eims = E1000_EICR_TX_QUEUE0 << (i % adapter->tx_num_queues); } else { tx_que->eims = 1 << (i % adapter->tx_num_queues); } } /* Link interrupt */ rid = rx_vectors + 1; error = iflib_irq_alloc_generic(ctx, &adapter->irq, rid, IFLIB_INTR_ADMIN, em_msix_link, adapter, 0, "aq"); if (error) { device_printf(iflib_get_dev(ctx), "Failed to register admin handler"); goto fail; } adapter->linkvec = rx_vectors; if (adapter->hw.mac.type < igb_mac_min) { adapter->ivars |= (8 | rx_vectors) << 16; adapter->ivars |= 0x80000000; } return (0); fail: iflib_irq_free(ctx, &adapter->irq); rx_que = adapter->rx_queues; for (int i = 0; i < adapter->rx_num_queues; i++, rx_que++) iflib_irq_free(ctx, &rx_que->que_irq); return (error); } static void igb_configure_queues(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct em_rx_queue *rx_que; struct em_tx_queue *tx_que; u32 tmp, ivar = 0, newitr = 0; /* First turn on RSS capability */ if (adapter->hw.mac.type != e1000_82575) E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE | E1000_GPIE_EIAME | E1000_GPIE_PBA | E1000_GPIE_NSICR); /* Turn on MSIX */ switch (adapter->hw.mac.type) { case e1000_82580: case e1000_i350: case e1000_i354: case e1000_i210: case e1000_i211: case e1000_vfadapt: case e1000_vfadapt_i350: /* RX entries */ for (int i = 0; i < adapter->rx_num_queues; i++) { u32 index = i >> 1; ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); rx_que = &adapter->rx_queues[i]; if (i & 1) { ivar &= 0xFF00FFFF; ivar |= (rx_que->msix | E1000_IVAR_VALID) << 16; } else { ivar &= 0xFFFFFF00; ivar |= rx_que->msix | E1000_IVAR_VALID; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); } /* TX entries */ for (int i = 0; i < adapter->tx_num_queues; i++) { u32 index = i >> 1; ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); tx_que = &adapter->tx_queues[i]; if (i & 1) { ivar &= 0x00FFFFFF; ivar |= (tx_que->msix | E1000_IVAR_VALID) << 24; } else { ivar &= 0xFFFF00FF; ivar |= (tx_que->msix | E1000_IVAR_VALID) << 8; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); adapter->que_mask |= tx_que->eims; } /* And for the link interrupt */ ivar = (adapter->linkvec | E1000_IVAR_VALID) << 8; adapter->link_mask = 1 << adapter->linkvec; E1000_WRITE_REG(hw, E1000_IVAR_MISC, ivar); break; case e1000_82576: /* RX entries */ for (int i = 0; i < adapter->rx_num_queues; i++) { u32 index = i & 0x7; /* Each IVAR has two entries */ ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); rx_que = &adapter->rx_queues[i]; if (i < 8) { ivar &= 0xFFFFFF00; ivar |= rx_que->msix | E1000_IVAR_VALID; } else { ivar &= 0xFF00FFFF; ivar |= (rx_que->msix | E1000_IVAR_VALID) << 16; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); adapter->que_mask |= rx_que->eims; } /* TX entries */ for (int i = 0; i < adapter->tx_num_queues; i++) { u32 index = i & 0x7; /* Each IVAR has two entries */ ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); tx_que = &adapter->tx_queues[i]; if (i < 8) { ivar &= 0xFFFF00FF; ivar |= (tx_que->msix | E1000_IVAR_VALID) << 8; } else { ivar &= 0x00FFFFFF; ivar |= (tx_que->msix | E1000_IVAR_VALID) << 24; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); adapter->que_mask |= tx_que->eims; } /* And for the link interrupt */ ivar = (adapter->linkvec | E1000_IVAR_VALID) << 8; adapter->link_mask = 1 << adapter->linkvec; E1000_WRITE_REG(hw, E1000_IVAR_MISC, ivar); break; case e1000_82575: /* enable MSI-X support*/ tmp = E1000_READ_REG(hw, E1000_CTRL_EXT); tmp |= E1000_CTRL_EXT_PBA_CLR; /* Auto-Mask interrupts upon ICR read. */ tmp |= E1000_CTRL_EXT_EIAME; tmp |= E1000_CTRL_EXT_IRCA; E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmp); /* Queues */ for (int i = 0; i < adapter->rx_num_queues; i++) { rx_que = &adapter->rx_queues[i]; tmp = E1000_EICR_RX_QUEUE0 << i; tmp |= E1000_EICR_TX_QUEUE0 << i; rx_que->eims = tmp; E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0), i, rx_que->eims); adapter->que_mask |= rx_que->eims; } /* Link */ E1000_WRITE_REG(hw, E1000_MSIXBM(adapter->linkvec), E1000_EIMS_OTHER); adapter->link_mask |= E1000_EIMS_OTHER; default: break; } /* Set the starting interrupt rate */ if (em_max_interrupt_rate > 0) newitr = (4000000 / em_max_interrupt_rate) & 0x7FFC; if (hw->mac.type == e1000_82575) newitr |= newitr << 16; else newitr |= E1000_EITR_CNT_IGNR; for (int i = 0; i < adapter->rx_num_queues; i++) { rx_que = &adapter->rx_queues[i]; E1000_WRITE_REG(hw, E1000_EITR(rx_que->msix), newitr); } return; } static void em_free_pci_resources(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct em_rx_queue *que = adapter->rx_queues; device_t dev = iflib_get_dev(ctx); /* Release all msix queue resources */ if (adapter->intr_type == IFLIB_INTR_MSIX) iflib_irq_free(ctx, &adapter->irq); for (int i = 0; i < adapter->rx_num_queues; i++, que++) { iflib_irq_free(ctx, &que->que_irq); } /* First release all the interrupt resources */ if (adapter->memory != NULL) { bus_release_resource(dev, SYS_RES_MEMORY, PCIR_BAR(0), adapter->memory); adapter->memory = NULL; } if (adapter->flash != NULL) { bus_release_resource(dev, SYS_RES_MEMORY, EM_FLASH, adapter->flash); adapter->flash = NULL; } if (adapter->ioport != NULL) bus_release_resource(dev, SYS_RES_IOPORT, adapter->io_rid, adapter->ioport); } /* Setup MSI or MSI/X */ static int em_setup_msix(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); if (adapter->hw.mac.type == e1000_82574) { em_enable_vectors_82574(ctx); } return (0); } /********************************************************************* * * Initialize the hardware to a configuration * as specified by the adapter structure. * **********************************************************************/ static void lem_smartspeed(struct adapter *adapter) { u16 phy_tmp; if (adapter->link_active || (adapter->hw.phy.type != e1000_phy_igp) || adapter->hw.mac.autoneg == 0 || (adapter->hw.phy.autoneg_advertised & ADVERTISE_1000_FULL) == 0) return; if (adapter->smartspeed == 0) { /* If Master/Slave config fault is asserted twice, * we assume back-to-back */ e1000_read_phy_reg(&adapter->hw, PHY_1000T_STATUS, &phy_tmp); if (!(phy_tmp & SR_1000T_MS_CONFIG_FAULT)) return; e1000_read_phy_reg(&adapter->hw, PHY_1000T_STATUS, &phy_tmp); if (phy_tmp & SR_1000T_MS_CONFIG_FAULT) { e1000_read_phy_reg(&adapter->hw, PHY_1000T_CTRL, &phy_tmp); if(phy_tmp & CR_1000T_MS_ENABLE) { phy_tmp &= ~CR_1000T_MS_ENABLE; e1000_write_phy_reg(&adapter->hw, PHY_1000T_CTRL, phy_tmp); adapter->smartspeed++; if(adapter->hw.mac.autoneg && !e1000_copper_link_autoneg(&adapter->hw) && !e1000_read_phy_reg(&adapter->hw, PHY_CONTROL, &phy_tmp)) { phy_tmp |= (MII_CR_AUTO_NEG_EN | MII_CR_RESTART_AUTO_NEG); e1000_write_phy_reg(&adapter->hw, PHY_CONTROL, phy_tmp); } } } return; } else if(adapter->smartspeed == EM_SMARTSPEED_DOWNSHIFT) { /* If still no link, perhaps using 2/3 pair cable */ e1000_read_phy_reg(&adapter->hw, PHY_1000T_CTRL, &phy_tmp); phy_tmp |= CR_1000T_MS_ENABLE; e1000_write_phy_reg(&adapter->hw, PHY_1000T_CTRL, phy_tmp); if(adapter->hw.mac.autoneg && !e1000_copper_link_autoneg(&adapter->hw) && !e1000_read_phy_reg(&adapter->hw, PHY_CONTROL, &phy_tmp)) { phy_tmp |= (MII_CR_AUTO_NEG_EN | MII_CR_RESTART_AUTO_NEG); e1000_write_phy_reg(&adapter->hw, PHY_CONTROL, phy_tmp); } } /* Restart process after EM_SMARTSPEED_MAX iterations */ if(adapter->smartspeed++ == EM_SMARTSPEED_MAX) adapter->smartspeed = 0; } /********************************************************************* * * Initialize the DMA Coalescing feature * **********************************************************************/ static void igb_init_dmac(struct adapter *adapter, u32 pba) { device_t dev = adapter->dev; struct e1000_hw *hw = &adapter->hw; u32 dmac, reg = ~E1000_DMACR_DMAC_EN; u16 hwm; u16 max_frame_size; if (hw->mac.type == e1000_i211) return; max_frame_size = adapter->shared->isc_max_frame_size; if (hw->mac.type > e1000_82580) { if (adapter->dmac == 0) { /* Disabling it */ E1000_WRITE_REG(hw, E1000_DMACR, reg); return; } else device_printf(dev, "DMA Coalescing enabled\n"); /* Set starting threshold */ E1000_WRITE_REG(hw, E1000_DMCTXTH, 0); hwm = 64 * pba - max_frame_size / 16; if (hwm < 64 * (pba - 6)) hwm = 64 * (pba - 6); reg = E1000_READ_REG(hw, E1000_FCRTC); reg &= ~E1000_FCRTC_RTH_COAL_MASK; reg |= ((hwm << E1000_FCRTC_RTH_COAL_SHIFT) & E1000_FCRTC_RTH_COAL_MASK); E1000_WRITE_REG(hw, E1000_FCRTC, reg); dmac = pba - max_frame_size / 512; if (dmac < pba - 10) dmac = pba - 10; reg = E1000_READ_REG(hw, E1000_DMACR); reg &= ~E1000_DMACR_DMACTHR_MASK; reg = ((dmac << E1000_DMACR_DMACTHR_SHIFT) & E1000_DMACR_DMACTHR_MASK); /* transition to L0x or L1 if available..*/ reg |= (E1000_DMACR_DMAC_EN | E1000_DMACR_DMAC_LX_MASK); /* Check if status is 2.5Gb backplane connection * before configuration of watchdog timer, which is * in msec values in 12.8usec intervals * watchdog timer= msec values in 32usec intervals * for non 2.5Gb connection */ if (hw->mac.type == e1000_i354) { int status = E1000_READ_REG(hw, E1000_STATUS); if ((status & E1000_STATUS_2P5_SKU) && (!(status & E1000_STATUS_2P5_SKU_OVER))) reg |= ((adapter->dmac * 5) >> 6); else reg |= (adapter->dmac >> 5); } else { reg |= (adapter->dmac >> 5); } E1000_WRITE_REG(hw, E1000_DMACR, reg); E1000_WRITE_REG(hw, E1000_DMCRTRH, 0); /* Set the interval before transition */ reg = E1000_READ_REG(hw, E1000_DMCTLX); if (hw->mac.type == e1000_i350) reg |= IGB_DMCTLX_DCFLUSH_DIS; /* ** in 2.5Gb connection, TTLX unit is 0.4 usec ** which is 0x4*2 = 0xA. But delay is still 4 usec */ if (hw->mac.type == e1000_i354) { int status = E1000_READ_REG(hw, E1000_STATUS); if ((status & E1000_STATUS_2P5_SKU) && (!(status & E1000_STATUS_2P5_SKU_OVER))) reg |= 0xA; else reg |= 0x4; } else { reg |= 0x4; } E1000_WRITE_REG(hw, E1000_DMCTLX, reg); /* free space in tx packet buffer to wake from DMA coal */ E1000_WRITE_REG(hw, E1000_DMCTXTH, (IGB_TXPBSIZE - (2 * max_frame_size)) >> 6); /* make low power state decision controlled by DMA coal */ reg = E1000_READ_REG(hw, E1000_PCIEMISC); reg &= ~E1000_PCIEMISC_LX_DECISION; E1000_WRITE_REG(hw, E1000_PCIEMISC, reg); } else if (hw->mac.type == e1000_82580) { u32 reg = E1000_READ_REG(hw, E1000_PCIEMISC); E1000_WRITE_REG(hw, E1000_PCIEMISC, reg & ~E1000_PCIEMISC_LX_DECISION); E1000_WRITE_REG(hw, E1000_DMACR, 0); } } static void em_reset(if_ctx_t ctx) { device_t dev = iflib_get_dev(ctx); struct adapter *adapter = iflib_get_softc(ctx); struct ifnet *ifp = iflib_get_ifp(ctx); struct e1000_hw *hw = &adapter->hw; u16 rx_buffer_size; u32 pba; INIT_DEBUGOUT("em_reset: begin"); /* Let the firmware know the OS is in control */ em_get_hw_control(adapter); /* Set up smart power down as default off on newer adapters. */ if (!em_smart_pwr_down && (hw->mac.type == e1000_82571 || hw->mac.type == e1000_82572)) { u16 phy_tmp = 0; /* Speed up time to link by disabling smart power down. */ e1000_read_phy_reg(hw, IGP02E1000_PHY_POWER_MGMT, &phy_tmp); phy_tmp &= ~IGP02E1000_PM_SPD; e1000_write_phy_reg(hw, IGP02E1000_PHY_POWER_MGMT, phy_tmp); } /* * Packet Buffer Allocation (PBA) * Writing PBA sets the receive portion of the buffer * the remainder is used for the transmit buffer. */ switch (hw->mac.type) { /* Total Packet Buffer on these is 48K */ case e1000_82571: case e1000_82572: case e1000_80003es2lan: pba = E1000_PBA_32K; /* 32K for Rx, 16K for Tx */ break; case e1000_82573: /* 82573: Total Packet Buffer is 32K */ pba = E1000_PBA_12K; /* 12K for Rx, 20K for Tx */ break; case e1000_82574: case e1000_82583: pba = E1000_PBA_20K; /* 20K for Rx, 20K for Tx */ break; case e1000_ich8lan: pba = E1000_PBA_8K; break; case e1000_ich9lan: case e1000_ich10lan: /* Boost Receive side for jumbo frames */ if (adapter->hw.mac.max_frame_size > 4096) pba = E1000_PBA_14K; else pba = E1000_PBA_10K; break; case e1000_pchlan: case e1000_pch2lan: case e1000_pch_lpt: case e1000_pch_spt: pba = E1000_PBA_26K; break; case e1000_82575: pba = E1000_PBA_32K; break; case e1000_82576: case e1000_vfadapt: pba = E1000_READ_REG(hw, E1000_RXPBS); pba &= E1000_RXPBS_SIZE_MASK_82576; break; case e1000_82580: case e1000_i350: case e1000_i354: case e1000_vfadapt_i350: pba = E1000_READ_REG(hw, E1000_RXPBS); pba = e1000_rxpbs_adjust_82580(pba); break; case e1000_i210: case e1000_i211: pba = E1000_PBA_34K; break; default: if (adapter->hw.mac.max_frame_size > 8192) pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */ else pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */ } /* Special needs in case of Jumbo frames */ if ((hw->mac.type == e1000_82575) && (ifp->if_mtu > ETHERMTU)) { u32 tx_space, min_tx, min_rx; pba = E1000_READ_REG(hw, E1000_PBA); tx_space = pba >> 16; pba &= 0xffff; min_tx = (adapter->hw.mac.max_frame_size + sizeof(struct e1000_tx_desc) - ETHERNET_FCS_SIZE) * 2; min_tx = roundup2(min_tx, 1024); min_tx >>= 10; min_rx = adapter->hw.mac.max_frame_size; min_rx = roundup2(min_rx, 1024); min_rx >>= 10; if (tx_space < min_tx && ((min_tx - tx_space) < pba)) { pba = pba - (min_tx - tx_space); /* * if short on rx space, rx wins * and must trump tx adjustment */ if (pba < min_rx) pba = min_rx; } E1000_WRITE_REG(hw, E1000_PBA, pba); } if (hw->mac.type < igb_mac_min) E1000_WRITE_REG(&adapter->hw, E1000_PBA, pba); INIT_DEBUGOUT1("em_reset: pba=%dK",pba); /* * These parameters control the automatic generation (Tx) and * response (Rx) to Ethernet PAUSE frames. * - High water mark should allow for at least two frames to be * received after sending an XOFF. * - Low water mark works best when it is very near the high water mark. * This allows the receiver to restart by sending XON when it has * drained a bit. Here we use an arbitrary value of 1500 which will * restart after one full frame is pulled from the buffer. There * could be several smaller frames in the buffer and if so they will * not trigger the XON until their total number reduces the buffer * by 1500. * - The pause time is fairly large at 1000 x 512ns = 512 usec. */ rx_buffer_size = (pba & 0xffff) << 10; hw->fc.high_water = rx_buffer_size - roundup2(adapter->hw.mac.max_frame_size, 1024); hw->fc.low_water = hw->fc.high_water - 1500; if (adapter->fc) /* locally set flow control value? */ hw->fc.requested_mode = adapter->fc; else hw->fc.requested_mode = e1000_fc_full; if (hw->mac.type == e1000_80003es2lan) hw->fc.pause_time = 0xFFFF; else hw->fc.pause_time = EM_FC_PAUSE_TIME; hw->fc.send_xon = TRUE; /* Device specific overrides/settings */ switch (hw->mac.type) { case e1000_pchlan: /* Workaround: no TX flow ctrl for PCH */ hw->fc.requested_mode = e1000_fc_rx_pause; hw->fc.pause_time = 0xFFFF; /* override */ if (if_getmtu(ifp) > ETHERMTU) { hw->fc.high_water = 0x3500; hw->fc.low_water = 0x1500; } else { hw->fc.high_water = 0x5000; hw->fc.low_water = 0x3000; } hw->fc.refresh_time = 0x1000; break; case e1000_pch2lan: case e1000_pch_lpt: case e1000_pch_spt: hw->fc.high_water = 0x5C20; hw->fc.low_water = 0x5048; hw->fc.pause_time = 0x0650; hw->fc.refresh_time = 0x0400; /* Jumbos need adjusted PBA */ if (if_getmtu(ifp) > ETHERMTU) E1000_WRITE_REG(hw, E1000_PBA, 12); else E1000_WRITE_REG(hw, E1000_PBA, 26); break; case e1000_82575: case e1000_82576: /* 8-byte granularity */ hw->fc.low_water = hw->fc.high_water - 8; break; case e1000_82580: case e1000_i350: case e1000_i354: case e1000_i210: case e1000_i211: case e1000_vfadapt: case e1000_vfadapt_i350: /* 16-byte granularity */ hw->fc.low_water = hw->fc.high_water - 16; break; case e1000_ich9lan: case e1000_ich10lan: if (if_getmtu(ifp) > ETHERMTU) { hw->fc.high_water = 0x2800; hw->fc.low_water = hw->fc.high_water - 8; break; } /* FALLTHROUGH */ default: if (hw->mac.type == e1000_80003es2lan) hw->fc.pause_time = 0xFFFF; break; } /* Issue a global reset */ e1000_reset_hw(hw); if (adapter->hw.mac.type >= igb_mac_min) { E1000_WRITE_REG(hw, E1000_WUC, 0); } else { E1000_WRITE_REG(hw, E1000_WUFC, 0); em_disable_aspm(adapter); } if (adapter->flags & IGB_MEDIA_RESET) { e1000_setup_init_funcs(hw, TRUE); e1000_get_bus_info(hw); adapter->flags &= ~IGB_MEDIA_RESET; } /* and a re-init */ if (e1000_init_hw(hw) < 0) { device_printf(dev, "Hardware Initialization Failed\n"); return; } if (adapter->hw.mac.type >= igb_mac_min) igb_init_dmac(adapter, pba); E1000_WRITE_REG(hw, E1000_VET, ETHERTYPE_VLAN); e1000_get_phy_info(hw); e1000_check_for_link(hw); } #define RSSKEYLEN 10 static void em_initialize_rss_mapping(struct adapter *adapter) { uint8_t rss_key[4 * RSSKEYLEN]; uint32_t reta = 0; struct e1000_hw *hw = &adapter->hw; int i; /* * Configure RSS key */ arc4rand(rss_key, sizeof(rss_key), 0); for (i = 0; i < RSSKEYLEN; ++i) { uint32_t rssrk = 0; rssrk = EM_RSSRK_VAL(rss_key, i); E1000_WRITE_REG(hw,E1000_RSSRK(i), rssrk); } /* * Configure RSS redirect table in following fashion: * (hash & ring_cnt_mask) == rdr_table[(hash & rdr_table_mask)] */ for (i = 0; i < sizeof(reta); ++i) { uint32_t q; q = (i % adapter->rx_num_queues) << 7; reta |= q << (8 * i); } for (i = 0; i < 32; ++i) E1000_WRITE_REG(hw, E1000_RETA(i), reta); E1000_WRITE_REG(hw, E1000_MRQC, E1000_MRQC_RSS_ENABLE_2Q | E1000_MRQC_RSS_FIELD_IPV4_TCP | E1000_MRQC_RSS_FIELD_IPV4 | E1000_MRQC_RSS_FIELD_IPV6_TCP_EX | E1000_MRQC_RSS_FIELD_IPV6_EX | E1000_MRQC_RSS_FIELD_IPV6); } static void igb_initialize_rss_mapping(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; int i; int queue_id; u32 reta; u32 rss_key[10], mrqc, shift = 0; /* XXX? */ if (adapter->hw.mac.type == e1000_82575) shift = 6; /* * The redirection table controls which destination * queue each bucket redirects traffic to. * Each DWORD represents four queues, with the LSB * being the first queue in the DWORD. * * This just allocates buckets to queues using round-robin * allocation. * * NOTE: It Just Happens to line up with the default * RSS allocation method. */ /* Warning FM follows */ reta = 0; for (i = 0; i < 128; i++) { #ifdef RSS queue_id = rss_get_indirection_to_bucket(i); /* * If we have more queues than buckets, we'll * end up mapping buckets to a subset of the * queues. * * If we have more buckets than queues, we'll * end up instead assigning multiple buckets * to queues. * * Both are suboptimal, but we need to handle * the case so we don't go out of bounds * indexing arrays and such. */ queue_id = queue_id % adapter->rx_num_queues; #else queue_id = (i % adapter->rx_num_queues); #endif /* Adjust if required */ queue_id = queue_id << shift; /* * The low 8 bits are for hash value (n+0); * The next 8 bits are for hash value (n+1), etc. */ reta = reta >> 8; reta = reta | ( ((uint32_t) queue_id) << 24); if ((i & 3) == 3) { E1000_WRITE_REG(hw, E1000_RETA(i >> 2), reta); reta = 0; } } /* Now fill in hash table */ /* * MRQC: Multiple Receive Queues Command * Set queuing to RSS control, number depends on the device. */ mrqc = E1000_MRQC_ENABLE_RSS_8Q; #ifdef RSS /* XXX ew typecasting */ rss_getkey((uint8_t *) &rss_key); #else arc4rand(&rss_key, sizeof(rss_key), 0); #endif for (i = 0; i < 10; i++) E1000_WRITE_REG_ARRAY(hw, E1000_RSSRK(0), i, rss_key[i]); /* * Configure the RSS fields to hash upon. */ mrqc |= (E1000_MRQC_RSS_FIELD_IPV4 | E1000_MRQC_RSS_FIELD_IPV4_TCP); mrqc |= (E1000_MRQC_RSS_FIELD_IPV6 | E1000_MRQC_RSS_FIELD_IPV6_TCP); mrqc |=( E1000_MRQC_RSS_FIELD_IPV4_UDP | E1000_MRQC_RSS_FIELD_IPV6_UDP); mrqc |=( E1000_MRQC_RSS_FIELD_IPV6_UDP_EX | E1000_MRQC_RSS_FIELD_IPV6_TCP_EX); E1000_WRITE_REG(hw, E1000_MRQC, mrqc); } /********************************************************************* * * Setup networking device structure and register an interface. * **********************************************************************/ static int em_setup_interface(if_ctx_t ctx) { struct ifnet *ifp = iflib_get_ifp(ctx); struct adapter *adapter = iflib_get_softc(ctx); if_softc_ctx_t scctx = adapter->shared; uint64_t cap = 0; INIT_DEBUGOUT("em_setup_interface: begin"); /* TSO parameters */ if_sethwtsomax(ifp, IP_MAXPACKET); /* Take m_pullup(9)'s in em_xmit() w/ TSO into acount. */ if_sethwtsomaxsegcount(ifp, EM_MAX_SCATTER - 5); if_sethwtsomaxsegsize(ifp, EM_TSO_SEG_SIZE); /* Single Queue */ if (adapter->tx_num_queues == 1) { if_setsendqlen(ifp, scctx->isc_ntxd[0] - 1); if_setsendqready(ifp); } cap = IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM | IFCAP_TSO4; cap |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWTSO | IFCAP_VLAN_MTU; /* * Tell the upper layer(s) we * support full VLAN capability */ if_setifheaderlen(ifp, sizeof(struct ether_vlan_header)); if_setcapabilitiesbit(ifp, cap, 0); /* * Don't turn this on by default, if vlans are * created on another pseudo device (eg. lagg) * then vlan events are not passed thru, breaking * operation, but with HW FILTER off it works. If * using vlans directly on the em driver you can * enable this and get full hardware tag filtering. */ if_setcapabilitiesbit(ifp, IFCAP_VLAN_HWFILTER,0); /* Enable only WOL MAGIC by default */ if (adapter->wol) { if_setcapenablebit(ifp, IFCAP_WOL_MAGIC, IFCAP_WOL_MCAST| IFCAP_WOL_UCAST); } else { if_setcapenablebit(ifp, 0, IFCAP_WOL_MAGIC | IFCAP_WOL_MCAST| IFCAP_WOL_UCAST); } /* * Specify the media types supported by this adapter and register * callbacks to update media and link information */ if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) { u_char fiber_type = IFM_1000_SX; /* default type */ if (adapter->hw.mac.type == e1000_82545) fiber_type = IFM_1000_LX; ifmedia_add(adapter->media, IFM_ETHER | fiber_type | IFM_FDX, 0, NULL); ifmedia_add(adapter->media, IFM_ETHER | fiber_type, 0, NULL); } else { ifmedia_add(adapter->media, IFM_ETHER | IFM_10_T, 0, NULL); ifmedia_add(adapter->media, IFM_ETHER | IFM_10_T | IFM_FDX, 0, NULL); ifmedia_add(adapter->media, IFM_ETHER | IFM_100_TX, 0, NULL); ifmedia_add(adapter->media, IFM_ETHER | IFM_100_TX | IFM_FDX, 0, NULL); if (adapter->hw.phy.type != e1000_phy_ife) { ifmedia_add(adapter->media, IFM_ETHER | IFM_1000_T | IFM_FDX, 0, NULL); ifmedia_add(adapter->media, IFM_ETHER | IFM_1000_T, 0, NULL); } } ifmedia_add(adapter->media, IFM_ETHER | IFM_AUTO, 0, NULL); ifmedia_set(adapter->media, IFM_ETHER | IFM_AUTO); return (0); } static int em_if_tx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int ntxqs, int ntxqsets) { struct adapter *adapter = iflib_get_softc(ctx); if_softc_ctx_t scctx = adapter->shared; int error = E1000_SUCCESS; struct em_tx_queue *que; int i, j; MPASS(adapter->tx_num_queues > 0); MPASS(adapter->tx_num_queues == ntxqsets); /* First allocate the top level queue structs */ if (!(adapter->tx_queues = (struct em_tx_queue *) malloc(sizeof(struct em_tx_queue) * adapter->tx_num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(iflib_get_dev(ctx), "Unable to allocate queue memory\n"); return(ENOMEM); } for (i = 0, que = adapter->tx_queues; i < adapter->tx_num_queues; i++, que++) { /* Set up some basics */ struct tx_ring *txr = &que->txr; txr->adapter = que->adapter = adapter; que->me = txr->me = i; /* Allocate report status array */ if (!(txr->tx_rsq = (qidx_t *) malloc(sizeof(qidx_t) * scctx->isc_ntxd[0], M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(iflib_get_dev(ctx), "failed to allocate rs_idxs memory\n"); error = ENOMEM; goto fail; } for (j = 0; j < scctx->isc_ntxd[0]; j++) txr->tx_rsq[j] = QIDX_INVALID; /* get the virtual and physical address of the hardware queues */ txr->tx_base = (struct e1000_tx_desc *)vaddrs[i*ntxqs]; txr->tx_paddr = paddrs[i*ntxqs]; } device_printf(iflib_get_dev(ctx), "allocated for %d tx_queues\n", adapter->tx_num_queues); return (0); fail: em_if_queues_free(ctx); return (error); } static int em_if_rx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int nrxqs, int nrxqsets) { struct adapter *adapter = iflib_get_softc(ctx); int error = E1000_SUCCESS; struct em_rx_queue *que; int i; MPASS(adapter->rx_num_queues > 0); MPASS(adapter->rx_num_queues == nrxqsets); /* First allocate the top level queue structs */ if (!(adapter->rx_queues = (struct em_rx_queue *) malloc(sizeof(struct em_rx_queue) * adapter->rx_num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(iflib_get_dev(ctx), "Unable to allocate queue memory\n"); error = ENOMEM; goto fail; } for (i = 0, que = adapter->rx_queues; i < nrxqsets; i++, que++) { /* Set up some basics */ struct rx_ring *rxr = &que->rxr; rxr->adapter = que->adapter = adapter; rxr->que = que; que->me = rxr->me = i; /* get the virtual and physical address of the hardware queues */ rxr->rx_base = (union e1000_rx_desc_extended *)vaddrs[i*nrxqs]; rxr->rx_paddr = paddrs[i*nrxqs]; } device_printf(iflib_get_dev(ctx), "allocated for %d rx_queues\n", adapter->rx_num_queues); return (0); fail: em_if_queues_free(ctx); return (error); } static void em_if_queues_free(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct em_tx_queue *tx_que = adapter->tx_queues; struct em_rx_queue *rx_que = adapter->rx_queues; if (tx_que != NULL) { for (int i = 0; i < adapter->tx_num_queues; i++, tx_que++) { struct tx_ring *txr = &tx_que->txr; if (txr->tx_rsq == NULL) break; free(txr->tx_rsq, M_DEVBUF); txr->tx_rsq = NULL; } free(adapter->tx_queues, M_DEVBUF); adapter->tx_queues = NULL; } if (rx_que != NULL) { free(adapter->rx_queues, M_DEVBUF); adapter->rx_queues = NULL; } em_release_hw_control(adapter); if (adapter->mta != NULL) { free(adapter->mta, M_DEVBUF); } } /********************************************************************* * * Enable transmit unit. * **********************************************************************/ static void em_initialize_transmit_unit(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); if_softc_ctx_t scctx = adapter->shared; struct em_tx_queue *que; struct tx_ring *txr; struct e1000_hw *hw = &adapter->hw; u32 tctl, txdctl = 0, tarc, tipg = 0; INIT_DEBUGOUT("em_initialize_transmit_unit: begin"); for (int i = 0; i < adapter->tx_num_queues; i++, txr++) { u64 bus_addr; caddr_t offp, endp; que = &adapter->tx_queues[i]; txr = &que->txr; bus_addr = txr->tx_paddr; /* Clear checksum offload context. */ offp = (caddr_t)&txr->csum_flags; endp = (caddr_t)(txr + 1); bzero(offp, endp - offp); /* Base and Len of TX Ring */ E1000_WRITE_REG(hw, E1000_TDLEN(i), scctx->isc_ntxd[0] * sizeof(struct e1000_tx_desc)); E1000_WRITE_REG(hw, E1000_TDBAH(i), (u32)(bus_addr >> 32)); E1000_WRITE_REG(hw, E1000_TDBAL(i), (u32)bus_addr); /* Init the HEAD/TAIL indices */ E1000_WRITE_REG(hw, E1000_TDT(i), 0); E1000_WRITE_REG(hw, E1000_TDH(i), 0); HW_DEBUGOUT2("Base = %x, Length = %x\n", E1000_READ_REG(&adapter->hw, E1000_TDBAL(i)), E1000_READ_REG(&adapter->hw, E1000_TDLEN(i))); txdctl = 0; /* clear txdctl */ txdctl |= 0x1f; /* PTHRESH */ txdctl |= 1 << 8; /* HTHRESH */ txdctl |= 1 << 16;/* WTHRESH */ txdctl |= 1 << 22; /* Reserved bit 22 must always be 1 */ txdctl |= E1000_TXDCTL_GRAN; txdctl |= 1 << 25; /* LWTHRESH */ E1000_WRITE_REG(hw, E1000_TXDCTL(i), txdctl); } /* Set the default values for the Tx Inter Packet Gap timer */ switch (adapter->hw.mac.type) { case e1000_80003es2lan: tipg = DEFAULT_82543_TIPG_IPGR1; tipg |= DEFAULT_80003ES2LAN_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT; break; case e1000_82542: tipg = DEFAULT_82542_TIPG_IPGT; tipg |= DEFAULT_82542_TIPG_IPGR1 << E1000_TIPG_IPGR1_SHIFT; tipg |= DEFAULT_82542_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT; break; default: if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) tipg = DEFAULT_82543_TIPG_IPGT_FIBER; else tipg = DEFAULT_82543_TIPG_IPGT_COPPER; tipg |= DEFAULT_82543_TIPG_IPGR1 << E1000_TIPG_IPGR1_SHIFT; tipg |= DEFAULT_82543_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT; } E1000_WRITE_REG(&adapter->hw, E1000_TIPG, tipg); E1000_WRITE_REG(&adapter->hw, E1000_TIDV, adapter->tx_int_delay.value); if(adapter->hw.mac.type >= e1000_82540) E1000_WRITE_REG(&adapter->hw, E1000_TADV, adapter->tx_abs_int_delay.value); if ((adapter->hw.mac.type == e1000_82571) || (adapter->hw.mac.type == e1000_82572)) { tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(0)); tarc |= TARC_SPEED_MODE_BIT; E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc); } else if (adapter->hw.mac.type == e1000_80003es2lan) { /* errata: program both queues to unweighted RR */ tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(0)); tarc |= 1; E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc); tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(1)); tarc |= 1; E1000_WRITE_REG(&adapter->hw, E1000_TARC(1), tarc); } else if (adapter->hw.mac.type == e1000_82574) { tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(0)); tarc |= TARC_ERRATA_BIT; if ( adapter->tx_num_queues > 1) { tarc |= (TARC_COMPENSATION_MODE | TARC_MQ_FIX); E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc); E1000_WRITE_REG(&adapter->hw, E1000_TARC(1), tarc); } else E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc); } if (adapter->tx_int_delay.value > 0) adapter->txd_cmd |= E1000_TXD_CMD_IDE; /* Program the Transmit Control Register */ tctl = E1000_READ_REG(&adapter->hw, E1000_TCTL); tctl &= ~E1000_TCTL_CT; tctl |= (E1000_TCTL_PSP | E1000_TCTL_RTLC | E1000_TCTL_EN | (E1000_COLLISION_THRESHOLD << E1000_CT_SHIFT)); if (adapter->hw.mac.type >= e1000_82571) tctl |= E1000_TCTL_MULR; /* This write will effectively turn on the transmit unit. */ E1000_WRITE_REG(&adapter->hw, E1000_TCTL, tctl); if (hw->mac.type == e1000_pch_spt) { u32 reg; reg = E1000_READ_REG(hw, E1000_IOSFPC); reg |= E1000_RCTL_RDMTS_HEX; E1000_WRITE_REG(hw, E1000_IOSFPC, reg); reg = E1000_READ_REG(hw, E1000_TARC(0)); reg |= E1000_TARC0_CB_MULTIQ_3_REQ; E1000_WRITE_REG(hw, E1000_TARC(0), reg); } } /********************************************************************* * * Enable receive unit. * **********************************************************************/ static void em_initialize_receive_unit(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); if_softc_ctx_t scctx = adapter->shared; struct ifnet *ifp = iflib_get_ifp(ctx); struct e1000_hw *hw = &adapter->hw; struct em_rx_queue *que; int i; u32 rctl, rxcsum, rfctl; INIT_DEBUGOUT("em_initialize_receive_units: begin"); /* * Make sure receives are disabled while setting * up the descriptor ring */ rctl = E1000_READ_REG(hw, E1000_RCTL); /* Do not disable if ever enabled on this hardware */ if ((hw->mac.type != e1000_82574) && (hw->mac.type != e1000_82583)) E1000_WRITE_REG(hw, E1000_RCTL, rctl & ~E1000_RCTL_EN); /* Setup the Receive Control Register */ rctl &= ~(3 << E1000_RCTL_MO_SHIFT); rctl |= E1000_RCTL_EN | E1000_RCTL_BAM | E1000_RCTL_LBM_NO | E1000_RCTL_RDMTS_HALF | (hw->mac.mc_filter_type << E1000_RCTL_MO_SHIFT); /* Do not store bad packets */ rctl &= ~E1000_RCTL_SBP; /* Enable Long Packet receive */ if (if_getmtu(ifp) > ETHERMTU) rctl |= E1000_RCTL_LPE; else rctl &= ~E1000_RCTL_LPE; /* Strip the CRC */ if (!em_disable_crc_stripping) rctl |= E1000_RCTL_SECRC; if (adapter->hw.mac.type >= e1000_82540) { E1000_WRITE_REG(&adapter->hw, E1000_RADV, adapter->rx_abs_int_delay.value); /* * Set the interrupt throttling rate. Value is calculated * as DEFAULT_ITR = 1/(MAX_INTS_PER_SEC * 256ns) */ E1000_WRITE_REG(hw, E1000_ITR, DEFAULT_ITR); } E1000_WRITE_REG(&adapter->hw, E1000_RDTR, adapter->rx_int_delay.value); /* Use extended rx descriptor formats */ rfctl = E1000_READ_REG(hw, E1000_RFCTL); rfctl |= E1000_RFCTL_EXTEN; /* * When using MSIX interrupts we need to throttle * using the EITR register (82574 only) */ if (hw->mac.type == e1000_82574) { for (int i = 0; i < 4; i++) E1000_WRITE_REG(hw, E1000_EITR_82574(i), DEFAULT_ITR); /* Disable accelerated acknowledge */ rfctl |= E1000_RFCTL_ACK_DIS; } E1000_WRITE_REG(hw, E1000_RFCTL, rfctl); rxcsum = E1000_READ_REG(hw, E1000_RXCSUM); if (if_getcapenable(ifp) & IFCAP_RXCSUM && adapter->hw.mac.type >= e1000_82543) { if (adapter->tx_num_queues > 1) { if (adapter->hw.mac.type >= igb_mac_min) { rxcsum |= E1000_RXCSUM_PCSD; if (hw->mac.type != e1000_82575) rxcsum |= E1000_RXCSUM_CRCOFL; } else rxcsum |= E1000_RXCSUM_TUOFL | E1000_RXCSUM_IPOFL | E1000_RXCSUM_PCSD; } else { if (adapter->hw.mac.type >= igb_mac_min) rxcsum |= E1000_RXCSUM_IPPCSE; else rxcsum |= E1000_RXCSUM_TUOFL | E1000_RXCSUM_IPOFL; if (adapter->hw.mac.type > e1000_82575) rxcsum |= E1000_RXCSUM_CRCOFL; } } else rxcsum &= ~E1000_RXCSUM_TUOFL; E1000_WRITE_REG(hw, E1000_RXCSUM, rxcsum); if (adapter->rx_num_queues > 1) { if (adapter->hw.mac.type >= igb_mac_min) igb_initialize_rss_mapping(adapter); else em_initialize_rss_mapping(adapter); } /* * XXX TEMPORARY WORKAROUND: on some systems with 82573 * long latencies are observed, like Lenovo X60. This * change eliminates the problem, but since having positive * values in RDTR is a known source of problems on other * platforms another solution is being sought. */ if (hw->mac.type == e1000_82573) E1000_WRITE_REG(hw, E1000_RDTR, 0x20); for (i = 0, que = adapter->rx_queues; i < adapter->rx_num_queues; i++, que++) { struct rx_ring *rxr = &que->rxr; /* Setup the Base and Length of the Rx Descriptor Ring */ u64 bus_addr = rxr->rx_paddr; #if 0 u32 rdt = adapter->rx_num_queues -1; /* default */ #endif E1000_WRITE_REG(hw, E1000_RDLEN(i), scctx->isc_nrxd[0] * sizeof(union e1000_rx_desc_extended)); E1000_WRITE_REG(hw, E1000_RDBAH(i), (u32)(bus_addr >> 32)); E1000_WRITE_REG(hw, E1000_RDBAL(i), (u32)bus_addr); /* Setup the Head and Tail Descriptor Pointers */ E1000_WRITE_REG(hw, E1000_RDH(i), 0); E1000_WRITE_REG(hw, E1000_RDT(i), 0); } /* * Set PTHRESH for improved jumbo performance * According to 10.2.5.11 of Intel 82574 Datasheet, * RXDCTL(1) is written whenever RXDCTL(0) is written. * Only write to RXDCTL(1) if there is a need for different * settings. */ if (((adapter->hw.mac.type == e1000_ich9lan) || (adapter->hw.mac.type == e1000_pch2lan) || (adapter->hw.mac.type == e1000_ich10lan)) && (if_getmtu(ifp) > ETHERMTU)) { u32 rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(0)); E1000_WRITE_REG(hw, E1000_RXDCTL(0), rxdctl | 3); } else if (adapter->hw.mac.type == e1000_82574) { for (int i = 0; i < adapter->rx_num_queues; i++) { u32 rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(i)); rxdctl |= 0x20; /* PTHRESH */ rxdctl |= 4 << 8; /* HTHRESH */ rxdctl |= 4 << 16;/* WTHRESH */ rxdctl |= 1 << 24; /* Switch to granularity */ E1000_WRITE_REG(hw, E1000_RXDCTL(i), rxdctl); } } else if (adapter->hw.mac.type >= igb_mac_min) { u32 psize, srrctl = 0; if (if_getmtu(ifp) > ETHERMTU) { /* Set maximum packet len */ if (adapter->rx_mbuf_sz <= 4096) { srrctl |= 4096 >> E1000_SRRCTL_BSIZEPKT_SHIFT; rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX; } else if (adapter->rx_mbuf_sz > 4096) { srrctl |= 8192 >> E1000_SRRCTL_BSIZEPKT_SHIFT; rctl |= E1000_RCTL_SZ_8192 | E1000_RCTL_BSEX; } psize = scctx->isc_max_frame_size; /* are we on a vlan? */ if (ifp->if_vlantrunk != NULL) psize += VLAN_TAG_SIZE; E1000_WRITE_REG(&adapter->hw, E1000_RLPML, psize); } else { srrctl |= 2048 >> E1000_SRRCTL_BSIZEPKT_SHIFT; rctl |= E1000_RCTL_SZ_2048; } /* * If TX flow control is disabled and there's >1 queue defined, * enable DROP. * * This drops frames rather than hanging the RX MAC for all queues. */ if ((adapter->rx_num_queues > 1) && (adapter->fc == e1000_fc_none || adapter->fc == e1000_fc_rx_pause)) { srrctl |= E1000_SRRCTL_DROP_EN; } /* Setup the Base and Length of the Rx Descriptor Rings */ for (i = 0, que = adapter->rx_queues; i < adapter->rx_num_queues; i++, que++) { struct rx_ring *rxr = &que->rxr; u64 bus_addr = rxr->rx_paddr; u32 rxdctl; #ifdef notyet /* Configure for header split? -- ignore for now */ rxr->hdr_split = igb_header_split; #else srrctl |= E1000_SRRCTL_DESCTYPE_ADV_ONEBUF; #endif E1000_WRITE_REG(hw, E1000_RDLEN(i), scctx->isc_nrxd[0] * sizeof(struct e1000_rx_desc)); E1000_WRITE_REG(hw, E1000_RDBAH(i), (uint32_t)(bus_addr >> 32)); E1000_WRITE_REG(hw, E1000_RDBAL(i), (uint32_t)bus_addr); E1000_WRITE_REG(hw, E1000_SRRCTL(i), srrctl); /* Enable this Queue */ rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(i)); rxdctl |= E1000_RXDCTL_QUEUE_ENABLE; rxdctl &= 0xFFF00000; rxdctl |= IGB_RX_PTHRESH; rxdctl |= IGB_RX_HTHRESH << 8; rxdctl |= IGB_RX_WTHRESH << 16; E1000_WRITE_REG(hw, E1000_RXDCTL(i), rxdctl); } } else if (adapter->hw.mac.type >= e1000_pch2lan) { if (if_getmtu(ifp) > ETHERMTU) e1000_lv_jumbo_workaround_ich8lan(hw, TRUE); else e1000_lv_jumbo_workaround_ich8lan(hw, FALSE); } /* Make sure VLAN Filters are off */ rctl &= ~E1000_RCTL_VFE; if (adapter->hw.mac.type < igb_mac_min) { if (adapter->rx_mbuf_sz == MCLBYTES) rctl |= E1000_RCTL_SZ_2048; else if (adapter->rx_mbuf_sz == MJUMPAGESIZE) rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX; else if (adapter->rx_mbuf_sz > MJUMPAGESIZE) rctl |= E1000_RCTL_SZ_8192 | E1000_RCTL_BSEX; /* ensure we clear use DTYPE of 00 here */ rctl &= ~0x00000C00; } /* Write out the settings */ E1000_WRITE_REG(hw, E1000_RCTL, rctl); return; } static void em_if_vlan_register(if_ctx_t ctx, u16 vtag) { struct adapter *adapter = iflib_get_softc(ctx); u32 index, bit; index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] |= (1 << bit); ++adapter->num_vlans; } static void em_if_vlan_unregister(if_ctx_t ctx, u16 vtag) { struct adapter *adapter = iflib_get_softc(ctx); u32 index, bit; index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] &= ~(1 << bit); --adapter->num_vlans; } static void em_setup_vlan_hw_support(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; u32 reg; /* * We get here thru init_locked, meaning * a soft reset, this has already cleared * the VFTA and other state, so if there * have been no vlan's registered do nothing. */ if (adapter->num_vlans == 0) return; /* * A soft reset zero's out the VFTA, so * we need to repopulate it now. */ for (int i = 0; i < EM_VFTA_SIZE; i++) if (adapter->shadow_vfta[i] != 0) E1000_WRITE_REG_ARRAY(hw, E1000_VFTA, i, adapter->shadow_vfta[i]); reg = E1000_READ_REG(hw, E1000_CTRL); reg |= E1000_CTRL_VME; E1000_WRITE_REG(hw, E1000_CTRL, reg); /* Enable the Filter Table */ reg = E1000_READ_REG(hw, E1000_RCTL); reg &= ~E1000_RCTL_CFIEN; reg |= E1000_RCTL_VFE; E1000_WRITE_REG(hw, E1000_RCTL, reg); } static void em_if_enable_intr(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct e1000_hw *hw = &adapter->hw; u32 ims_mask = IMS_ENABLE_MASK; if (hw->mac.type == e1000_82574) { E1000_WRITE_REG(hw, EM_EIAC, EM_MSIX_MASK); ims_mask |= adapter->ims; } else if (adapter->intr_type == IFLIB_INTR_MSIX && hw->mac.type >= igb_mac_min) { u32 mask = (adapter->que_mask | adapter->link_mask); E1000_WRITE_REG(&adapter->hw, E1000_EIAC, mask); E1000_WRITE_REG(&adapter->hw, E1000_EIAM, mask); E1000_WRITE_REG(&adapter->hw, E1000_EIMS, mask); ims_mask = E1000_IMS_LSC; } E1000_WRITE_REG(hw, E1000_IMS, ims_mask); } static void em_if_disable_intr(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct e1000_hw *hw = &adapter->hw; if (adapter->intr_type == IFLIB_INTR_MSIX) { if (hw->mac.type >= igb_mac_min) E1000_WRITE_REG(&adapter->hw, E1000_EIMC, ~0); E1000_WRITE_REG(&adapter->hw, E1000_EIAC, 0); } E1000_WRITE_REG(&adapter->hw, E1000_IMC, 0xffffffff); } /* * Bit of a misnomer, what this really means is * to enable OS management of the system... aka * to disable special hardware management features */ static void em_init_manageability(struct adapter *adapter) { /* A shared code workaround */ #define E1000_82542_MANC2H E1000_MANC2H if (adapter->has_manage) { int manc2h = E1000_READ_REG(&adapter->hw, E1000_MANC2H); int manc = E1000_READ_REG(&adapter->hw, E1000_MANC); /* disable hardware interception of ARP */ manc &= ~(E1000_MANC_ARP_EN); /* enable receiving management packets to the host */ manc |= E1000_MANC_EN_MNG2HOST; #define E1000_MNG2HOST_PORT_623 (1 << 5) #define E1000_MNG2HOST_PORT_664 (1 << 6) manc2h |= E1000_MNG2HOST_PORT_623; manc2h |= E1000_MNG2HOST_PORT_664; E1000_WRITE_REG(&adapter->hw, E1000_MANC2H, manc2h); E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc); } } /* * Give control back to hardware management * controller if there is one. */ static void em_release_manageability(struct adapter *adapter) { if (adapter->has_manage) { int manc = E1000_READ_REG(&adapter->hw, E1000_MANC); /* re-enable hardware interception of ARP */ manc |= E1000_MANC_ARP_EN; manc &= ~E1000_MANC_EN_MNG2HOST; E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc); } } /* * em_get_hw_control sets the {CTRL_EXT|FWSM}:DRV_LOAD bit. * For ASF and Pass Through versions of f/w this means * that the driver is loaded. For AMT version type f/w * this means that the network i/f is open. */ static void em_get_hw_control(struct adapter *adapter) { u32 ctrl_ext, swsm; if (adapter->vf_ifp) return; if (adapter->hw.mac.type == e1000_82573) { swsm = E1000_READ_REG(&adapter->hw, E1000_SWSM); E1000_WRITE_REG(&adapter->hw, E1000_SWSM, swsm | E1000_SWSM_DRV_LOAD); return; } /* else */ ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext | E1000_CTRL_EXT_DRV_LOAD); } /* * em_release_hw_control resets {CTRL_EXT|FWSM}:DRV_LOAD bit. * For ASF and Pass Through versions of f/w this means that * the driver is no longer loaded. For AMT versions of the * f/w this means that the network i/f is closed. */ static void em_release_hw_control(struct adapter *adapter) { u32 ctrl_ext, swsm; if (!adapter->has_manage) return; if (adapter->hw.mac.type == e1000_82573) { swsm = E1000_READ_REG(&adapter->hw, E1000_SWSM); E1000_WRITE_REG(&adapter->hw, E1000_SWSM, swsm & ~E1000_SWSM_DRV_LOAD); return; } /* else */ ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext & ~E1000_CTRL_EXT_DRV_LOAD); return; } static int em_is_valid_ether_addr(u8 *addr) { char zero_addr[6] = { 0, 0, 0, 0, 0, 0 }; if ((addr[0] & 1) || (!bcmp(addr, zero_addr, ETHER_ADDR_LEN))) { return (FALSE); } return (TRUE); } /* ** Parse the interface capabilities with regard ** to both system management and wake-on-lan for ** later use. */ static void em_get_wakeup(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); device_t dev = iflib_get_dev(ctx); u16 eeprom_data = 0, device_id, apme_mask; adapter->has_manage = e1000_enable_mng_pass_thru(&adapter->hw); apme_mask = EM_EEPROM_APME; switch (adapter->hw.mac.type) { case e1000_82542: case e1000_82543: break; case e1000_82544: e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL2_REG, 1, &eeprom_data); apme_mask = EM_82544_APME; break; case e1000_82546: case e1000_82546_rev_3: if (adapter->hw.bus.func == 1) { e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_B, 1, &eeprom_data); break; } else e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data); break; case e1000_82573: case e1000_82583: adapter->has_amt = TRUE; /* FALLTHROUGH */ case e1000_82571: case e1000_82572: case e1000_80003es2lan: if (adapter->hw.bus.func == 1) { e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_B, 1, &eeprom_data); break; } else e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data); break; case e1000_ich8lan: case e1000_ich9lan: case e1000_ich10lan: case e1000_pchlan: case e1000_pch2lan: case e1000_pch_lpt: case e1000_pch_spt: case e1000_82575: /* listing all igb devices */ case e1000_82576: case e1000_82580: case e1000_i350: case e1000_i354: case e1000_i210: case e1000_i211: case e1000_vfadapt: case e1000_vfadapt_i350: apme_mask = E1000_WUC_APME; adapter->has_amt = TRUE; eeprom_data = E1000_READ_REG(&adapter->hw, E1000_WUC); break; default: e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data); break; } if (eeprom_data & apme_mask) adapter->wol = (E1000_WUFC_MAG | E1000_WUFC_MC); /* * We have the eeprom settings, now apply the special cases * where the eeprom may be wrong or the board won't support * wake on lan on a particular port */ device_id = pci_get_device(dev); switch (device_id) { case E1000_DEV_ID_82546GB_PCIE: adapter->wol = 0; break; case E1000_DEV_ID_82546EB_FIBER: case E1000_DEV_ID_82546GB_FIBER: /* Wake events only supported on port A for dual fiber * regardless of eeprom setting */ if (E1000_READ_REG(&adapter->hw, E1000_STATUS) & E1000_STATUS_FUNC_1) adapter->wol = 0; break; case E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3: /* if quad port adapter, disable WoL on all but port A */ if (global_quad_port_a != 0) adapter->wol = 0; /* Reset for multiple quad port adapters */ if (++global_quad_port_a == 4) global_quad_port_a = 0; break; case E1000_DEV_ID_82571EB_FIBER: /* Wake events only supported on port A for dual fiber * regardless of eeprom setting */ if (E1000_READ_REG(&adapter->hw, E1000_STATUS) & E1000_STATUS_FUNC_1) adapter->wol = 0; break; case E1000_DEV_ID_82571EB_QUAD_COPPER: case E1000_DEV_ID_82571EB_QUAD_FIBER: case E1000_DEV_ID_82571EB_QUAD_COPPER_LP: /* if quad port adapter, disable WoL on all but port A */ if (global_quad_port_a != 0) adapter->wol = 0; /* Reset for multiple quad port adapters */ if (++global_quad_port_a == 4) global_quad_port_a = 0; break; } return; } /* * Enable PCI Wake On Lan capability */ static void em_enable_wakeup(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); device_t dev = iflib_get_dev(ctx); if_t ifp = iflib_get_ifp(ctx); u32 pmc, ctrl, ctrl_ext, rctl, wuc; u16 status; if ((pci_find_cap(dev, PCIY_PMG, &pmc) != 0)) return; /* Advertise the wakeup capability */ ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); ctrl |= (E1000_CTRL_SWDPIN2 | E1000_CTRL_SWDPIN3); E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl); wuc = E1000_READ_REG(&adapter->hw, E1000_WUC); wuc |= (E1000_WUC_PME_EN | E1000_WUC_APME); E1000_WRITE_REG(&adapter->hw, E1000_WUC, wuc); if ((adapter->hw.mac.type == e1000_ich8lan) || (adapter->hw.mac.type == e1000_pchlan) || (adapter->hw.mac.type == e1000_ich9lan) || (adapter->hw.mac.type == e1000_ich10lan)) e1000_suspend_workarounds_ich8lan(&adapter->hw); /* Keep the laser running on Fiber adapters */ if (adapter->hw.phy.media_type == e1000_media_type_fiber || adapter->hw.phy.media_type == e1000_media_type_internal_serdes) { ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); ctrl_ext |= E1000_CTRL_EXT_SDP3_DATA; E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext); } /* * Determine type of Wakeup: note that wol * is set with all bits on by default. */ if ((if_getcapenable(ifp) & IFCAP_WOL_MAGIC) == 0) adapter->wol &= ~E1000_WUFC_MAG; if ((if_getcapenable(ifp) & IFCAP_WOL_UCAST) == 0) adapter->wol &= ~E1000_WUFC_EX; if ((if_getcapenable(ifp) & IFCAP_WOL_MCAST) == 0) adapter->wol &= ~E1000_WUFC_MC; else { rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); rctl |= E1000_RCTL_MPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, rctl); } if ( adapter->hw.mac.type >= e1000_pchlan) { if (em_enable_phy_wakeup(adapter)) return; } else { E1000_WRITE_REG(&adapter->hw, E1000_WUC, E1000_WUC_PME_EN); E1000_WRITE_REG(&adapter->hw, E1000_WUFC, adapter->wol); } if (adapter->hw.phy.type == e1000_phy_igp_3) e1000_igp3_phy_powerdown_workaround_ich8lan(&adapter->hw); /* Request PME */ status = pci_read_config(dev, pmc + PCIR_POWER_STATUS, 2); status &= ~(PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE); if (if_getcapenable(ifp) & IFCAP_WOL) status |= PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE; pci_write_config(dev, pmc + PCIR_POWER_STATUS, status, 2); return; } /* * WOL in the newer chipset interfaces (pchlan) * require thing to be copied into the phy */ static int em_enable_phy_wakeup(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; u32 mreg, ret = 0; u16 preg; /* copy MAC RARs to PHY RARs */ e1000_copy_rx_addrs_to_phy_ich8lan(hw); /* copy MAC MTA to PHY MTA */ for (int i = 0; i < adapter->hw.mac.mta_reg_count; i++) { mreg = E1000_READ_REG_ARRAY(hw, E1000_MTA, i); e1000_write_phy_reg(hw, BM_MTA(i), (u16)(mreg & 0xFFFF)); e1000_write_phy_reg(hw, BM_MTA(i) + 1, (u16)((mreg >> 16) & 0xFFFF)); } /* configure PHY Rx Control register */ e1000_read_phy_reg(&adapter->hw, BM_RCTL, &preg); mreg = E1000_READ_REG(hw, E1000_RCTL); if (mreg & E1000_RCTL_UPE) preg |= BM_RCTL_UPE; if (mreg & E1000_RCTL_MPE) preg |= BM_RCTL_MPE; preg &= ~(BM_RCTL_MO_MASK); if (mreg & E1000_RCTL_MO_3) preg |= (((mreg & E1000_RCTL_MO_3) >> E1000_RCTL_MO_SHIFT) << BM_RCTL_MO_SHIFT); if (mreg & E1000_RCTL_BAM) preg |= BM_RCTL_BAM; if (mreg & E1000_RCTL_PMCF) preg |= BM_RCTL_PMCF; mreg = E1000_READ_REG(hw, E1000_CTRL); if (mreg & E1000_CTRL_RFCE) preg |= BM_RCTL_RFCE; e1000_write_phy_reg(&adapter->hw, BM_RCTL, preg); /* enable PHY wakeup in MAC register */ E1000_WRITE_REG(hw, E1000_WUC, E1000_WUC_PHY_WAKE | E1000_WUC_PME_EN | E1000_WUC_APME); E1000_WRITE_REG(hw, E1000_WUFC, adapter->wol); /* configure and enable PHY wakeup in PHY registers */ e1000_write_phy_reg(&adapter->hw, BM_WUFC, adapter->wol); e1000_write_phy_reg(&adapter->hw, BM_WUC, E1000_WUC_PME_EN); /* activate PHY wakeup */ ret = hw->phy.ops.acquire(hw); if (ret) { printf("Could not acquire PHY\n"); return ret; } e1000_write_phy_reg_mdic(hw, IGP01E1000_PHY_PAGE_SELECT, (BM_WUC_ENABLE_PAGE << IGP_PAGE_SHIFT)); ret = e1000_read_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, &preg); if (ret) { printf("Could not read PHY page 769\n"); goto out; } preg |= BM_WUC_ENABLE_BIT | BM_WUC_HOST_WU_BIT; ret = e1000_write_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, preg); if (ret) printf("Could not set PHY Host Wakeup bit\n"); out: hw->phy.ops.release(hw); return ret; } static void em_if_led_func(if_ctx_t ctx, int onoff) { struct adapter *adapter = iflib_get_softc(ctx); if (onoff) { e1000_setup_led(&adapter->hw); e1000_led_on(&adapter->hw); } else { e1000_led_off(&adapter->hw); e1000_cleanup_led(&adapter->hw); } } /* * Disable the L0S and L1 LINK states */ static void em_disable_aspm(struct adapter *adapter) { int base, reg; u16 link_cap,link_ctrl; device_t dev = adapter->dev; switch (adapter->hw.mac.type) { case e1000_82573: case e1000_82574: case e1000_82583: break; default: return; } if (pci_find_cap(dev, PCIY_EXPRESS, &base) != 0) return; reg = base + PCIER_LINK_CAP; link_cap = pci_read_config(dev, reg, 2); if ((link_cap & PCIEM_LINK_CAP_ASPM) == 0) return; reg = base + PCIER_LINK_CTL; link_ctrl = pci_read_config(dev, reg, 2); link_ctrl &= ~PCIEM_LINK_CTL_ASPMC; pci_write_config(dev, reg, link_ctrl, 2); return; } /********************************************************************** * * Update the board statistics counters. * **********************************************************************/ static void em_update_stats_counters(struct adapter *adapter) { if(adapter->hw.phy.media_type == e1000_media_type_copper || (E1000_READ_REG(&adapter->hw, E1000_STATUS) & E1000_STATUS_LU)) { adapter->stats.symerrs += E1000_READ_REG(&adapter->hw, E1000_SYMERRS); adapter->stats.sec += E1000_READ_REG(&adapter->hw, E1000_SEC); } adapter->stats.crcerrs += E1000_READ_REG(&adapter->hw, E1000_CRCERRS); adapter->stats.mpc += E1000_READ_REG(&adapter->hw, E1000_MPC); adapter->stats.scc += E1000_READ_REG(&adapter->hw, E1000_SCC); adapter->stats.ecol += E1000_READ_REG(&adapter->hw, E1000_ECOL); adapter->stats.mcc += E1000_READ_REG(&adapter->hw, E1000_MCC); adapter->stats.latecol += E1000_READ_REG(&adapter->hw, E1000_LATECOL); adapter->stats.colc += E1000_READ_REG(&adapter->hw, E1000_COLC); adapter->stats.dc += E1000_READ_REG(&adapter->hw, E1000_DC); adapter->stats.rlec += E1000_READ_REG(&adapter->hw, E1000_RLEC); adapter->stats.xonrxc += E1000_READ_REG(&adapter->hw, E1000_XONRXC); adapter->stats.xontxc += E1000_READ_REG(&adapter->hw, E1000_XONTXC); adapter->stats.xoffrxc += E1000_READ_REG(&adapter->hw, E1000_XOFFRXC); /* ** For watchdog management we need to know if we have been ** paused during the last interval, so capture that here. */ adapter->shared->isc_pause_frames = adapter->stats.xoffrxc; adapter->stats.xofftxc += E1000_READ_REG(&adapter->hw, E1000_XOFFTXC); adapter->stats.fcruc += E1000_READ_REG(&adapter->hw, E1000_FCRUC); adapter->stats.prc64 += E1000_READ_REG(&adapter->hw, E1000_PRC64); adapter->stats.prc127 += E1000_READ_REG(&adapter->hw, E1000_PRC127); adapter->stats.prc255 += E1000_READ_REG(&adapter->hw, E1000_PRC255); adapter->stats.prc511 += E1000_READ_REG(&adapter->hw, E1000_PRC511); adapter->stats.prc1023 += E1000_READ_REG(&adapter->hw, E1000_PRC1023); adapter->stats.prc1522 += E1000_READ_REG(&adapter->hw, E1000_PRC1522); adapter->stats.gprc += E1000_READ_REG(&adapter->hw, E1000_GPRC); adapter->stats.bprc += E1000_READ_REG(&adapter->hw, E1000_BPRC); adapter->stats.mprc += E1000_READ_REG(&adapter->hw, E1000_MPRC); adapter->stats.gptc += E1000_READ_REG(&adapter->hw, E1000_GPTC); /* For the 64-bit byte counters the low dword must be read first. */ /* Both registers clear on the read of the high dword */ adapter->stats.gorc += E1000_READ_REG(&adapter->hw, E1000_GORCL) + ((u64)E1000_READ_REG(&adapter->hw, E1000_GORCH) << 32); adapter->stats.gotc += E1000_READ_REG(&adapter->hw, E1000_GOTCL) + ((u64)E1000_READ_REG(&adapter->hw, E1000_GOTCH) << 32); adapter->stats.rnbc += E1000_READ_REG(&adapter->hw, E1000_RNBC); adapter->stats.ruc += E1000_READ_REG(&adapter->hw, E1000_RUC); adapter->stats.rfc += E1000_READ_REG(&adapter->hw, E1000_RFC); adapter->stats.roc += E1000_READ_REG(&adapter->hw, E1000_ROC); adapter->stats.rjc += E1000_READ_REG(&adapter->hw, E1000_RJC); adapter->stats.tor += E1000_READ_REG(&adapter->hw, E1000_TORH); adapter->stats.tot += E1000_READ_REG(&adapter->hw, E1000_TOTH); adapter->stats.tpr += E1000_READ_REG(&adapter->hw, E1000_TPR); adapter->stats.tpt += E1000_READ_REG(&adapter->hw, E1000_TPT); adapter->stats.ptc64 += E1000_READ_REG(&adapter->hw, E1000_PTC64); adapter->stats.ptc127 += E1000_READ_REG(&adapter->hw, E1000_PTC127); adapter->stats.ptc255 += E1000_READ_REG(&adapter->hw, E1000_PTC255); adapter->stats.ptc511 += E1000_READ_REG(&adapter->hw, E1000_PTC511); adapter->stats.ptc1023 += E1000_READ_REG(&adapter->hw, E1000_PTC1023); adapter->stats.ptc1522 += E1000_READ_REG(&adapter->hw, E1000_PTC1522); adapter->stats.mptc += E1000_READ_REG(&adapter->hw, E1000_MPTC); adapter->stats.bptc += E1000_READ_REG(&adapter->hw, E1000_BPTC); /* Interrupt Counts */ adapter->stats.iac += E1000_READ_REG(&adapter->hw, E1000_IAC); adapter->stats.icrxptc += E1000_READ_REG(&adapter->hw, E1000_ICRXPTC); adapter->stats.icrxatc += E1000_READ_REG(&adapter->hw, E1000_ICRXATC); adapter->stats.ictxptc += E1000_READ_REG(&adapter->hw, E1000_ICTXPTC); adapter->stats.ictxatc += E1000_READ_REG(&adapter->hw, E1000_ICTXATC); adapter->stats.ictxqec += E1000_READ_REG(&adapter->hw, E1000_ICTXQEC); adapter->stats.ictxqmtc += E1000_READ_REG(&adapter->hw, E1000_ICTXQMTC); adapter->stats.icrxdmtc += E1000_READ_REG(&adapter->hw, E1000_ICRXDMTC); adapter->stats.icrxoc += E1000_READ_REG(&adapter->hw, E1000_ICRXOC); if (adapter->hw.mac.type >= e1000_82543) { adapter->stats.algnerrc += E1000_READ_REG(&adapter->hw, E1000_ALGNERRC); adapter->stats.rxerrc += E1000_READ_REG(&adapter->hw, E1000_RXERRC); adapter->stats.tncrs += E1000_READ_REG(&adapter->hw, E1000_TNCRS); adapter->stats.cexterr += E1000_READ_REG(&adapter->hw, E1000_CEXTERR); adapter->stats.tsctc += E1000_READ_REG(&adapter->hw, E1000_TSCTC); adapter->stats.tsctfc += E1000_READ_REG(&adapter->hw, E1000_TSCTFC); } } static uint64_t em_if_get_counter(if_ctx_t ctx, ift_counter cnt) { struct adapter *adapter = iflib_get_softc(ctx); struct ifnet *ifp = iflib_get_ifp(ctx); switch (cnt) { case IFCOUNTER_COLLISIONS: return (adapter->stats.colc); case IFCOUNTER_IERRORS: return (adapter->dropped_pkts + adapter->stats.rxerrc + adapter->stats.crcerrs + adapter->stats.algnerrc + adapter->stats.ruc + adapter->stats.roc + adapter->stats.mpc + adapter->stats.cexterr); case IFCOUNTER_OERRORS: return (adapter->stats.ecol + adapter->stats.latecol + adapter->watchdog_events); default: return (if_get_counter_default(ifp, cnt)); } } /* Export a single 32-bit register via a read-only sysctl. */ static int em_sysctl_reg_handler(SYSCTL_HANDLER_ARGS) { struct adapter *adapter; u_int val; adapter = oidp->oid_arg1; val = E1000_READ_REG(&adapter->hw, oidp->oid_arg2); return (sysctl_handle_int(oidp, &val, 0, req)); } /* * Add sysctl variables, one per statistic, to the system. */ static void em_add_hw_stats(struct adapter *adapter) { device_t dev = iflib_get_dev(adapter->ctx); struct em_tx_queue *tx_que = adapter->tx_queues; struct em_rx_queue *rx_que = adapter->rx_queues; struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(dev); struct sysctl_oid *tree = device_get_sysctl_tree(dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); struct e1000_hw_stats *stats = &adapter->stats; struct sysctl_oid *stat_node, *queue_node, *int_node; struct sysctl_oid_list *stat_list, *queue_list, *int_list; #define QUEUE_NAME_LEN 32 char namebuf[QUEUE_NAME_LEN]; /* Driver Statistics */ SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "dropped", CTLFLAG_RD, &adapter->dropped_pkts, "Driver dropped packets"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "link_irq", CTLFLAG_RD, &adapter->link_irq, "Link MSIX IRQ Handled"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "mbuf_defrag_fail", CTLFLAG_RD, &adapter->mbuf_defrag_failed, "Defragmenting mbuf chain failed"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_dma_fail", CTLFLAG_RD, &adapter->no_tx_dma_setup, "Driver tx dma failure in xmit"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "rx_overruns", CTLFLAG_RD, &adapter->rx_overruns, "RX overruns"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "watchdog_timeouts", CTLFLAG_RD, &adapter->watchdog_events, "Watchdog timeouts"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "device_control", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_CTRL, em_sysctl_reg_handler, "IU", "Device Control Register"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "rx_control", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RCTL, em_sysctl_reg_handler, "IU", "Receiver Control Register"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_high_water", CTLFLAG_RD, &adapter->hw.fc.high_water, 0, "Flow Control High Watermark"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_low_water", CTLFLAG_RD, &adapter->hw.fc.low_water, 0, "Flow Control Low Watermark"); for (int i = 0; i < adapter->tx_num_queues; i++, tx_que++) { struct tx_ring *txr = &tx_que->txr; snprintf(namebuf, QUEUE_NAME_LEN, "queue_tx_%d", i); queue_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, namebuf, CTLFLAG_RD, NULL, "TX Queue Name"); queue_list = SYSCTL_CHILDREN(queue_node); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "txd_head", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_TDH(txr->me), em_sysctl_reg_handler, "IU", "Transmit Descriptor Head"); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "txd_tail", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_TDT(txr->me), em_sysctl_reg_handler, "IU", "Transmit Descriptor Tail"); SYSCTL_ADD_ULONG(ctx, queue_list, OID_AUTO, "tx_irq", CTLFLAG_RD, &txr->tx_irq, "Queue MSI-X Transmit Interrupts"); } for (int j = 0; j < adapter->rx_num_queues; j++, rx_que++) { struct rx_ring *rxr = &rx_que->rxr; snprintf(namebuf, QUEUE_NAME_LEN, "queue_rx_%d", j); queue_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, namebuf, CTLFLAG_RD, NULL, "RX Queue Name"); queue_list = SYSCTL_CHILDREN(queue_node); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "rxd_head", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RDH(rxr->me), em_sysctl_reg_handler, "IU", "Receive Descriptor Head"); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "rxd_tail", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RDT(rxr->me), em_sysctl_reg_handler, "IU", "Receive Descriptor Tail"); SYSCTL_ADD_ULONG(ctx, queue_list, OID_AUTO, "rx_irq", CTLFLAG_RD, &rxr->rx_irq, "Queue MSI-X Receive Interrupts"); } /* MAC stats get their own sub node */ stat_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "mac_stats", CTLFLAG_RD, NULL, "Statistics"); stat_list = SYSCTL_CHILDREN(stat_node); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "excess_coll", CTLFLAG_RD, &stats->ecol, "Excessive collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "single_coll", CTLFLAG_RD, &stats->scc, "Single collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "multiple_coll", CTLFLAG_RD, &stats->mcc, "Multiple collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "late_coll", CTLFLAG_RD, &stats->latecol, "Late collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "collision_count", CTLFLAG_RD, &stats->colc, "Collision Count"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "symbol_errors", CTLFLAG_RD, &adapter->stats.symerrs, "Symbol Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "sequence_errors", CTLFLAG_RD, &adapter->stats.sec, "Sequence Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "defer_count", CTLFLAG_RD, &adapter->stats.dc, "Defer Count"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "missed_packets", CTLFLAG_RD, &adapter->stats.mpc, "Missed Packets"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_no_buff", CTLFLAG_RD, &adapter->stats.rnbc, "Receive No Buffers"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_undersize", CTLFLAG_RD, &adapter->stats.ruc, "Receive Undersize"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_fragmented", CTLFLAG_RD, &adapter->stats.rfc, "Fragmented Packets Received "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_oversize", CTLFLAG_RD, &adapter->stats.roc, "Oversized Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_jabber", CTLFLAG_RD, &adapter->stats.rjc, "Recevied Jabber"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_errs", CTLFLAG_RD, &adapter->stats.rxerrc, "Receive Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "crc_errs", CTLFLAG_RD, &adapter->stats.crcerrs, "CRC errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "alignment_errs", CTLFLAG_RD, &adapter->stats.algnerrc, "Alignment Errors"); /* On 82575 these are collision counts */ SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "coll_ext_errs", CTLFLAG_RD, &adapter->stats.cexterr, "Collision/Carrier extension errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xon_recvd", CTLFLAG_RD, &adapter->stats.xonrxc, "XON Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xon_txd", CTLFLAG_RD, &adapter->stats.xontxc, "XON Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xoff_recvd", CTLFLAG_RD, &adapter->stats.xoffrxc, "XOFF Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xoff_txd", CTLFLAG_RD, &adapter->stats.xofftxc, "XOFF Transmitted"); /* Packet Reception Stats */ SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "total_pkts_recvd", CTLFLAG_RD, &adapter->stats.tpr, "Total Packets Received "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_pkts_recvd", CTLFLAG_RD, &adapter->stats.gprc, "Good Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_recvd", CTLFLAG_RD, &adapter->stats.bprc, "Broadcast Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_recvd", CTLFLAG_RD, &adapter->stats.mprc, "Multicast Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_64", CTLFLAG_RD, &adapter->stats.prc64, "64 byte frames received "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_65_127", CTLFLAG_RD, &adapter->stats.prc127, "65-127 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_128_255", CTLFLAG_RD, &adapter->stats.prc255, "128-255 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_256_511", CTLFLAG_RD, &adapter->stats.prc511, "256-511 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_512_1023", CTLFLAG_RD, &adapter->stats.prc1023, "512-1023 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_1024_1522", CTLFLAG_RD, &adapter->stats.prc1522, "1023-1522 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_octets_recvd", CTLFLAG_RD, &adapter->stats.gorc, "Good Octets Received"); /* Packet Transmission Stats */ SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_octets_txd", CTLFLAG_RD, &adapter->stats.gotc, "Good Octets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "total_pkts_txd", CTLFLAG_RD, &adapter->stats.tpt, "Total Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_pkts_txd", CTLFLAG_RD, &adapter->stats.gptc, "Good Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_txd", CTLFLAG_RD, &adapter->stats.bptc, "Broadcast Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_txd", CTLFLAG_RD, &adapter->stats.mptc, "Multicast Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_64", CTLFLAG_RD, &adapter->stats.ptc64, "64 byte frames transmitted "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_65_127", CTLFLAG_RD, &adapter->stats.ptc127, "65-127 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_128_255", CTLFLAG_RD, &adapter->stats.ptc255, "128-255 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_256_511", CTLFLAG_RD, &adapter->stats.ptc511, "256-511 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_512_1023", CTLFLAG_RD, &adapter->stats.ptc1023, "512-1023 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_1024_1522", CTLFLAG_RD, &adapter->stats.ptc1522, "1024-1522 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tso_txd", CTLFLAG_RD, &adapter->stats.tsctc, "TSO Contexts Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tso_ctx_fail", CTLFLAG_RD, &adapter->stats.tsctfc, "TSO Contexts Failed"); /* Interrupt Stats */ int_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "interrupts", CTLFLAG_RD, NULL, "Interrupt Statistics"); int_list = SYSCTL_CHILDREN(int_node); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "asserts", CTLFLAG_RD, &adapter->stats.iac, "Interrupt Assertion Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_pkt_timer", CTLFLAG_RD, &adapter->stats.icrxptc, "Interrupt Cause Rx Pkt Timer Expire Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_abs_timer", CTLFLAG_RD, &adapter->stats.icrxatc, "Interrupt Cause Rx Abs Timer Expire Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_pkt_timer", CTLFLAG_RD, &adapter->stats.ictxptc, "Interrupt Cause Tx Pkt Timer Expire Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_abs_timer", CTLFLAG_RD, &adapter->stats.ictxatc, "Interrupt Cause Tx Abs Timer Expire Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_queue_empty", CTLFLAG_RD, &adapter->stats.ictxqec, "Interrupt Cause Tx Queue Empty Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_queue_min_thresh", CTLFLAG_RD, &adapter->stats.ictxqmtc, "Interrupt Cause Tx Queue Min Thresh Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_desc_min_thresh", CTLFLAG_RD, &adapter->stats.icrxdmtc, "Interrupt Cause Rx Desc Min Thresh Count"); SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_overrun", CTLFLAG_RD, &adapter->stats.icrxoc, "Interrupt Cause Receiver Overrun Count"); } /********************************************************************** * * This routine provides a way to dump out the adapter eeprom, * often a useful debug/service tool. This only dumps the first * 32 words, stuff that matters is in that extent. * **********************************************************************/ static int em_sysctl_nvm_info(SYSCTL_HANDLER_ARGS) { struct adapter *adapter = (struct adapter *)arg1; int error; int result; result = -1; error = sysctl_handle_int(oidp, &result, 0, req); if (error || !req->newptr) return (error); /* * This value will cause a hex dump of the * first 32 16-bit words of the EEPROM to * the screen. */ if (result == 1) em_print_nvm_info(adapter); return (error); } static void em_print_nvm_info(struct adapter *adapter) { u16 eeprom_data; int i, j, row = 0; /* Its a bit crude, but it gets the job done */ printf("\nInterface EEPROM Dump:\n"); printf("Offset\n0x0000 "); for (i = 0, j = 0; i < 32; i++, j++) { if (j == 8) { /* Make the offset block */ j = 0; ++row; printf("\n0x00%x0 ",row); } e1000_read_nvm(&adapter->hw, i, 1, &eeprom_data); printf("%04x ", eeprom_data); } printf("\n"); } static int em_sysctl_int_delay(SYSCTL_HANDLER_ARGS) { struct em_int_delay_info *info; struct adapter *adapter; u32 regval; int error, usecs, ticks; info = (struct em_int_delay_info *) arg1; usecs = info->value; error = sysctl_handle_int(oidp, &usecs, 0, req); if (error != 0 || req->newptr == NULL) return (error); if (usecs < 0 || usecs > EM_TICKS_TO_USECS(65535)) return (EINVAL); info->value = usecs; ticks = EM_USECS_TO_TICKS(usecs); if (info->offset == E1000_ITR) /* units are 256ns here */ ticks *= 4; adapter = info->adapter; regval = E1000_READ_OFFSET(&adapter->hw, info->offset); regval = (regval & ~0xffff) | (ticks & 0xffff); /* Handle a few special cases. */ switch (info->offset) { case E1000_RDTR: break; case E1000_TIDV: if (ticks == 0) { adapter->txd_cmd &= ~E1000_TXD_CMD_IDE; /* Don't write 0 into the TIDV register. */ regval++; } else adapter->txd_cmd |= E1000_TXD_CMD_IDE; break; } E1000_WRITE_OFFSET(&adapter->hw, info->offset, regval); return (0); } static void em_add_int_delay_sysctl(struct adapter *adapter, const char *name, const char *description, struct em_int_delay_info *info, int offset, int value) { info->adapter = adapter; info->offset = offset; info->value = value; SYSCTL_ADD_PROC(device_get_sysctl_ctx(adapter->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)), OID_AUTO, name, CTLTYPE_INT|CTLFLAG_RW, info, 0, em_sysctl_int_delay, "I", description); } /* * Set flow control using sysctl: * Flow control values: * 0 - off * 1 - rx pause * 2 - tx pause * 3 - full */ static int em_set_flowcntl(SYSCTL_HANDLER_ARGS) { int error; static int input = 3; /* default is full */ struct adapter *adapter = (struct adapter *) arg1; error = sysctl_handle_int(oidp, &input, 0, req); if ((error) || (req->newptr == NULL)) return (error); if (input == adapter->fc) /* no change? */ return (error); switch (input) { case e1000_fc_rx_pause: case e1000_fc_tx_pause: case e1000_fc_full: case e1000_fc_none: adapter->hw.fc.requested_mode = input; adapter->fc = input; break; default: /* Do nothing */ return (error); } adapter->hw.fc.current_mode = adapter->hw.fc.requested_mode; e1000_force_mac_fc(&adapter->hw); return (error); } /* * Manage Energy Efficient Ethernet: * Control values: * 0/1 - enabled/disabled */ static int em_sysctl_eee(SYSCTL_HANDLER_ARGS) { struct adapter *adapter = (struct adapter *) arg1; int error, value; value = adapter->hw.dev_spec.ich8lan.eee_disable; error = sysctl_handle_int(oidp, &value, 0, req); if (error || req->newptr == NULL) return (error); adapter->hw.dev_spec.ich8lan.eee_disable = (value != 0); em_if_init(adapter->ctx); return (0); } static int em_sysctl_debug_info(SYSCTL_HANDLER_ARGS) { struct adapter *adapter; int error; int result; result = -1; error = sysctl_handle_int(oidp, &result, 0, req); if (error || !req->newptr) return (error); if (result == 1) { adapter = (struct adapter *) arg1; em_print_debug_info(adapter); } return (error); } static int em_get_rs(SYSCTL_HANDLER_ARGS) { struct adapter *adapter = (struct adapter *) arg1; int error; int result; result = 0; error = sysctl_handle_int(oidp, &result, 0, req); if (error || !req->newptr || result != 1) return (error); em_dump_rs(adapter); return (error); } static void em_if_debug(if_ctx_t ctx) { em_dump_rs(iflib_get_softc(ctx)); } /* * This routine is meant to be fluid, add whatever is * needed for debugging a problem. -jfv */ static void em_print_debug_info(struct adapter *adapter) { device_t dev = iflib_get_dev(adapter->ctx); struct ifnet *ifp = iflib_get_ifp(adapter->ctx); struct tx_ring *txr = &adapter->tx_queues->txr; struct rx_ring *rxr = &adapter->rx_queues->rxr; if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) printf("Interface is RUNNING "); else printf("Interface is NOT RUNNING\n"); if (if_getdrvflags(ifp) & IFF_DRV_OACTIVE) printf("and INACTIVE\n"); else printf("and ACTIVE\n"); for (int i = 0; i < adapter->tx_num_queues; i++, txr++) { device_printf(dev, "TX Queue %d ------\n", i); device_printf(dev, "hw tdh = %d, hw tdt = %d\n", E1000_READ_REG(&adapter->hw, E1000_TDH(i)), E1000_READ_REG(&adapter->hw, E1000_TDT(i))); } for (int j=0; j < adapter->rx_num_queues; j++, rxr++) { device_printf(dev, "RX Queue %d ------\n", j); device_printf(dev, "hw rdh = %d, hw rdt = %d\n", E1000_READ_REG(&adapter->hw, E1000_RDH(j)), E1000_READ_REG(&adapter->hw, E1000_RDT(j))); } } /* * 82574 only: * Write a new value to the EEPROM increasing the number of MSIX * vectors from 3 to 5, for proper multiqueue support. */ static void em_enable_vectors_82574(if_ctx_t ctx) { struct adapter *adapter = iflib_get_softc(ctx); struct e1000_hw *hw = &adapter->hw; device_t dev = iflib_get_dev(ctx); u16 edata; e1000_read_nvm(hw, EM_NVM_PCIE_CTRL, 1, &edata); printf("Current cap: %#06x\n", edata); if (((edata & EM_NVM_MSIX_N_MASK) >> EM_NVM_MSIX_N_SHIFT) != 4) { device_printf(dev, "Writing to eeprom: increasing " "reported MSIX vectors from 3 to 5...\n"); edata &= ~(EM_NVM_MSIX_N_MASK); edata |= 4 << EM_NVM_MSIX_N_SHIFT; e1000_write_nvm(hw, EM_NVM_PCIE_CTRL, 1, &edata); e1000_update_nvm_checksum(hw); device_printf(dev, "Writing to eeprom: done\n"); } } Index: projects/runtime-coverage/sys/dev/e1000/if_em.h =================================================================== --- projects/runtime-coverage/sys/dev/e1000/if_em.h (revision 322921) +++ projects/runtime-coverage/sys/dev/e1000/if_em.h (revision 322922) @@ -1,563 +1,565 @@ /*- * Copyright (c) 2016 Matt Macy * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /*$FreeBSD$*/ #include "opt_ddb.h" #include "opt_inet.h" #include "opt_inet6.h" #ifdef HAVE_KERNEL_OPTION_HEADERS #include "opt_device_polling.h" #endif #include #include #ifdef DDB #include #include #endif #if __FreeBSD_version >= 800000 #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "e1000_api.h" #include "e1000_82571.h" #include "ifdi_if.h" #ifndef _EM_H_DEFINED_ #define _EM_H_DEFINED_ /* Tunables */ /* - * EM_TXD: Maximum number of Transmit Descriptors + * EM_MAX_TXD: Maximum number of Transmit Descriptors * Valid Range: 80-256 for 82542 and 82543-based adapters * 80-4096 for others - * Default Value: 256 + * Default Value: 1024 * This value is the number of transmit descriptors allocated by the driver. * Increasing this value allows the driver to queue more transmits. Each * descriptor is 16 bytes. * Since TDLEN should be multiple of 128bytes, the number of transmit * desscriptors should meet the following condition. * (num_tx_desc * sizeof(struct e1000_tx_desc)) % 128 == 0 */ #define EM_MIN_TXD 128 #define EM_MAX_TXD 4096 #define EM_DEFAULT_TXD 1024 #define EM_DEFAULT_MULTI_TXD 4096 +#define IGB_MAX_TXD 4096 /* - * EM_RXD - Maximum number of receive Descriptors + * EM_MAX_RXD - Maximum number of receive Descriptors * Valid Range: 80-256 for 82542 and 82543-based adapters * 80-4096 for others - * Default Value: 256 + * Default Value: 1024 * This value is the number of receive descriptors allocated by the driver. * Increasing this value allows the driver to buffer more incoming packets. * Each descriptor is 16 bytes. A receive buffer is also allocated for each * descriptor. The maximum MTU size is 16110. * Since TDLEN should be multiple of 128bytes, the number of transmit * desscriptors should meet the following condition. * (num_tx_desc * sizeof(struct e1000_tx_desc)) % 128 == 0 */ #define EM_MIN_RXD 128 #define EM_MAX_RXD 4096 #define EM_DEFAULT_RXD 1024 #define EM_DEFAULT_MULTI_RXD 4096 +#define IGB_MAX_RXD 4096 /* * EM_TIDV - Transmit Interrupt Delay Value * Valid Range: 0-65535 (0=off) * Default Value: 64 * This value delays the generation of transmit interrupts in units of * 1.024 microseconds. Transmit interrupt reduction can improve CPU * efficiency if properly tuned for specific network traffic. If the * system is reporting dropped transmits, this value may be set too high * causing the driver to run out of available transmit descriptors. */ #define EM_TIDV 64 /* * EM_TADV - Transmit Absolute Interrupt Delay Value * (Not valid for 82542/82543/82544) * Valid Range: 0-65535 (0=off) * Default Value: 64 * This value, in units of 1.024 microseconds, limits the delay in which a * transmit interrupt is generated. Useful only if EM_TIDV is non-zero, * this value ensures that an interrupt is generated after the initial * packet is sent on the wire within the set amount of time. Proper tuning, * along with EM_TIDV, may improve traffic throughput in specific * network conditions. */ #define EM_TADV 64 /* * EM_RDTR - Receive Interrupt Delay Timer (Packet Timer) * Valid Range: 0-65535 (0=off) * Default Value: 0 * This value delays the generation of receive interrupts in units of 1.024 * microseconds. Receive interrupt reduction can improve CPU efficiency if * properly tuned for specific network traffic. Increasing this value adds * extra latency to frame reception and can end up decreasing the throughput * of TCP traffic. If the system is reporting dropped receives, this value * may be set too high, causing the driver to run out of available receive * descriptors. * * CAUTION: When setting EM_RDTR to a value other than 0, adapters * may hang (stop transmitting) under certain network conditions. * If this occurs a WATCHDOG message is logged in the system * event log. In addition, the controller is automatically reset, * restoring the network connection. To eliminate the potential * for the hang ensure that EM_RDTR is set to 0. */ #define EM_RDTR 0 /* * Receive Interrupt Absolute Delay Timer (Not valid for 82542/82543/82544) * Valid Range: 0-65535 (0=off) * Default Value: 64 * This value, in units of 1.024 microseconds, limits the delay in which a * receive interrupt is generated. Useful only if EM_RDTR is non-zero, * this value ensures that an interrupt is generated after the initial * packet is received within the set amount of time. Proper tuning, * along with EM_RDTR, may improve traffic throughput in specific network * conditions. */ #define EM_RADV 64 /* * This parameter controls whether or not autonegotation is enabled. * 0 - Disable autonegotiation * 1 - Enable autonegotiation */ #define DO_AUTO_NEG 1 /* * This parameter control whether or not the driver will wait for * autonegotiation to complete. * 1 - Wait for autonegotiation to complete * 0 - Don't wait for autonegotiation to complete */ #define WAIT_FOR_AUTO_NEG_DEFAULT 0 /* Tunables -- End */ #define AUTONEG_ADV_DEFAULT (ADVERTISE_10_HALF | ADVERTISE_10_FULL | \ ADVERTISE_100_HALF | ADVERTISE_100_FULL | \ ADVERTISE_1000_FULL) #define AUTO_ALL_MODES 0 /* PHY master/slave setting */ #define EM_MASTER_SLAVE e1000_ms_hw_default /* * Micellaneous constants */ #define EM_VENDOR_ID 0x8086 #define EM_FLASH 0x0014 #define EM_JUMBO_PBA 0x00000028 #define EM_DEFAULT_PBA 0x00000030 #define EM_SMARTSPEED_DOWNSHIFT 3 #define EM_SMARTSPEED_MAX 15 #define EM_MAX_LOOP 10 #define MAX_NUM_MULTICAST_ADDRESSES 128 #define PCI_ANY_ID (~0U) #define ETHER_ALIGN 2 #define EM_FC_PAUSE_TIME 0x0680 #define EM_EEPROM_APME 0x400; #define EM_82544_APME 0x0004; /* Support AutoMediaDetect for Marvell M88 PHY in i354 */ #define IGB_MEDIA_RESET (1 << 0) /* Define the starting Interrupt rate per Queue */ #define IGB_INTS_PER_SEC 8000 #define IGB_DEFAULT_ITR ((1000000/IGB_INTS_PER_SEC) << 2) #define IGB_LINK_ITR 2000 #define I210_LINK_DELAY 1000 #define IGB_MAX_SCATTER 40 #define IGB_VFTA_SIZE 128 #define IGB_BR_SIZE 4096 /* ring buf size */ #define IGB_TSO_SIZE (65535 + sizeof(struct ether_vlan_header)) #define IGB_TSO_SEG_SIZE 4096 /* Max dma segment size */ #define IGB_TXPBSIZE 20408 #define IGB_HDR_BUF 128 #define IGB_PKTTYPE_MASK 0x0000FFF0 #define IGB_DMCTLX_DCFLUSH_DIS 0x80000000 /* Disable DMA Coalesce Flush */ /* * Driver state logic for the detection of a hung state * in hardware. Set TX_HUNG whenever a TX packet is used * (data is sent) and clear it when txeof() is invoked if * any descriptors from the ring are cleaned/reclaimed. * Increment internal counter if no descriptors are cleaned * and compare to TX_MAXTRIES. When counter > TX_MAXTRIES, * reset adapter. */ #define EM_TX_IDLE 0x00000000 #define EM_TX_BUSY 0x00000001 #define EM_TX_HUNG 0x80000000 #define EM_TX_MAXTRIES 10 #define PCICFG_DESC_RING_STATUS 0xe4 #define FLUSH_DESC_REQUIRED 0x100 #define IGB_RX_PTHRESH ((hw->mac.type == e1000_i354) ? 12 : \ ((hw->mac.type <= e1000_82576) ? 16 : 8)) #define IGB_RX_HTHRESH 8 #define IGB_RX_WTHRESH ((hw->mac.type == e1000_82576 && \ (adapter->intr_type == IFLIB_INTR_MSIX)) ? 1 : 4) #define IGB_TX_PTHRESH ((hw->mac.type == e1000_i354) ? 20 : 8) #define IGB_TX_HTHRESH 1 #define IGB_TX_WTHRESH ((hw->mac.type != e1000_82575 && \ (adapter->intr_type == IFLIB_INTR_MSIX) ? 1 : 16) /* * TDBA/RDBA should be aligned on 16 byte boundary. But TDLEN/RDLEN should be * multiple of 128 bytes. So we align TDBA/RDBA on 128 byte boundary. This will * also optimize cache line size effect. H/W supports up to cache line size 128. */ #define EM_DBA_ALIGN 128 /* * See Intel 82574 Driver Programming Interface Manual, Section 10.2.6.9 */ #define TARC_COMPENSATION_MODE (1 << 7) /* Compensation Mode */ #define TARC_SPEED_MODE_BIT (1 << 21) /* On PCI-E MACs only */ #define TARC_MQ_FIX (1 << 23) | \ (1 << 24) | \ (1 << 25) /* Handle errata in MQ mode */ #define TARC_ERRATA_BIT (1 << 26) /* Note from errata on 82574 */ /* PCI Config defines */ #define EM_BAR_TYPE(v) ((v) & EM_BAR_TYPE_MASK) #define EM_BAR_TYPE_MASK 0x00000001 #define EM_BAR_TYPE_MMEM 0x00000000 #define EM_BAR_TYPE_IO 0x00000001 #define EM_BAR_TYPE_FLASH 0x0014 #define EM_BAR_MEM_TYPE(v) ((v) & EM_BAR_MEM_TYPE_MASK) #define EM_BAR_MEM_TYPE_MASK 0x00000006 #define EM_BAR_MEM_TYPE_32BIT 0x00000000 #define EM_BAR_MEM_TYPE_64BIT 0x00000004 #define EM_MSIX_BAR 3 /* On 82575 */ /* More backward compatibility */ #if __FreeBSD_version < 900000 #define SYSCTL_ADD_UQUAD SYSCTL_ADD_QUAD #endif /* Defines for printing debug information */ #define DEBUG_INIT 0 #define DEBUG_IOCTL 0 #define DEBUG_HW 0 #define INIT_DEBUGOUT(S) if (DEBUG_INIT) printf(S "\n") #define INIT_DEBUGOUT1(S, A) if (DEBUG_INIT) printf(S "\n", A) #define INIT_DEBUGOUT2(S, A, B) if (DEBUG_INIT) printf(S "\n", A, B) #define IOCTL_DEBUGOUT(S) if (DEBUG_IOCTL) printf(S "\n") #define IOCTL_DEBUGOUT1(S, A) if (DEBUG_IOCTL) printf(S "\n", A) #define IOCTL_DEBUGOUT2(S, A, B) if (DEBUG_IOCTL) printf(S "\n", A, B) #define HW_DEBUGOUT(S) if (DEBUG_HW) printf(S "\n") #define HW_DEBUGOUT1(S, A) if (DEBUG_HW) printf(S "\n", A) #define HW_DEBUGOUT2(S, A, B) if (DEBUG_HW) printf(S "\n", A, B) #define EM_MAX_SCATTER 40 #define EM_VFTA_SIZE 128 #define EM_TSO_SIZE (65535 + sizeof(struct ether_vlan_header)) #define EM_TSO_SEG_SIZE 4096 /* Max dma segment size */ #define EM_MSIX_MASK 0x01F00000 /* For 82574 use */ #define EM_MSIX_LINK 0x01000000 /* For 82574 use */ #define ETH_ZLEN 60 #define ETH_ADDR_LEN 6 #define EM_CSUM_OFFLOAD 7 /* Offload bits in mbuf flag */ #define IGB_CSUM_OFFLOAD 0x0E0F /* Offload bits in mbuf flag */ #define IGB_PKTTYPE_MASK 0x0000FFF0 #define IGB_DMCTLX_DCFLUSH_DIS 0x80000000 /* Disable DMA Coalesce Flush */ /* * 82574 has a nonstandard address for EIAC * and since its only used in MSIX, and in * the em driver only 82574 uses MSIX we can * solve it just using this define. */ #define EM_EIAC 0x000DC /* * 82574 only reports 3 MSI-X vectors by default; * defines assisting with making it report 5 are * located here. */ #define EM_NVM_PCIE_CTRL 0x1B #define EM_NVM_MSIX_N_MASK (0x7 << EM_NVM_MSIX_N_SHIFT) #define EM_NVM_MSIX_N_SHIFT 7 struct adapter; struct em_int_delay_info { struct adapter *adapter; /* Back-pointer to the adapter struct */ int offset; /* Register offset to read/write */ int value; /* Current value in usecs */ }; /* * The transmit ring, one per tx queue */ struct tx_ring { struct adapter *adapter; struct e1000_tx_desc *tx_base; uint64_t tx_paddr; qidx_t *tx_rsq; bool tx_tso; /* last tx was tso */ uint8_t me; qidx_t tx_rs_cidx; qidx_t tx_rs_pidx; qidx_t tx_cidx_processed; /* Interrupt resources */ void *tag; struct resource *res; unsigned long tx_irq; /* Saved csum offloading context information */ int csum_flags; int csum_lhlen; int csum_iphlen; int csum_thlen; int csum_mss; int csum_pktlen; uint32_t csum_txd_upper; uint32_t csum_txd_lower; /* last field */ }; /* * The Receive ring, one per rx queue */ struct rx_ring { struct adapter *adapter; struct em_rx_queue *que; u32 me; u32 payload; union e1000_rx_desc_extended *rx_base; uint64_t rx_paddr; /* Interrupt resources */ void *tag; struct resource *res; bool discard; /* Soft stats */ unsigned long rx_irq; unsigned long rx_discarded; unsigned long rx_packets; unsigned long rx_bytes; }; struct em_tx_queue { struct adapter *adapter; u32 msix; u32 eims; /* This queue's EIMS bit */ u32 me; struct tx_ring txr; }; struct em_rx_queue { struct adapter *adapter; u32 me; u32 msix; u32 eims; struct rx_ring rxr; u64 irqs; struct if_irq que_irq; }; /* Our adapter structure */ struct adapter { struct ifnet *ifp; struct e1000_hw hw; if_softc_ctx_t shared; if_ctx_t ctx; #define tx_num_queues shared->isc_ntxqsets #define rx_num_queues shared->isc_nrxqsets #define intr_type shared->isc_intr /* FreeBSD operating-system-specific structures. */ struct e1000_osdep osdep; device_t dev; struct cdev *led_dev; struct em_tx_queue *tx_queues; struct em_rx_queue *rx_queues; struct if_irq irq; struct resource *memory; struct resource *flash; struct resource *ioport; int io_rid; struct resource *res; void *tag; u32 linkvec; u32 ivars; struct ifmedia *media; int msix; int if_flags; int em_insert_vlan_header; u32 ims; bool in_detach; u32 flags; /* Task for FAST handling */ struct grouptask link_task; u16 num_vlans; u32 txd_cmd; u32 tx_process_limit; u32 rx_process_limit; u32 rx_mbuf_sz; /* Management and WOL features */ u32 wol; bool has_manage; bool has_amt; /* Multicast array memory */ u8 *mta; /* ** Shadow VFTA table, this is needed because ** the real vlan filter table gets cleared during ** a soft reset and the driver needs to be able ** to repopulate it. */ u32 shadow_vfta[EM_VFTA_SIZE]; /* Info about the interface */ u16 link_active; u16 fc; u16 link_speed; u16 link_duplex; u32 smartspeed; u32 dmac; int link_mask; u64 que_mask; struct em_int_delay_info tx_int_delay; struct em_int_delay_info tx_abs_int_delay; struct em_int_delay_info rx_int_delay; struct em_int_delay_info rx_abs_int_delay; struct em_int_delay_info tx_itr; /* Misc stats maintained by the driver */ unsigned long dropped_pkts; unsigned long link_irq; unsigned long mbuf_defrag_failed; unsigned long no_tx_dma_setup; unsigned long no_tx_map_avail; unsigned long rx_overruns; unsigned long watchdog_events; struct e1000_hw_stats stats; u16 vf_ifp; }; /******************************************************************************** * vendor_info_array * * This array contains the list of Subvendor/Subdevice IDs on which the driver * should load. * ********************************************************************************/ typedef struct _em_vendor_info_t { unsigned int vendor_id; unsigned int device_id; unsigned int subvendor_id; unsigned int subdevice_id; unsigned int index; } em_vendor_info_t; void em_dump_rs(struct adapter *); #define EM_RSSRK_SIZE 4 #define EM_RSSRK_VAL(key, i) (key[(i) * EM_RSSRK_SIZE] | \ key[(i) * EM_RSSRK_SIZE + 1] << 8 | \ key[(i) * EM_RSSRK_SIZE + 2] << 16 | \ key[(i) * EM_RSSRK_SIZE + 3] << 24) #endif /* _EM_H_DEFINED_ */ Index: projects/runtime-coverage/sys/dev/nvme/nvme.c =================================================================== --- projects/runtime-coverage/sys/dev/nvme/nvme.c (revision 322921) +++ projects/runtime-coverage/sys/dev/nvme/nvme.c (revision 322922) @@ -1,448 +1,452 @@ /*- * Copyright (C) 2012-2014 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include "nvme_private.h" struct nvme_consumer { uint32_t id; nvme_cons_ns_fn_t ns_fn; nvme_cons_ctrlr_fn_t ctrlr_fn; nvme_cons_async_fn_t async_fn; nvme_cons_fail_fn_t fail_fn; }; struct nvme_consumer nvme_consumer[NVME_MAX_CONSUMERS]; #define INVALID_CONSUMER_ID 0xFFFF uma_zone_t nvme_request_zone; int32_t nvme_retry_count; MALLOC_DEFINE(M_NVME, "nvme", "nvme(4) memory allocations"); static int nvme_probe(device_t); static int nvme_attach(device_t); static int nvme_detach(device_t); static int nvme_shutdown(device_t); static int nvme_modevent(module_t mod, int type, void *arg); static devclass_t nvme_devclass; static device_method_t nvme_pci_methods[] = { /* Device interface */ DEVMETHOD(device_probe, nvme_probe), DEVMETHOD(device_attach, nvme_attach), DEVMETHOD(device_detach, nvme_detach), DEVMETHOD(device_shutdown, nvme_shutdown), { 0, 0 } }; static driver_t nvme_pci_driver = { "nvme", nvme_pci_methods, sizeof(struct nvme_controller), }; DRIVER_MODULE(nvme, pci, nvme_pci_driver, nvme_devclass, nvme_modevent, 0); MODULE_VERSION(nvme, 1); static struct _pcsid { uint32_t devid; int match_subdevice; uint16_t subdevice; const char *desc; } pci_ids[] = { { 0x01118086, 0, 0, "NVMe Controller" }, { IDT32_PCI_ID, 0, 0, "IDT NVMe Controller (32 channel)" }, { IDT8_PCI_ID, 0, 0, "IDT NVMe Controller (8 channel)" }, { 0x09538086, 1, 0x3702, "DC P3700 SSD" }, { 0x09538086, 1, 0x3703, "DC P3700 SSD [2.5\" SFF]" }, { 0x09538086, 1, 0x3704, "DC P3500 SSD [Add-in Card]" }, { 0x09538086, 1, 0x3705, "DC P3500 SSD [2.5\" SFF]" }, { 0x09538086, 1, 0x3709, "DC P3600 SSD [Add-in Card]" }, { 0x09538086, 1, 0x370a, "DC P3600 SSD [2.5\" SFF]" }, { 0x00000000, 0, 0, NULL } }; static int nvme_match(uint32_t devid, uint16_t subdevice, struct _pcsid *ep) { if (devid != ep->devid) return 0; if (!ep->match_subdevice) return 1; if (subdevice == ep->subdevice) return 1; else return 0; } static int nvme_probe (device_t device) { struct _pcsid *ep; uint32_t devid; uint16_t subdevice; devid = pci_get_devid(device); subdevice = pci_get_subdevice(device); ep = pci_ids; while (ep->devid) { if (nvme_match(devid, subdevice, ep)) break; ++ep; } if (ep->desc) { device_set_desc(device, ep->desc); return (BUS_PROBE_DEFAULT); } #if defined(PCIS_STORAGE_NVM) if (pci_get_class(device) == PCIC_STORAGE && pci_get_subclass(device) == PCIS_STORAGE_NVM && pci_get_progif(device) == PCIP_STORAGE_NVM_ENTERPRISE_NVMHCI_1_0) { device_set_desc(device, "Generic NVMe Device"); return (BUS_PROBE_GENERIC); } #endif return (ENXIO); } static void nvme_init(void) { uint32_t i; nvme_request_zone = uma_zcreate("nvme_request", sizeof(struct nvme_request), NULL, NULL, NULL, NULL, 0, 0); for (i = 0; i < NVME_MAX_CONSUMERS; i++) nvme_consumer[i].id = INVALID_CONSUMER_ID; } SYSINIT(nvme_register, SI_SUB_DRIVERS, SI_ORDER_SECOND, nvme_init, NULL); static void nvme_uninit(void) { uma_zdestroy(nvme_request_zone); } SYSUNINIT(nvme_unregister, SI_SUB_DRIVERS, SI_ORDER_SECOND, nvme_uninit, NULL); static void nvme_load(void) { } static void nvme_unload(void) { } static int nvme_shutdown(device_t dev) { struct nvme_controller *ctrlr; ctrlr = DEVICE2SOFTC(dev); nvme_ctrlr_shutdown(ctrlr); return (0); } static int nvme_modevent(module_t mod, int type, void *arg) { switch (type) { case MOD_LOAD: nvme_load(); break; case MOD_UNLOAD: nvme_unload(); break; default: break; } return (0); } void nvme_dump_command(struct nvme_command *cmd) { printf( "opc:%x f:%x r1:%x cid:%x nsid:%x r2:%x r3:%x mptr:%jx prp1:%jx prp2:%jx cdw:%x %x %x %x %x %x\n", cmd->opc, cmd->fuse, cmd->rsvd1, cmd->cid, cmd->nsid, cmd->rsvd2, cmd->rsvd3, (uintmax_t)cmd->mptr, (uintmax_t)cmd->prp1, (uintmax_t)cmd->prp2, cmd->cdw10, cmd->cdw11, cmd->cdw12, cmd->cdw13, cmd->cdw14, cmd->cdw15); } void nvme_dump_completion(struct nvme_completion *cpl) { printf("cdw0:%08x sqhd:%04x sqid:%04x " "cid:%04x p:%x sc:%02x sct:%x m:%x dnr:%x\n", cpl->cdw0, cpl->sqhd, cpl->sqid, cpl->cid, cpl->status.p, cpl->status.sc, cpl->status.sct, cpl->status.m, cpl->status.dnr); } static int nvme_attach(device_t dev) { struct nvme_controller *ctrlr = DEVICE2SOFTC(dev); int status; status = nvme_ctrlr_construct(ctrlr, dev); if (status != 0) { nvme_ctrlr_destruct(ctrlr, dev); return (status); } /* + * Enable busmastering so the completion status messages can + * be busmastered back to the host. + */ + pci_enable_busmaster(dev); + + /* * Reset controller twice to ensure we do a transition from cc.en==1 * to cc.en==0. This is because we don't really know what status * the controller was left in when boot handed off to OS. */ status = nvme_ctrlr_hw_reset(ctrlr); if (status != 0) { nvme_ctrlr_destruct(ctrlr, dev); return (status); } status = nvme_ctrlr_hw_reset(ctrlr); if (status != 0) { nvme_ctrlr_destruct(ctrlr, dev); return (status); } - - pci_enable_busmaster(dev); ctrlr->config_hook.ich_func = nvme_ctrlr_start_config_hook; ctrlr->config_hook.ich_arg = ctrlr; config_intrhook_establish(&ctrlr->config_hook); return (0); } static int nvme_detach (device_t dev) { struct nvme_controller *ctrlr = DEVICE2SOFTC(dev); nvme_ctrlr_destruct(ctrlr, dev); pci_disable_busmaster(dev); return (0); } static void nvme_notify(struct nvme_consumer *cons, struct nvme_controller *ctrlr) { struct nvme_namespace *ns; void *ctrlr_cookie; int cmpset, ns_idx; /* * The consumer may register itself after the nvme devices * have registered with the kernel, but before the * driver has completed initialization. In that case, * return here, and when initialization completes, the * controller will make sure the consumer gets notified. */ if (!ctrlr->is_initialized) return; cmpset = atomic_cmpset_32(&ctrlr->notification_sent, 0, 1); if (cmpset == 0) return; if (cons->ctrlr_fn != NULL) ctrlr_cookie = (*cons->ctrlr_fn)(ctrlr); else ctrlr_cookie = NULL; ctrlr->cons_cookie[cons->id] = ctrlr_cookie; if (ctrlr->is_failed) { if (cons->fail_fn != NULL) (*cons->fail_fn)(ctrlr_cookie); /* * Do not notify consumers about the namespaces of a * failed controller. */ return; } for (ns_idx = 0; ns_idx < min(ctrlr->cdata.nn, NVME_MAX_NAMESPACES); ns_idx++) { ns = &ctrlr->ns[ns_idx]; if (ns->data.nsze == 0) continue; if (cons->ns_fn != NULL) ns->cons_cookie[cons->id] = (*cons->ns_fn)(ns, ctrlr_cookie); } } void nvme_notify_new_controller(struct nvme_controller *ctrlr) { int i; for (i = 0; i < NVME_MAX_CONSUMERS; i++) { if (nvme_consumer[i].id != INVALID_CONSUMER_ID) { nvme_notify(&nvme_consumer[i], ctrlr); } } } static void nvme_notify_new_consumer(struct nvme_consumer *cons) { device_t *devlist; struct nvme_controller *ctrlr; int dev_idx, devcount; if (devclass_get_devices(nvme_devclass, &devlist, &devcount)) return; for (dev_idx = 0; dev_idx < devcount; dev_idx++) { ctrlr = DEVICE2SOFTC(devlist[dev_idx]); nvme_notify(cons, ctrlr); } free(devlist, M_TEMP); } void nvme_notify_async_consumers(struct nvme_controller *ctrlr, const struct nvme_completion *async_cpl, uint32_t log_page_id, void *log_page_buffer, uint32_t log_page_size) { struct nvme_consumer *cons; uint32_t i; for (i = 0; i < NVME_MAX_CONSUMERS; i++) { cons = &nvme_consumer[i]; if (cons->id != INVALID_CONSUMER_ID && cons->async_fn != NULL) (*cons->async_fn)(ctrlr->cons_cookie[i], async_cpl, log_page_id, log_page_buffer, log_page_size); } } void nvme_notify_fail_consumers(struct nvme_controller *ctrlr) { struct nvme_consumer *cons; uint32_t i; /* * This controller failed during initialization (i.e. IDENTIFY * command failed or timed out). Do not notify any nvme * consumers of the failure here, since the consumer does not * even know about the controller yet. */ if (!ctrlr->is_initialized) return; for (i = 0; i < NVME_MAX_CONSUMERS; i++) { cons = &nvme_consumer[i]; if (cons->id != INVALID_CONSUMER_ID && cons->fail_fn != NULL) cons->fail_fn(ctrlr->cons_cookie[i]); } } struct nvme_consumer * nvme_register_consumer(nvme_cons_ns_fn_t ns_fn, nvme_cons_ctrlr_fn_t ctrlr_fn, nvme_cons_async_fn_t async_fn, nvme_cons_fail_fn_t fail_fn) { int i; /* * TODO: add locking around consumer registration. Not an issue * right now since we only have one nvme consumer - nvd(4). */ for (i = 0; i < NVME_MAX_CONSUMERS; i++) if (nvme_consumer[i].id == INVALID_CONSUMER_ID) { nvme_consumer[i].id = i; nvme_consumer[i].ns_fn = ns_fn; nvme_consumer[i].ctrlr_fn = ctrlr_fn; nvme_consumer[i].async_fn = async_fn; nvme_consumer[i].fail_fn = fail_fn; nvme_notify_new_consumer(&nvme_consumer[i]); return (&nvme_consumer[i]); } printf("nvme(4): consumer not registered - no slots available\n"); return (NULL); } void nvme_unregister_consumer(struct nvme_consumer *consumer) { consumer->id = INVALID_CONSUMER_ID; } void nvme_completion_poll_cb(void *arg, const struct nvme_completion *cpl) { struct nvme_completion_poll_status *status = arg; /* * Copy status into the argument passed by the caller, so that * the caller can check the status to determine if the * the request passed or failed. */ memcpy(&status->cpl, cpl, sizeof(*cpl)); wmb(); status->done = TRUE; } Index: projects/runtime-coverage/sys/dev/nvme/nvme.h =================================================================== --- projects/runtime-coverage/sys/dev/nvme/nvme.h (revision 322921) +++ projects/runtime-coverage/sys/dev/nvme/nvme.h (revision 322922) @@ -1,1010 +1,1122 @@ /*- * Copyright (C) 2012-2013 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __NVME_H__ #define __NVME_H__ #ifdef _KERNEL #include #endif #include #define NVME_PASSTHROUGH_CMD _IOWR('n', 0, struct nvme_pt_command) #define NVME_RESET_CONTROLLER _IO('n', 1) #define NVME_IO_TEST _IOWR('n', 100, struct nvme_io_test) #define NVME_BIO_TEST _IOWR('n', 101, struct nvme_io_test) /* * Use to mark a command to apply to all namespaces, or to retrieve global * log pages. */ #define NVME_GLOBAL_NAMESPACE_TAG ((uint32_t)0xFFFFFFFF) /* Cap nvme to 1MB transfers driver explodes with larger sizes */ #define NVME_MAX_XFER_SIZE (MAXPHYS < (1<<20) ? MAXPHYS : (1<<20)) union cap_lo_register { uint32_t raw; struct { /** maximum queue entries supported */ uint32_t mqes : 16; /** contiguous queues required */ uint32_t cqr : 1; /** arbitration mechanism supported */ uint32_t ams : 2; uint32_t reserved1 : 5; /** timeout */ uint32_t to : 8; } bits __packed; } __packed; +_Static_assert(sizeof(union cap_lo_register) == 4, "bad size for cap_lo_register"); + union cap_hi_register { uint32_t raw; struct { /** doorbell stride */ uint32_t dstrd : 4; uint32_t reserved3 : 1; /** command sets supported */ uint32_t css_nvm : 1; uint32_t css_reserved : 3; uint32_t reserved2 : 7; /** memory page size minimum */ uint32_t mpsmin : 4; /** memory page size maximum */ uint32_t mpsmax : 4; uint32_t reserved1 : 8; } bits __packed; } __packed; +_Static_assert(sizeof(union cap_hi_register) == 4, "bad size of cap_hi_register"); + union cc_register { uint32_t raw; struct { /** enable */ uint32_t en : 1; uint32_t reserved1 : 3; /** i/o command set selected */ uint32_t css : 3; /** memory page size */ uint32_t mps : 4; /** arbitration mechanism selected */ uint32_t ams : 3; /** shutdown notification */ uint32_t shn : 2; /** i/o submission queue entry size */ uint32_t iosqes : 4; /** i/o completion queue entry size */ uint32_t iocqes : 4; uint32_t reserved2 : 8; } bits __packed; } __packed; +_Static_assert(sizeof(union cc_register) == 4, "bad size for cc_register"); + enum shn_value { NVME_SHN_NORMAL = 0x1, NVME_SHN_ABRUPT = 0x2, }; union csts_register { uint32_t raw; struct { /** ready */ uint32_t rdy : 1; /** controller fatal status */ uint32_t cfs : 1; /** shutdown status */ uint32_t shst : 2; uint32_t reserved1 : 28; } bits __packed; } __packed; +_Static_assert(sizeof(union csts_register) == 4, "bad size for csts_register"); + enum shst_value { NVME_SHST_NORMAL = 0x0, NVME_SHST_OCCURRING = 0x1, NVME_SHST_COMPLETE = 0x2, }; union aqa_register { uint32_t raw; struct { /** admin submission queue size */ uint32_t asqs : 12; uint32_t reserved1 : 4; /** admin completion queue size */ uint32_t acqs : 12; uint32_t reserved2 : 4; } bits __packed; } __packed; +_Static_assert(sizeof(union aqa_register) == 4, "bad size for aqa_resgister"); + struct nvme_registers { /** controller capabilities */ union cap_lo_register cap_lo; union cap_hi_register cap_hi; uint32_t vs; /* version */ uint32_t intms; /* interrupt mask set */ uint32_t intmc; /* interrupt mask clear */ /** controller configuration */ union cc_register cc; uint32_t reserved1; /** controller status */ union csts_register csts; uint32_t reserved2; /** admin queue attributes */ union aqa_register aqa; uint64_t asq; /* admin submission queue base addr */ uint64_t acq; /* admin completion queue base addr */ uint32_t reserved3[0x3f2]; struct { uint32_t sq_tdbl; /* submission queue tail doorbell */ uint32_t cq_hdbl; /* completion queue head doorbell */ } doorbell[1] __packed; } __packed; +_Static_assert(sizeof(struct nvme_registers) == 0x1008, "bad size for nvme_registers"); + struct nvme_command { /* dword 0 */ uint16_t opc : 8; /* opcode */ uint16_t fuse : 2; /* fused operation */ uint16_t rsvd1 : 6; uint16_t cid; /* command identifier */ /* dword 1 */ uint32_t nsid; /* namespace identifier */ /* dword 2-3 */ uint32_t rsvd2; uint32_t rsvd3; /* dword 4-5 */ uint64_t mptr; /* metadata pointer */ /* dword 6-7 */ uint64_t prp1; /* prp entry 1 */ /* dword 8-9 */ uint64_t prp2; /* prp entry 2 */ /* dword 10-15 */ uint32_t cdw10; /* command-specific */ uint32_t cdw11; /* command-specific */ uint32_t cdw12; /* command-specific */ uint32_t cdw13; /* command-specific */ uint32_t cdw14; /* command-specific */ uint32_t cdw15; /* command-specific */ } __packed; +_Static_assert(sizeof(struct nvme_command) == 16 * 4, "bad size for nvme_command"); + struct nvme_status { uint16_t p : 1; /* phase tag */ uint16_t sc : 8; /* status code */ uint16_t sct : 3; /* status code type */ uint16_t rsvd2 : 2; uint16_t m : 1; /* more */ uint16_t dnr : 1; /* do not retry */ } __packed; +_Static_assert(sizeof(struct nvme_status) == 2, "bad size for nvme_status"); + struct nvme_completion { /* dword 0 */ uint32_t cdw0; /* command-specific */ /* dword 1 */ uint32_t rsvd1; /* dword 2 */ uint16_t sqhd; /* submission queue head pointer */ uint16_t sqid; /* submission queue identifier */ /* dword 3 */ uint16_t cid; /* command identifier */ struct nvme_status status; } __packed; +_Static_assert(sizeof(struct nvme_completion) == 4 * 4, "bad size for nvme_completion"); + struct nvme_dsm_range { uint32_t attributes; uint32_t length; uint64_t starting_lba; } __packed; +_Static_assert(sizeof(struct nvme_dsm_range) == 16, "bad size for nvme_dsm_ranage"); + /* status code types */ enum nvme_status_code_type { NVME_SCT_GENERIC = 0x0, NVME_SCT_COMMAND_SPECIFIC = 0x1, NVME_SCT_MEDIA_ERROR = 0x2, /* 0x3-0x6 - reserved */ NVME_SCT_VENDOR_SPECIFIC = 0x7, }; /* generic command status codes */ enum nvme_generic_command_status_code { NVME_SC_SUCCESS = 0x00, NVME_SC_INVALID_OPCODE = 0x01, NVME_SC_INVALID_FIELD = 0x02, NVME_SC_COMMAND_ID_CONFLICT = 0x03, NVME_SC_DATA_TRANSFER_ERROR = 0x04, NVME_SC_ABORTED_POWER_LOSS = 0x05, NVME_SC_INTERNAL_DEVICE_ERROR = 0x06, NVME_SC_ABORTED_BY_REQUEST = 0x07, NVME_SC_ABORTED_SQ_DELETION = 0x08, NVME_SC_ABORTED_FAILED_FUSED = 0x09, NVME_SC_ABORTED_MISSING_FUSED = 0x0a, NVME_SC_INVALID_NAMESPACE_OR_FORMAT = 0x0b, NVME_SC_COMMAND_SEQUENCE_ERROR = 0x0c, NVME_SC_LBA_OUT_OF_RANGE = 0x80, NVME_SC_CAPACITY_EXCEEDED = 0x81, NVME_SC_NAMESPACE_NOT_READY = 0x82, }; /* command specific status codes */ enum nvme_command_specific_status_code { NVME_SC_COMPLETION_QUEUE_INVALID = 0x00, NVME_SC_INVALID_QUEUE_IDENTIFIER = 0x01, NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED = 0x02, NVME_SC_ABORT_COMMAND_LIMIT_EXCEEDED = 0x03, /* 0x04 - reserved */ NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED = 0x05, NVME_SC_INVALID_FIRMWARE_SLOT = 0x06, NVME_SC_INVALID_FIRMWARE_IMAGE = 0x07, NVME_SC_INVALID_INTERRUPT_VECTOR = 0x08, NVME_SC_INVALID_LOG_PAGE = 0x09, NVME_SC_INVALID_FORMAT = 0x0a, NVME_SC_FIRMWARE_REQUIRES_RESET = 0x0b, NVME_SC_CONFLICTING_ATTRIBUTES = 0x80, NVME_SC_INVALID_PROTECTION_INFO = 0x81, NVME_SC_ATTEMPTED_WRITE_TO_RO_PAGE = 0x82, }; /* media error status codes */ enum nvme_media_error_status_code { NVME_SC_WRITE_FAULTS = 0x80, NVME_SC_UNRECOVERED_READ_ERROR = 0x81, NVME_SC_GUARD_CHECK_ERROR = 0x82, NVME_SC_APPLICATION_TAG_CHECK_ERROR = 0x83, NVME_SC_REFERENCE_TAG_CHECK_ERROR = 0x84, NVME_SC_COMPARE_FAILURE = 0x85, NVME_SC_ACCESS_DENIED = 0x86, }; /* admin opcodes */ enum nvme_admin_opcode { NVME_OPC_DELETE_IO_SQ = 0x00, NVME_OPC_CREATE_IO_SQ = 0x01, NVME_OPC_GET_LOG_PAGE = 0x02, /* 0x03 - reserved */ NVME_OPC_DELETE_IO_CQ = 0x04, NVME_OPC_CREATE_IO_CQ = 0x05, NVME_OPC_IDENTIFY = 0x06, /* 0x07 - reserved */ NVME_OPC_ABORT = 0x08, NVME_OPC_SET_FEATURES = 0x09, NVME_OPC_GET_FEATURES = 0x0a, /* 0x0b - reserved */ NVME_OPC_ASYNC_EVENT_REQUEST = 0x0c, NVME_OPC_NAMESPACE_MANAGEMENT = 0x0d, /* 0x0e-0x0f - reserved */ NVME_OPC_FIRMWARE_ACTIVATE = 0x10, NVME_OPC_FIRMWARE_IMAGE_DOWNLOAD = 0x11, NVME_OPC_NAMESPACE_ATTACHMENT = 0x15, NVME_OPC_FORMAT_NVM = 0x80, NVME_OPC_SECURITY_SEND = 0x81, NVME_OPC_SECURITY_RECEIVE = 0x82, }; /* nvme nvm opcodes */ enum nvme_nvm_opcode { NVME_OPC_FLUSH = 0x00, NVME_OPC_WRITE = 0x01, NVME_OPC_READ = 0x02, /* 0x03 - reserved */ NVME_OPC_WRITE_UNCORRECTABLE = 0x04, NVME_OPC_COMPARE = 0x05, /* 0x06-0x07 - reserved */ NVME_OPC_DATASET_MANAGEMENT = 0x09, }; enum nvme_feature { /* 0x00 - reserved */ NVME_FEAT_ARBITRATION = 0x01, NVME_FEAT_POWER_MANAGEMENT = 0x02, NVME_FEAT_LBA_RANGE_TYPE = 0x03, NVME_FEAT_TEMPERATURE_THRESHOLD = 0x04, NVME_FEAT_ERROR_RECOVERY = 0x05, NVME_FEAT_VOLATILE_WRITE_CACHE = 0x06, NVME_FEAT_NUMBER_OF_QUEUES = 0x07, NVME_FEAT_INTERRUPT_COALESCING = 0x08, NVME_FEAT_INTERRUPT_VECTOR_CONFIGURATION = 0x09, NVME_FEAT_WRITE_ATOMICITY = 0x0A, NVME_FEAT_ASYNC_EVENT_CONFIGURATION = 0x0B, - /* 0x0C-0x7F - reserved */ + NVME_FEAT_AUTONOMOUS_POWER_STATE_TRANSITION = 0x0C, + NVME_FEAT_HOST_MEMORY_BUFFER = 0x0D, + NVME_FEAT_TIMESTAMP = 0x0E, + NVME_FEAT_KEEP_ALIVE_TIMER = 0x0F, + NVME_FEAT_HOST_CONTROLLED_THERMAL_MGMT = 0x10, + NVME_FEAT_NON_OP_POWER_STATE_CONFIG = 0x11, + /* 0x12-0x77 - reserved */ + /* 0x78-0x7f - NVMe Management Interface */ NVME_FEAT_SOFTWARE_PROGRESS_MARKER = 0x80, /* 0x81-0xBF - command set specific (reserved) */ /* 0xC0-0xFF - vendor specific */ }; enum nvme_dsm_attribute { NVME_DSM_ATTR_INTEGRAL_READ = 0x1, NVME_DSM_ATTR_INTEGRAL_WRITE = 0x2, NVME_DSM_ATTR_DEALLOCATE = 0x4, }; enum nvme_activate_action { NVME_AA_REPLACE_NO_ACTIVATE = 0x0, NVME_AA_REPLACE_ACTIVATE = 0x1, NVME_AA_ACTIVATE = 0x2, }; struct nvme_power_state { /** Maximum Power */ uint16_t mp; /* Maximum Power */ uint8_t ps_rsvd1; uint8_t mps : 1; /* Max Power Scale */ uint8_t nops : 1; /* Non-Operational State */ uint8_t ps_rsvd2 : 6; uint32_t enlat; /* Entry Latency */ uint32_t exlat; /* Exit Latency */ uint8_t rrt : 5; /* Relative Read Throughput */ uint8_t ps_rsvd3 : 3; uint8_t rrl : 5; /* Relative Read Latency */ uint8_t ps_rsvd4 : 3; uint8_t rwt : 5; /* Relative Write Throughput */ uint8_t ps_rsvd5 : 3; uint8_t rwl : 5; /* Relative Write Latency */ uint8_t ps_rsvd6 : 3; uint16_t idlp; /* Idle Power */ uint8_t ps_rsvd7 : 6; uint8_t ips : 2; /* Idle Power Scale */ uint8_t ps_rsvd8; uint16_t actp; /* Active Power */ uint8_t apw : 3; /* Active Power Workload */ uint8_t ps_rsvd9 : 3; uint8_t aps : 2; /* Active Power Scale */ uint8_t ps_rsvd10[9]; } __packed; +_Static_assert(sizeof(struct nvme_power_state) == 32, "bad size for nvme_power_state"); + #define NVME_SERIAL_NUMBER_LENGTH 20 #define NVME_MODEL_NUMBER_LENGTH 40 #define NVME_FIRMWARE_REVISION_LENGTH 8 struct nvme_controller_data { /* bytes 0-255: controller capabilities and features */ /** pci vendor id */ uint16_t vid; /** pci subsystem vendor id */ uint16_t ssvid; /** serial number */ uint8_t sn[NVME_SERIAL_NUMBER_LENGTH]; /** model number */ uint8_t mn[NVME_MODEL_NUMBER_LENGTH]; /** firmware revision */ uint8_t fr[NVME_FIRMWARE_REVISION_LENGTH]; /** recommended arbitration burst */ uint8_t rab; /** ieee oui identifier */ uint8_t ieee[3]; /** multi-interface capabilities */ uint8_t mic; /** maximum data transfer size */ uint8_t mdts; /** Controller ID */ uint16_t ctrlr_id; - uint8_t reserved1[176]; + /** Version */ + uint32_t ver; + /** RTD3 Resume Latency */ + uint32_t rtd3r; + + /** RTD3 Enter Latency */ + uint32_t rtd3e; + + /** Optional Asynchronous Events Supported */ + uint32_t oaes; /* bitfield really */ + + /** Controller Attributes */ + uint32_t ctratt; /* bitfield really */ + + uint8_t reserved1[12]; + + /** FRU Globally Unique Identifier */ + uint8_t fguid[16]; + + uint8_t reserved2[128]; + /* bytes 256-511: admin command set attributes */ /** optional admin command support */ struct { /* supports security send/receive commands */ uint16_t security : 1; /* supports format nvm command */ uint16_t format : 1; /* supports firmware activate/download commands */ uint16_t firmware : 1; /* supports namespace management commands */ uint16_t nsmgmt : 1; uint16_t oacs_rsvd : 12; } __packed oacs; /** abort command limit */ uint8_t acl; /** asynchronous event request limit */ uint8_t aerl; /** firmware updates */ struct { /* first slot is read-only */ uint8_t slot1_ro : 1; /* number of firmware slots */ uint8_t num_slots : 3; uint8_t frmw_rsvd : 4; } __packed frmw; /** log page attributes */ struct { /* per namespace smart/health log page */ uint8_t ns_smart : 1; uint8_t lpa_rsvd : 7; } __packed lpa; /** error log page entries */ uint8_t elpe; /** number of power states supported */ uint8_t npss; /** admin vendor specific command configuration */ struct { /* admin vendor specific commands use spec format */ uint8_t spec_format : 1; uint8_t avscc_rsvd : 7; } __packed avscc; - uint8_t reserved2[15]; + /** Autonomous Power State Transition Attributes */ + struct { + /* Autonmous Power State Transitions supported */ + uint8_t apst_supp : 1; + uint8_t apsta_rsvd : 7; + } __packed apsta; + + /** Warning Composite Temperature Threshold */ + uint16_t wctemp; + + /** Critical Composite Temperature Threshold */ + uint16_t cctemp; + + /** Maximum Time for Firmware Activation */ + uint16_t mtfa; + + /** Host Memory Buffer Preferred Size */ + uint32_t hmpre; + + /** Host Memory Buffer Minimum Size */ + uint32_t hmmin; + /** Name space capabilities */ struct { /* if nsmgmt, report tnvmcap and unvmcap */ uint8_t tnvmcap[16]; uint8_t unvmcap[16]; } __packed untncap; - uint8_t reserved3[200]; + /** Replay Protected Memory Block Support */ + uint32_t rpmbs; /* Really a bitfield */ + + /** Extended Device Self-test Time */ + uint16_t edstt; + + /** Device Self-test Options */ + uint8_t dsto; /* Really a bitfield */ + + /** Firmware Update Granularity */ + uint8_t fwug; + + /** Keep Alive Support */ + uint16_t kas; + + /** Host Controlled Thermal Management Attributes */ + uint16_t hctma; /* Really a bitfield */ + + /** Minimum Thermal Management Temperature */ + uint16_t mntmt; + + /** Maximum Thermal Management Temperature */ + uint16_t mxtmt; + + /** Sanitize Capabilities */ + uint32_t sanicap; /* Really a bitfield */ + + uint8_t reserved3[180]; /* bytes 512-703: nvm command set attributes */ /** submission queue entry size */ struct { uint8_t min : 4; uint8_t max : 4; } __packed sqes; /** completion queue entry size */ struct { uint8_t min : 4; uint8_t max : 4; } __packed cqes; - uint8_t reserved4[2]; + /** Maximum Outstanding Commands */ + uint16_t maxcmd; /** number of namespaces */ uint32_t nn; /** optional nvm command support */ struct { uint16_t compare : 1; uint16_t write_unc : 1; uint16_t dsm: 1; uint16_t reserved: 13; } __packed oncs; /** fused operation support */ uint16_t fuses; /** format nvm attributes */ uint8_t fna; /** volatile write cache */ struct { uint8_t present : 1; uint8_t reserved : 7; } __packed vwc; /* TODO: flesh out remaining nvm command set attributes */ uint8_t reserved5[178]; /* bytes 704-2047: i/o command set attributes */ uint8_t reserved6[1344]; /* bytes 2048-3071: power state descriptors */ struct nvme_power_state power_state[32]; /* bytes 3072-4095: vendor specific */ uint8_t vs[1024]; } __packed __aligned(4); +_Static_assert(sizeof(struct nvme_controller_data) == 4096, "bad size for nvme_controller_data"); + struct nvme_namespace_data { /** namespace size */ uint64_t nsze; /** namespace capacity */ uint64_t ncap; /** namespace utilization */ uint64_t nuse; /** namespace features */ struct { /** thin provisioning */ uint8_t thin_prov : 1; uint8_t reserved1 : 7; } __packed nsfeat; /** number of lba formats */ uint8_t nlbaf; /** formatted lba size */ struct { uint8_t format : 4; uint8_t extended : 1; uint8_t reserved2 : 3; } __packed flbas; /** metadata capabilities */ struct { /* metadata can be transferred as part of data prp list */ uint8_t extended : 1; /* metadata can be transferred with separate metadata pointer */ uint8_t pointer : 1; uint8_t reserved3 : 6; } __packed mc; /** end-to-end data protection capabilities */ struct { /* protection information type 1 */ uint8_t pit1 : 1; /* protection information type 2 */ uint8_t pit2 : 1; /* protection information type 3 */ uint8_t pit3 : 1; /* first eight bytes of metadata */ uint8_t md_start : 1; /* last eight bytes of metadata */ uint8_t md_end : 1; } __packed dpc; /** end-to-end data protection type settings */ struct { /* protection information type */ uint8_t pit : 3; /* 1 == protection info transferred at start of metadata */ /* 0 == protection info transferred at end of metadata */ uint8_t md_start : 1; uint8_t reserved4 : 4; } __packed dps; uint8_t reserved5[98]; /** lba format support */ struct { /** metadata size */ uint32_t ms : 16; /** lba data size */ uint32_t lbads : 8; /** relative performance */ uint32_t rp : 2; uint32_t reserved6 : 6; } __packed lbaf[16]; uint8_t reserved6[192]; uint8_t vendor_specific[3712]; } __packed __aligned(4); +_Static_assert(sizeof(struct nvme_namespace_data) == 4096, "bad size for nvme_namepsace_data"); + enum nvme_log_page { /* 0x00 - reserved */ NVME_LOG_ERROR = 0x01, NVME_LOG_HEALTH_INFORMATION = 0x02, NVME_LOG_FIRMWARE_SLOT = 0x03, NVME_LOG_CHANGED_NAMESPACE = 0x04, NVME_LOG_COMMAND_EFFECT = 0x05, /* 0x06-0x7F - reserved */ /* 0x80-0xBF - I/O command set specific */ NVME_LOG_RES_NOTIFICATION = 0x80, /* 0xC0-0xFF - vendor specific */ /* * The following are Intel Specific log pages, but they seem * to be widely implemented. */ INTEL_LOG_READ_LAT_LOG = 0xc1, INTEL_LOG_WRITE_LAT_LOG = 0xc2, INTEL_LOG_TEMP_STATS = 0xc5, INTEL_LOG_ADD_SMART = 0xca, INTEL_LOG_DRIVE_MKT_NAME = 0xdd, /* * HGST log page, with lots ofs sub pages. */ HGST_INFO_LOG = 0xc1, }; struct nvme_error_information_entry { uint64_t error_count; uint16_t sqid; uint16_t cid; struct nvme_status status; uint16_t error_location; uint64_t lba; uint32_t nsid; uint8_t vendor_specific; uint8_t reserved[35]; } __packed __aligned(4); +_Static_assert(sizeof(struct nvme_error_information_entry) == 64, "bad size for nvme_error_information_entry"); + union nvme_critical_warning_state { uint8_t raw; struct { uint8_t available_spare : 1; uint8_t temperature : 1; uint8_t device_reliability : 1; uint8_t read_only : 1; uint8_t volatile_memory_backup : 1; uint8_t reserved : 3; } __packed bits; } __packed; +_Static_assert(sizeof(union nvme_critical_warning_state) == 1, "bad size for nvme_critical_warning_state"); + struct nvme_health_information_page { union nvme_critical_warning_state critical_warning; uint16_t temperature; uint8_t available_spare; uint8_t available_spare_threshold; uint8_t percentage_used; uint8_t reserved[26]; /* * Note that the following are 128-bit values, but are * defined as an array of 2 64-bit values. */ /* Data Units Read is always in 512-byte units. */ uint64_t data_units_read[2]; /* Data Units Written is always in 512-byte units. */ uint64_t data_units_written[2]; /* For NVM command set, this includes Compare commands. */ uint64_t host_read_commands[2]; uint64_t host_write_commands[2]; /* Controller Busy Time is reported in minutes. */ uint64_t controller_busy_time[2]; uint64_t power_cycles[2]; uint64_t power_on_hours[2]; uint64_t unsafe_shutdowns[2]; uint64_t media_errors[2]; uint64_t num_error_info_log_entries[2]; uint32_t warning_temp_time; uint32_t error_temp_time; uint16_t temp_sensor[8]; uint8_t reserved2[296]; } __packed __aligned(4); +_Static_assert(sizeof(struct nvme_health_information_page) == 512, "bad size for nvme_health_information_page"); + struct nvme_firmware_page { struct { uint8_t slot : 3; /* slot for current FW */ uint8_t reserved : 5; } __packed afi; uint8_t reserved[7]; uint64_t revision[7]; /* revisions for 7 slots */ uint8_t reserved2[448]; } __packed __aligned(4); +_Static_assert(sizeof(struct nvme_firmware_page) == 512, "bad size for nvme_firmware_page"); + struct intel_log_temp_stats { uint64_t current; uint64_t overtemp_flag_last; uint64_t overtemp_flag_life; uint64_t max_temp; uint64_t min_temp; uint64_t _rsvd[5]; uint64_t max_oper_temp; uint64_t min_oper_temp; uint64_t est_offset; } __packed __aligned(4); + +_Static_assert(sizeof(struct intel_log_temp_stats) == 13 * 8, "bad size for intel_log_temp_stats"); #define NVME_TEST_MAX_THREADS 128 struct nvme_io_test { enum nvme_nvm_opcode opc; uint32_t size; uint32_t time; /* in seconds */ uint32_t num_threads; uint32_t flags; uint64_t io_completed[NVME_TEST_MAX_THREADS]; }; enum nvme_io_test_flags { /* * Specifies whether dev_refthread/dev_relthread should be * called during NVME_BIO_TEST. Ignored for other test * types. */ NVME_TEST_FLAG_REFTHREAD = 0x1, }; struct nvme_pt_command { /* * cmd is used to specify a passthrough command to a controller or * namespace. * * The following fields from cmd may be specified by the caller: * * opc (opcode) * * nsid (namespace id) - for admin commands only * * cdw10-cdw15 * * Remaining fields must be set to 0 by the caller. */ struct nvme_command cmd; /* * cpl returns completion status for the passthrough command * specified by cmd. * * The following fields will be filled out by the driver, for * consumption by the caller: * * cdw0 * * status (except for phase) * * Remaining fields will be set to 0 by the driver. */ struct nvme_completion cpl; /* buf is the data buffer associated with this passthrough command. */ void * buf; /* * len is the length of the data buffer associated with this * passthrough command. */ uint32_t len; /* * is_read = 1 if the passthrough command will read data into the * supplied buffer from the controller. * * is_read = 0 if the passthrough command will write data from the * supplied buffer to the controller. */ uint32_t is_read; /* * driver_lock is used by the driver only. It must be set to 0 * by the caller. */ struct mtx * driver_lock; }; #define nvme_completion_is_error(cpl) \ ((cpl)->status.sc != 0 || (cpl)->status.sct != 0) void nvme_strvis(uint8_t *dst, const uint8_t *src, int dstlen, int srclen); #ifdef _KERNEL struct bio; struct nvme_namespace; struct nvme_controller; struct nvme_consumer; typedef void (*nvme_cb_fn_t)(void *, const struct nvme_completion *); typedef void *(*nvme_cons_ns_fn_t)(struct nvme_namespace *, void *); typedef void *(*nvme_cons_ctrlr_fn_t)(struct nvme_controller *); typedef void (*nvme_cons_async_fn_t)(void *, const struct nvme_completion *, uint32_t, void *, uint32_t); typedef void (*nvme_cons_fail_fn_t)(void *); enum nvme_namespace_flags { NVME_NS_DEALLOCATE_SUPPORTED = 0x1, NVME_NS_FLUSH_SUPPORTED = 0x2, }; int nvme_ctrlr_passthrough_cmd(struct nvme_controller *ctrlr, struct nvme_pt_command *pt, uint32_t nsid, int is_user_buffer, int is_admin_cmd); /* Admin functions */ void nvme_ctrlr_cmd_set_feature(struct nvme_controller *ctrlr, uint8_t feature, uint32_t cdw11, void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_feature(struct nvme_controller *ctrlr, uint8_t feature, uint32_t cdw11, void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_log_page(struct nvme_controller *ctrlr, uint8_t log_page, uint32_t nsid, void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg); /* NVM I/O functions */ int nvme_ns_cmd_write(struct nvme_namespace *ns, void *payload, uint64_t lba, uint32_t lba_count, nvme_cb_fn_t cb_fn, void *cb_arg); int nvme_ns_cmd_write_bio(struct nvme_namespace *ns, struct bio *bp, nvme_cb_fn_t cb_fn, void *cb_arg); int nvme_ns_cmd_read(struct nvme_namespace *ns, void *payload, uint64_t lba, uint32_t lba_count, nvme_cb_fn_t cb_fn, void *cb_arg); int nvme_ns_cmd_read_bio(struct nvme_namespace *ns, struct bio *bp, nvme_cb_fn_t cb_fn, void *cb_arg); int nvme_ns_cmd_deallocate(struct nvme_namespace *ns, void *payload, uint8_t num_ranges, nvme_cb_fn_t cb_fn, void *cb_arg); int nvme_ns_cmd_flush(struct nvme_namespace *ns, nvme_cb_fn_t cb_fn, void *cb_arg); int nvme_ns_dump(struct nvme_namespace *ns, void *virt, off_t offset, size_t len); /* Registration functions */ struct nvme_consumer * nvme_register_consumer(nvme_cons_ns_fn_t ns_fn, nvme_cons_ctrlr_fn_t ctrlr_fn, nvme_cons_async_fn_t async_fn, nvme_cons_fail_fn_t fail_fn); void nvme_unregister_consumer(struct nvme_consumer *consumer); /* Controller helper functions */ device_t nvme_ctrlr_get_device(struct nvme_controller *ctrlr); const struct nvme_controller_data * nvme_ctrlr_get_data(struct nvme_controller *ctrlr); /* Namespace helper functions */ uint32_t nvme_ns_get_max_io_xfer_size(struct nvme_namespace *ns); uint32_t nvme_ns_get_sector_size(struct nvme_namespace *ns); uint64_t nvme_ns_get_num_sectors(struct nvme_namespace *ns); uint64_t nvme_ns_get_size(struct nvme_namespace *ns); uint32_t nvme_ns_get_flags(struct nvme_namespace *ns); const char * nvme_ns_get_serial_number(struct nvme_namespace *ns); const char * nvme_ns_get_model_number(struct nvme_namespace *ns); const struct nvme_namespace_data * nvme_ns_get_data(struct nvme_namespace *ns); uint32_t nvme_ns_get_stripesize(struct nvme_namespace *ns); int nvme_ns_bio_process(struct nvme_namespace *ns, struct bio *bp, nvme_cb_fn_t cb_fn); /* Command building helper functions -- shared with CAM */ static inline void nvme_ns_flush_cmd(struct nvme_command *cmd, uint16_t nsid) { cmd->opc = NVME_OPC_FLUSH; cmd->nsid = nsid; } static inline void nvme_ns_rw_cmd(struct nvme_command *cmd, uint32_t rwcmd, uint16_t nsid, uint64_t lba, uint32_t count) { cmd->opc = rwcmd; cmd->nsid = nsid; cmd->cdw10 = lba & 0xffffffffu; cmd->cdw11 = lba >> 32; cmd->cdw12 = count-1; cmd->cdw13 = 0; cmd->cdw14 = 0; cmd->cdw15 = 0; } static inline void nvme_ns_write_cmd(struct nvme_command *cmd, uint16_t nsid, uint64_t lba, uint32_t count) { nvme_ns_rw_cmd(cmd, NVME_OPC_WRITE, nsid, lba, count); } static inline void nvme_ns_read_cmd(struct nvme_command *cmd, uint16_t nsid, uint64_t lba, uint32_t count) { nvme_ns_rw_cmd(cmd, NVME_OPC_READ, nsid, lba, count); } static inline void nvme_ns_trim_cmd(struct nvme_command *cmd, uint16_t nsid, uint32_t num_ranges) { cmd->opc = NVME_OPC_DATASET_MANAGEMENT; cmd->nsid = nsid; cmd->cdw10 = num_ranges - 1; cmd->cdw11 = NVME_DSM_ATTR_DEALLOCATE; } extern int nvme_use_nvd; #endif /* _KERNEL */ #endif /* __NVME_H__ */ Index: projects/runtime-coverage/sys/dev/nvme/nvme_ctrlr.c =================================================================== --- projects/runtime-coverage/sys/dev/nvme/nvme_ctrlr.c (revision 322921) +++ projects/runtime-coverage/sys/dev/nvme/nvme_ctrlr.c (revision 322922) @@ -1,1246 +1,1246 @@ /*- * Copyright (C) 2012-2016 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_cam.h" #include #include #include #include #include #include #include #include #include #include #include #include "nvme_private.h" static void nvme_ctrlr_construct_and_submit_aer(struct nvme_controller *ctrlr, struct nvme_async_event_request *aer); static void nvme_ctrlr_setup_interrupts(struct nvme_controller *ctrlr); static int nvme_ctrlr_allocate_bar(struct nvme_controller *ctrlr) { ctrlr->resource_id = PCIR_BAR(0); ctrlr->resource = bus_alloc_resource_any(ctrlr->dev, SYS_RES_MEMORY, &ctrlr->resource_id, RF_ACTIVE); if(ctrlr->resource == NULL) { nvme_printf(ctrlr, "unable to allocate pci resource\n"); return (ENOMEM); } ctrlr->bus_tag = rman_get_bustag(ctrlr->resource); ctrlr->bus_handle = rman_get_bushandle(ctrlr->resource); ctrlr->regs = (struct nvme_registers *)ctrlr->bus_handle; /* * The NVMe spec allows for the MSI-X table to be placed behind * BAR 4/5, separate from the control/doorbell registers. Always * try to map this bar, because it must be mapped prior to calling * pci_alloc_msix(). If the table isn't behind BAR 4/5, * bus_alloc_resource() will just return NULL which is OK. */ ctrlr->bar4_resource_id = PCIR_BAR(4); ctrlr->bar4_resource = bus_alloc_resource_any(ctrlr->dev, SYS_RES_MEMORY, &ctrlr->bar4_resource_id, RF_ACTIVE); return (0); } static int nvme_ctrlr_construct_admin_qpair(struct nvme_controller *ctrlr) { struct nvme_qpair *qpair; uint32_t num_entries; int error; qpair = &ctrlr->adminq; num_entries = NVME_ADMIN_ENTRIES; TUNABLE_INT_FETCH("hw.nvme.admin_entries", &num_entries); /* * If admin_entries was overridden to an invalid value, revert it * back to our default value. */ if (num_entries < NVME_MIN_ADMIN_ENTRIES || num_entries > NVME_MAX_ADMIN_ENTRIES) { nvme_printf(ctrlr, "invalid hw.nvme.admin_entries=%d " "specified\n", num_entries); num_entries = NVME_ADMIN_ENTRIES; } /* * The admin queue's max xfer size is treated differently than the * max I/O xfer size. 16KB is sufficient here - maybe even less? */ error = nvme_qpair_construct(qpair, 0, /* qpair ID */ 0, /* vector */ num_entries, NVME_ADMIN_TRACKERS, ctrlr); return (error); } static int nvme_ctrlr_construct_io_qpairs(struct nvme_controller *ctrlr) { struct nvme_qpair *qpair; union cap_lo_register cap_lo; int i, error, num_entries, num_trackers; num_entries = NVME_IO_ENTRIES; TUNABLE_INT_FETCH("hw.nvme.io_entries", &num_entries); /* * NVMe spec sets a hard limit of 64K max entries, but * devices may specify a smaller limit, so we need to check * the MQES field in the capabilities register. */ cap_lo.raw = nvme_mmio_read_4(ctrlr, cap_lo); num_entries = min(num_entries, cap_lo.bits.mqes+1); num_trackers = NVME_IO_TRACKERS; TUNABLE_INT_FETCH("hw.nvme.io_trackers", &num_trackers); num_trackers = max(num_trackers, NVME_MIN_IO_TRACKERS); num_trackers = min(num_trackers, NVME_MAX_IO_TRACKERS); /* * No need to have more trackers than entries in the submit queue. * Note also that for a queue size of N, we can only have (N-1) * commands outstanding, hence the "-1" here. */ num_trackers = min(num_trackers, (num_entries-1)); /* * This was calculated previously when setting up interrupts, but * a controller could theoretically support fewer I/O queues than * MSI-X vectors. So calculate again here just to be safe. */ ctrlr->num_cpus_per_ioq = howmany(mp_ncpus, ctrlr->num_io_queues); ctrlr->ioq = malloc(ctrlr->num_io_queues * sizeof(struct nvme_qpair), M_NVME, M_ZERO | M_WAITOK); for (i = 0; i < ctrlr->num_io_queues; i++) { qpair = &ctrlr->ioq[i]; /* * Admin queue has ID=0. IO queues start at ID=1 - * hence the 'i+1' here. * * For I/O queues, use the controller-wide max_xfer_size * calculated in nvme_attach(). */ error = nvme_qpair_construct(qpair, i+1, /* qpair ID */ ctrlr->msix_enabled ? i+1 : 0, /* vector */ num_entries, num_trackers, ctrlr); if (error) return (error); /* * Do not bother binding interrupts if we only have one I/O * interrupt thread for this controller. */ if (ctrlr->num_io_queues > 1) bus_bind_intr(ctrlr->dev, qpair->res, i * ctrlr->num_cpus_per_ioq); } return (0); } static void nvme_ctrlr_fail(struct nvme_controller *ctrlr) { int i; ctrlr->is_failed = TRUE; nvme_qpair_fail(&ctrlr->adminq); if (ctrlr->ioq != NULL) { for (i = 0; i < ctrlr->num_io_queues; i++) nvme_qpair_fail(&ctrlr->ioq[i]); } nvme_notify_fail_consumers(ctrlr); } void nvme_ctrlr_post_failed_request(struct nvme_controller *ctrlr, struct nvme_request *req) { mtx_lock(&ctrlr->lock); STAILQ_INSERT_TAIL(&ctrlr->fail_req, req, stailq); mtx_unlock(&ctrlr->lock); taskqueue_enqueue(ctrlr->taskqueue, &ctrlr->fail_req_task); } static void nvme_ctrlr_fail_req_task(void *arg, int pending) { struct nvme_controller *ctrlr = arg; struct nvme_request *req; mtx_lock(&ctrlr->lock); while (!STAILQ_EMPTY(&ctrlr->fail_req)) { req = STAILQ_FIRST(&ctrlr->fail_req); STAILQ_REMOVE_HEAD(&ctrlr->fail_req, stailq); nvme_qpair_manual_complete_request(req->qpair, req, NVME_SCT_GENERIC, NVME_SC_ABORTED_BY_REQUEST, TRUE); } mtx_unlock(&ctrlr->lock); } static int nvme_ctrlr_wait_for_ready(struct nvme_controller *ctrlr, int desired_val) { int ms_waited; union cc_register cc; union csts_register csts; cc.raw = nvme_mmio_read_4(ctrlr, cc); csts.raw = nvme_mmio_read_4(ctrlr, csts); if (cc.bits.en != desired_val) { nvme_printf(ctrlr, "%s called with desired_val = %d " "but cc.en = %d\n", __func__, desired_val, cc.bits.en); return (ENXIO); } ms_waited = 0; while (csts.bits.rdy != desired_val) { DELAY(1000); if (ms_waited++ > ctrlr->ready_timeout_in_ms) { nvme_printf(ctrlr, "controller ready did not become %d " "within %d ms\n", desired_val, ctrlr->ready_timeout_in_ms); return (ENXIO); } csts.raw = nvme_mmio_read_4(ctrlr, csts); } return (0); } static void nvme_ctrlr_disable(struct nvme_controller *ctrlr) { union cc_register cc; union csts_register csts; cc.raw = nvme_mmio_read_4(ctrlr, cc); csts.raw = nvme_mmio_read_4(ctrlr, csts); if (cc.bits.en == 1 && csts.bits.rdy == 0) nvme_ctrlr_wait_for_ready(ctrlr, 1); cc.bits.en = 0; nvme_mmio_write_4(ctrlr, cc, cc.raw); DELAY(5000); nvme_ctrlr_wait_for_ready(ctrlr, 0); } static int nvme_ctrlr_enable(struct nvme_controller *ctrlr) { union cc_register cc; union csts_register csts; union aqa_register aqa; cc.raw = nvme_mmio_read_4(ctrlr, cc); csts.raw = nvme_mmio_read_4(ctrlr, csts); if (cc.bits.en == 1) { if (csts.bits.rdy == 1) return (0); else return (nvme_ctrlr_wait_for_ready(ctrlr, 1)); } nvme_mmio_write_8(ctrlr, asq, ctrlr->adminq.cmd_bus_addr); DELAY(5000); nvme_mmio_write_8(ctrlr, acq, ctrlr->adminq.cpl_bus_addr); DELAY(5000); aqa.raw = 0; /* acqs and asqs are 0-based. */ aqa.bits.acqs = ctrlr->adminq.num_entries-1; aqa.bits.asqs = ctrlr->adminq.num_entries-1; nvme_mmio_write_4(ctrlr, aqa, aqa.raw); DELAY(5000); cc.bits.en = 1; cc.bits.css = 0; cc.bits.ams = 0; cc.bits.shn = 0; cc.bits.iosqes = 6; /* SQ entry size == 64 == 2^6 */ cc.bits.iocqes = 4; /* CQ entry size == 16 == 2^4 */ /* This evaluates to 0, which is according to spec. */ cc.bits.mps = (PAGE_SIZE >> 13); nvme_mmio_write_4(ctrlr, cc, cc.raw); DELAY(5000); return (nvme_ctrlr_wait_for_ready(ctrlr, 1)); } int nvme_ctrlr_hw_reset(struct nvme_controller *ctrlr) { int i; nvme_admin_qpair_disable(&ctrlr->adminq); /* * I/O queues are not allocated before the initial HW * reset, so do not try to disable them. Use is_initialized * to determine if this is the initial HW reset. */ if (ctrlr->is_initialized) { for (i = 0; i < ctrlr->num_io_queues; i++) nvme_io_qpair_disable(&ctrlr->ioq[i]); } DELAY(100*1000); nvme_ctrlr_disable(ctrlr); return (nvme_ctrlr_enable(ctrlr)); } void nvme_ctrlr_reset(struct nvme_controller *ctrlr) { int cmpset; cmpset = atomic_cmpset_32(&ctrlr->is_resetting, 0, 1); if (cmpset == 0 || ctrlr->is_failed) /* * Controller is already resetting or has failed. Return * immediately since there is no need to kick off another * reset in these cases. */ return; taskqueue_enqueue(ctrlr->taskqueue, &ctrlr->reset_task); } static int nvme_ctrlr_identify(struct nvme_controller *ctrlr) { struct nvme_completion_poll_status status; status.done = FALSE; nvme_ctrlr_cmd_identify_controller(ctrlr, &ctrlr->cdata, nvme_completion_poll_cb, &status); while (status.done == FALSE) pause("nvme", 1); if (nvme_completion_is_error(&status.cpl)) { nvme_printf(ctrlr, "nvme_identify_controller failed!\n"); return (ENXIO); } /* * Use MDTS to ensure our default max_xfer_size doesn't exceed what the * controller supports. */ if (ctrlr->cdata.mdts > 0) ctrlr->max_xfer_size = min(ctrlr->max_xfer_size, ctrlr->min_page_size * (1 << (ctrlr->cdata.mdts))); return (0); } static int nvme_ctrlr_set_num_qpairs(struct nvme_controller *ctrlr) { struct nvme_completion_poll_status status; int cq_allocated, sq_allocated; status.done = FALSE; nvme_ctrlr_cmd_set_num_queues(ctrlr, ctrlr->num_io_queues, nvme_completion_poll_cb, &status); while (status.done == FALSE) pause("nvme", 1); if (nvme_completion_is_error(&status.cpl)) { nvme_printf(ctrlr, "nvme_ctrlr_set_num_qpairs failed!\n"); return (ENXIO); } /* * Data in cdw0 is 0-based. * Lower 16-bits indicate number of submission queues allocated. * Upper 16-bits indicate number of completion queues allocated. */ sq_allocated = (status.cpl.cdw0 & 0xFFFF) + 1; cq_allocated = (status.cpl.cdw0 >> 16) + 1; /* * Controller may allocate more queues than we requested, * so use the minimum of the number requested and what was * actually allocated. */ ctrlr->num_io_queues = min(ctrlr->num_io_queues, sq_allocated); ctrlr->num_io_queues = min(ctrlr->num_io_queues, cq_allocated); return (0); } static int nvme_ctrlr_create_qpairs(struct nvme_controller *ctrlr) { struct nvme_completion_poll_status status; struct nvme_qpair *qpair; int i; for (i = 0; i < ctrlr->num_io_queues; i++) { qpair = &ctrlr->ioq[i]; status.done = FALSE; nvme_ctrlr_cmd_create_io_cq(ctrlr, qpair, qpair->vector, nvme_completion_poll_cb, &status); while (status.done == FALSE) pause("nvme", 1); if (nvme_completion_is_error(&status.cpl)) { nvme_printf(ctrlr, "nvme_create_io_cq failed!\n"); return (ENXIO); } status.done = FALSE; nvme_ctrlr_cmd_create_io_sq(qpair->ctrlr, qpair, nvme_completion_poll_cb, &status); while (status.done == FALSE) pause("nvme", 1); if (nvme_completion_is_error(&status.cpl)) { nvme_printf(ctrlr, "nvme_create_io_sq failed!\n"); return (ENXIO); } } return (0); } static int nvme_ctrlr_construct_namespaces(struct nvme_controller *ctrlr) { struct nvme_namespace *ns; - int i; + uint32_t i; for (i = 0; i < min(ctrlr->cdata.nn, NVME_MAX_NAMESPACES); i++) { ns = &ctrlr->ns[i]; nvme_ns_construct(ns, i+1, ctrlr); } return (0); } static boolean_t is_log_page_id_valid(uint8_t page_id) { switch (page_id) { case NVME_LOG_ERROR: case NVME_LOG_HEALTH_INFORMATION: case NVME_LOG_FIRMWARE_SLOT: return (TRUE); } return (FALSE); } static uint32_t nvme_ctrlr_get_log_page_size(struct nvme_controller *ctrlr, uint8_t page_id) { uint32_t log_page_size; switch (page_id) { case NVME_LOG_ERROR: log_page_size = min( sizeof(struct nvme_error_information_entry) * ctrlr->cdata.elpe, NVME_MAX_AER_LOG_SIZE); break; case NVME_LOG_HEALTH_INFORMATION: log_page_size = sizeof(struct nvme_health_information_page); break; case NVME_LOG_FIRMWARE_SLOT: log_page_size = sizeof(struct nvme_firmware_page); break; default: log_page_size = 0; break; } return (log_page_size); } static void nvme_ctrlr_log_critical_warnings(struct nvme_controller *ctrlr, union nvme_critical_warning_state state) { if (state.bits.available_spare == 1) nvme_printf(ctrlr, "available spare space below threshold\n"); if (state.bits.temperature == 1) nvme_printf(ctrlr, "temperature above threshold\n"); if (state.bits.device_reliability == 1) nvme_printf(ctrlr, "device reliability degraded\n"); if (state.bits.read_only == 1) nvme_printf(ctrlr, "media placed in read only mode\n"); if (state.bits.volatile_memory_backup == 1) nvme_printf(ctrlr, "volatile memory backup device failed\n"); if (state.bits.reserved != 0) nvme_printf(ctrlr, "unknown critical warning(s): state = 0x%02x\n", state.raw); } static void nvme_ctrlr_async_event_log_page_cb(void *arg, const struct nvme_completion *cpl) { struct nvme_async_event_request *aer = arg; struct nvme_health_information_page *health_info; /* * If the log page fetch for some reason completed with an error, * don't pass log page data to the consumers. In practice, this case * should never happen. */ if (nvme_completion_is_error(cpl)) nvme_notify_async_consumers(aer->ctrlr, &aer->cpl, aer->log_page_id, NULL, 0); else { if (aer->log_page_id == NVME_LOG_HEALTH_INFORMATION) { health_info = (struct nvme_health_information_page *) aer->log_page_buffer; nvme_ctrlr_log_critical_warnings(aer->ctrlr, health_info->critical_warning); /* * Critical warnings reported through the * SMART/health log page are persistent, so * clear the associated bits in the async event * config so that we do not receive repeated * notifications for the same event. */ aer->ctrlr->async_event_config.raw &= ~health_info->critical_warning.raw; nvme_ctrlr_cmd_set_async_event_config(aer->ctrlr, aer->ctrlr->async_event_config, NULL, NULL); } /* * Pass the cpl data from the original async event completion, * not the log page fetch. */ nvme_notify_async_consumers(aer->ctrlr, &aer->cpl, aer->log_page_id, aer->log_page_buffer, aer->log_page_size); } /* * Repost another asynchronous event request to replace the one * that just completed. */ nvme_ctrlr_construct_and_submit_aer(aer->ctrlr, aer); } static void nvme_ctrlr_async_event_cb(void *arg, const struct nvme_completion *cpl) { struct nvme_async_event_request *aer = arg; if (nvme_completion_is_error(cpl)) { /* * Do not retry failed async event requests. This avoids * infinite loops where a new async event request is submitted * to replace the one just failed, only to fail again and * perpetuate the loop. */ return; } /* Associated log page is in bits 23:16 of completion entry dw0. */ aer->log_page_id = (cpl->cdw0 & 0xFF0000) >> 16; nvme_printf(aer->ctrlr, "async event occurred (log page id=0x%x)\n", aer->log_page_id); if (is_log_page_id_valid(aer->log_page_id)) { aer->log_page_size = nvme_ctrlr_get_log_page_size(aer->ctrlr, aer->log_page_id); memcpy(&aer->cpl, cpl, sizeof(*cpl)); nvme_ctrlr_cmd_get_log_page(aer->ctrlr, aer->log_page_id, NVME_GLOBAL_NAMESPACE_TAG, aer->log_page_buffer, aer->log_page_size, nvme_ctrlr_async_event_log_page_cb, aer); /* Wait to notify consumers until after log page is fetched. */ } else { nvme_notify_async_consumers(aer->ctrlr, cpl, aer->log_page_id, NULL, 0); /* * Repost another asynchronous event request to replace the one * that just completed. */ nvme_ctrlr_construct_and_submit_aer(aer->ctrlr, aer); } } static void nvme_ctrlr_construct_and_submit_aer(struct nvme_controller *ctrlr, struct nvme_async_event_request *aer) { struct nvme_request *req; aer->ctrlr = ctrlr; req = nvme_allocate_request_null(nvme_ctrlr_async_event_cb, aer); aer->req = req; /* * Disable timeout here, since asynchronous event requests should by * nature never be timed out. */ req->timeout = FALSE; req->cmd.opc = NVME_OPC_ASYNC_EVENT_REQUEST; nvme_ctrlr_submit_admin_request(ctrlr, req); } static void nvme_ctrlr_configure_aer(struct nvme_controller *ctrlr) { struct nvme_completion_poll_status status; struct nvme_async_event_request *aer; uint32_t i; ctrlr->async_event_config.raw = 0xFF; ctrlr->async_event_config.bits.reserved = 0; status.done = FALSE; nvme_ctrlr_cmd_get_feature(ctrlr, NVME_FEAT_TEMPERATURE_THRESHOLD, 0, NULL, 0, nvme_completion_poll_cb, &status); while (status.done == FALSE) pause("nvme", 1); if (nvme_completion_is_error(&status.cpl) || (status.cpl.cdw0 & 0xFFFF) == 0xFFFF || (status.cpl.cdw0 & 0xFFFF) == 0x0000) { nvme_printf(ctrlr, "temperature threshold not supported\n"); ctrlr->async_event_config.bits.temperature = 0; } nvme_ctrlr_cmd_set_async_event_config(ctrlr, ctrlr->async_event_config, NULL, NULL); /* aerl is a zero-based value, so we need to add 1 here. */ ctrlr->num_aers = min(NVME_MAX_ASYNC_EVENTS, (ctrlr->cdata.aerl+1)); for (i = 0; i < ctrlr->num_aers; i++) { aer = &ctrlr->aer[i]; nvme_ctrlr_construct_and_submit_aer(ctrlr, aer); } } static void nvme_ctrlr_configure_int_coalescing(struct nvme_controller *ctrlr) { ctrlr->int_coal_time = 0; TUNABLE_INT_FETCH("hw.nvme.int_coal_time", &ctrlr->int_coal_time); ctrlr->int_coal_threshold = 0; TUNABLE_INT_FETCH("hw.nvme.int_coal_threshold", &ctrlr->int_coal_threshold); nvme_ctrlr_cmd_set_interrupt_coalescing(ctrlr, ctrlr->int_coal_time, ctrlr->int_coal_threshold, NULL, NULL); } static void nvme_ctrlr_start(void *ctrlr_arg) { struct nvme_controller *ctrlr = ctrlr_arg; uint32_t old_num_io_queues; int i; /* * Only reset adminq here when we are restarting the * controller after a reset. During initialization, * we have already submitted admin commands to get * the number of I/O queues supported, so cannot reset * the adminq again here. */ if (ctrlr->is_resetting) { nvme_qpair_reset(&ctrlr->adminq); } for (i = 0; i < ctrlr->num_io_queues; i++) nvme_qpair_reset(&ctrlr->ioq[i]); nvme_admin_qpair_enable(&ctrlr->adminq); if (nvme_ctrlr_identify(ctrlr) != 0) { nvme_ctrlr_fail(ctrlr); return; } /* * The number of qpairs are determined during controller initialization, * including using NVMe SET_FEATURES/NUMBER_OF_QUEUES to determine the * HW limit. We call SET_FEATURES again here so that it gets called * after any reset for controllers that depend on the driver to * explicit specify how many queues it will use. This value should * never change between resets, so panic if somehow that does happen. */ if (ctrlr->is_resetting) { old_num_io_queues = ctrlr->num_io_queues; if (nvme_ctrlr_set_num_qpairs(ctrlr) != 0) { nvme_ctrlr_fail(ctrlr); return; } if (old_num_io_queues != ctrlr->num_io_queues) { panic("num_io_queues changed from %u to %u", old_num_io_queues, ctrlr->num_io_queues); } } if (nvme_ctrlr_create_qpairs(ctrlr) != 0) { nvme_ctrlr_fail(ctrlr); return; } if (nvme_ctrlr_construct_namespaces(ctrlr) != 0) { nvme_ctrlr_fail(ctrlr); return; } nvme_ctrlr_configure_aer(ctrlr); nvme_ctrlr_configure_int_coalescing(ctrlr); for (i = 0; i < ctrlr->num_io_queues; i++) nvme_io_qpair_enable(&ctrlr->ioq[i]); } void nvme_ctrlr_start_config_hook(void *arg) { struct nvme_controller *ctrlr = arg; nvme_qpair_reset(&ctrlr->adminq); nvme_admin_qpair_enable(&ctrlr->adminq); if (nvme_ctrlr_set_num_qpairs(ctrlr) == 0 && nvme_ctrlr_construct_io_qpairs(ctrlr) == 0) nvme_ctrlr_start(ctrlr); else nvme_ctrlr_fail(ctrlr); nvme_sysctl_initialize_ctrlr(ctrlr); config_intrhook_disestablish(&ctrlr->config_hook); ctrlr->is_initialized = 1; nvme_notify_new_controller(ctrlr); } static void nvme_ctrlr_reset_task(void *arg, int pending) { struct nvme_controller *ctrlr = arg; int status; nvme_printf(ctrlr, "resetting controller\n"); status = nvme_ctrlr_hw_reset(ctrlr); /* * Use pause instead of DELAY, so that we yield to any nvme interrupt * handlers on this CPU that were blocked on a qpair lock. We want * all nvme interrupts completed before proceeding with restarting the * controller. * * XXX - any way to guarantee the interrupt handlers have quiesced? */ pause("nvmereset", hz / 10); if (status == 0) nvme_ctrlr_start(ctrlr); else nvme_ctrlr_fail(ctrlr); atomic_cmpset_32(&ctrlr->is_resetting, 1, 0); } void nvme_ctrlr_intx_handler(void *arg) { struct nvme_controller *ctrlr = arg; nvme_mmio_write_4(ctrlr, intms, 1); nvme_qpair_process_completions(&ctrlr->adminq); if (ctrlr->ioq && ctrlr->ioq[0].cpl) nvme_qpair_process_completions(&ctrlr->ioq[0]); nvme_mmio_write_4(ctrlr, intmc, 1); } static int nvme_ctrlr_configure_intx(struct nvme_controller *ctrlr) { ctrlr->msix_enabled = 0; ctrlr->num_io_queues = 1; ctrlr->num_cpus_per_ioq = mp_ncpus; ctrlr->rid = 0; ctrlr->res = bus_alloc_resource_any(ctrlr->dev, SYS_RES_IRQ, &ctrlr->rid, RF_SHAREABLE | RF_ACTIVE); if (ctrlr->res == NULL) { nvme_printf(ctrlr, "unable to allocate shared IRQ\n"); return (ENOMEM); } bus_setup_intr(ctrlr->dev, ctrlr->res, INTR_TYPE_MISC | INTR_MPSAFE, NULL, nvme_ctrlr_intx_handler, ctrlr, &ctrlr->tag); if (ctrlr->tag == NULL) { nvme_printf(ctrlr, "unable to setup intx handler\n"); return (ENOMEM); } return (0); } static void nvme_pt_done(void *arg, const struct nvme_completion *cpl) { struct nvme_pt_command *pt = arg; bzero(&pt->cpl, sizeof(pt->cpl)); pt->cpl.cdw0 = cpl->cdw0; pt->cpl.status = cpl->status; pt->cpl.status.p = 0; mtx_lock(pt->driver_lock); wakeup(pt); mtx_unlock(pt->driver_lock); } int nvme_ctrlr_passthrough_cmd(struct nvme_controller *ctrlr, struct nvme_pt_command *pt, uint32_t nsid, int is_user_buffer, int is_admin_cmd) { struct nvme_request *req; struct mtx *mtx; struct buf *buf = NULL; int ret = 0; vm_offset_t addr, end; if (pt->len > 0) { /* * vmapbuf calls vm_fault_quick_hold_pages which only maps full * pages. Ensure this request has fewer than MAXPHYS bytes when * extended to full pages. */ addr = (vm_offset_t)pt->buf; end = round_page(addr + pt->len); addr = trunc_page(addr); if (end - addr > MAXPHYS) return EIO; if (pt->len > ctrlr->max_xfer_size) { nvme_printf(ctrlr, "pt->len (%d) " "exceeds max_xfer_size (%d)\n", pt->len, ctrlr->max_xfer_size); return EIO; } if (is_user_buffer) { /* * Ensure the user buffer is wired for the duration of * this passthrough command. */ PHOLD(curproc); buf = getpbuf(NULL); buf->b_data = pt->buf; buf->b_bufsize = pt->len; buf->b_iocmd = pt->is_read ? BIO_READ : BIO_WRITE; #ifdef NVME_UNMAPPED_BIO_SUPPORT if (vmapbuf(buf, 1) < 0) { #else if (vmapbuf(buf) < 0) { #endif ret = EFAULT; goto err; } req = nvme_allocate_request_vaddr(buf->b_data, pt->len, nvme_pt_done, pt); } else req = nvme_allocate_request_vaddr(pt->buf, pt->len, nvme_pt_done, pt); } else req = nvme_allocate_request_null(nvme_pt_done, pt); req->cmd.opc = pt->cmd.opc; req->cmd.cdw10 = pt->cmd.cdw10; req->cmd.cdw11 = pt->cmd.cdw11; req->cmd.cdw12 = pt->cmd.cdw12; req->cmd.cdw13 = pt->cmd.cdw13; req->cmd.cdw14 = pt->cmd.cdw14; req->cmd.cdw15 = pt->cmd.cdw15; req->cmd.nsid = nsid; if (is_admin_cmd) mtx = &ctrlr->lock; else mtx = &ctrlr->ns[nsid-1].lock; mtx_lock(mtx); pt->driver_lock = mtx; if (is_admin_cmd) nvme_ctrlr_submit_admin_request(ctrlr, req); else nvme_ctrlr_submit_io_request(ctrlr, req); mtx_sleep(pt, mtx, PRIBIO, "nvme_pt", 0); mtx_unlock(mtx); pt->driver_lock = NULL; err: if (buf != NULL) { relpbuf(buf, NULL); PRELE(curproc); } return (ret); } static int nvme_ctrlr_ioctl(struct cdev *cdev, u_long cmd, caddr_t arg, int flag, struct thread *td) { struct nvme_controller *ctrlr; struct nvme_pt_command *pt; ctrlr = cdev->si_drv1; switch (cmd) { case NVME_RESET_CONTROLLER: nvme_ctrlr_reset(ctrlr); break; case NVME_PASSTHROUGH_CMD: pt = (struct nvme_pt_command *)arg; return (nvme_ctrlr_passthrough_cmd(ctrlr, pt, pt->cmd.nsid, 1 /* is_user_buffer */, 1 /* is_admin_cmd */)); default: return (ENOTTY); } return (0); } static struct cdevsw nvme_ctrlr_cdevsw = { .d_version = D_VERSION, .d_flags = 0, .d_ioctl = nvme_ctrlr_ioctl }; static void nvme_ctrlr_setup_interrupts(struct nvme_controller *ctrlr) { device_t dev; int per_cpu_io_queues; int min_cpus_per_ioq; int num_vectors_requested, num_vectors_allocated; int num_vectors_available; dev = ctrlr->dev; min_cpus_per_ioq = 1; TUNABLE_INT_FETCH("hw.nvme.min_cpus_per_ioq", &min_cpus_per_ioq); if (min_cpus_per_ioq < 1) { min_cpus_per_ioq = 1; } else if (min_cpus_per_ioq > mp_ncpus) { min_cpus_per_ioq = mp_ncpus; } per_cpu_io_queues = 1; TUNABLE_INT_FETCH("hw.nvme.per_cpu_io_queues", &per_cpu_io_queues); if (per_cpu_io_queues == 0) { min_cpus_per_ioq = mp_ncpus; } ctrlr->force_intx = 0; TUNABLE_INT_FETCH("hw.nvme.force_intx", &ctrlr->force_intx); /* * FreeBSD currently cannot allocate more than about 190 vectors at * boot, meaning that systems with high core count and many devices * requesting per-CPU interrupt vectors will not get their full * allotment. So first, try to allocate as many as we may need to * understand what is available, then immediately release them. * Then figure out how many of those we will actually use, based on * assigning an equal number of cores to each I/O queue. */ /* One vector for per core I/O queue, plus one vector for admin queue. */ num_vectors_available = min(pci_msix_count(dev), mp_ncpus + 1); if (pci_alloc_msix(dev, &num_vectors_available) != 0) { num_vectors_available = 0; } pci_release_msi(dev); if (ctrlr->force_intx || num_vectors_available < 2) { nvme_ctrlr_configure_intx(ctrlr); return; } /* * Do not use all vectors for I/O queues - one must be saved for the * admin queue. */ ctrlr->num_cpus_per_ioq = max(min_cpus_per_ioq, howmany(mp_ncpus, num_vectors_available - 1)); ctrlr->num_io_queues = howmany(mp_ncpus, ctrlr->num_cpus_per_ioq); num_vectors_requested = ctrlr->num_io_queues + 1; num_vectors_allocated = num_vectors_requested; /* * Now just allocate the number of vectors we need. This should * succeed, since we previously called pci_alloc_msix() * successfully returning at least this many vectors, but just to * be safe, if something goes wrong just revert to INTx. */ if (pci_alloc_msix(dev, &num_vectors_allocated) != 0) { nvme_ctrlr_configure_intx(ctrlr); return; } if (num_vectors_allocated < num_vectors_requested) { pci_release_msi(dev); nvme_ctrlr_configure_intx(ctrlr); return; } ctrlr->msix_enabled = 1; } int nvme_ctrlr_construct(struct nvme_controller *ctrlr, device_t dev) { union cap_lo_register cap_lo; union cap_hi_register cap_hi; int status, timeout_period; ctrlr->dev = dev; mtx_init(&ctrlr->lock, "nvme ctrlr lock", NULL, MTX_DEF); status = nvme_ctrlr_allocate_bar(ctrlr); if (status != 0) return (status); /* * Software emulators may set the doorbell stride to something * other than zero, but this driver is not set up to handle that. */ cap_hi.raw = nvme_mmio_read_4(ctrlr, cap_hi); if (cap_hi.bits.dstrd != 0) return (ENXIO); ctrlr->min_page_size = 1 << (12 + cap_hi.bits.mpsmin); /* Get ready timeout value from controller, in units of 500ms. */ cap_lo.raw = nvme_mmio_read_4(ctrlr, cap_lo); ctrlr->ready_timeout_in_ms = cap_lo.bits.to * 500; timeout_period = NVME_DEFAULT_TIMEOUT_PERIOD; TUNABLE_INT_FETCH("hw.nvme.timeout_period", &timeout_period); timeout_period = min(timeout_period, NVME_MAX_TIMEOUT_PERIOD); timeout_period = max(timeout_period, NVME_MIN_TIMEOUT_PERIOD); ctrlr->timeout_period = timeout_period; nvme_retry_count = NVME_DEFAULT_RETRY_COUNT; TUNABLE_INT_FETCH("hw.nvme.retry_count", &nvme_retry_count); ctrlr->enable_aborts = 0; TUNABLE_INT_FETCH("hw.nvme.enable_aborts", &ctrlr->enable_aborts); nvme_ctrlr_setup_interrupts(ctrlr); ctrlr->max_xfer_size = NVME_MAX_XFER_SIZE; if (nvme_ctrlr_construct_admin_qpair(ctrlr) != 0) return (ENXIO); ctrlr->cdev = make_dev(&nvme_ctrlr_cdevsw, device_get_unit(dev), UID_ROOT, GID_WHEEL, 0600, "nvme%d", device_get_unit(dev)); if (ctrlr->cdev == NULL) return (ENXIO); ctrlr->cdev->si_drv1 = (void *)ctrlr; ctrlr->taskqueue = taskqueue_create("nvme_taskq", M_WAITOK, taskqueue_thread_enqueue, &ctrlr->taskqueue); taskqueue_start_threads(&ctrlr->taskqueue, 1, PI_DISK, "nvme taskq"); ctrlr->is_resetting = 0; ctrlr->is_initialized = 0; ctrlr->notification_sent = 0; TASK_INIT(&ctrlr->reset_task, 0, nvme_ctrlr_reset_task, ctrlr); TASK_INIT(&ctrlr->fail_req_task, 0, nvme_ctrlr_fail_req_task, ctrlr); STAILQ_INIT(&ctrlr->fail_req); ctrlr->is_failed = FALSE; return (0); } void nvme_ctrlr_destruct(struct nvme_controller *ctrlr, device_t dev) { int i; /* * Notify the controller of a shutdown, even though this is due to * a driver unload, not a system shutdown (this path is not invoked * during shutdown). This ensures the controller receives a * shutdown notification in case the system is shutdown before * reloading the driver. */ nvme_ctrlr_shutdown(ctrlr); nvme_ctrlr_disable(ctrlr); taskqueue_free(ctrlr->taskqueue); for (i = 0; i < NVME_MAX_NAMESPACES; i++) nvme_ns_destruct(&ctrlr->ns[i]); if (ctrlr->cdev) destroy_dev(ctrlr->cdev); for (i = 0; i < ctrlr->num_io_queues; i++) { nvme_io_qpair_destroy(&ctrlr->ioq[i]); } free(ctrlr->ioq, M_NVME); nvme_admin_qpair_destroy(&ctrlr->adminq); if (ctrlr->resource != NULL) { bus_release_resource(dev, SYS_RES_MEMORY, ctrlr->resource_id, ctrlr->resource); } if (ctrlr->bar4_resource != NULL) { bus_release_resource(dev, SYS_RES_MEMORY, ctrlr->bar4_resource_id, ctrlr->bar4_resource); } if (ctrlr->tag) bus_teardown_intr(ctrlr->dev, ctrlr->res, ctrlr->tag); if (ctrlr->res) bus_release_resource(ctrlr->dev, SYS_RES_IRQ, rman_get_rid(ctrlr->res), ctrlr->res); if (ctrlr->msix_enabled) pci_release_msi(dev); } void nvme_ctrlr_shutdown(struct nvme_controller *ctrlr) { union cc_register cc; union csts_register csts; int ticks = 0; cc.raw = nvme_mmio_read_4(ctrlr, cc); cc.bits.shn = NVME_SHN_NORMAL; nvme_mmio_write_4(ctrlr, cc, cc.raw); csts.raw = nvme_mmio_read_4(ctrlr, csts); while ((csts.bits.shst != NVME_SHST_COMPLETE) && (ticks++ < 5*hz)) { pause("nvme shn", 1); csts.raw = nvme_mmio_read_4(ctrlr, csts); } if (csts.bits.shst != NVME_SHST_COMPLETE) nvme_printf(ctrlr, "did not complete shutdown within 5 seconds " "of notification\n"); } void nvme_ctrlr_submit_admin_request(struct nvme_controller *ctrlr, struct nvme_request *req) { nvme_qpair_submit_request(&ctrlr->adminq, req); } void nvme_ctrlr_submit_io_request(struct nvme_controller *ctrlr, struct nvme_request *req) { struct nvme_qpair *qpair; qpair = &ctrlr->ioq[curcpu / ctrlr->num_cpus_per_ioq]; nvme_qpair_submit_request(qpair, req); } device_t nvme_ctrlr_get_device(struct nvme_controller *ctrlr) { return (ctrlr->dev); } const struct nvme_controller_data * nvme_ctrlr_get_data(struct nvme_controller *ctrlr) { return (&ctrlr->cdata); } Index: projects/runtime-coverage/sys/dev/nvme/nvme_ctrlr_cmd.c =================================================================== --- projects/runtime-coverage/sys/dev/nvme/nvme_ctrlr_cmd.c (revision 322921) +++ projects/runtime-coverage/sys/dev/nvme/nvme_ctrlr_cmd.c (revision 322922) @@ -1,325 +1,325 @@ /*- * Copyright (C) 2012-2013 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "nvme_private.h" void nvme_ctrlr_cmd_identify_controller(struct nvme_controller *ctrlr, void *payload, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_vaddr(payload, sizeof(struct nvme_controller_data), cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_IDENTIFY; /* * TODO: create an identify command data structure, which * includes this CNS bit in cdw10. */ cmd->cdw10 = 1; nvme_ctrlr_submit_admin_request(ctrlr, req); } void -nvme_ctrlr_cmd_identify_namespace(struct nvme_controller *ctrlr, uint16_t nsid, +nvme_ctrlr_cmd_identify_namespace(struct nvme_controller *ctrlr, uint32_t nsid, void *payload, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_vaddr(payload, sizeof(struct nvme_namespace_data), cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_IDENTIFY; /* * TODO: create an identify command data structure */ cmd->nsid = nsid; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_create_io_cq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, uint16_t vector, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_CREATE_IO_CQ; /* * TODO: create a create io completion queue command data * structure. */ cmd->cdw10 = ((io_que->num_entries-1) << 16) | io_que->id; /* 0x3 = interrupts enabled | physically contiguous */ cmd->cdw11 = (vector << 16) | 0x3; cmd->prp1 = io_que->cpl_bus_addr; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_create_io_sq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_CREATE_IO_SQ; /* * TODO: create a create io submission queue command data * structure. */ cmd->cdw10 = ((io_que->num_entries-1) << 16) | io_que->id; /* 0x1 = physically contiguous */ cmd->cdw11 = (io_que->id << 16) | 0x1; cmd->prp1 = io_que->cmd_bus_addr; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_delete_io_cq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_DELETE_IO_CQ; /* * TODO: create a delete io completion queue command data * structure. */ cmd->cdw10 = io_que->id; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_delete_io_sq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_DELETE_IO_SQ; /* * TODO: create a delete io submission queue command data * structure. */ cmd->cdw10 = io_que->id; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_set_feature(struct nvme_controller *ctrlr, uint8_t feature, uint32_t cdw11, void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_SET_FEATURES; cmd->cdw10 = feature; cmd->cdw11 = cdw11; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_get_feature(struct nvme_controller *ctrlr, uint8_t feature, uint32_t cdw11, void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_GET_FEATURES; cmd->cdw10 = feature; cmd->cdw11 = cdw11; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_set_num_queues(struct nvme_controller *ctrlr, uint32_t num_queues, nvme_cb_fn_t cb_fn, void *cb_arg) { uint32_t cdw11; cdw11 = ((num_queues - 1) << 16) | (num_queues - 1); nvme_ctrlr_cmd_set_feature(ctrlr, NVME_FEAT_NUMBER_OF_QUEUES, cdw11, NULL, 0, cb_fn, cb_arg); } void nvme_ctrlr_cmd_set_async_event_config(struct nvme_controller *ctrlr, union nvme_critical_warning_state state, nvme_cb_fn_t cb_fn, void *cb_arg) { uint32_t cdw11; cdw11 = state.raw; nvme_ctrlr_cmd_set_feature(ctrlr, NVME_FEAT_ASYNC_EVENT_CONFIGURATION, cdw11, NULL, 0, cb_fn, cb_arg); } void nvme_ctrlr_cmd_set_interrupt_coalescing(struct nvme_controller *ctrlr, uint32_t microseconds, uint32_t threshold, nvme_cb_fn_t cb_fn, void *cb_arg) { uint32_t cdw11; if ((microseconds/100) >= 0x100) { nvme_printf(ctrlr, "invalid coal time %d, disabling\n", microseconds); microseconds = 0; threshold = 0; } if (threshold >= 0x100) { nvme_printf(ctrlr, "invalid threshold %d, disabling\n", threshold); threshold = 0; microseconds = 0; } cdw11 = ((microseconds/100) << 8) | threshold; nvme_ctrlr_cmd_set_feature(ctrlr, NVME_FEAT_INTERRUPT_COALESCING, cdw11, NULL, 0, cb_fn, cb_arg); } void nvme_ctrlr_cmd_get_log_page(struct nvme_controller *ctrlr, uint8_t log_page, uint32_t nsid, void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_vaddr(payload, payload_size, cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_GET_LOG_PAGE; cmd->nsid = nsid; cmd->cdw10 = ((payload_size/sizeof(uint32_t)) - 1) << 16; cmd->cdw10 |= log_page; nvme_ctrlr_submit_admin_request(ctrlr, req); } void nvme_ctrlr_cmd_get_error_page(struct nvme_controller *ctrlr, struct nvme_error_information_entry *payload, uint32_t num_entries, nvme_cb_fn_t cb_fn, void *cb_arg) { KASSERT(num_entries > 0, ("%s called with num_entries==0\n", __func__)); /* Controller's error log page entries is 0-based. */ KASSERT(num_entries <= (ctrlr->cdata.elpe + 1), ("%s called with num_entries=%d but (elpe+1)=%d\n", __func__, num_entries, ctrlr->cdata.elpe + 1)); if (num_entries > (ctrlr->cdata.elpe + 1)) num_entries = ctrlr->cdata.elpe + 1; nvme_ctrlr_cmd_get_log_page(ctrlr, NVME_LOG_ERROR, NVME_GLOBAL_NAMESPACE_TAG, payload, sizeof(*payload) * num_entries, cb_fn, cb_arg); } void nvme_ctrlr_cmd_get_health_information_page(struct nvme_controller *ctrlr, uint32_t nsid, struct nvme_health_information_page *payload, nvme_cb_fn_t cb_fn, void *cb_arg) { nvme_ctrlr_cmd_get_log_page(ctrlr, NVME_LOG_HEALTH_INFORMATION, nsid, payload, sizeof(*payload), cb_fn, cb_arg); } void nvme_ctrlr_cmd_get_firmware_page(struct nvme_controller *ctrlr, struct nvme_firmware_page *payload, nvme_cb_fn_t cb_fn, void *cb_arg) { nvme_ctrlr_cmd_get_log_page(ctrlr, NVME_LOG_FIRMWARE_SLOT, NVME_GLOBAL_NAMESPACE_TAG, payload, sizeof(*payload), cb_fn, cb_arg); } void nvme_ctrlr_cmd_abort(struct nvme_controller *ctrlr, uint16_t cid, uint16_t sqid, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; struct nvme_command *cmd; req = nvme_allocate_request_null(cb_fn, cb_arg); cmd = &req->cmd; cmd->opc = NVME_OPC_ABORT; cmd->cdw10 = (cid << 16) | sqid; nvme_ctrlr_submit_admin_request(ctrlr, req); } Index: projects/runtime-coverage/sys/dev/nvme/nvme_ns.c =================================================================== --- projects/runtime-coverage/sys/dev/nvme/nvme_ns.c (revision 322921) +++ projects/runtime-coverage/sys/dev/nvme/nvme_ns.c (revision 322922) @@ -1,582 +1,582 @@ /*- * Copyright (C) 2012-2013 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include "nvme_private.h" static void nvme_bio_child_inbed(struct bio *parent, int bio_error); static void nvme_bio_child_done(void *arg, const struct nvme_completion *cpl); static uint32_t nvme_get_num_segments(uint64_t addr, uint64_t size, uint32_t alignment); static void nvme_free_child_bios(int num_bios, struct bio **child_bios); static struct bio ** nvme_allocate_child_bios(int num_bios); static struct bio ** nvme_construct_child_bios(struct bio *bp, uint32_t alignment, int *num_bios); static int nvme_ns_split_bio(struct nvme_namespace *ns, struct bio *bp, uint32_t alignment); static int nvme_ns_ioctl(struct cdev *cdev, u_long cmd, caddr_t arg, int flag, struct thread *td) { struct nvme_namespace *ns; struct nvme_controller *ctrlr; struct nvme_pt_command *pt; ns = cdev->si_drv1; ctrlr = ns->ctrlr; switch (cmd) { case NVME_IO_TEST: case NVME_BIO_TEST: nvme_ns_test(ns, cmd, arg); break; case NVME_PASSTHROUGH_CMD: pt = (struct nvme_pt_command *)arg; return (nvme_ctrlr_passthrough_cmd(ctrlr, pt, ns->id, 1 /* is_user_buffer */, 0 /* is_admin_cmd */)); case DIOCGMEDIASIZE: *(off_t *)arg = (off_t)nvme_ns_get_size(ns); break; case DIOCGSECTORSIZE: *(u_int *)arg = nvme_ns_get_sector_size(ns); break; default: return (ENOTTY); } return (0); } static int nvme_ns_open(struct cdev *dev __unused, int flags, int fmt __unused, struct thread *td) { int error = 0; if (flags & FWRITE) error = securelevel_gt(td->td_ucred, 0); return (error); } static int nvme_ns_close(struct cdev *dev __unused, int flags, int fmt __unused, struct thread *td) { return (0); } static void nvme_ns_strategy_done(void *arg, const struct nvme_completion *cpl) { struct bio *bp = arg; /* * TODO: add more extensive translation of NVMe status codes * to different bio error codes (i.e. EIO, EINVAL, etc.) */ if (nvme_completion_is_error(cpl)) { bp->bio_error = EIO; bp->bio_flags |= BIO_ERROR; bp->bio_resid = bp->bio_bcount; } else bp->bio_resid = 0; biodone(bp); } static void nvme_ns_strategy(struct bio *bp) { struct nvme_namespace *ns; int err; ns = bp->bio_dev->si_drv1; err = nvme_ns_bio_process(ns, bp, nvme_ns_strategy_done); if (err) { bp->bio_error = err; bp->bio_flags |= BIO_ERROR; bp->bio_resid = bp->bio_bcount; biodone(bp); } } static struct cdevsw nvme_ns_cdevsw = { .d_version = D_VERSION, .d_flags = D_DISK, .d_read = physread, .d_write = physwrite, .d_open = nvme_ns_open, .d_close = nvme_ns_close, .d_strategy = nvme_ns_strategy, .d_ioctl = nvme_ns_ioctl }; uint32_t nvme_ns_get_max_io_xfer_size(struct nvme_namespace *ns) { return ns->ctrlr->max_xfer_size; } uint32_t nvme_ns_get_sector_size(struct nvme_namespace *ns) { return (1 << ns->data.lbaf[ns->data.flbas.format].lbads); } uint64_t nvme_ns_get_num_sectors(struct nvme_namespace *ns) { return (ns->data.nsze); } uint64_t nvme_ns_get_size(struct nvme_namespace *ns) { return (nvme_ns_get_num_sectors(ns) * nvme_ns_get_sector_size(ns)); } uint32_t nvme_ns_get_flags(struct nvme_namespace *ns) { return (ns->flags); } const char * nvme_ns_get_serial_number(struct nvme_namespace *ns) { return ((const char *)ns->ctrlr->cdata.sn); } const char * nvme_ns_get_model_number(struct nvme_namespace *ns) { return ((const char *)ns->ctrlr->cdata.mn); } const struct nvme_namespace_data * nvme_ns_get_data(struct nvme_namespace *ns) { return (&ns->data); } uint32_t nvme_ns_get_stripesize(struct nvme_namespace *ns) { return (ns->stripesize); } static void nvme_ns_bio_done(void *arg, const struct nvme_completion *status) { struct bio *bp = arg; nvme_cb_fn_t bp_cb_fn; bp_cb_fn = bp->bio_driver1; if (bp->bio_driver2) free(bp->bio_driver2, M_NVME); if (nvme_completion_is_error(status)) { bp->bio_flags |= BIO_ERROR; if (bp->bio_error == 0) bp->bio_error = EIO; } if ((bp->bio_flags & BIO_ERROR) == 0) bp->bio_resid = 0; else bp->bio_resid = bp->bio_bcount; bp_cb_fn(bp, status); } static void nvme_bio_child_inbed(struct bio *parent, int bio_error) { struct nvme_completion parent_cpl; int children, inbed; if (bio_error != 0) { parent->bio_flags |= BIO_ERROR; parent->bio_error = bio_error; } /* * atomic_fetchadd will return value before adding 1, so we still * must add 1 to get the updated inbed number. Save bio_children * before incrementing to guard against race conditions when * two children bios complete on different queues. */ children = atomic_load_acq_int(&parent->bio_children); inbed = atomic_fetchadd_int(&parent->bio_inbed, 1) + 1; if (inbed == children) { bzero(&parent_cpl, sizeof(parent_cpl)); if (parent->bio_flags & BIO_ERROR) parent_cpl.status.sc = NVME_SC_DATA_TRANSFER_ERROR; nvme_ns_bio_done(parent, &parent_cpl); } } static void nvme_bio_child_done(void *arg, const struct nvme_completion *cpl) { struct bio *child = arg; struct bio *parent; int bio_error; parent = child->bio_parent; g_destroy_bio(child); bio_error = nvme_completion_is_error(cpl) ? EIO : 0; nvme_bio_child_inbed(parent, bio_error); } static uint32_t nvme_get_num_segments(uint64_t addr, uint64_t size, uint32_t align) { uint32_t num_segs, offset, remainder; if (align == 0) return (1); KASSERT((align & (align - 1)) == 0, ("alignment not power of 2\n")); num_segs = size / align; remainder = size & (align - 1); offset = addr & (align - 1); if (remainder > 0 || offset > 0) num_segs += 1 + (remainder + offset - 1) / align; return (num_segs); } static void nvme_free_child_bios(int num_bios, struct bio **child_bios) { int i; for (i = 0; i < num_bios; i++) { if (child_bios[i] != NULL) g_destroy_bio(child_bios[i]); } free(child_bios, M_NVME); } static struct bio ** nvme_allocate_child_bios(int num_bios) { struct bio **child_bios; int err = 0, i; child_bios = malloc(num_bios * sizeof(struct bio *), M_NVME, M_NOWAIT); if (child_bios == NULL) return (NULL); for (i = 0; i < num_bios; i++) { child_bios[i] = g_new_bio(); if (child_bios[i] == NULL) err = ENOMEM; } if (err == ENOMEM) { nvme_free_child_bios(num_bios, child_bios); return (NULL); } return (child_bios); } static struct bio ** nvme_construct_child_bios(struct bio *bp, uint32_t alignment, int *num_bios) { struct bio **child_bios; struct bio *child; uint64_t cur_offset; caddr_t data; uint32_t rem_bcount; int i; #ifdef NVME_UNMAPPED_BIO_SUPPORT struct vm_page **ma; uint32_t ma_offset; #endif *num_bios = nvme_get_num_segments(bp->bio_offset, bp->bio_bcount, alignment); child_bios = nvme_allocate_child_bios(*num_bios); if (child_bios == NULL) return (NULL); bp->bio_children = *num_bios; bp->bio_inbed = 0; cur_offset = bp->bio_offset; rem_bcount = bp->bio_bcount; data = bp->bio_data; #ifdef NVME_UNMAPPED_BIO_SUPPORT ma_offset = bp->bio_ma_offset; ma = bp->bio_ma; #endif for (i = 0; i < *num_bios; i++) { child = child_bios[i]; child->bio_parent = bp; child->bio_cmd = bp->bio_cmd; child->bio_offset = cur_offset; child->bio_bcount = min(rem_bcount, alignment - (cur_offset & (alignment - 1))); child->bio_flags = bp->bio_flags; #ifdef NVME_UNMAPPED_BIO_SUPPORT if (bp->bio_flags & BIO_UNMAPPED) { child->bio_ma_offset = ma_offset; child->bio_ma = ma; child->bio_ma_n = nvme_get_num_segments(child->bio_ma_offset, child->bio_bcount, PAGE_SIZE); ma_offset = (ma_offset + child->bio_bcount) & PAGE_MASK; ma += child->bio_ma_n; if (ma_offset != 0) ma -= 1; } else #endif { child->bio_data = data; data += child->bio_bcount; } cur_offset += child->bio_bcount; rem_bcount -= child->bio_bcount; } return (child_bios); } static int nvme_ns_split_bio(struct nvme_namespace *ns, struct bio *bp, uint32_t alignment) { struct bio *child; struct bio **child_bios; int err, i, num_bios; child_bios = nvme_construct_child_bios(bp, alignment, &num_bios); if (child_bios == NULL) return (ENOMEM); for (i = 0; i < num_bios; i++) { child = child_bios[i]; err = nvme_ns_bio_process(ns, child, nvme_bio_child_done); if (err != 0) { nvme_bio_child_inbed(bp, err); g_destroy_bio(child); } } free(child_bios, M_NVME); return (0); } int nvme_ns_bio_process(struct nvme_namespace *ns, struct bio *bp, nvme_cb_fn_t cb_fn) { struct nvme_dsm_range *dsm_range; uint32_t num_bios; int err; bp->bio_driver1 = cb_fn; if (ns->stripesize > 0 && (bp->bio_cmd == BIO_READ || bp->bio_cmd == BIO_WRITE)) { num_bios = nvme_get_num_segments(bp->bio_offset, bp->bio_bcount, ns->stripesize); if (num_bios > 1) return (nvme_ns_split_bio(ns, bp, ns->stripesize)); } switch (bp->bio_cmd) { case BIO_READ: err = nvme_ns_cmd_read_bio(ns, bp, nvme_ns_bio_done, bp); break; case BIO_WRITE: err = nvme_ns_cmd_write_bio(ns, bp, nvme_ns_bio_done, bp); break; case BIO_FLUSH: err = nvme_ns_cmd_flush(ns, nvme_ns_bio_done, bp); break; case BIO_DELETE: dsm_range = malloc(sizeof(struct nvme_dsm_range), M_NVME, M_ZERO | M_WAITOK); dsm_range->length = bp->bio_bcount/nvme_ns_get_sector_size(ns); dsm_range->starting_lba = bp->bio_offset/nvme_ns_get_sector_size(ns); bp->bio_driver2 = dsm_range; err = nvme_ns_cmd_deallocate(ns, dsm_range, 1, nvme_ns_bio_done, bp); if (err != 0) free(dsm_range, M_NVME); break; default: err = EIO; break; } return (err); } int -nvme_ns_construct(struct nvme_namespace *ns, uint16_t id, +nvme_ns_construct(struct nvme_namespace *ns, uint32_t id, struct nvme_controller *ctrlr) { struct nvme_completion_poll_status status; int unit; ns->ctrlr = ctrlr; ns->id = id; ns->stripesize = 0; if (pci_get_devid(ctrlr->dev) == 0x09538086 && ctrlr->cdata.vs[3] != 0) ns->stripesize = (1 << ctrlr->cdata.vs[3]) * ctrlr->min_page_size; /* * Namespaces are reconstructed after a controller reset, so check * to make sure we only call mtx_init once on each mtx. * * TODO: Move this somewhere where it gets called at controller * construction time, which is not invoked as part of each * controller reset. */ if (!mtx_initialized(&ns->lock)) mtx_init(&ns->lock, "nvme ns lock", NULL, MTX_DEF); status.done = FALSE; nvme_ctrlr_cmd_identify_namespace(ctrlr, id, &ns->data, nvme_completion_poll_cb, &status); while (status.done == FALSE) DELAY(5); if (nvme_completion_is_error(&status.cpl)) { nvme_printf(ctrlr, "nvme_identify_namespace failed\n"); return (ENXIO); } /* * If the size of is zero, chances are this isn't a valid * namespace (eg one that's not been configured yet). The * standard says the entire id will be zeros, so this is a * cheap way to test for that. */ if (ns->data.nsze == 0) return (ENXIO); /* * Note: format is a 0-based value, so > is appropriate here, * not >=. */ if (ns->data.flbas.format > ns->data.nlbaf) { printf("lba format %d exceeds number supported (%d)\n", ns->data.flbas.format, ns->data.nlbaf+1); return (ENXIO); } if (ctrlr->cdata.oncs.dsm) ns->flags |= NVME_NS_DEALLOCATE_SUPPORTED; if (ctrlr->cdata.vwc.present) ns->flags |= NVME_NS_FLUSH_SUPPORTED; /* * cdev may have already been created, if we are reconstructing the * namespace after a controller-level reset. */ if (ns->cdev != NULL) return (0); /* * Namespace IDs start at 1, so we need to subtract 1 to create a * correct unit number. */ unit = device_get_unit(ctrlr->dev) * NVME_MAX_NAMESPACES + ns->id - 1; /* * MAKEDEV_ETERNAL was added in r210923, for cdevs that will never * be destroyed. This avoids refcounting on the cdev object. * That should be OK case here, as long as we're not supporting PCIe * surprise removal nor namespace deletion. */ #ifdef MAKEDEV_ETERNAL_KLD ns->cdev = make_dev_credf(MAKEDEV_ETERNAL_KLD, &nvme_ns_cdevsw, unit, NULL, UID_ROOT, GID_WHEEL, 0600, "nvme%dns%d", device_get_unit(ctrlr->dev), ns->id); #else ns->cdev = make_dev_credf(0, &nvme_ns_cdevsw, unit, NULL, UID_ROOT, GID_WHEEL, 0600, "nvme%dns%d", device_get_unit(ctrlr->dev), ns->id); #endif #ifdef NVME_UNMAPPED_BIO_SUPPORT ns->cdev->si_flags |= SI_UNMAPPED; #endif if (ns->cdev != NULL) ns->cdev->si_drv1 = ns; return (0); } void nvme_ns_destruct(struct nvme_namespace *ns) { if (ns->cdev != NULL) destroy_dev(ns->cdev); } Index: projects/runtime-coverage/sys/dev/nvme/nvme_private.h =================================================================== --- projects/runtime-coverage/sys/dev/nvme/nvme_private.h (revision 322921) +++ projects/runtime-coverage/sys/dev/nvme/nvme_private.h (revision 322922) @@ -1,530 +1,530 @@ /*- * Copyright (C) 2012-2014 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __NVME_PRIVATE_H__ #define __NVME_PRIVATE_H__ #include #include #include #include #include #include #include #include #include #include #include #include #include "nvme.h" #define DEVICE2SOFTC(dev) ((struct nvme_controller *) device_get_softc(dev)) MALLOC_DECLARE(M_NVME); #define IDT32_PCI_ID 0x80d0111d /* 32 channel board */ #define IDT8_PCI_ID 0x80d2111d /* 8 channel board */ /* * For commands requiring more than 2 PRP entries, one PRP will be * embedded in the command (prp1), and the rest of the PRP entries * will be in a list pointed to by the command (prp2). This means * that real max number of PRP entries we support is 32+1, which * results in a max xfer size of 32*PAGE_SIZE. */ #define NVME_MAX_PRP_LIST_ENTRIES (NVME_MAX_XFER_SIZE / PAGE_SIZE) #define NVME_ADMIN_TRACKERS (16) #define NVME_ADMIN_ENTRIES (128) /* min and max are defined in admin queue attributes section of spec */ #define NVME_MIN_ADMIN_ENTRIES (2) #define NVME_MAX_ADMIN_ENTRIES (4096) /* * NVME_IO_ENTRIES defines the size of an I/O qpair's submission and completion * queues, while NVME_IO_TRACKERS defines the maximum number of I/O that we * will allow outstanding on an I/O qpair at any time. The only advantage in * having IO_ENTRIES > IO_TRACKERS is for debugging purposes - when dumping * the contents of the submission and completion queues, it will show a longer * history of data. */ #define NVME_IO_ENTRIES (256) #define NVME_IO_TRACKERS (128) #define NVME_MIN_IO_TRACKERS (4) #define NVME_MAX_IO_TRACKERS (1024) /* * NVME_MAX_IO_ENTRIES is not defined, since it is specified in CC.MQES * for each controller. */ #define NVME_INT_COAL_TIME (0) /* disabled */ #define NVME_INT_COAL_THRESHOLD (0) /* 0-based */ #define NVME_MAX_NAMESPACES (16) #define NVME_MAX_CONSUMERS (2) #define NVME_MAX_ASYNC_EVENTS (8) #define NVME_DEFAULT_TIMEOUT_PERIOD (30) /* in seconds */ #define NVME_MIN_TIMEOUT_PERIOD (5) #define NVME_MAX_TIMEOUT_PERIOD (120) #define NVME_DEFAULT_RETRY_COUNT (4) /* Maximum log page size to fetch for AERs. */ #define NVME_MAX_AER_LOG_SIZE (4096) /* * Define CACHE_LINE_SIZE here for older FreeBSD versions that do not define * it. */ #ifndef CACHE_LINE_SIZE #define CACHE_LINE_SIZE (64) #endif /* * Use presence of the BIO_UNMAPPED flag to determine whether unmapped I/O * support and the bus_dmamap_load_bio API are available on the target * kernel. This will ease porting back to earlier stable branches at a * later point. */ #ifdef BIO_UNMAPPED #define NVME_UNMAPPED_BIO_SUPPORT #endif extern uma_zone_t nvme_request_zone; extern int32_t nvme_retry_count; struct nvme_completion_poll_status { struct nvme_completion cpl; boolean_t done; }; #define NVME_REQUEST_VADDR 1 #define NVME_REQUEST_NULL 2 /* For requests with no payload. */ #define NVME_REQUEST_UIO 3 #ifdef NVME_UNMAPPED_BIO_SUPPORT #define NVME_REQUEST_BIO 4 #endif struct nvme_request { struct nvme_command cmd; struct nvme_qpair *qpair; union { void *payload; struct bio *bio; } u; uint32_t type; uint32_t payload_size; boolean_t timeout; nvme_cb_fn_t cb_fn; void *cb_arg; int32_t retries; STAILQ_ENTRY(nvme_request) stailq; }; struct nvme_async_event_request { struct nvme_controller *ctrlr; struct nvme_request *req; struct nvme_completion cpl; uint32_t log_page_id; uint32_t log_page_size; uint8_t log_page_buffer[NVME_MAX_AER_LOG_SIZE]; }; struct nvme_tracker { TAILQ_ENTRY(nvme_tracker) tailq; struct nvme_request *req; struct nvme_qpair *qpair; struct callout timer; bus_dmamap_t payload_dma_map; uint16_t cid; uint64_t *prp; bus_addr_t prp_bus_addr; }; struct nvme_qpair { struct nvme_controller *ctrlr; uint32_t id; uint32_t phase; uint16_t vector; int rid; struct resource *res; void *tag; uint32_t num_entries; uint32_t num_trackers; uint32_t sq_tdbl_off; uint32_t cq_hdbl_off; uint32_t sq_head; uint32_t sq_tail; uint32_t cq_head; int64_t num_cmds; int64_t num_intr_handler_calls; struct nvme_command *cmd; struct nvme_completion *cpl; bus_dma_tag_t dma_tag; bus_dma_tag_t dma_tag_payload; bus_dmamap_t queuemem_map; uint64_t cmd_bus_addr; uint64_t cpl_bus_addr; TAILQ_HEAD(, nvme_tracker) free_tr; TAILQ_HEAD(, nvme_tracker) outstanding_tr; STAILQ_HEAD(, nvme_request) queued_req; struct nvme_tracker **act_tr; boolean_t is_enabled; struct mtx lock __aligned(CACHE_LINE_SIZE); } __aligned(CACHE_LINE_SIZE); struct nvme_namespace { struct nvme_controller *ctrlr; struct nvme_namespace_data data; - uint16_t id; - uint16_t flags; + uint32_t id; + uint32_t flags; struct cdev *cdev; void *cons_cookie[NVME_MAX_CONSUMERS]; uint32_t stripesize; struct mtx lock; }; /* * One of these per allocated PCI device. */ struct nvme_controller { device_t dev; struct mtx lock; uint32_t ready_timeout_in_ms; bus_space_tag_t bus_tag; bus_space_handle_t bus_handle; int resource_id; struct resource *resource; /* * The NVMe spec allows for the MSI-X table to be placed in BAR 4/5, * separate from the control registers which are in BAR 0/1. These * members track the mapping of BAR 4/5 for that reason. */ int bar4_resource_id; struct resource *bar4_resource; uint32_t msix_enabled; uint32_t force_intx; uint32_t enable_aborts; uint32_t num_io_queues; uint32_t num_cpus_per_ioq; /* Fields for tracking progress during controller initialization. */ struct intr_config_hook config_hook; uint32_t ns_identified; uint32_t queues_created; struct task reset_task; struct task fail_req_task; struct taskqueue *taskqueue; /* For shared legacy interrupt. */ int rid; struct resource *res; void *tag; bus_dma_tag_t hw_desc_tag; bus_dmamap_t hw_desc_map; /** maximum i/o size in bytes */ uint32_t max_xfer_size; /** minimum page size supported by this controller in bytes */ uint32_t min_page_size; /** interrupt coalescing time period (in microseconds) */ uint32_t int_coal_time; /** interrupt coalescing threshold */ uint32_t int_coal_threshold; /** timeout period in seconds */ uint32_t timeout_period; struct nvme_qpair adminq; struct nvme_qpair *ioq; struct nvme_registers *regs; struct nvme_controller_data cdata; struct nvme_namespace ns[NVME_MAX_NAMESPACES]; struct cdev *cdev; /** bit mask of warning types currently enabled for async events */ union nvme_critical_warning_state async_event_config; uint32_t num_aers; struct nvme_async_event_request aer[NVME_MAX_ASYNC_EVENTS]; void *cons_cookie[NVME_MAX_CONSUMERS]; uint32_t is_resetting; uint32_t is_initialized; uint32_t notification_sent; boolean_t is_failed; STAILQ_HEAD(, nvme_request) fail_req; }; #define nvme_mmio_offsetof(reg) \ offsetof(struct nvme_registers, reg) #define nvme_mmio_read_4(sc, reg) \ bus_space_read_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg)) #define nvme_mmio_write_4(sc, reg, val) \ bus_space_write_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg), val) #define nvme_mmio_write_8(sc, reg, val) \ do { \ bus_space_write_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg), val & 0xFFFFFFFF); \ bus_space_write_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg)+4, \ (val & 0xFFFFFFFF00000000UL) >> 32); \ } while (0); #if __FreeBSD_version < 800054 #define wmb() __asm volatile("sfence" ::: "memory") #define mb() __asm volatile("mfence" ::: "memory") #endif #define nvme_printf(ctrlr, fmt, args...) \ device_printf(ctrlr->dev, fmt, ##args) void nvme_ns_test(struct nvme_namespace *ns, u_long cmd, caddr_t arg); void nvme_ctrlr_cmd_identify_controller(struct nvme_controller *ctrlr, void *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_identify_namespace(struct nvme_controller *ctrlr, - uint16_t nsid, void *payload, + uint32_t nsid, void *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_set_interrupt_coalescing(struct nvme_controller *ctrlr, uint32_t microseconds, uint32_t threshold, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_error_page(struct nvme_controller *ctrlr, struct nvme_error_information_entry *payload, uint32_t num_entries, /* 0 = max */ nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_health_information_page(struct nvme_controller *ctrlr, uint32_t nsid, struct nvme_health_information_page *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_firmware_page(struct nvme_controller *ctrlr, struct nvme_firmware_page *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_create_io_cq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, uint16_t vector, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_create_io_sq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_delete_io_cq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_delete_io_sq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_set_num_queues(struct nvme_controller *ctrlr, uint32_t num_queues, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_set_async_event_config(struct nvme_controller *ctrlr, union nvme_critical_warning_state state, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_abort(struct nvme_controller *ctrlr, uint16_t cid, uint16_t sqid, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_completion_poll_cb(void *arg, const struct nvme_completion *cpl); int nvme_ctrlr_construct(struct nvme_controller *ctrlr, device_t dev); void nvme_ctrlr_destruct(struct nvme_controller *ctrlr, device_t dev); void nvme_ctrlr_shutdown(struct nvme_controller *ctrlr); int nvme_ctrlr_hw_reset(struct nvme_controller *ctrlr); void nvme_ctrlr_reset(struct nvme_controller *ctrlr); /* ctrlr defined as void * to allow use with config_intrhook. */ void nvme_ctrlr_start_config_hook(void *ctrlr_arg); void nvme_ctrlr_submit_admin_request(struct nvme_controller *ctrlr, struct nvme_request *req); void nvme_ctrlr_submit_io_request(struct nvme_controller *ctrlr, struct nvme_request *req); void nvme_ctrlr_post_failed_request(struct nvme_controller *ctrlr, struct nvme_request *req); int nvme_qpair_construct(struct nvme_qpair *qpair, uint32_t id, uint16_t vector, uint32_t num_entries, uint32_t num_trackers, struct nvme_controller *ctrlr); void nvme_qpair_submit_tracker(struct nvme_qpair *qpair, struct nvme_tracker *tr); void nvme_qpair_process_completions(struct nvme_qpair *qpair); void nvme_qpair_submit_request(struct nvme_qpair *qpair, struct nvme_request *req); void nvme_qpair_reset(struct nvme_qpair *qpair); void nvme_qpair_fail(struct nvme_qpair *qpair); void nvme_qpair_manual_complete_request(struct nvme_qpair *qpair, struct nvme_request *req, uint32_t sct, uint32_t sc, boolean_t print_on_error); void nvme_admin_qpair_enable(struct nvme_qpair *qpair); void nvme_admin_qpair_disable(struct nvme_qpair *qpair); void nvme_admin_qpair_destroy(struct nvme_qpair *qpair); void nvme_io_qpair_enable(struct nvme_qpair *qpair); void nvme_io_qpair_disable(struct nvme_qpair *qpair); void nvme_io_qpair_destroy(struct nvme_qpair *qpair); -int nvme_ns_construct(struct nvme_namespace *ns, uint16_t id, +int nvme_ns_construct(struct nvme_namespace *ns, uint32_t id, struct nvme_controller *ctrlr); void nvme_ns_destruct(struct nvme_namespace *ns); void nvme_sysctl_initialize_ctrlr(struct nvme_controller *ctrlr); void nvme_dump_command(struct nvme_command *cmd); void nvme_dump_completion(struct nvme_completion *cpl); static __inline void nvme_single_map(void *arg, bus_dma_segment_t *seg, int nseg, int error) { uint64_t *bus_addr = (uint64_t *)arg; if (error != 0) printf("nvme_single_map err %d\n", error); *bus_addr = seg[0].ds_addr; } static __inline struct nvme_request * _nvme_allocate_request(nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = uma_zalloc(nvme_request_zone, M_NOWAIT | M_ZERO); if (req != NULL) { req->cb_fn = cb_fn; req->cb_arg = cb_arg; req->timeout = TRUE; } return (req); } static __inline struct nvme_request * nvme_allocate_request_vaddr(void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = _nvme_allocate_request(cb_fn, cb_arg); if (req != NULL) { req->type = NVME_REQUEST_VADDR; req->u.payload = payload; req->payload_size = payload_size; } return (req); } static __inline struct nvme_request * nvme_allocate_request_null(nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = _nvme_allocate_request(cb_fn, cb_arg); if (req != NULL) req->type = NVME_REQUEST_NULL; return (req); } static __inline struct nvme_request * nvme_allocate_request_bio(struct bio *bio, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = _nvme_allocate_request(cb_fn, cb_arg); if (req != NULL) { #ifdef NVME_UNMAPPED_BIO_SUPPORT req->type = NVME_REQUEST_BIO; req->u.bio = bio; #else req->type = NVME_REQUEST_VADDR; req->u.payload = bio->bio_data; req->payload_size = bio->bio_bcount; #endif } return (req); } #define nvme_free_request(req) uma_zfree(nvme_request_zone, req) void nvme_notify_async_consumers(struct nvme_controller *ctrlr, const struct nvme_completion *async_cpl, uint32_t log_page_id, void *log_page_buffer, uint32_t log_page_size); void nvme_notify_fail_consumers(struct nvme_controller *ctrlr); void nvme_notify_new_controller(struct nvme_controller *ctrlr); void nvme_ctrlr_intx_handler(void *arg); #endif /* __NVME_PRIVATE_H__ */ Index: projects/runtime-coverage/sys/dev/syscons/scvgarndr.c =================================================================== --- projects/runtime-coverage/sys/dev/syscons/scvgarndr.c (revision 322921) +++ projects/runtime-coverage/sys/dev/syscons/scvgarndr.c (revision 322922) @@ -1,1355 +1,1359 @@ /*- * Copyright (c) 1999 Kazutaka YOKOTA * All rights reserved. * * This code is derived from software contributed to The DragonFly Project * by Sascha Wildner * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer as * the first lines of this file unmodified. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include "opt_syscons.h" #include "opt_vga.h" #include #include #include #include #include #include #include #include #include #include #include #include #ifndef SC_RENDER_DEBUG #define SC_RENDER_DEBUG 0 #endif static vr_clear_t vga_txtclear; static vr_draw_border_t vga_txtborder; static vr_draw_t vga_txtdraw; static vr_set_cursor_t vga_txtcursor_shape; static vr_draw_cursor_t vga_txtcursor; static vr_blink_cursor_t vga_txtblink; #ifndef SC_NO_CUTPASTE static vr_draw_mouse_t vga_txtmouse; #else #define vga_txtmouse (vr_draw_mouse_t *)vga_nop #endif #ifdef SC_PIXEL_MODE static vr_init_t vga_rndrinit; static vr_clear_t vga_pxlclear_direct; static vr_clear_t vga_pxlclear_planar; static vr_draw_border_t vga_pxlborder_direct; static vr_draw_border_t vga_pxlborder_planar; static vr_draw_t vga_vgadraw_direct; static vr_draw_t vga_vgadraw_planar; static vr_set_cursor_t vga_pxlcursor_shape; static vr_draw_cursor_t vga_pxlcursor_direct; static vr_draw_cursor_t vga_pxlcursor_planar; static vr_blink_cursor_t vga_pxlblink_direct; static vr_blink_cursor_t vga_pxlblink_planar; #ifndef SC_NO_CUTPASTE static vr_draw_mouse_t vga_pxlmouse_direct; static vr_draw_mouse_t vga_pxlmouse_planar; #else #define vga_pxlmouse_direct (vr_draw_mouse_t *)vga_nop #define vga_pxlmouse_planar (vr_draw_mouse_t *)vga_nop #endif #endif /* SC_PIXEL_MODE */ #ifndef SC_NO_MODE_CHANGE static vr_draw_border_t vga_grborder; #endif static void vga_nop(scr_stat *scp); static sc_rndr_sw_t txtrndrsw = { (vr_init_t *)vga_nop, vga_txtclear, vga_txtborder, vga_txtdraw, vga_txtcursor_shape, vga_txtcursor, vga_txtblink, (vr_set_mouse_t *)vga_nop, vga_txtmouse, }; RENDERER(mda, 0, txtrndrsw, vga_set); RENDERER(cga, 0, txtrndrsw, vga_set); RENDERER(ega, 0, txtrndrsw, vga_set); RENDERER(vga, 0, txtrndrsw, vga_set); #ifdef SC_PIXEL_MODE static sc_rndr_sw_t vgarndrsw = { vga_rndrinit, (vr_clear_t *)vga_nop, (vr_draw_border_t *)vga_nop, (vr_draw_t *)vga_nop, vga_pxlcursor_shape, (vr_draw_cursor_t *)vga_nop, (vr_blink_cursor_t *)vga_nop, (vr_set_mouse_t *)vga_nop, (vr_draw_mouse_t *)vga_nop, }; RENDERER(ega, PIXEL_MODE, vgarndrsw, vga_set); RENDERER(vga, PIXEL_MODE, vgarndrsw, vga_set); #endif /* SC_PIXEL_MODE */ #ifndef SC_NO_MODE_CHANGE static sc_rndr_sw_t grrndrsw = { (vr_init_t *)vga_nop, (vr_clear_t *)vga_nop, vga_grborder, (vr_draw_t *)vga_nop, (vr_set_cursor_t *)vga_nop, (vr_draw_cursor_t *)vga_nop, (vr_blink_cursor_t *)vga_nop, (vr_set_mouse_t *)vga_nop, (vr_draw_mouse_t *)vga_nop, }; RENDERER(cga, GRAPHICS_MODE, grrndrsw, vga_set); RENDERER(ega, GRAPHICS_MODE, grrndrsw, vga_set); RENDERER(vga, GRAPHICS_MODE, grrndrsw, vga_set); #endif /* SC_NO_MODE_CHANGE */ RENDERER_MODULE(vga, vga_set); #ifndef SC_NO_CUTPASTE #if !defined(SC_ALT_MOUSE_IMAGE) || defined(SC_PIXEL_MODE) struct mousedata { u_short md_border[16]; u_short md_interior[16]; u_char md_width; u_char md_height; u_char md_baspect; u_char md_iaspect; const char *md_name; }; static const struct mousedata mouse10x16_50 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8200, 0x8400, 0x8400, 0x8400, 0x9200, 0xB200, 0xA900, 0xC900, 0x8600, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7C00, 0x7800, 0x7800, 0x7800, 0x6C00, 0x4C00, 0x4600, 0x0600, 0x0000, }, 10, 16, 49, 52, "mouse10x16_50", }; static const struct mousedata mouse8x14_67 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8700, 0x8400, 0x9200, 0xB200, 0xA900, 0xC900, 0x0600, 0x0000, 0x0000, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7800, 0x7800, 0x6C00, 0x4C00, 0x4600, 0x0600, 0x0000, 0x0000, 0x0000, }, 8, 14, 64, 65, "mouse8x14_67", }; static const struct mousedata mouse8x13_75 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8600, 0x8400, 0xB200, 0xD200, 0x0900, 0x0900, 0x0600, 0x0000, 0x0000, 0x0000, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7800, 0x7800, 0x4C00, 0x0C00, 0x0600, 0x0600, 0x0000, 0x0000, 0x0000, 0x0000, }, 8, 13, 75, 80, "mouse8x13_75", }; static const struct mousedata mouse10x16_75 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8700, 0x8400, 0x9200, 0xB200, 0xC900, 0x0900, 0x0480, 0x0480, 0x0300, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7800, 0x7800, 0x6C00, 0x4C00, 0x0600, 0x0600, 0x0300, 0x0300, 0x0000, }, 10, 16, 72, 75, "mouse10x16_75", }; static const struct mousedata mouse9x13_90 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8780, 0x9200, 0xB200, 0xD900, 0x8900, 0x0600, 0x0000, 0x0000, 0x0000, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7800, 0x6C00, 0x4C00, 0x0600, 0x0600, 0x0000, 0x0000, 0x0000, 0x0000, }, 9, 13, 89, 89, "mouse9x13_90", }; static const struct mousedata mouse10x16_90 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8080, 0x8040, 0x83E0, 0x8200, 0x9900, 0xA900, 0xC480, 0x8480, 0x0300, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7F00, 0x7F80, 0x7C00, 0x7C00, 0x6600, 0x4600, 0x0300, 0x0300, 0x0000, }, 10, 16, 89, 89, "mouse10x16_90", }; static const struct mousedata mouse9x13_100 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8780, 0xB200, 0xD200, 0x8900, 0x0900, 0x0600, 0x0000, 0x0000, 0x0000, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7800, 0x4C00, 0x0C00, 0x0600, 0x0600, 0x0000, 0x0000, 0x0000, 0x0000, }, 9, 13, 106, 113, "mouse9x13_100", }; static const struct mousedata mouse10x16_100 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8080, 0x8040, 0x83C0, 0x9200, 0xA900, 0xC900, 0x0480, 0x0480, 0x0300, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7F00, 0x7F80, 0x7C00, 0x6C00, 0x4600, 0x0600, 0x0300, 0x0300, 0x0000, }, 10, 16, 96, 106, "mouse10x16_100", }; static const struct mousedata mouse10x14_120 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8080, 0x97C0, 0xB200, 0xF200, 0xC900, 0x8900, 0x0600, 0x0000, 0x0000, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7F00, 0x6800, 0x4C00, 0x0C00, 0x0600, 0x0600, 0x0000, 0x0000, 0x0000, }, 10, 14, 120, 124, "mouse10x14_120", }; static const struct mousedata mouse10x16_120 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8080, 0x97C0, 0xB200, 0xF200, 0xC900, 0x8900, 0x0480, 0x0480, 0x0300, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7F00, 0x6800, 0x4C00, 0x0C00, 0x0600, 0x0600, 0x0300, 0x0300, 0x0000, }, 10, 16, 120, 124, "mouse10x16_120", }; static const struct mousedata mouse9x13_133 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8080, 0x9780, 0xB200, 0xC900, 0x0900, 0x0600, 0x0000, 0x0000, 0x0000, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7F00, 0x6800, 0x4C00, 0x0600, 0x0600, 0x0000, 0x0000, 0x0000, 0x0000, }, 9, 13, 142, 124, "mouse9x13_133", }; static const struct mousedata mouse10x16_133 = { { 0xC000, 0xA000, 0x9000, 0x8800, 0x8400, 0x8200, 0x8100, 0x8080, 0x8040, 0x93E0, 0xB200, 0xC900, 0x8900, 0x0480, 0x0480, 0x0300, }, { 0x0000, 0x4000, 0x6000, 0x7000, 0x7800, 0x7C00, 0x7E00, 0x7F00, 0x7F80, 0x6C00, 0x4C00, 0x0600, 0x0600, 0x0300, 0x0300, 0x0000, }, 10, 16, 120, 133, "mouse10x16_133", }; static const struct mousedata mouse14x10_240 = { { 0xF800, 0xCE00, 0xC380, 0xC0E0, 0xC038, 0xC1FC, 0xDCC0, 0xF660, 0xC330, 0x01E0, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, }, { 0x0000, 0x3000, 0x3C00, 0x3F00, 0x3FC0, 0x3E00, 0x2300, 0x0180, 0x00C0, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, }, 14, 10, 189, 189, "mouse14x10_240", }; static const struct mousedata * const mouselarge[] = { &mouse10x16_50, &mouse8x14_67, &mouse10x16_75, &mouse10x16_90, &mouse10x16_100, &mouse10x16_120, &mouse10x16_133, &mouse14x10_240, }; static const struct mousedata * const mousesmall[] = { &mouse8x14_67, &mouse8x13_75, &mouse9x13_90, &mouse9x13_100, &mouse10x14_120, &mouse9x13_133, &mouse14x10_240, }; #endif #endif #ifdef SC_PIXEL_MODE #define GET_PIXEL(scp, pos, x, w) \ ({ \ (scp)->sc->adp->va_window + \ (x) * (scp)->xoff + \ (scp)->yoff * (scp)->font_size * (w) + \ (x) * ((pos) % (scp)->xsize) + \ (scp)->font_size * (w) * ((pos) / (scp)->xsize); \ }) #define DRAW_PIXEL(scp, pos, color) do { \ switch ((scp)->sc->adp->va_info.vi_depth) { \ case 32: \ writel((pos), vga_palette32[color]); \ break; \ case 24: \ if (((pos) & 1) == 0) { \ writew((pos), vga_palette32[color]); \ writeb((pos) + 2, vga_palette32[color] >> 16); \ } else { \ writeb((pos), vga_palette32[color]); \ writew((pos) + 1, vga_palette32[color] >> 8); \ } \ break; \ case 16: \ if ((scp)->sc->adp->va_info.vi_pixel_fsizes[1] == 5) \ writew((pos), vga_palette15[color]); \ else \ writew((pos), vga_palette16[color]); \ break; \ case 15: \ writew((pos), vga_palette15[color]); \ break; \ case 8: \ writeb((pos), (uint8_t)(color)); \ } \ } while (0) static uint32_t vga_palette32[16] = { 0x000000, 0x0000ad, 0x00ad00, 0x00adad, 0xad0000, 0xad00ad, 0xad5200, 0xadadad, 0x525252, 0x5252ff, 0x52ff52, 0x52ffff, 0xff5252, 0xff52ff, 0xffff52, 0xffffff }; static uint16_t vga_palette16[16] = { 0x0000, 0x0016, 0x0560, 0x0576, 0xb000, 0xb016, 0xb2a0, 0xb576, 0x52aa, 0x52bf, 0x57ea, 0x57ff, 0xfaaa, 0xfabf, 0xffea, 0xffff }; static uint16_t vga_palette15[16] = { 0x0000, 0x0016, 0x02c0, 0x02d6, 0x5800, 0x5816, 0x5940, 0x5ad6, 0x294a, 0x295f, 0x2bea, 0x2bff, 0x7d4a, 0x7d5f, 0x7fea, 0x7fff }; #endif static int vga_aspect_scale= 100; SYSCTL_INT(_machdep, OID_AUTO, vga_aspect_scale, CTLFLAG_RW, &vga_aspect_scale, 0, "Aspect scale ratio (3:4):actual times 100"); static u_short vga_flipattr(u_short a, int blink) { if (blink) a = (a & 0x8800) | ((a & 0x7000) >> 4) | ((a & 0x0700) << 4); else a = ((a & 0xf000) >> 4) | ((a & 0x0f00) << 4); return (a); } static u_short -vga_cursorattr_adj(u_short a, int blink) +vga_cursorattr_adj(scr_stat *scp, u_short a, int blink) { - /* - * !blink means pixel mode, and the cursor attribute in that case - * is simplistic reverse video. - */ - if (!blink) - return (vga_flipattr(a, blink)); + int i; + u_short bg, bgmask, fg, newbg; /* * The cursor attribute is usually that of the underlying char - * with the bg changed to white. If the bg is already white, - * then the bg is changed to black. The fg is usually not - * changed, but if it is the same as the new bg then it is - * changed to the inverse of the new bg. + * with only the bg changed, to the first preferred color that + * differs from both the fg and bg. If there is no such color, + * use reverse video. */ - if ((a & 0x7000) == 0x7000) { - a &= 0x8f00; - if ((a & 0x0700) == 0) - a |= 0x0700; - } else { - a |= 0x7000; - if ((a & 0x0700) == 0x0700) - a &= 0xf000; + bgmask = blink ? 0x7000 : 0xf000; + bg = a & bgmask; + fg = a & 0x0f00; + for (i = 0; i < nitems(scp->curs_attr.bg); i++) { + newbg = (scp->curs_attr.bg[i] << 12) & bgmask; + if (newbg != bg && newbg != (fg << 4)) + break; } - return (a); + if (i == nitems(scp->curs_attr.bg)) + return (vga_flipattr(a, blink)); + return (fg | newbg | (blink ? a & 0x8000 : 0)); } static void vga_setmdp(scr_stat *scp) { #if !defined(SC_NO_CUTPASTE) && \ (!defined(SC_ALT_MOUSE_IMAGE) || defined(SC_PIXEL_MODE)) const struct mousedata *mdp; const struct mousedata * const *mdpp; int aspect, best_i, best_v, i, n, v, wb, wi, xpixel, ypixel; xpixel = scp->xpixel; ypixel = scp->ypixel; if (scp->sc->adp->va_flags & V_ADP_CWIDTH9) xpixel = xpixel * 9 / 8; /* If 16:9 +-1%, assume square pixels, else scale to 4:3 or full. */ aspect = xpixel * 900 / ypixel / 16; if (aspect < 99 || aspect > 100) aspect = xpixel * 300 / ypixel / 4 * vga_aspect_scale / 100; /* * Use 10x16 cursors except even with 8x8 fonts except in ~200- * line modes where pixels are very large and in text mode where * even 13 pixels high is really 4 too many. Clipping a 16-high * cursor at 9-high gives a variable tail which looks better than * a smaller cursor with a constant tail. * * XXX: the IS*SC() macros don't work when this is called at the * end of a mode switch since UNKNOWN_SC is still set. */ if (scp->font_size <= 8 && (ypixel < 300 || !(scp->status & PIXEL_MODE))) { mdpp = &mousesmall[0]; n = nitems(mousesmall); } else { mdpp = &mouselarge[0]; n = nitems(mouselarge); } if (scp->status & PIXEL_MODE) { wb = 1024; wi = 256; } else { wb = 256; wi = 1024; } best_i = 0; best_v = 0x7fffffff; for (i = 0; i < n; i++) { v = (wb * abs(mdpp[i]->md_baspect - aspect) + wi * abs(mdpp[i]->md_iaspect - aspect)) / aspect; if (best_v > v) { best_v = v; best_i = i; } } mdp = mdpp[best_i]; scp->mouse_data = mdp; #endif /* !SC_NO_CUTPASTE && (!SC_ALT_MOUSE_IMAGE || SC_PIXEL_MODE) */ } static void vga_nop(scr_stat *scp) { } /* text mode renderer */ static void vga_txtclear(scr_stat *scp, int c, int attr) { sc_vtb_clear(&scp->scr, c, attr); } static void vga_txtborder(scr_stat *scp, int color) { vidd_set_border(scp->sc->adp, color); } static void vga_txtdraw(scr_stat *scp, int from, int count, int flip) { vm_offset_t p; int c; int a; if (from + count > scp->xsize*scp->ysize) count = scp->xsize*scp->ysize - from; if (flip) { for (p = sc_vtb_pointer(&scp->scr, from); count-- > 0; ++from) { c = sc_vtb_getc(&scp->vtb, from); a = sc_vtb_geta(&scp->vtb, from); a = vga_flipattr(a, TRUE); p = sc_vtb_putchar(&scp->scr, p, c, a); } } else { sc_vtb_copy(&scp->vtb, from, &scp->scr, from, count); } } static void vga_txtcursor_shape(scr_stat *scp, int base, int height, int blink) { vga_setmdp(scp); if (base < 0 || base >= scp->font_size) return; /* the caller may set height <= 0 in order to disable the cursor */ vidd_set_hw_cursor_shape(scp->sc->adp, base, height, scp->font_size, blink); } static void draw_txtcharcursor(scr_stat *scp, int at, u_short c, u_short a, int flip) { sc_softc_t *sc; sc = scp->sc; #ifndef SC_NO_FONT_LOADING if (scp->curs_attr.flags & CONS_CHAR_CURSOR) { unsigned char *font; int h; int i; if (scp->font_size < 14) { font = sc->font_8; h = 8; } else if (scp->font_size >= 16) { font = sc->font_16; h = 16; } else { font = sc->font_14; h = 14; } if (scp->curs_attr.base >= h) return; if (flip) a = vga_flipattr(a, TRUE); + /* + * This clause handles partial-block cursors in text mode. + * We want to change the attribute only under the partial + * block, but in text mode we can only change full blocks. + * Use reverse video instead. + */ bcopy(font + c*h, font + sc->cursor_char*h, h); font = font + sc->cursor_char*h; for (i = imax(h - scp->curs_attr.base - scp->curs_attr.height, 0); i < h - scp->curs_attr.base; ++i) { font[i] ^= 0xff; } /* XXX */ vidd_load_font(sc->adp, 0, h, 8, font, sc->cursor_char, 1); sc_vtb_putc(&scp->scr, at, sc->cursor_char, a); } else #endif /* SC_NO_FONT_LOADING */ { if (flip) a = vga_flipattr(a, TRUE); - a = vga_cursorattr_adj(a, TRUE); + a = vga_cursorattr_adj(scp, a, TRUE); sc_vtb_putc(&scp->scr, at, c, a); } } static void vga_txtcursor(scr_stat *scp, int at, int blink, int on, int flip) { video_adapter_t *adp; int cursor_attr; if (scp->curs_attr.height <= 0) /* the text cursor is disabled */ return; adp = scp->sc->adp; if (blink) { scp->status |= VR_CURSOR_BLINK; if (on) { scp->status |= VR_CURSOR_ON; vidd_set_hw_cursor(adp, at%scp->xsize, at/scp->xsize); } else { if (scp->status & VR_CURSOR_ON) vidd_set_hw_cursor(adp, -1, -1); scp->status &= ~VR_CURSOR_ON; } } else { scp->status &= ~VR_CURSOR_BLINK; if (on) { scp->status |= VR_CURSOR_ON; draw_txtcharcursor(scp, at, sc_vtb_getc(&scp->vtb, at), sc_vtb_geta(&scp->vtb, at), flip); } else { cursor_attr = sc_vtb_geta(&scp->vtb, at); if (flip) cursor_attr = vga_flipattr(cursor_attr, TRUE); if (scp->status & VR_CURSOR_ON) sc_vtb_putc(&scp->scr, at, sc_vtb_getc(&scp->vtb, at), cursor_attr); scp->status &= ~VR_CURSOR_ON; } } } static void vga_txtblink(scr_stat *scp, int at, int flip) { } int sc_txtmouse_no_retrace_wait; #ifndef SC_NO_CUTPASTE static void draw_txtmouse(scr_stat *scp, int x, int y) { #ifndef SC_ALT_MOUSE_IMAGE if (ISMOUSEAVAIL(scp->sc->adp->va_flags)) { const struct mousedata *mdp; uint32_t border, interior; u_char font_buf[128]; u_short cursor[32]; u_char c; int pos; int xoffset, yoffset; int crtc_addr; int i; mdp = scp->mouse_data; /* prepare mousepointer char's bitmaps */ pos = (y/scp->font_size - scp->yoff)*scp->xsize + x/8 - scp->xoff; bcopy(scp->font + sc_vtb_getc(&scp->scr, pos)*scp->font_size, &font_buf[0], scp->font_size); bcopy(scp->font + sc_vtb_getc(&scp->scr, pos + 1)*scp->font_size, &font_buf[32], scp->font_size); bcopy(scp->font + sc_vtb_getc(&scp->scr, pos + scp->xsize)*scp->font_size, &font_buf[64], scp->font_size); bcopy(scp->font + sc_vtb_getc(&scp->scr, pos + scp->xsize + 1)*scp->font_size, &font_buf[96], scp->font_size); for (i = 0; i < scp->font_size; ++i) { cursor[i] = font_buf[i]<<8 | font_buf[i+32]; cursor[i + scp->font_size] = font_buf[i+64]<<8 | font_buf[i+96]; } /* now and-or in the mousepointer image */ xoffset = x%8; yoffset = y%scp->font_size; for (i = 0; i < 16; ++i) { border = mdp->md_border[i] << 8; /* avoid right shifting out */ interior = mdp->md_interior[i] << 8; border >>= xoffset; /* normalize */ interior >>= xoffset; if (scp->sc->adp->va_flags & V_ADP_CWIDTH9) { /* skip gaps between characters */ border = (border & 0xff0000) | (border & 0x007f80) << 1 | (border & 0x00003f) << 2; interior = (interior & 0xff0000) | (interior & 0x007f80) << 1 | (interior & 0x00003f) << 2; } border >>= 8; /* back to normal position */ interior >>= 8; cursor[i + yoffset] = (cursor[i + yoffset] & ~border) | interior; } for (i = 0; i < scp->font_size; ++i) { font_buf[i] = (cursor[i] & 0xff00) >> 8; font_buf[i + 32] = cursor[i] & 0xff; font_buf[i + 64] = (cursor[i + scp->font_size] & 0xff00) >> 8; font_buf[i + 96] = cursor[i + scp->font_size] & 0xff; } #if 1 /* wait for vertical retrace to avoid jitter on some videocards */ crtc_addr = scp->sc->adp->va_crtc_addr; while (!sc_txtmouse_no_retrace_wait && !(inb(crtc_addr + 6) & 0x08)) /* idle */ ; #endif c = scp->sc->mouse_char; vidd_load_font(scp->sc->adp, 0, 32, 8, font_buf, c, 4); sc_vtb_putc(&scp->scr, pos, c, sc_vtb_geta(&scp->scr, pos)); /* FIXME: may be out of range! */ sc_vtb_putc(&scp->scr, pos + scp->xsize, c + 2, sc_vtb_geta(&scp->scr, pos + scp->xsize)); if (x < (scp->xsize - 1)*8) { sc_vtb_putc(&scp->scr, pos + 1, c + 1, sc_vtb_geta(&scp->scr, pos + 1)); sc_vtb_putc(&scp->scr, pos + scp->xsize + 1, c + 3, sc_vtb_geta(&scp->scr, pos + scp->xsize + 1)); } } else #endif /* SC_ALT_MOUSE_IMAGE */ { /* Red, magenta and brown are mapped to green to to keep it readable */ static const int col_conv[16] = { 6, 6, 6, 6, 2, 2, 2, 6, 14, 14, 14, 14, 10, 10, 10, 14 }; int pos; int color; int a; pos = (y/scp->font_size - scp->yoff)*scp->xsize + x/8 - scp->xoff; a = sc_vtb_geta(&scp->scr, pos); if (scp->sc->adp->va_flags & V_ADP_COLOR) color = (col_conv[(a & 0xf000) >> 12] << 12) | ((a & 0x0f00) | 0x0800); else color = ((a & 0xf000) >> 4) | ((a & 0x0f00) << 4); sc_vtb_putc(&scp->scr, pos, sc_vtb_getc(&scp->scr, pos), color); } } static void remove_txtmouse(scr_stat *scp, int x, int y) { } static void vga_txtmouse(scr_stat *scp, int x, int y, int on) { if (on) draw_txtmouse(scp, x, y); else remove_txtmouse(scp, x, y); } #endif /* SC_NO_CUTPASTE */ #ifdef SC_PIXEL_MODE /* pixel (raster text) mode renderer */ static void vga_rndrinit(scr_stat *scp) { if (scp->sc->adp->va_info.vi_mem_model == V_INFO_MM_PLANAR) { scp->rndr->clear = vga_pxlclear_planar; scp->rndr->draw_border = vga_pxlborder_planar; scp->rndr->draw = vga_vgadraw_planar; scp->rndr->draw_cursor = vga_pxlcursor_planar; scp->rndr->blink_cursor = vga_pxlblink_planar; scp->rndr->draw_mouse = vga_pxlmouse_planar; } else if (scp->sc->adp->va_info.vi_mem_model == V_INFO_MM_DIRECT || scp->sc->adp->va_info.vi_mem_model == V_INFO_MM_PACKED) { scp->rndr->clear = vga_pxlclear_direct; scp->rndr->draw_border = vga_pxlborder_direct; scp->rndr->draw = vga_vgadraw_direct; scp->rndr->draw_cursor = vga_pxlcursor_direct; scp->rndr->blink_cursor = vga_pxlblink_direct; scp->rndr->draw_mouse = vga_pxlmouse_direct; } } static void vga_pxlclear_direct(scr_stat *scp, int c, int attr) { vm_offset_t p; int line_width; int pixel_size; int lines; int i; line_width = scp->sc->adp->va_line_width; pixel_size = scp->sc->adp->va_info.vi_pixel_size; lines = scp->ysize * scp->font_size; p = scp->sc->adp->va_window + line_width * scp->yoff * scp->font_size + scp->xoff * 8 * pixel_size; for (i = 0; i < lines; ++i) { bzero_io((void *)p, scp->xsize * 8 * pixel_size); p += line_width; } } static void vga_pxlclear_planar(scr_stat *scp, int c, int attr) { vm_offset_t p; int line_width; int lines; int i; /* XXX: we are just filling the screen with the background color... */ outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ outw(GDCIDX, 0x0003); /* data rotate/function select */ outw(GDCIDX, 0x0f01); /* set/reset enable */ outw(GDCIDX, 0xff08); /* bit mask */ outw(GDCIDX, ((attr & 0xf000) >> 4) | 0x00); /* set/reset */ line_width = scp->sc->adp->va_line_width; lines = scp->ysize*scp->font_size; p = scp->sc->adp->va_window + line_width*scp->yoff*scp->font_size + scp->xoff; for (i = 0; i < lines; ++i) { bzero_io((void *)p, scp->xsize); p += line_width; } outw(GDCIDX, 0x0000); /* set/reset */ outw(GDCIDX, 0x0001); /* set/reset enable */ } static void vga_pxlborder_direct(scr_stat *scp, int color) { vm_offset_t s; vm_offset_t e; vm_offset_t f; int line_width; int pixel_size; int x; int y; int i; line_width = scp->sc->adp->va_line_width; pixel_size = scp->sc->adp->va_info.vi_pixel_size; if (scp->yoff > 0) { s = scp->sc->adp->va_window; e = s + line_width * scp->yoff * scp->font_size; for (f = s; f < e; f += pixel_size) DRAW_PIXEL(scp, f, color); } y = (scp->yoff + scp->ysize) * scp->font_size; if (scp->ypixel > y) { s = scp->sc->adp->va_window + line_width * y; e = s + line_width * (scp->ypixel - y); for (f = s; f < e; f += pixel_size) DRAW_PIXEL(scp, f, color); } y = scp->yoff * scp->font_size; x = scp->xpixel / 8 - scp->xoff - scp->xsize; for (i = 0; i < scp->ysize * scp->font_size; ++i) { if (scp->xoff > 0) { s = scp->sc->adp->va_window + line_width * (y + i); e = s + scp->xoff * 8 * pixel_size; for (f = s; f < e; f += pixel_size) DRAW_PIXEL(scp, f, color); } if (x > 0) { s = scp->sc->adp->va_window + line_width * (y + i) + scp->xoff * 8 * pixel_size + scp->xsize * 8 * pixel_size; e = s + x * 8 * pixel_size; for (f = s; f < e; f += pixel_size) DRAW_PIXEL(scp, f, color); } } } static void vga_pxlborder_planar(scr_stat *scp, int color) { vm_offset_t p; int line_width; int x; int y; int i; vidd_set_border(scp->sc->adp, color); outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ outw(GDCIDX, 0x0003); /* data rotate/function select */ outw(GDCIDX, 0x0f01); /* set/reset enable */ outw(GDCIDX, 0xff08); /* bit mask */ outw(GDCIDX, (color << 8) | 0x00); /* set/reset */ line_width = scp->sc->adp->va_line_width; p = scp->sc->adp->va_window; if (scp->yoff > 0) bzero_io((void *)p, line_width*scp->yoff*scp->font_size); y = (scp->yoff + scp->ysize)*scp->font_size; if (scp->ypixel > y) bzero_io((void *)(p + line_width*y), line_width*(scp->ypixel - y)); y = scp->yoff*scp->font_size; x = scp->xpixel/8 - scp->xoff - scp->xsize; for (i = 0; i < scp->ysize*scp->font_size; ++i) { if (scp->xoff > 0) bzero_io((void *)(p + line_width*(y + i)), scp->xoff); if (x > 0) bzero_io((void *)(p + line_width*(y + i) + scp->xoff + scp->xsize), x); } outw(GDCIDX, 0x0000); /* set/reset */ outw(GDCIDX, 0x0001); /* set/reset enable */ } static void vga_vgadraw_direct(scr_stat *scp, int from, int count, int flip) { vm_offset_t d; vm_offset_t e; u_char *f; u_short col1, col2, color; int line_width, pixel_size; int i, j, k; int a; line_width = scp->sc->adp->va_line_width; pixel_size = scp->sc->adp->va_info.vi_pixel_size; d = GET_PIXEL(scp, from, 8 * pixel_size, line_width); if (from + count > scp->xsize * scp->ysize) count = scp->xsize * scp->ysize - from; for (i = from; count-- > 0; ++i) { a = sc_vtb_geta(&scp->vtb, i); if (flip) a = vga_flipattr(a, FALSE); col1 = (a & 0x0f00) >> 8; col2 = (a & 0xf000) >> 12; e = d; f = &(scp->font[sc_vtb_getc(&scp->vtb, i) * scp->font_size]); for (j = 0; j < scp->font_size; ++j, ++f) { for (k = 0; k < 8; ++k) { color = *f & (1 << (7 - k)) ? col1 : col2; DRAW_PIXEL(scp, e + pixel_size * k, color); } e += line_width; } d += 8 * pixel_size; if ((i % scp->xsize) == scp->xsize - 1) d += scp->font_size * line_width - scp->xsize * 8 * pixel_size; } } static void vga_vgadraw_planar(scr_stat *scp, int from, int count, int flip) { vm_offset_t d; vm_offset_t e; u_char *f; u_short bg, fg; u_short col1, col2; int line_width; int i, j; int a; u_char c; line_width = scp->sc->adp->va_line_width; d = GET_PIXEL(scp, from, 1, line_width); if (scp->sc->adp->va_type == KD_VGA) { outw(GDCIDX, 0x0305); /* read mode 0, write mode 3 */ outw(GDCIDX, 0xff08); /* bit mask */ } else outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ outw(GDCIDX, 0x0003); /* data rotate/function select */ outw(GDCIDX, 0x0f01); /* set/reset enable */ fg = bg = -1; if (from + count > scp->xsize*scp->ysize) count = scp->xsize*scp->ysize - from; for (i = from; count-- > 0; ++i) { a = sc_vtb_geta(&scp->vtb, i); if (flip) a = vga_flipattr(a, FALSE); col1 = a & 0x0f00; col2 = (a & 0xf000) >> 4; /* set background color in EGA/VGA latch */ if (bg != col2) { bg = col2; fg = -1; outw(GDCIDX, bg | 0x00); /* set/reset */ if (scp->sc->adp->va_type != KD_VGA) outw(GDCIDX, 0xff08); /* bit mask */ writeb(d, 0xff); c = readb(d); /* set bg color in the latch */ } /* foreground color */ if (fg != col1) { fg = col1; outw(GDCIDX, col1 | 0x00); /* set/reset */ } e = d; f = &(scp->font[sc_vtb_getc(&scp->vtb, i)*scp->font_size]); for (j = 0; j < scp->font_size; ++j, ++f) { if (scp->sc->adp->va_type == KD_VGA) writeb(e, *f); else { outw(GDCIDX, (*f << 8) | 0x08); /* bit mask */ writeb(e, 0); } e += line_width; } ++d; if ((i % scp->xsize) == scp->xsize - 1) d += scp->font_size * line_width - scp->xsize; } if (scp->sc->adp->va_type == KD_VGA) outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ else outw(GDCIDX, 0xff08); /* bit mask */ outw(GDCIDX, 0x0000); /* set/reset */ outw(GDCIDX, 0x0001); /* set/reset enable */ } static void vga_pxlcursor_shape(scr_stat *scp, int base, int height, int blink) { vga_setmdp(scp); } static void draw_pxlcursor_direct(scr_stat *scp, int at, int on, int flip) { vm_offset_t d; u_char *f; int line_width, pixel_size; int height; int col1, col2, color; int a; int i, j; line_width = scp->sc->adp->va_line_width; pixel_size = scp->sc->adp->va_info.vi_pixel_size; d = GET_PIXEL(scp, at, 8 * pixel_size, line_width) + (scp->font_size - scp->curs_attr.base - 1) * line_width; a = sc_vtb_geta(&scp->vtb, at); if (flip) a = vga_flipattr(a, FALSE); if (on) - a = vga_cursorattr_adj(a, FALSE); + a = vga_cursorattr_adj(scp, a, FALSE); col1 = (a & 0x0f00) >> 8; col2 = a >> 12; f = &(scp->font[sc_vtb_getc(&scp->vtb, at) * scp->font_size + scp->font_size - scp->curs_attr.base - 1]); height = imin(scp->curs_attr.height, scp->font_size); for (i = 0; i < height; ++i, --f) { for (j = 0; j < 8; ++j) { color = *f & (1 << (7 - j)) ? col1 : col2; DRAW_PIXEL(scp, d + pixel_size * j, color); } d -= line_width; } } static void draw_pxlcursor_planar(scr_stat *scp, int at, int on, int flip) { vm_offset_t d; u_char *f; int line_width; int height; int col; int a; int i; u_char c; line_width = scp->sc->adp->va_line_width; d = GET_PIXEL(scp, at, 1, line_width) + (scp->font_size - scp->curs_attr.base - 1) * line_width; outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ outw(GDCIDX, 0x0003); /* data rotate/function select */ outw(GDCIDX, 0x0f01); /* set/reset enable */ /* set background color in EGA/VGA latch */ a = sc_vtb_geta(&scp->vtb, at); if (flip) a = vga_flipattr(a, FALSE); if (on) - a = vga_cursorattr_adj(a, FALSE); + a = vga_cursorattr_adj(scp, a, FALSE); col = (a & 0xf000) >> 4; outw(GDCIDX, col | 0x00); /* set/reset */ outw(GDCIDX, 0xff08); /* bit mask */ writeb(d, 0); c = readb(d); /* set bg color in the latch */ /* foreground color */ col = a & 0x0f00; outw(GDCIDX, col | 0x00); /* set/reset */ f = &(scp->font[sc_vtb_getc(&scp->vtb, at)*scp->font_size + scp->font_size - scp->curs_attr.base - 1]); height = imin(scp->curs_attr.height, scp->font_size); for (i = 0; i < height; ++i, --f) { outw(GDCIDX, (*f << 8) | 0x08); /* bit mask */ writeb(d, 0); d -= line_width; } outw(GDCIDX, 0x0000); /* set/reset */ outw(GDCIDX, 0x0001); /* set/reset enable */ outw(GDCIDX, 0xff08); /* bit mask */ } static int pxlblinkrate = 0; static void vga_pxlcursor_direct(scr_stat *scp, int at, int blink, int on, int flip) { if (scp->curs_attr.height <= 0) /* the text cursor is disabled */ return; if (on) { if (!blink) { scp->status |= VR_CURSOR_ON; draw_pxlcursor_direct(scp, at, on, flip); } else if (++pxlblinkrate & 4) { pxlblinkrate = 0; scp->status ^= VR_CURSOR_ON; draw_pxlcursor_direct(scp, at, scp->status & VR_CURSOR_ON, flip); } } else { if (scp->status & VR_CURSOR_ON) draw_pxlcursor_direct(scp, at, on, flip); scp->status &= ~VR_CURSOR_ON; } if (blink) scp->status |= VR_CURSOR_BLINK; else scp->status &= ~VR_CURSOR_BLINK; } static void vga_pxlcursor_planar(scr_stat *scp, int at, int blink, int on, int flip) { if (scp->curs_attr.height <= 0) /* the text cursor is disabled */ return; if (on) { if (!blink) { scp->status |= VR_CURSOR_ON; draw_pxlcursor_planar(scp, at, on, flip); } else if (++pxlblinkrate & 4) { pxlblinkrate = 0; scp->status ^= VR_CURSOR_ON; draw_pxlcursor_planar(scp, at, scp->status & VR_CURSOR_ON, flip); } } else { if (scp->status & VR_CURSOR_ON) draw_pxlcursor_planar(scp, at, on, flip); scp->status &= ~VR_CURSOR_ON; } if (blink) scp->status |= VR_CURSOR_BLINK; else scp->status &= ~VR_CURSOR_BLINK; } static void vga_pxlblink_direct(scr_stat *scp, int at, int flip) { if (!(scp->status & VR_CURSOR_BLINK)) return; if (!(++pxlblinkrate & 4)) return; pxlblinkrate = 0; scp->status ^= VR_CURSOR_ON; draw_pxlcursor_direct(scp, at, scp->status & VR_CURSOR_ON, flip); } static void vga_pxlblink_planar(scr_stat *scp, int at, int flip) { if (!(scp->status & VR_CURSOR_BLINK)) return; if (!(++pxlblinkrate & 4)) return; pxlblinkrate = 0; scp->status ^= VR_CURSOR_ON; draw_pxlcursor_planar(scp, at, scp->status & VR_CURSOR_ON, flip); } #ifndef SC_NO_CUTPASTE static void draw_pxlmouse_planar(scr_stat *scp, int x, int y) { const struct mousedata *mdp; vm_offset_t p; int line_width; int xoff, yoff; int ymax; uint32_t m; int i, j, k; uint8_t m1; mdp = scp->mouse_data; line_width = scp->sc->adp->va_line_width; xoff = (x - scp->xoff*8)%8; yoff = y - rounddown(y, line_width); ymax = imin(y + mdp->md_height, scp->ypixel); if (scp->sc->adp->va_type == KD_VGA) { outw(GDCIDX, 0x0305); /* read mode 0, write mode 3 */ outw(GDCIDX, 0xff08); /* bit mask */ } else outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ outw(GDCIDX, 0x0003); /* data rotate/function select */ outw(GDCIDX, 0x0f01); /* set/reset enable */ - outw(GDCIDX, (0 << 8) | 0x00); /* set/reset */ + outw(GDCIDX, (scp->curs_attr.mouse_ba << 8) | 0x00); /* set/reset */ p = scp->sc->adp->va_window + line_width*y + x/8; for (i = y, j = 0; i < ymax; ++i, ++j) { m = mdp->md_border[j] << 8 >> xoff; for (k = 0; k < 3; ++k) { m1 = m >> (8 * (2 - k)); if (m1 != 0 && x + 8 * k < scp->xpixel) { readb(p + k); if (scp->sc->adp->va_type == KD_VGA) writeb(p + k, m1); else { /* bit mask: */ outw(GDCIDX, (m1 << 8) | 0x08); writeb(p + k, 0); } } } p += line_width; } - outw(GDCIDX, (15 << 8) | 0x00); /* set/reset */ + outw(GDCIDX, (scp->curs_attr.mouse_ia << 8) | 0x00); /* set/reset */ p = scp->sc->adp->va_window + line_width*y + x/8; for (i = y, j = 0; i < ymax; ++i, ++j) { m = mdp->md_interior[j] << 8 >> xoff; for (k = 0; k < 3; ++k) { m1 = m >> (8 * (2 - k)); if (m1 != 0 && x + 8 * k < scp->xpixel) { readb(p + k); if (scp->sc->adp->va_type == KD_VGA) writeb(p + k, m1); else { /* bit mask: */ outw(GDCIDX, (m1 << 8) | 0x08); writeb(p + k, 0); } } } p += line_width; } if (scp->sc->adp->va_type == KD_VGA) outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ else outw(GDCIDX, 0xff08); /* bit mask */ outw(GDCIDX, 0x0000); /* set/reset */ outw(GDCIDX, 0x0001); /* set/reset enable */ } static void remove_pxlmouse_planar(scr_stat *scp, int x, int y) { const struct mousedata *mdp; vm_offset_t p; int bx, by, i, line_width, xend, xoff, yend, yoff; mdp = scp->mouse_data; /* * It is only necessary to remove the mouse image where it overlaps * the border. Determine the overlap, and do nothing if it is empty. */ bx = (scp->xoff + scp->xsize) * 8; by = (scp->yoff + scp->ysize) * scp->font_size; xend = imin(x + mdp->md_width, scp->xpixel); yend = imin(y + mdp->md_height, scp->ypixel); if (xend <= bx && yend <= by) return; /* Repaint the non-empty overlap. */ line_width = scp->sc->adp->va_line_width; outw(GDCIDX, 0x0005); /* read mode 0, write mode 0 */ outw(GDCIDX, 0x0003); /* data rotate/function select */ outw(GDCIDX, 0x0f01); /* set/reset enable */ outw(GDCIDX, 0xff08); /* bit mask */ outw(GDCIDX, (scp->border << 8) | 0x00); /* set/reset */ for (i = x / 8, xoff = i * 8; xoff < xend; ++i, xoff += 8) { yoff = (xoff >= bx) ? y : by; p = scp->sc->adp->va_window + yoff * line_width + i; for (; yoff < yend; ++yoff, p += line_width) writeb(p, 0); } outw(GDCIDX, 0x0000); /* set/reset */ outw(GDCIDX, 0x0001); /* set/reset enable */ } static void vga_pxlmouse_direct(scr_stat *scp, int x, int y, int on) { const struct mousedata *mdp; vm_offset_t p; int line_width, pixel_size; int xend, yend; int i, j; mdp = scp->mouse_data; /* * Determine overlap with the border and then if removing, do nothing * if the overlap is empty. */ xend = imin(x + mdp->md_width, scp->xpixel); yend = imin(y + mdp->md_height, scp->ypixel); if (!on && xend <= (scp->xoff + scp->xsize) * 8 && yend <= (scp->yoff + scp->ysize) * scp->font_size) return; line_width = scp->sc->adp->va_line_width; pixel_size = scp->sc->adp->va_info.vi_pixel_size; if (on) goto do_on; /* Repaint overlap with the border (mess up the corner a little). */ p = scp->sc->adp->va_window + y * line_width + x * pixel_size; for (i = 0; i < yend - y; i++, p += line_width) for (j = xend - x - 1; j >= 0; j--) DRAW_PIXEL(scp, p + j * pixel_size, scp->border); return; do_on: p = scp->sc->adp->va_window + y * line_width + x * pixel_size; for (i = 0; i < yend - y; i++, p += line_width) for (j = xend - x - 1; j >= 0; j--) if (mdp->md_interior[i] & (1 << (15 - j))) - DRAW_PIXEL(scp, p + j * pixel_size, 15); + DRAW_PIXEL(scp, p + j * pixel_size, + scp->curs_attr.mouse_ia); else if (mdp->md_border[i] & (1 << (15 - j))) - DRAW_PIXEL(scp, p + j * pixel_size, 0); + DRAW_PIXEL(scp, p + j * pixel_size, + scp->curs_attr.mouse_ba); } static void vga_pxlmouse_planar(scr_stat *scp, int x, int y, int on) { if (on) draw_pxlmouse_planar(scp, x, y); else remove_pxlmouse_planar(scp, x, y); } #endif /* SC_NO_CUTPASTE */ #endif /* SC_PIXEL_MODE */ #ifndef SC_NO_MODE_CHANGE /* graphics mode renderer */ static void vga_grborder(scr_stat *scp, int color) { vidd_set_border(scp->sc->adp, color); } #endif Index: projects/runtime-coverage/sys/dev/syscons/syscons.c =================================================================== --- projects/runtime-coverage/sys/dev/syscons/syscons.c (revision 322921) +++ projects/runtime-coverage/sys/dev/syscons/syscons.c (revision 322922) @@ -1,4224 +1,4248 @@ /*- * Copyright (c) 1992-1998 Søren Schmidt * All rights reserved. * * This code is derived from software contributed to The DragonFly Project * by Sascha Wildner * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include "opt_syscons.h" #include "opt_splash.h" #include "opt_ddb.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if defined(__arm__) || defined(__mips__) || \ defined(__powerpc__) || defined(__sparc64__) #include #else #include #endif #if defined( __i386__) || defined(__amd64__) #include #include #endif #include +#if defined(__amd64__) || defined(__i386__) +#include + +#include +#include +#endif + #include #include #include #include #define COLD 0 #define WARM 1 #define DEFAULT_BLANKTIME (5*60) /* 5 minutes */ #define MAX_BLANKTIME (7*24*60*60) /* 7 days!? */ #define KEYCODE_BS 0x0e /* "<-- Backspace" key, XXX */ /* NULL-safe version of "tty_opened()" */ #define tty_opened_ns(tp) ((tp) != NULL && tty_opened(tp)) static u_char sc_kattrtab[MAXCPU]; static int sc_console_unit = -1; static int sc_saver_keyb_only = 1; static scr_stat *sc_console; static struct consdev *sc_consptr; static void *sc_kts[MAXCPU]; static struct sc_term_sw *sc_ktsw; static scr_stat main_console; static struct tty *main_devs[MAXCONS]; static char init_done = COLD; static int shutdown_in_progress = FALSE; static int suspend_in_progress = FALSE; static char sc_malloc = FALSE; static int saver_mode = CONS_NO_SAVER; /* LKM/user saver */ static int run_scrn_saver = FALSE; /* should run the saver? */ static int enable_bell = TRUE; /* enable beeper */ #ifndef SC_DISABLE_REBOOT static int enable_reboot = TRUE; /* enable keyboard reboot */ #endif #ifndef SC_DISABLE_KDBKEY static int enable_kdbkey = TRUE; /* enable keyboard debug */ #endif static long scrn_blank_time = 0; /* screen saver timeout value */ #ifdef DEV_SPLASH static int scrn_blanked; /* # of blanked screen */ static int sticky_splash = FALSE; static void none_saver(sc_softc_t *sc, int blank) { } static void (*current_saver)(sc_softc_t *, int) = none_saver; #endif #ifdef SC_NO_SUSPEND_VTYSWITCH static int sc_no_suspend_vtswitch = 1; #else static int sc_no_suspend_vtswitch = 0; #endif static int sc_susp_scr; static SYSCTL_NODE(_hw, OID_AUTO, syscons, CTLFLAG_RD, 0, "syscons"); -SYSCTL_OPAQUE(_hw_syscons, OID_AUTO, kattr, CTLFLAG_RW, - &sc_kattrtab, sizeof(sc_kattrtab), "CU", "kernel console attributes"); static SYSCTL_NODE(_hw_syscons, OID_AUTO, saver, CTLFLAG_RD, 0, "saver"); SYSCTL_INT(_hw_syscons_saver, OID_AUTO, keybonly, CTLFLAG_RW, &sc_saver_keyb_only, 0, "screen saver interrupted by input only"); SYSCTL_INT(_hw_syscons, OID_AUTO, bell, CTLFLAG_RW, &enable_bell, 0, "enable bell"); #ifndef SC_DISABLE_REBOOT SYSCTL_INT(_hw_syscons, OID_AUTO, kbd_reboot, CTLFLAG_RW|CTLFLAG_SECURE, &enable_reboot, 0, "enable keyboard reboot"); #endif #ifndef SC_DISABLE_KDBKEY SYSCTL_INT(_hw_syscons, OID_AUTO, kbd_debug, CTLFLAG_RW|CTLFLAG_SECURE, &enable_kdbkey, 0, "enable keyboard debug"); #endif SYSCTL_INT(_hw_syscons, OID_AUTO, sc_no_suspend_vtswitch, CTLFLAG_RWTUN, &sc_no_suspend_vtswitch, 0, "Disable VT switch before suspend."); #if !defined(SC_NO_FONT_LOADING) && defined(SC_DFLT_FONT) #include "font.h" #endif tsw_ioctl_t *sc_user_ioctl; static bios_values_t bios_value; static int enable_panic_key; SYSCTL_INT(_machdep, OID_AUTO, enable_panic_key, CTLFLAG_RW, &enable_panic_key, 0, "Enable panic via keypress specified in kbdmap(5)"); #define SC_CONSOLECTL 255 #define VTY_WCHAN(sc, vty) (&SC_DEV(sc, vty)) /* prototypes */ static int sc_allocate_keyboard(sc_softc_t *sc, int unit); static int scvidprobe(int unit, int flags, int cons); static int sckbdprobe(int unit, int flags, int cons); static void scmeminit(void *arg); static int scdevtounit(struct tty *tp); static kbd_callback_func_t sckbdevent; static void scinit(int unit, int flags); static scr_stat *sc_get_stat(struct tty *tp); static void scterm(int unit, int flags); static void scshutdown(void *, int); static void scsuspend(void *); static void scresume(void *); static u_int scgetc(sc_softc_t *sc, u_int flags, struct sc_cnstate *sp); static void sc_puts(scr_stat *scp, u_char *buf, int len); #define SCGETC_CN 1 #define SCGETC_NONBLOCK 2 static void sccnupdate(scr_stat *scp); static scr_stat *alloc_scp(sc_softc_t *sc, int vty); static void init_scp(sc_softc_t *sc, int vty, scr_stat *scp); static timeout_t scrn_timer; static int and_region(int *s1, int *e1, int s2, int e2); static void scrn_update(scr_stat *scp, int show_cursor); #ifdef DEV_SPLASH static int scsplash_callback(int event, void *arg); static void scsplash_saver(sc_softc_t *sc, int show); static int add_scrn_saver(void (*this_saver)(sc_softc_t *, int)); static int remove_scrn_saver(void (*this_saver)(sc_softc_t *, int)); static int set_scrn_saver_mode(scr_stat *scp, int mode, u_char *pal, int border); static int restore_scrn_saver_mode(scr_stat *scp, int changemode); static void stop_scrn_saver(sc_softc_t *sc, void (*saver)(sc_softc_t *, int)); static int wait_scrn_saver_stop(sc_softc_t *sc); #define scsplash_stick(stick) (sticky_splash = (stick)) #else /* !DEV_SPLASH */ #define scsplash_stick(stick) #endif /* DEV_SPLASH */ static int do_switch_scr(sc_softc_t *sc, int s); static int vt_proc_alive(scr_stat *scp); static int signal_vt_rel(scr_stat *scp); static int signal_vt_acq(scr_stat *scp); static int finish_vt_rel(scr_stat *scp, int release, int *s); static int finish_vt_acq(scr_stat *scp); static void exchange_scr(sc_softc_t *sc); static void update_cursor_image(scr_stat *scp); static void change_cursor_shape(scr_stat *scp, int flags, int base, int height); static void update_font(scr_stat *); static int save_kbd_state(scr_stat *scp); static int update_kbd_state(scr_stat *scp, int state, int mask); static int update_kbd_leds(scr_stat *scp, int which); static int sc_kattr(void); static timeout_t blink_screen; static struct tty *sc_alloc_tty(int, int); static cn_probe_t sc_cnprobe; static cn_init_t sc_cninit; static cn_term_t sc_cnterm; static cn_getc_t sc_cngetc; static cn_putc_t sc_cnputc; static cn_grab_t sc_cngrab; static cn_ungrab_t sc_cnungrab; CONSOLE_DRIVER(sc); static tsw_open_t sctty_open; static tsw_close_t sctty_close; static tsw_outwakeup_t sctty_outwakeup; static tsw_ioctl_t sctty_ioctl; static tsw_mmap_t sctty_mmap; static struct ttydevsw sc_ttydevsw = { .tsw_open = sctty_open, .tsw_close = sctty_close, .tsw_outwakeup = sctty_outwakeup, .tsw_ioctl = sctty_ioctl, .tsw_mmap = sctty_mmap, }; static d_ioctl_t consolectl_ioctl; static d_close_t consolectl_close; static struct cdevsw consolectl_devsw = { .d_version = D_VERSION, .d_flags = D_NEEDGIANT | D_TRACKCLOSE, .d_ioctl = consolectl_ioctl, .d_close = consolectl_close, .d_name = "consolectl", }; /* ec -- emergency console. */ static u_int ec_scroffset; static void ec_putc(int c) { uintptr_t fb; u_short *scrptr; u_int ind; int attr, column, mysize, width, xsize, yborder, ysize; if (c < 0 || c > 0xff || c == '\a') return; if (sc_console == NULL) { #if !defined(__amd64__) && !defined(__i386__) return; -#endif +#else /* * This is enough for ec_putc() to work very early on x86 * if the kernel starts in normal color text mode. */ - fb = 0xb8000; + fb = KERNBASE + 0xb8000; xsize = 80; ysize = 25; +#endif } else { - if (main_console.status & GRAPHICS_MODE) + if (!ISTEXTSC(&main_console)) return; fb = main_console.sc->adp->va_window; xsize = main_console.xsize; ysize = main_console.ysize; } yborder = ysize / 5; scrptr = (u_short *)(void *)fb + xsize * yborder; mysize = xsize * (ysize - 2 * yborder); do { ind = ec_scroffset; column = ind % xsize; width = (c == '\b' ? -1 : c == '\t' ? (column + 8) & ~7 : c == '\r' ? -column : c == '\n' ? xsize - column : 1); if (width == 0 || (width < 0 && ind < -width)) return; } while (atomic_cmpset_rel_int(&ec_scroffset, ind, ind + width) == 0); if (c == '\b' || c == '\r') return; if (c == '\n') ind += xsize; /* XXX clearing from new pos is not atomic */ attr = sc_kattr(); if (c == '\t' || c == '\n') c = ' '; do scrptr[ind++ % mysize] = (attr << 8) | c; while (--width != 0); } int sc_probe_unit(int unit, int flags) { if (!vty_enabled(VTY_SC)) return ENXIO; if (!scvidprobe(unit, flags, FALSE)) { if (bootverbose) printf("%s%d: no video adapter found.\n", SC_DRIVER_NAME, unit); return ENXIO; } /* syscons will be attached even when there is no keyboard */ sckbdprobe(unit, flags, FALSE); return 0; } /* probe video adapters, return TRUE if found */ static int scvidprobe(int unit, int flags, int cons) { /* * Access the video adapter driver through the back door! * Video adapter drivers need to be configured before syscons. * However, when syscons is being probed as the low-level console, * they have not been initialized yet. We force them to initialize * themselves here. XXX */ vid_configure(cons ? VIO_PROBE_ONLY : 0); return (vid_find_adapter("*", unit) >= 0); } /* probe the keyboard, return TRUE if found */ static int sckbdprobe(int unit, int flags, int cons) { /* access the keyboard driver through the backdoor! */ kbd_configure(cons ? KB_CONF_PROBE_ONLY : 0); return (kbd_find_keyboard("*", unit) >= 0); } static char *adapter_name(video_adapter_t *adp) { static struct { int type; char *name[2]; } names[] = { { KD_MONO, { "MDA", "MDA" } }, { KD_HERCULES, { "Hercules", "Hercules" } }, { KD_CGA, { "CGA", "CGA" } }, { KD_EGA, { "EGA", "EGA (mono)" } }, { KD_VGA, { "VGA", "VGA (mono)" } }, { KD_TGA, { "TGA", "TGA" } }, { -1, { "Unknown", "Unknown" } }, }; int i; for (i = 0; names[i].type != -1; ++i) if (names[i].type == adp->va_type) break; return names[i].name[(adp->va_flags & V_ADP_COLOR) ? 0 : 1]; } static void sctty_outwakeup(struct tty *tp) { size_t len; u_char buf[PCBURST]; scr_stat *scp = sc_get_stat(tp); if (scp->status & SLKED || (scp == scp->sc->cur_scp && scp->sc->blink_in_progress)) return; for (;;) { len = ttydisc_getc(tp, buf, sizeof buf); if (len == 0) break; SC_VIDEO_LOCK(scp->sc); sc_puts(scp, buf, len); SC_VIDEO_UNLOCK(scp->sc); } } static struct tty * sc_alloc_tty(int index, int devnum) { struct sc_ttysoftc *stc; struct tty *tp; /* Allocate TTY object and softc to store unit number. */ stc = malloc(sizeof(struct sc_ttysoftc), M_DEVBUF, M_WAITOK); stc->st_index = index; stc->st_stat = NULL; tp = tty_alloc_mutex(&sc_ttydevsw, stc, &Giant); /* Create device node. */ tty_makedev(tp, NULL, "v%r", devnum); return (tp); } #ifdef SC_PIXEL_MODE static void sc_set_vesa_mode(scr_stat *scp, sc_softc_t *sc, int unit) { video_info_t info; u_char *font; int depth; int fontsize; int i; int vmode; vmode = 0; (void)resource_int_value("sc", unit, "vesa_mode", &vmode); if (vmode < M_VESA_BASE || vmode > M_VESA_MODE_MAX || vidd_get_info(sc->adp, vmode, &info) != 0 || !sc_support_pixel_mode(&info)) vmode = 0; /* * If the mode is unset or unsupported, search for an available * 800x600 graphics mode with the highest color depth. */ if (vmode == 0) { for (depth = 0, i = M_VESA_BASE; i <= M_VESA_MODE_MAX; i++) if (vidd_get_info(sc->adp, i, &info) == 0 && info.vi_width == 800 && info.vi_height == 600 && sc_support_pixel_mode(&info) && info.vi_depth > depth) { vmode = i; depth = info.vi_depth; } if (vmode == 0) return; vidd_get_info(sc->adp, vmode, &info); } #if !defined(SC_NO_FONT_LOADING) && defined(SC_DFLT_FONT) fontsize = info.vi_cheight; #else fontsize = scp->font_size; #endif if (fontsize < 14) fontsize = 8; else if (fontsize >= 16) fontsize = 16; else fontsize = 14; #ifndef SC_NO_FONT_LOADING switch (fontsize) { case 8: if ((sc->fonts_loaded & FONT_8) == 0) return; font = sc->font_8; break; case 14: if ((sc->fonts_loaded & FONT_14) == 0) return; font = sc->font_14; break; case 16: if ((sc->fonts_loaded & FONT_16) == 0) return; font = sc->font_16; break; } #else font = NULL; #endif #ifdef DEV_SPLASH if ((sc->flags & SC_SPLASH_SCRN) != 0) splash_term(sc->adp); #endif #ifndef SC_NO_HISTORY if (scp->history != NULL) { sc_vtb_append(&scp->vtb, 0, scp->history, scp->ypos * scp->xsize + scp->xpos); scp->history_pos = sc_vtb_tail(scp->history); } #endif vidd_set_mode(sc->adp, vmode); scp->status |= (UNKNOWN_MODE | PIXEL_MODE | MOUSE_HIDDEN); scp->status &= ~(GRAPHICS_MODE | MOUSE_VISIBLE); scp->xpixel = info.vi_width; scp->ypixel = info.vi_height; scp->xsize = scp->xpixel / 8; scp->ysize = scp->ypixel / fontsize; scp->xpos = 0; scp->ypos = scp->ysize - 1; scp->xoff = scp->yoff = 0; scp->font = font; scp->font_size = fontsize; scp->font_width = 8; scp->start = scp->xsize * scp->ysize - 1; scp->end = 0; scp->cursor_pos = scp->cursor_oldpos = scp->xsize * scp->xsize; scp->mode = sc->initial_mode = vmode; #ifndef __sparc64__ sc_vtb_init(&scp->scr, VTB_FRAMEBUFFER, scp->xsize, scp->ysize, (void *)sc->adp->va_window, FALSE); #endif sc_alloc_scr_buffer(scp, FALSE, FALSE); sc_init_emulator(scp, NULL); #ifndef SC_NO_CUTPASTE sc_alloc_cut_buffer(scp, FALSE); #endif #ifndef SC_NO_HISTORY sc_alloc_history_buffer(scp, 0, 0, FALSE); #endif sc_set_border(scp, scp->border); sc_set_cursor_image(scp); scp->status &= ~UNKNOWN_MODE; #ifdef DEV_SPLASH if ((sc->flags & SC_SPLASH_SCRN) != 0) splash_init(sc->adp, scsplash_callback, sc); #endif } #endif int sc_attach_unit(int unit, int flags) { sc_softc_t *sc; scr_stat *scp; struct cdev *dev; void *oldts, *ts; int i, vc; if (!vty_enabled(VTY_SC)) return ENXIO; flags &= ~SC_KERNEL_CONSOLE; if (sc_console_unit == unit) { /* * If this unit is being used as the system console, we need to * adjust some variables and buffers before and after scinit(). */ /* assert(sc_console != NULL) */ flags |= SC_KERNEL_CONSOLE; scmeminit(NULL); scinit(unit, flags); if (sc_console->tsw->te_size > 0) { sc_ktsw = sc_console->tsw; /* assert(sc_console->ts != NULL); */ oldts = sc_console->ts; for (i = 0; i <= mp_maxid; i++) { ts = malloc(sc_console->tsw->te_size, M_DEVBUF, M_WAITOK | M_ZERO); (*sc_console->tsw->te_init)(sc_console, &ts, SC_TE_COLD_INIT); sc_console->ts = ts; (*sc_console->tsw->te_default_attr)(sc_console, sc_kattrtab[i], SC_KERNEL_CONS_REV_ATTR); sc_kts[i] = ts; } sc_console->ts = oldts; (*sc_console->tsw->te_default_attr)(sc_console, SC_NORM_ATTR, SC_NORM_REV_ATTR); } } else { scinit(unit, flags); } sc = sc_get_softc(unit, flags & SC_KERNEL_CONSOLE); sc->config = flags; callout_init(&sc->ctimeout, 0); callout_init(&sc->cblink, 0); scp = sc_get_stat(sc->dev[0]); if (sc_console == NULL) /* sc_console_unit < 0 */ sc_console = scp; #ifdef SC_PIXEL_MODE if ((sc->config & SC_VESAMODE) != 0) sc_set_vesa_mode(scp, sc, unit); #endif /* SC_PIXEL_MODE */ /* initialize cursor */ if (!ISGRAPHSC(scp)) update_cursor_image(scp); /* get screen update going */ scrn_timer(sc); /* set up the keyboard */ (void)kbdd_ioctl(sc->kbd, KDSKBMODE, (caddr_t)&scp->kbd_mode); update_kbd_state(scp, scp->status, LOCK_MASK); printf("%s%d: %s <%d virtual consoles, flags=0x%x>\n", SC_DRIVER_NAME, unit, adapter_name(sc->adp), sc->vtys, sc->config); if (bootverbose) { printf("%s%d:", SC_DRIVER_NAME, unit); if (sc->adapter >= 0) printf(" fb%d", sc->adapter); if (sc->keyboard >= 0) printf(", kbd%d", sc->keyboard); if (scp->tsw) printf(", terminal emulator: %s (%s)", scp->tsw->te_name, scp->tsw->te_desc); printf("\n"); } /* Register suspend/resume/shutdown callbacks for the kernel console. */ if (sc_console_unit == unit) { EVENTHANDLER_REGISTER(power_suspend_early, scsuspend, NULL, EVENTHANDLER_PRI_ANY); EVENTHANDLER_REGISTER(power_resume, scresume, NULL, EVENTHANDLER_PRI_ANY); EVENTHANDLER_REGISTER(shutdown_pre_sync, scshutdown, NULL, SHUTDOWN_PRI_DEFAULT); } for (vc = 0; vc < sc->vtys; vc++) { if (sc->dev[vc] == NULL) { sc->dev[vc] = sc_alloc_tty(vc, vc + unit * MAXCONS); if (vc == 0 && sc->dev == main_devs) SC_STAT(sc->dev[0]) = &main_console; } /* * The first vty already has struct tty and scr_stat initialized * in scinit(). The other vtys will have these structs when * first opened. */ } dev = make_dev(&consolectl_devsw, 0, UID_ROOT, GID_WHEEL, 0600, "consolectl"); dev->si_drv1 = sc->dev[0]; return 0; } static void scmeminit(void *arg) { if (!vty_enabled(VTY_SC)) return; if (sc_malloc) return; sc_malloc = TRUE; /* * As soon as malloc() becomes functional, we had better allocate * various buffers for the kernel console. */ if (sc_console_unit < 0) /* sc_console == NULL */ return; /* copy the temporary buffer to the final buffer */ sc_alloc_scr_buffer(sc_console, FALSE, FALSE); #ifndef SC_NO_CUTPASTE sc_alloc_cut_buffer(sc_console, FALSE); #endif #ifndef SC_NO_HISTORY /* initialize history buffer & pointers */ sc_alloc_history_buffer(sc_console, 0, 0, FALSE); #endif } /* XXX */ SYSINIT(sc_mem, SI_SUB_KMEM, SI_ORDER_ANY, scmeminit, NULL); static int scdevtounit(struct tty *tp) { int vty = SC_VTY(tp); if (vty == SC_CONSOLECTL) return ((sc_console != NULL) ? sc_console->sc->unit : -1); else if ((vty < 0) || (vty >= MAXCONS*sc_max_unit())) return -1; else return vty/MAXCONS; } static int sctty_open(struct tty *tp) { int unit = scdevtounit(tp); sc_softc_t *sc; scr_stat *scp; #ifndef __sparc64__ keyarg_t key; #endif DPRINTF(5, ("scopen: dev:%s, unit:%d, vty:%d\n", devtoname(tp->t_dev), unit, SC_VTY(tp))); sc = sc_get_softc(unit, (sc_console_unit == unit) ? SC_KERNEL_CONSOLE : 0); if (sc == NULL) return ENXIO; if (!tty_opened(tp)) { /* Use the current setting of the <-- key as default VERASE. */ /* If the Delete key is preferable, an stty is necessary */ #ifndef __sparc64__ if (sc->kbd != NULL) { key.keynum = KEYCODE_BS; (void)kbdd_ioctl(sc->kbd, GIO_KEYMAPENT, (caddr_t)&key); tp->t_termios.c_cc[VERASE] = key.key.map[0]; } #endif } scp = sc_get_stat(tp); if (scp == NULL) { scp = SC_STAT(tp) = alloc_scp(sc, SC_VTY(tp)); if (ISGRAPHSC(scp)) sc_set_pixel_mode(scp, NULL, 0, 0, 16, 8); } if (!tp->t_winsize.ws_col && !tp->t_winsize.ws_row) { tp->t_winsize.ws_col = scp->xsize; tp->t_winsize.ws_row = scp->ysize; } return (0); } static void sctty_close(struct tty *tp) { scr_stat *scp; int s; if (SC_VTY(tp) != SC_CONSOLECTL) { scp = sc_get_stat(tp); /* were we in the middle of the VT switching process? */ DPRINTF(5, ("sc%d: scclose(), ", scp->sc->unit)); s = spltty(); if ((scp == scp->sc->cur_scp) && (scp->sc->unit == sc_console_unit)) cnavailable(sc_consptr, TRUE); if (finish_vt_rel(scp, TRUE, &s) == 0) /* force release */ DPRINTF(5, ("reset WAIT_REL, ")); if (finish_vt_acq(scp) == 0) /* force acknowledge */ DPRINTF(5, ("reset WAIT_ACQ, ")); #ifdef not_yet_done if (scp == &main_console) { scp->pid = 0; scp->proc = NULL; scp->smode.mode = VT_AUTO; } else { sc_vtb_destroy(&scp->vtb); #ifndef __sparc64__ sc_vtb_destroy(&scp->scr); #endif sc_free_history_buffer(scp, scp->ysize); SC_STAT(tp) = NULL; free(scp, M_DEVBUF); } #else scp->pid = 0; scp->proc = NULL; scp->smode.mode = VT_AUTO; #endif scp->kbd_mode = K_XLATE; if (scp == scp->sc->cur_scp) (void)kbdd_ioctl(scp->sc->kbd, KDSKBMODE, (caddr_t)&scp->kbd_mode); DPRINTF(5, ("done.\n")); } } #if 0 /* XXX mpsafetty: fix screensaver. What about outwakeup? */ static int scread(struct cdev *dev, struct uio *uio, int flag) { if (!sc_saver_keyb_only) sc_touch_scrn_saver(); return ttyread(dev, uio, flag); } #endif static int sckbdevent(keyboard_t *thiskbd, int event, void *arg) { sc_softc_t *sc; struct tty *cur_tty; int c, error = 0; size_t len; const u_char *cp; sc = (sc_softc_t *)arg; /* assert(thiskbd == sc->kbd) */ mtx_lock(&Giant); switch (event) { case KBDIO_KEYINPUT: break; case KBDIO_UNLOADING: sc->kbd = NULL; sc->keyboard = -1; kbd_release(thiskbd, (void *)&sc->keyboard); goto done; default: error = EINVAL; goto done; } /* * Loop while there is still input to get from the keyboard. * I don't think this is nessesary, and it doesn't fix * the Xaccel-2.1 keyboard hang, but it can't hurt. XXX */ while ((c = scgetc(sc, SCGETC_NONBLOCK, NULL)) != NOKEY) { cur_tty = SC_DEV(sc, sc->cur_scp->index); if (!tty_opened_ns(cur_tty)) continue; if ((*sc->cur_scp->tsw->te_input)(sc->cur_scp, c, cur_tty)) continue; switch (KEYFLAGS(c)) { case 0x0000: /* normal key */ ttydisc_rint(cur_tty, KEYCHAR(c), 0); break; case FKEY: /* function key, return string */ cp = (*sc->cur_scp->tsw->te_fkeystr)(sc->cur_scp, c); if (cp != NULL) { ttydisc_rint_simple(cur_tty, cp, strlen(cp)); break; } cp = kbdd_get_fkeystr(thiskbd, KEYCHAR(c), &len); if (cp != NULL) ttydisc_rint_simple(cur_tty, cp, len); break; case MKEY: /* meta is active, prepend ESC */ ttydisc_rint(cur_tty, 0x1b, 0); ttydisc_rint(cur_tty, KEYCHAR(c), 0); break; case BKEY: /* backtab fixed sequence (esc [ Z) */ ttydisc_rint_simple(cur_tty, "\x1B[Z", 3); break; } ttydisc_rint_done(cur_tty); } sc->cur_scp->status |= MOUSE_HIDDEN; done: mtx_unlock(&Giant); return (error); } static int sctty_ioctl(struct tty *tp, u_long cmd, caddr_t data, struct thread *td) { int error; int i; struct cursor_attr *cap; sc_softc_t *sc; scr_stat *scp; int s; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) int ival; #endif /* If there is a user_ioctl function call that first */ if (sc_user_ioctl) { error = (*sc_user_ioctl)(tp, cmd, data, td); if (error != ENOIOCTL) return error; } error = sc_vid_ioctl(tp, cmd, data, td); if (error != ENOIOCTL) return error; #ifndef SC_NO_HISTORY error = sc_hist_ioctl(tp, cmd, data, td); if (error != ENOIOCTL) return error; #endif #ifndef SC_NO_SYSMOUSE error = sc_mouse_ioctl(tp, cmd, data, td); if (error != ENOIOCTL) return error; #endif scp = sc_get_stat(tp); /* assert(scp != NULL) */ /* scp is sc_console, if SC_VTY(dev) == SC_CONSOLECTL. */ sc = scp->sc; if (scp->tsw) { error = (*scp->tsw->te_ioctl)(scp, tp, cmd, data, td); if (error != ENOIOCTL) return error; } switch (cmd) { /* process console hardware related ioctl's */ case GIO_ATTR: /* get current attributes */ /* this ioctl is not processed here, but in the terminal emulator */ return ENOTTY; case GIO_COLOR: /* is this a color console ? */ *(int *)data = (sc->adp->va_flags & V_ADP_COLOR) ? 1 : 0; return 0; case CONS_BLANKTIME: /* set screen saver timeout (0 = no saver) */ if (*(int *)data < 0 || *(int *)data > MAX_BLANKTIME) return EINVAL; s = spltty(); scrn_blank_time = *(int *)data; run_scrn_saver = (scrn_blank_time != 0); splx(s); return 0; case CONS_CURSORTYPE: /* set cursor type (old interface + HIDDEN) */ s = spltty(); *(int *)data &= CONS_CURSOR_ATTRS; sc_change_cursor_shape(scp, *(int *)data, -1, -1); splx(s); return 0; case CONS_GETCURSORSHAPE: /* get cursor shape (new interface) */ switch (((int *)data)[0] & (CONS_DEFAULT_CURSOR | CONS_LOCAL_CURSOR)) { case 0: cap = &sc->curs_attr; break; case CONS_LOCAL_CURSOR: cap = &scp->base_curs_attr; break; case CONS_DEFAULT_CURSOR: cap = &sc->dflt_curs_attr; break; case CONS_DEFAULT_CURSOR | CONS_LOCAL_CURSOR: cap = &scp->dflt_curs_attr; break; } - ((int *)data)[1] = cap->base; - ((int *)data)[2] = cap->height; + if (((int *)data)[0] & CONS_CHARCURSOR_COLORS) { + ((int *)data)[1] = cap->bg[0]; + ((int *)data)[2] = cap->bg[1]; + } else if (((int *)data)[0] & CONS_MOUSECURSOR_COLORS) { + ((int *)data)[1] = cap->mouse_ba; + ((int *)data)[2] = cap->mouse_ia; + } else { + ((int *)data)[1] = cap->base; + ((int *)data)[2] = cap->height; + } ((int *)data)[0] = cap->flags; return 0; case CONS_SETCURSORSHAPE: /* set cursor shape (new interface) */ s = spltty(); sc_change_cursor_shape(scp, ((int *)data)[0], ((int *)data)[1], ((int *)data)[2]); splx(s); return 0; case CONS_BELLTYPE: /* set bell type sound/visual */ if ((*(int *)data) & CONS_VISUAL_BELL) sc->flags |= SC_VISUAL_BELL; else sc->flags &= ~SC_VISUAL_BELL; if ((*(int *)data) & CONS_QUIET_BELL) sc->flags |= SC_QUIET_BELL; else sc->flags &= ~SC_QUIET_BELL; return 0; case CONS_GETINFO: /* get current (virtual) console info */ { vid_info_t *ptr = (vid_info_t*)data; if (ptr->size == sizeof(struct vid_info)) { ptr->m_num = sc->cur_scp->index; ptr->font_size = scp->font_size; ptr->mv_col = scp->xpos; ptr->mv_row = scp->ypos; ptr->mv_csz = scp->xsize; ptr->mv_rsz = scp->ysize; ptr->mv_hsz = (scp->history != NULL) ? scp->history->vtb_rows : 0; /* * The following fields are filled by the terminal emulator. XXX * * ptr->mv_norm.fore * ptr->mv_norm.back * ptr->mv_rev.fore * ptr->mv_rev.back */ ptr->mv_grfc.fore = 0; /* not supported */ ptr->mv_grfc.back = 0; /* not supported */ ptr->mv_ovscan = scp->border; if (scp == sc->cur_scp) save_kbd_state(scp); ptr->mk_keylock = scp->status & LOCK_MASK; return 0; } return EINVAL; } case CONS_GETVERS: /* get version number */ *(int*)data = 0x200; /* version 2.0 */ return 0; case CONS_IDLE: /* see if the screen has been idle */ /* * When the screen is in the GRAPHICS_MODE or UNKNOWN_MODE, * the user process may have been writing something on the * screen and syscons is not aware of it. Declare the screen * is NOT idle if it is in one of these modes. But there is * an exception to it; if a screen saver is running in the * graphics mode in the current screen, we should say that the * screen has been idle. */ *(int *)data = (sc->flags & SC_SCRN_IDLE) && (!ISGRAPHSC(sc->cur_scp) || (sc->cur_scp->status & SAVER_RUNNING)); return 0; case CONS_SAVERMODE: /* set saver mode */ switch(*(int *)data) { case CONS_NO_SAVER: case CONS_USR_SAVER: /* if a LKM screen saver is running, stop it first. */ scsplash_stick(FALSE); saver_mode = *(int *)data; s = spltty(); #ifdef DEV_SPLASH if ((error = wait_scrn_saver_stop(NULL))) { splx(s); return error; } #endif run_scrn_saver = TRUE; if (saver_mode == CONS_USR_SAVER) scp->status |= SAVER_RUNNING; else scp->status &= ~SAVER_RUNNING; scsplash_stick(TRUE); splx(s); break; case CONS_LKM_SAVER: s = spltty(); if ((saver_mode == CONS_USR_SAVER) && (scp->status & SAVER_RUNNING)) scp->status &= ~SAVER_RUNNING; saver_mode = *(int *)data; splx(s); break; default: return EINVAL; } return 0; case CONS_SAVERSTART: /* immediately start/stop the screen saver */ /* * Note that this ioctl does not guarantee the screen saver * actually starts or stops. It merely attempts to do so... */ s = spltty(); run_scrn_saver = (*(int *)data != 0); if (run_scrn_saver) sc->scrn_time_stamp -= scrn_blank_time; splx(s); return 0; case CONS_SCRSHOT: /* get a screen shot */ { int retval, hist_rsz; size_t lsize, csize; vm_offset_t frbp, hstp; unsigned lnum; scrshot_t *ptr = (scrshot_t *)data; void *outp = ptr->buf; if (ptr->x < 0 || ptr->y < 0 || ptr->xsize < 0 || ptr->ysize < 0) return EINVAL; s = spltty(); if (ISGRAPHSC(scp)) { splx(s); return EOPNOTSUPP; } hist_rsz = (scp->history != NULL) ? scp->history->vtb_rows : 0; if (((u_int)ptr->x + ptr->xsize) > scp->xsize || ((u_int)ptr->y + ptr->ysize) > (scp->ysize + hist_rsz)) { splx(s); return EINVAL; } lsize = scp->xsize * sizeof(u_int16_t); csize = ptr->xsize * sizeof(u_int16_t); /* Pointer to the last line of framebuffer */ frbp = scp->vtb.vtb_buffer + scp->ysize * lsize + ptr->x * sizeof(u_int16_t); /* Pointer to the last line of target buffer */ outp = (char *)outp + ptr->ysize * csize; /* Pointer to the last line of history buffer */ if (scp->history != NULL) hstp = scp->history->vtb_buffer + sc_vtb_tail(scp->history) * sizeof(u_int16_t) + ptr->x * sizeof(u_int16_t); else hstp = 0; retval = 0; for (lnum = 0; lnum < (ptr->y + ptr->ysize); lnum++) { if (lnum < scp->ysize) { frbp -= lsize; } else { hstp -= lsize; if (hstp < scp->history->vtb_buffer) hstp += scp->history->vtb_rows * lsize; frbp = hstp; } if (lnum < ptr->y) continue; outp = (char *)outp - csize; retval = copyout((void *)frbp, outp, csize); if (retval != 0) break; } splx(s); return retval; } case VT_SETMODE: /* set screen switcher mode */ { struct vt_mode *mode; struct proc *p1; mode = (struct vt_mode *)data; DPRINTF(5, ("%s%d: VT_SETMODE ", SC_DRIVER_NAME, sc->unit)); if (scp->smode.mode == VT_PROCESS) { p1 = pfind(scp->pid); if (scp->proc == p1 && scp->proc != td->td_proc) { if (p1) PROC_UNLOCK(p1); DPRINTF(5, ("error EPERM\n")); return EPERM; } if (p1) PROC_UNLOCK(p1); } s = spltty(); if (mode->mode == VT_AUTO) { scp->smode.mode = VT_AUTO; scp->proc = NULL; scp->pid = 0; DPRINTF(5, ("VT_AUTO, ")); if ((scp == sc->cur_scp) && (sc->unit == sc_console_unit)) cnavailable(sc_consptr, TRUE); /* were we in the middle of the vty switching process? */ if (finish_vt_rel(scp, TRUE, &s) == 0) DPRINTF(5, ("reset WAIT_REL, ")); if (finish_vt_acq(scp) == 0) DPRINTF(5, ("reset WAIT_ACQ, ")); } else { if (!ISSIGVALID(mode->relsig) || !ISSIGVALID(mode->acqsig) || !ISSIGVALID(mode->frsig)) { splx(s); DPRINTF(5, ("error EINVAL\n")); return EINVAL; } DPRINTF(5, ("VT_PROCESS %d, ", td->td_proc->p_pid)); bcopy(data, &scp->smode, sizeof(struct vt_mode)); scp->proc = td->td_proc; scp->pid = scp->proc->p_pid; if ((scp == sc->cur_scp) && (sc->unit == sc_console_unit)) cnavailable(sc_consptr, FALSE); } splx(s); DPRINTF(5, ("\n")); return 0; } case VT_GETMODE: /* get screen switcher mode */ bcopy(&scp->smode, data, sizeof(struct vt_mode)); return 0; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('v', 4): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case VT_RELDISP: /* screen switcher ioctl */ s = spltty(); /* * This must be the current vty which is in the VT_PROCESS * switching mode... */ if ((scp != sc->cur_scp) || (scp->smode.mode != VT_PROCESS)) { splx(s); return EINVAL; } /* ...and this process is controlling it. */ if (scp->proc != td->td_proc) { splx(s); return EPERM; } error = EINVAL; switch(*(int *)data) { case VT_FALSE: /* user refuses to release screen, abort */ if ((error = finish_vt_rel(scp, FALSE, &s)) == 0) DPRINTF(5, ("%s%d: VT_FALSE\n", SC_DRIVER_NAME, sc->unit)); break; case VT_TRUE: /* user has released screen, go on */ if ((error = finish_vt_rel(scp, TRUE, &s)) == 0) DPRINTF(5, ("%s%d: VT_TRUE\n", SC_DRIVER_NAME, sc->unit)); break; case VT_ACKACQ: /* acquire acknowledged, switch completed */ if ((error = finish_vt_acq(scp)) == 0) DPRINTF(5, ("%s%d: VT_ACKACQ\n", SC_DRIVER_NAME, sc->unit)); break; default: break; } splx(s); return error; case VT_OPENQRY: /* return free virtual console */ for (i = sc->first_vty; i < sc->first_vty + sc->vtys; i++) { tp = SC_DEV(sc, i); if (!tty_opened_ns(tp)) { *(int *)data = i + 1; return 0; } } return EINVAL; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('v', 5): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case VT_ACTIVATE: /* switch to screen *data */ i = (*(int *)data == 0) ? scp->index : (*(int *)data - 1); s = spltty(); error = sc_clean_up(sc->cur_scp); splx(s); if (error) return error; error = sc_switch_scr(sc, i); return (error); #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('v', 6): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case VT_WAITACTIVE: /* wait for switch to occur */ i = (*(int *)data == 0) ? scp->index : (*(int *)data - 1); if ((i < sc->first_vty) || (i >= sc->first_vty + sc->vtys)) return EINVAL; if (i == sc->cur_scp->index) return 0; error = tsleep(VTY_WCHAN(sc, i), (PZERO + 1) | PCATCH, "waitvt", 0); return error; case VT_GETACTIVE: /* get active vty # */ *(int *)data = sc->cur_scp->index + 1; return 0; case VT_GETINDEX: /* get this vty # */ *(int *)data = scp->index + 1; return 0; case VT_LOCKSWITCH: /* prevent vty switching */ if ((*(int *)data) & 0x01) sc->flags |= SC_SCRN_VTYLOCK; else sc->flags &= ~SC_SCRN_VTYLOCK; return 0; case KDENABIO: /* allow io operations */ error = priv_check(td, PRIV_IO); if (error != 0) return error; error = securelevel_gt(td->td_ucred, 0); if (error != 0) return error; #ifdef __i386__ td->td_frame->tf_eflags |= PSL_IOPL; #elif defined(__amd64__) td->td_frame->tf_rflags |= PSL_IOPL; #endif return 0; case KDDISABIO: /* disallow io operations (default) */ #ifdef __i386__ td->td_frame->tf_eflags &= ~PSL_IOPL; #elif defined(__amd64__) td->td_frame->tf_rflags &= ~PSL_IOPL; #endif return 0; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('K', 20): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case KDSKBSTATE: /* set keyboard state (locks) */ if (*(int *)data & ~LOCK_MASK) return EINVAL; scp->status &= ~LOCK_MASK; scp->status |= *(int *)data; if (scp == sc->cur_scp) update_kbd_state(scp, scp->status, LOCK_MASK); return 0; case KDGKBSTATE: /* get keyboard state (locks) */ if (scp == sc->cur_scp) save_kbd_state(scp); *(int *)data = scp->status & LOCK_MASK; return 0; case KDGETREPEAT: /* get keyboard repeat & delay rates */ case KDSETREPEAT: /* set keyboard repeat & delay rates (new) */ error = kbdd_ioctl(sc->kbd, cmd, data); if (error == ENOIOCTL) error = ENODEV; return error; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('K', 67): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case KDSETRAD: /* set keyboard repeat & delay rates (old) */ if (*(int *)data & ~0x7f) return EINVAL; error = kbdd_ioctl(sc->kbd, KDSETRAD, data); if (error == ENOIOCTL) error = ENODEV; return error; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('K', 7): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case KDSKBMODE: /* set keyboard mode */ switch (*(int *)data) { case K_XLATE: /* switch to XLT ascii mode */ case K_RAW: /* switch to RAW scancode mode */ case K_CODE: /* switch to CODE mode */ scp->kbd_mode = *(int *)data; if (scp == sc->cur_scp) (void)kbdd_ioctl(sc->kbd, KDSKBMODE, data); return 0; default: return EINVAL; } /* NOT REACHED */ case KDGKBMODE: /* get keyboard mode */ *(int *)data = scp->kbd_mode; return 0; case KDGKBINFO: error = kbdd_ioctl(sc->kbd, cmd, data); if (error == ENOIOCTL) error = ENODEV; return error; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('K', 8): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case KDMKTONE: /* sound the bell */ if (*(int*)data) sc_bell(scp, (*(int*)data)&0xffff, (((*(int*)data)>>16)&0xffff)*hz/1000); else sc_bell(scp, scp->bell_pitch, scp->bell_duration); return 0; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('K', 63): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case KIOCSOUND: /* make tone (*data) hz */ if (scp == sc->cur_scp) { if (*(int *)data) return sc_tone(*(int *)data); else return sc_tone(0); } return 0; case KDGKBTYPE: /* get keyboard type */ error = kbdd_ioctl(sc->kbd, cmd, data); if (error == ENOIOCTL) { /* always return something? XXX */ *(int *)data = 0; } return 0; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('K', 66): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case KDSETLED: /* set keyboard LED status */ if (*(int *)data & ~LED_MASK) /* FIXME: LOCK_MASK? */ return EINVAL; scp->status &= ~LED_MASK; scp->status |= *(int *)data; if (scp == sc->cur_scp) update_kbd_leds(scp, scp->status); return 0; case KDGETLED: /* get keyboard LED status */ if (scp == sc->cur_scp) save_kbd_state(scp); *(int *)data = scp->status & LED_MASK; return 0; case KBADDKBD: /* add/remove keyboard to/from mux */ case KBRELKBD: error = kbdd_ioctl(sc->kbd, cmd, data); if (error == ENOIOCTL) error = ENODEV; return error; #if defined(COMPAT_FREEBSD6) || defined(COMPAT_FREEBSD5) || \ defined(COMPAT_FREEBSD4) || defined(COMPAT_43) case _IO('c', 110): ival = IOCPARM_IVAL(data); data = (caddr_t)&ival; /* FALLTHROUGH */ #endif case CONS_SETKBD: /* set the new keyboard */ { keyboard_t *newkbd; s = spltty(); newkbd = kbd_get_keyboard(*(int *)data); if (newkbd == NULL) { splx(s); return EINVAL; } error = 0; if (sc->kbd != newkbd) { i = kbd_allocate(newkbd->kb_name, newkbd->kb_unit, (void *)&sc->keyboard, sckbdevent, sc); /* i == newkbd->kb_index */ if (i >= 0) { if (sc->kbd != NULL) { save_kbd_state(sc->cur_scp); kbd_release(sc->kbd, (void *)&sc->keyboard); } sc->kbd = kbd_get_keyboard(i); /* sc->kbd == newkbd */ sc->keyboard = i; (void)kbdd_ioctl(sc->kbd, KDSKBMODE, (caddr_t)&sc->cur_scp->kbd_mode); update_kbd_state(sc->cur_scp, sc->cur_scp->status, LOCK_MASK); } else { error = EPERM; /* XXX */ } } splx(s); return error; } case CONS_RELKBD: /* release the current keyboard */ s = spltty(); error = 0; if (sc->kbd != NULL) { save_kbd_state(sc->cur_scp); error = kbd_release(sc->kbd, (void *)&sc->keyboard); if (error == 0) { sc->kbd = NULL; sc->keyboard = -1; } } splx(s); return error; case CONS_GETTERM: /* get the current terminal emulator info */ { sc_term_sw_t *sw; if (((term_info_t *)data)->ti_index == 0) { sw = scp->tsw; } else { sw = sc_term_match_by_number(((term_info_t *)data)->ti_index); } if (sw != NULL) { strncpy(((term_info_t *)data)->ti_name, sw->te_name, sizeof(((term_info_t *)data)->ti_name)); strncpy(((term_info_t *)data)->ti_desc, sw->te_desc, sizeof(((term_info_t *)data)->ti_desc)); ((term_info_t *)data)->ti_flags = 0; return 0; } else { ((term_info_t *)data)->ti_name[0] = '\0'; ((term_info_t *)data)->ti_desc[0] = '\0'; ((term_info_t *)data)->ti_flags = 0; return EINVAL; } } case CONS_SETTERM: /* set the current terminal emulator */ s = spltty(); error = sc_init_emulator(scp, ((term_info_t *)data)->ti_name); /* FIXME: what if scp == sc_console! XXX */ splx(s); return error; case GIO_SCRNMAP: /* get output translation table */ bcopy(&sc->scr_map, data, sizeof(sc->scr_map)); return 0; case PIO_SCRNMAP: /* set output translation table */ bcopy(data, &sc->scr_map, sizeof(sc->scr_map)); for (i=0; iscr_map); i++) { sc->scr_rmap[sc->scr_map[i]] = i; } return 0; case GIO_KEYMAP: /* get keyboard translation table */ case PIO_KEYMAP: /* set keyboard translation table */ case OGIO_KEYMAP: /* get keyboard translation table (compat) */ case OPIO_KEYMAP: /* set keyboard translation table (compat) */ case GIO_DEADKEYMAP: /* get accent key translation table */ case PIO_DEADKEYMAP: /* set accent key translation table */ case GETFKEY: /* get function key string */ case SETFKEY: /* set function key string */ error = kbdd_ioctl(sc->kbd, cmd, data); if (error == ENOIOCTL) error = ENODEV; return error; #ifndef SC_NO_FONT_LOADING case PIO_FONT8x8: /* set 8x8 dot font */ if (!ISFONTAVAIL(sc->adp->va_flags)) return ENXIO; bcopy(data, sc->font_8, 8*256); sc->fonts_loaded |= FONT_8; /* * FONT KLUDGE * Always use the font page #0. XXX * Don't load if the current font size is not 8x8. */ if (ISTEXTSC(sc->cur_scp) && (sc->cur_scp->font_size < 14)) sc_load_font(sc->cur_scp, 0, 8, 8, sc->font_8, 0, 256); return 0; case GIO_FONT8x8: /* get 8x8 dot font */ if (!ISFONTAVAIL(sc->adp->va_flags)) return ENXIO; if (sc->fonts_loaded & FONT_8) { bcopy(sc->font_8, data, 8*256); return 0; } else return ENXIO; case PIO_FONT8x14: /* set 8x14 dot font */ if (!ISFONTAVAIL(sc->adp->va_flags)) return ENXIO; bcopy(data, sc->font_14, 14*256); sc->fonts_loaded |= FONT_14; /* * FONT KLUDGE * Always use the font page #0. XXX * Don't load if the current font size is not 8x14. */ if (ISTEXTSC(sc->cur_scp) && (sc->cur_scp->font_size >= 14) && (sc->cur_scp->font_size < 16)) sc_load_font(sc->cur_scp, 0, 14, 8, sc->font_14, 0, 256); return 0; case GIO_FONT8x14: /* get 8x14 dot font */ if (!ISFONTAVAIL(sc->adp->va_flags)) return ENXIO; if (sc->fonts_loaded & FONT_14) { bcopy(sc->font_14, data, 14*256); return 0; } else return ENXIO; case PIO_FONT8x16: /* set 8x16 dot font */ if (!ISFONTAVAIL(sc->adp->va_flags)) return ENXIO; bcopy(data, sc->font_16, 16*256); sc->fonts_loaded |= FONT_16; /* * FONT KLUDGE * Always use the font page #0. XXX * Don't load if the current font size is not 8x16. */ if (ISTEXTSC(sc->cur_scp) && (sc->cur_scp->font_size >= 16)) sc_load_font(sc->cur_scp, 0, 16, 8, sc->font_16, 0, 256); return 0; case GIO_FONT8x16: /* get 8x16 dot font */ if (!ISFONTAVAIL(sc->adp->va_flags)) return ENXIO; if (sc->fonts_loaded & FONT_16) { bcopy(sc->font_16, data, 16*256); return 0; } else return ENXIO; #endif /* SC_NO_FONT_LOADING */ default: break; } return (ENOIOCTL); } static int consolectl_ioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td) { return sctty_ioctl(dev->si_drv1, cmd, data, td); } static int consolectl_close(struct cdev *dev, int flags, int mode, struct thread *td) { #ifndef SC_NO_SYSMOUSE mouse_info_t info; memset(&info, 0, sizeof(info)); info.operation = MOUSE_ACTION; /* * Make sure all buttons are released when moused and other * console daemons exit, so that no buttons are left pressed. */ (void) sctty_ioctl(dev->si_drv1, CONS_MOUSECTL, (caddr_t)&info, td); #endif return (0); } static void sc_cnprobe(struct consdev *cp) { int unit; int flags; if (!vty_enabled(VTY_SC)) { cp->cn_pri = CN_DEAD; return; } cp->cn_pri = sc_get_cons_priority(&unit, &flags); /* a video card is always required */ if (!scvidprobe(unit, flags, TRUE)) cp->cn_pri = CN_DEAD; /* syscons will become console even when there is no keyboard */ sckbdprobe(unit, flags, TRUE); if (cp->cn_pri == CN_DEAD) return; /* initialize required fields */ strcpy(cp->cn_name, "ttyv0"); } static void sc_cninit(struct consdev *cp) { int unit; int flags; sc_get_cons_priority(&unit, &flags); scinit(unit, flags | SC_KERNEL_CONSOLE); sc_console_unit = unit; sc_console = sc_get_stat(sc_get_softc(unit, SC_KERNEL_CONSOLE)->dev[0]); sc_consptr = cp; } static void sc_cnterm(struct consdev *cp) { void *ts; int i; /* we are not the kernel console any more, release everything */ if (sc_console_unit < 0) return; /* shouldn't happen */ #if 0 /* XXX */ sc_clear_screen(sc_console); sccnupdate(sc_console); #endif if (sc_ktsw != NULL) { for (i = 0; i <= mp_maxid; i++) { ts = sc_kts[i]; sc_kts[i] = NULL; (*sc_ktsw->te_term)(sc_console, &ts); free(ts, M_DEVBUF); } sc_ktsw = NULL; } scterm(sc_console_unit, SC_KERNEL_CONSOLE); sc_console_unit = -1; sc_console = NULL; } static void sccnclose(sc_softc_t *sc, struct sc_cnstate *sp); static int sc_cngetc_locked(struct sc_cnstate *sp); static void sccnkbdlock(sc_softc_t *sc, struct sc_cnstate *sp); static void sccnkbdunlock(sc_softc_t *sc, struct sc_cnstate *sp); static void sccnopen(sc_softc_t *sc, struct sc_cnstate *sp, int flags); static void sccnscrlock(sc_softc_t *sc, struct sc_cnstate *sp); static void sccnscrunlock(sc_softc_t *sc, struct sc_cnstate *sp); static void sccnkbdlock(sc_softc_t *sc, struct sc_cnstate *sp) { /* * Locking method: hope for the best. * The keyboard is supposed to be Giant locked. We can't handle that * in general. The kdb_active case here is not safe, and we will * proceed without the lock in all cases. */ sp->kbd_locked = !kdb_active && mtx_trylock(&Giant); } static void sccnkbdunlock(sc_softc_t *sc, struct sc_cnstate *sp) { if (sp->kbd_locked) mtx_unlock(&Giant); sp->kbd_locked = FALSE; } static void sccnscrlock(sc_softc_t *sc, struct sc_cnstate *sp) { int retries; /** * Locking method: * - if kdb_active and video_mtx is not owned by anyone, then lock * by kdb remaining active * - if !kdb_active, try to acquire video_mtx without blocking or * recursing; if we get it then it works normally. * Note that video_mtx is especially unusable if we already own it, * since then it is protecting something and syscons is not reentrant * enough to ignore the protection even in the kdb_active case. */ if (kdb_active) { sp->kdb_locked = sc->video_mtx.mtx_lock == MTX_UNOWNED || panicstr; sp->mtx_locked = FALSE; } else { sp->kdb_locked = FALSE; for (retries = 0; retries < 1000; retries++) { sp->mtx_locked = mtx_trylock_spin_flags(&sc->video_mtx, MTX_QUIET) != 0 || panicstr; if (sp->mtx_locked) break; DELAY(1); } } } static void sccnscrunlock(sc_softc_t *sc, struct sc_cnstate *sp) { if (sp->mtx_locked) mtx_unlock_spin(&sc->video_mtx); sp->mtx_locked = sp->kdb_locked = FALSE; } static void sccnopen(sc_softc_t *sc, struct sc_cnstate *sp, int flags) { int kbd_mode; /* assert(sc_console_unit >= 0) */ sp->kbd_opened = FALSE; sp->scr_opened = FALSE; sp->kbd_locked = FALSE; /* Opening the keyboard is optional. */ if (!(flags & 1) || sc->kbd == NULL) goto over_keyboard; sccnkbdlock(sc, sp); /* * Make sure the keyboard is accessible even when the kbd device * driver is disabled. */ kbdd_enable(sc->kbd); /* Switch the keyboard to console mode (K_XLATE, polled) on all scp's. */ kbd_mode = K_XLATE; (void)kbdd_ioctl(sc->kbd, KDSKBMODE, (caddr_t)&kbd_mode); sc->kbd_open_level++; kbdd_poll(sc->kbd, TRUE); sp->kbd_opened = TRUE; over_keyboard: ; /* The screen is opened iff locking it succeeds. */ sccnscrlock(sc, sp); if (!sp->kdb_locked && !sp->mtx_locked) return; sp->scr_opened = TRUE; /* The screen switch is optional. */ if (!(flags & 2)) return; /* try to switch to the kernel console screen */ if (!cold && sc->cur_scp->index != sc_console->index && sc->cur_scp->smode.mode == VT_AUTO && sc_console->smode.mode == VT_AUTO) sc_switch_scr(sc, sc_console->index); } static void sccnclose(sc_softc_t *sc, struct sc_cnstate *sp) { sp->scr_opened = FALSE; sccnscrunlock(sc, sp); if (!sp->kbd_opened) return; /* Restore keyboard mode (for the current, possibly-changed scp). */ kbdd_poll(sc->kbd, FALSE); if (--sc->kbd_open_level == 0) (void)kbdd_ioctl(sc->kbd, KDSKBMODE, (caddr_t)&sc->cur_scp->kbd_mode); kbdd_disable(sc->kbd); sp->kbd_opened = FALSE; sccnkbdunlock(sc, sp); } /* * Grabbing switches the screen and keyboard focus to sc_console and the * keyboard mode to (K_XLATE, polled). Only switching to polled mode is * essential (for preventing the interrupt handler from eating input * between polls). Focus is part of the UI, and the other switches are * work just was well when they are done on every entry and exit. * * Screen switches while grabbed are supported, and to maintain focus for * this ungrabbing and closing only restore the polling state and then * the keyboard mode if on the original screen. */ static void sc_cngrab(struct consdev *cp) { sc_softc_t *sc; int lev; sc = sc_console->sc; lev = atomic_fetchadd_int(&sc->grab_level, 1); if (lev >= 0 && lev < 2) { sccnopen(sc, &sc->grab_state[lev], 1 | 2); sccnscrunlock(sc, &sc->grab_state[lev]); sccnkbdunlock(sc, &sc->grab_state[lev]); } } static void sc_cnungrab(struct consdev *cp) { sc_softc_t *sc; int lev; sc = sc_console->sc; lev = atomic_load_acq_int(&sc->grab_level) - 1; if (lev >= 0 && lev < 2) { sccnkbdlock(sc, &sc->grab_state[lev]); sccnscrlock(sc, &sc->grab_state[lev]); sccnclose(sc, &sc->grab_state[lev]); } atomic_add_int(&sc->grab_level, -1); } static char sc_cnputc_log[0x1000]; static u_int sc_cnputc_loghead; static u_int sc_cnputc_logtail; static void sc_cnputc(struct consdev *cd, int c) { struct sc_cnstate st; u_char buf[1]; scr_stat *scp = sc_console; void *oldts, *ts; struct sc_term_sw *oldtsw; #ifndef SC_NO_HISTORY #if 0 struct tty *tp; #endif #endif /* !SC_NO_HISTORY */ u_int head; int s; /* assert(sc_console != NULL) */ sccnopen(scp->sc, &st, 0); /* * Log the output. * * In the unlocked case, the logging is intentionally only * perfectly atomic for the indexes. */ head = atomic_fetchadd_int(&sc_cnputc_loghead, 1); sc_cnputc_log[head % sizeof(sc_cnputc_log)] = c; /* * If we couldn't open, do special reentrant output and return to defer * normal output. */ if (!st.scr_opened) { ec_putc(c); return; } #ifndef SC_NO_HISTORY if (scp == scp->sc->cur_scp && scp->status & SLKED) { scp->status &= ~SLKED; update_kbd_state(scp, scp->status, SLKED); if (scp->status & BUFFER_SAVED) { if (!sc_hist_restore(scp)) sc_remove_cutmarking(scp); scp->status &= ~BUFFER_SAVED; scp->status |= CURSOR_ENABLED; sc_draw_cursor_image(scp); } #if 0 /* * XXX: Now that TTY's have their own locks, we cannot process * any data after disabling scroll lock. cnputs already holds a * spinlock. */ tp = SC_DEV(scp->sc, scp->index); /* XXX "tp" can be NULL */ tty_lock(tp); if (tty_opened(tp)) sctty_outwakeup(tp); tty_unlock(tp); #endif } #endif /* !SC_NO_HISTORY */ /* Play any output still in the log (our char may already be done). */ while (sc_cnputc_logtail != atomic_load_acq_int(&sc_cnputc_loghead)) { buf[0] = sc_cnputc_log[sc_cnputc_logtail++ % sizeof(sc_cnputc_log)]; if (atomic_load_acq_int(&sc_cnputc_loghead) - sc_cnputc_logtail >= sizeof(sc_cnputc_log)) continue; /* Console output has a per-CPU "input" state. Switch for it. */ oldtsw = scp->tsw; oldts = scp->ts; ts = sc_kts[PCPU_GET(cpuid)]; if (ts != NULL) { scp->tsw = sc_ktsw; scp->ts = ts; (*scp->tsw->te_sync)(scp); } sc_puts(scp, buf, 1); scp->tsw = oldtsw; scp->ts = oldts; (*scp->tsw->te_sync)(scp); } s = spltty(); /* block sckbdevent and scrn_timer */ sccnupdate(scp); splx(s); sccnclose(scp->sc, &st); } static int sc_cngetc(struct consdev *cd) { struct sc_cnstate st; int c, s; /* assert(sc_console != NULL) */ sccnopen(sc_console->sc, &st, 1); s = spltty(); /* block sckbdevent and scrn_timer while we poll */ if (!st.kbd_opened) { splx(s); sccnclose(sc_console->sc, &st); return -1; /* means no keyboard since we fudged the locking */ } c = sc_cngetc_locked(&st); splx(s); sccnclose(sc_console->sc, &st); return c; } static int sc_cngetc_locked(struct sc_cnstate *sp) { static struct fkeytab fkey; static int fkeycp; scr_stat *scp; const u_char *p; int c; /* * Stop the screen saver and update the screen if necessary. * What if we have been running in the screen saver code... XXX */ if (sp->scr_opened) sc_touch_scrn_saver(); scp = sc_console->sc->cur_scp; /* XXX */ if (sp->scr_opened) sccnupdate(scp); if (fkeycp < fkey.len) return fkey.str[fkeycp++]; c = scgetc(scp->sc, SCGETC_CN | SCGETC_NONBLOCK, sp); switch (KEYFLAGS(c)) { case 0: /* normal char */ return KEYCHAR(c); case FKEY: /* function key */ p = (*scp->tsw->te_fkeystr)(scp, c); if (p != NULL) { fkey.len = strlen(p); bcopy(p, fkey.str, fkey.len); fkeycp = 1; return fkey.str[0]; } p = kbdd_get_fkeystr(scp->sc->kbd, KEYCHAR(c), (size_t *)&fkeycp); fkey.len = fkeycp; if ((p != NULL) && (fkey.len > 0)) { bcopy(p, fkey.str, fkey.len); fkeycp = 1; return fkey.str[0]; } return c; /* XXX */ case NOKEY: case ERRKEY: default: return -1; } /* NOT REACHED */ } static void sccnupdate(scr_stat *scp) { /* this is a cut-down version of scrn_timer()... */ if (suspend_in_progress || scp->sc->font_loading_in_progress) return; if (kdb_active || panicstr || shutdown_in_progress) { sc_touch_scrn_saver(); } else if (scp != scp->sc->cur_scp) { return; } if (!run_scrn_saver) scp->sc->flags &= ~SC_SCRN_IDLE; #ifdef DEV_SPLASH if ((saver_mode != CONS_LKM_SAVER) || !(scp->sc->flags & SC_SCRN_IDLE)) if (scp->sc->flags & SC_SCRN_BLANKED) stop_scrn_saver(scp->sc, current_saver); #endif if (scp != scp->sc->cur_scp || scp->sc->blink_in_progress || scp->sc->switch_in_progress) return; /* * FIXME: unlike scrn_timer(), we call scrn_update() from here even * when write_in_progress is non-zero. XXX */ if (!ISGRAPHSC(scp) && !(scp->sc->flags & SC_SCRN_BLANKED)) scrn_update(scp, TRUE); } static void scrn_timer(void *arg) { static time_t kbd_time_stamp = 0; sc_softc_t *sc; scr_stat *scp; int again, rate; again = (arg != NULL); if (arg != NULL) sc = (sc_softc_t *)arg; else if (sc_console != NULL) sc = sc_console->sc; else return; /* find the vty to update */ scp = sc->cur_scp; /* don't do anything when we are performing some I/O operations */ if (suspend_in_progress || sc->font_loading_in_progress) goto done; if ((sc->kbd == NULL) && (sc->config & SC_AUTODETECT_KBD)) { /* try to allocate a keyboard automatically */ if (kbd_time_stamp != time_uptime) { kbd_time_stamp = time_uptime; sc->keyboard = sc_allocate_keyboard(sc, -1); if (sc->keyboard >= 0) { sc->kbd = kbd_get_keyboard(sc->keyboard); (void)kbdd_ioctl(sc->kbd, KDSKBMODE, (caddr_t)&sc->cur_scp->kbd_mode); update_kbd_state(sc->cur_scp, sc->cur_scp->status, LOCK_MASK); } } } /* should we stop the screen saver? */ if (kdb_active || panicstr || shutdown_in_progress) sc_touch_scrn_saver(); if (run_scrn_saver) { if (time_uptime > sc->scrn_time_stamp + scrn_blank_time) sc->flags |= SC_SCRN_IDLE; else sc->flags &= ~SC_SCRN_IDLE; } else { sc->scrn_time_stamp = time_uptime; sc->flags &= ~SC_SCRN_IDLE; if (scrn_blank_time > 0) run_scrn_saver = TRUE; } #ifdef DEV_SPLASH if ((saver_mode != CONS_LKM_SAVER) || !(sc->flags & SC_SCRN_IDLE)) if (sc->flags & SC_SCRN_BLANKED) stop_scrn_saver(sc, current_saver); #endif /* should we just return ? */ if (sc->blink_in_progress || sc->switch_in_progress || sc->write_in_progress) goto done; /* Update the screen */ scp = sc->cur_scp; /* cur_scp may have changed... */ if (!ISGRAPHSC(scp) && !(sc->flags & SC_SCRN_BLANKED)) scrn_update(scp, TRUE); #ifdef DEV_SPLASH /* should we activate the screen saver? */ if ((saver_mode == CONS_LKM_SAVER) && (sc->flags & SC_SCRN_IDLE)) if (!ISGRAPHSC(scp) || (sc->flags & SC_SCRN_BLANKED)) (*current_saver)(sc, TRUE); #endif done: if (again) { /* * Use reduced "refresh" rate if we are in graphics and that is not a * graphical screen saver. In such case we just have nothing to do. */ if (ISGRAPHSC(scp) && !(sc->flags & SC_SCRN_BLANKED)) rate = 2; else rate = 30; callout_reset_sbt(&sc->ctimeout, SBT_1S / rate, 0, scrn_timer, sc, C_PREL(1)); } } static int and_region(int *s1, int *e1, int s2, int e2) { if (*e1 < s2 || e2 < *s1) return FALSE; *s1 = imax(*s1, s2); *e1 = imin(*e1, e2); return TRUE; } static void scrn_update(scr_stat *scp, int show_cursor) { int start; int end; int s; int e; /* assert(scp == scp->sc->cur_scp) */ SC_VIDEO_LOCK(scp->sc); #ifndef SC_NO_CUTPASTE /* remove the previous mouse pointer image if necessary */ if (scp->status & MOUSE_VISIBLE) { s = scp->mouse_pos; e = scp->mouse_pos + scp->xsize + 1; if ((scp->status & (MOUSE_MOVED | MOUSE_HIDDEN)) || and_region(&s, &e, scp->start, scp->end) || ((scp->status & CURSOR_ENABLED) && (scp->cursor_pos != scp->cursor_oldpos) && (and_region(&s, &e, scp->cursor_pos, scp->cursor_pos) || and_region(&s, &e, scp->cursor_oldpos, scp->cursor_oldpos)))) { sc_remove_mouse_image(scp); if (scp->end >= scp->xsize*scp->ysize) scp->end = scp->xsize*scp->ysize - 1; } } #endif /* !SC_NO_CUTPASTE */ #if 1 /* debug: XXX */ if (scp->end >= scp->xsize*scp->ysize) { printf("scrn_update(): scp->end %d > size_of_screen!!\n", scp->end); scp->end = scp->xsize*scp->ysize - 1; } if (scp->start < 0) { printf("scrn_update(): scp->start %d < 0\n", scp->start); scp->start = 0; } #endif /* update screen image */ if (scp->start <= scp->end) { if (scp->mouse_cut_end >= 0) { /* there is a marked region for cut & paste */ if (scp->mouse_cut_start <= scp->mouse_cut_end) { start = scp->mouse_cut_start; end = scp->mouse_cut_end; } else { start = scp->mouse_cut_end; end = scp->mouse_cut_start - 1; } s = start; e = end; /* does the cut-mark region overlap with the update region? */ if (and_region(&s, &e, scp->start, scp->end)) { (*scp->rndr->draw)(scp, s, e - s + 1, TRUE); s = 0; e = start - 1; if (and_region(&s, &e, scp->start, scp->end)) (*scp->rndr->draw)(scp, s, e - s + 1, FALSE); s = end + 1; e = scp->xsize*scp->ysize - 1; if (and_region(&s, &e, scp->start, scp->end)) (*scp->rndr->draw)(scp, s, e - s + 1, FALSE); } else { (*scp->rndr->draw)(scp, scp->start, scp->end - scp->start + 1, FALSE); } } else { (*scp->rndr->draw)(scp, scp->start, scp->end - scp->start + 1, FALSE); } } /* we are not to show the cursor and the mouse pointer... */ if (!show_cursor) { scp->end = 0; scp->start = scp->xsize*scp->ysize - 1; SC_VIDEO_UNLOCK(scp->sc); return; } /* update cursor image */ if (scp->status & CURSOR_ENABLED) { s = scp->start; e = scp->end; /* did cursor move since last time ? */ if (scp->cursor_pos != scp->cursor_oldpos) { /* do we need to remove old cursor image ? */ if (!and_region(&s, &e, scp->cursor_oldpos, scp->cursor_oldpos)) sc_remove_cursor_image(scp); sc_draw_cursor_image(scp); } else { if (and_region(&s, &e, scp->cursor_pos, scp->cursor_pos)) /* cursor didn't move, but has been overwritten */ sc_draw_cursor_image(scp); else if (scp->curs_attr.flags & CONS_BLINK_CURSOR) /* if it's a blinking cursor, update it */ (*scp->rndr->blink_cursor)(scp, scp->cursor_pos, sc_inside_cutmark(scp, scp->cursor_pos)); } } #ifndef SC_NO_CUTPASTE /* update "pseudo" mouse pointer image */ if (scp->sc->flags & SC_MOUSE_ENABLED) { if (!(scp->status & (MOUSE_VISIBLE | MOUSE_HIDDEN))) { scp->status &= ~MOUSE_MOVED; sc_draw_mouse_image(scp); } } #endif /* SC_NO_CUTPASTE */ scp->end = 0; scp->start = scp->xsize*scp->ysize - 1; SC_VIDEO_UNLOCK(scp->sc); } #ifdef DEV_SPLASH static int scsplash_callback(int event, void *arg) { sc_softc_t *sc; int error; sc = (sc_softc_t *)arg; switch (event) { case SPLASH_INIT: if (add_scrn_saver(scsplash_saver) == 0) { sc->flags &= ~SC_SAVER_FAILED; run_scrn_saver = TRUE; if (cold && !(boothowto & RB_VERBOSE)) { scsplash_stick(TRUE); (*current_saver)(sc, TRUE); } } return 0; case SPLASH_TERM: if (current_saver == scsplash_saver) { scsplash_stick(FALSE); error = remove_scrn_saver(scsplash_saver); if (error) return error; } return 0; default: return EINVAL; } } static void scsplash_saver(sc_softc_t *sc, int show) { static int busy = FALSE; scr_stat *scp; if (busy) return; busy = TRUE; scp = sc->cur_scp; if (show) { if (!(sc->flags & SC_SAVER_FAILED)) { if (!(sc->flags & SC_SCRN_BLANKED)) set_scrn_saver_mode(scp, -1, NULL, 0); switch (splash(sc->adp, TRUE)) { case 0: /* succeeded */ break; case EAGAIN: /* try later */ restore_scrn_saver_mode(scp, FALSE); sc_touch_scrn_saver(); /* XXX */ break; default: sc->flags |= SC_SAVER_FAILED; scsplash_stick(FALSE); restore_scrn_saver_mode(scp, TRUE); printf("scsplash_saver(): failed to put up the image\n"); break; } } } else if (!sticky_splash) { if ((sc->flags & SC_SCRN_BLANKED) && (splash(sc->adp, FALSE) == 0)) restore_scrn_saver_mode(scp, TRUE); } busy = FALSE; } static int add_scrn_saver(void (*this_saver)(sc_softc_t *, int)) { #if 0 int error; if (current_saver != none_saver) { error = remove_scrn_saver(current_saver); if (error) return error; } #endif if (current_saver != none_saver) return EBUSY; run_scrn_saver = FALSE; saver_mode = CONS_LKM_SAVER; current_saver = this_saver; return 0; } static int remove_scrn_saver(void (*this_saver)(sc_softc_t *, int)) { if (current_saver != this_saver) return EINVAL; #if 0 /* * In order to prevent `current_saver' from being called by * the timeout routine `scrn_timer()' while we manipulate * the saver list, we shall set `current_saver' to `none_saver' * before stopping the current saver, rather than blocking by `splXX()'. */ current_saver = none_saver; if (scrn_blanked) stop_scrn_saver(this_saver); #endif /* unblank all blanked screens */ wait_scrn_saver_stop(NULL); if (scrn_blanked) return EBUSY; current_saver = none_saver; return 0; } static int set_scrn_saver_mode(scr_stat *scp, int mode, u_char *pal, int border) { int s; /* assert(scp == scp->sc->cur_scp) */ s = spltty(); if (!ISGRAPHSC(scp)) sc_remove_cursor_image(scp); scp->splash_save_mode = scp->mode; scp->splash_save_status = scp->status & (GRAPHICS_MODE | PIXEL_MODE); scp->status &= ~(GRAPHICS_MODE | PIXEL_MODE); scp->status |= (UNKNOWN_MODE | SAVER_RUNNING); scp->sc->flags |= SC_SCRN_BLANKED; ++scrn_blanked; splx(s); if (mode < 0) return 0; scp->mode = mode; if (set_mode(scp) == 0) { if (scp->sc->adp->va_info.vi_flags & V_INFO_GRAPHICS) scp->status |= GRAPHICS_MODE; #ifndef SC_NO_PALETTE_LOADING if (pal != NULL) vidd_load_palette(scp->sc->adp, pal); #endif sc_set_border(scp, border); return 0; } else { s = spltty(); scp->mode = scp->splash_save_mode; scp->status &= ~(UNKNOWN_MODE | SAVER_RUNNING); scp->status |= scp->splash_save_status; splx(s); return 1; } } static int restore_scrn_saver_mode(scr_stat *scp, int changemode) { int mode; int status; int s; /* assert(scp == scp->sc->cur_scp) */ s = spltty(); mode = scp->mode; status = scp->status; scp->mode = scp->splash_save_mode; scp->status &= ~(UNKNOWN_MODE | SAVER_RUNNING); scp->status |= scp->splash_save_status; scp->sc->flags &= ~SC_SCRN_BLANKED; if (!changemode) { if (!ISGRAPHSC(scp)) sc_draw_cursor_image(scp); --scrn_blanked; splx(s); return 0; } if (set_mode(scp) == 0) { #ifndef SC_NO_PALETTE_LOADING #ifdef SC_PIXEL_MODE if (scp->sc->adp->va_info.vi_mem_model == V_INFO_MM_DIRECT) vidd_load_palette(scp->sc->adp, scp->sc->palette2); else #endif vidd_load_palette(scp->sc->adp, scp->sc->palette); #endif --scrn_blanked; splx(s); return 0; } else { scp->mode = mode; scp->status = status; splx(s); return 1; } } static void stop_scrn_saver(sc_softc_t *sc, void (*saver)(sc_softc_t *, int)) { (*saver)(sc, FALSE); run_scrn_saver = FALSE; /* the screen saver may have chosen not to stop after all... */ if (sc->flags & SC_SCRN_BLANKED) return; mark_all(sc->cur_scp); if (sc->delayed_next_scr) sc_switch_scr(sc, sc->delayed_next_scr - 1); if (!kdb_active) wakeup(&scrn_blanked); } static int wait_scrn_saver_stop(sc_softc_t *sc) { int error = 0; while (scrn_blanked > 0) { run_scrn_saver = FALSE; if (sc && !(sc->flags & SC_SCRN_BLANKED)) { error = 0; break; } error = tsleep(&scrn_blanked, PZERO | PCATCH, "scrsav", 0); if ((error != 0) && (error != ERESTART)) break; } run_scrn_saver = FALSE; return error; } #endif /* DEV_SPLASH */ void sc_touch_scrn_saver(void) { scsplash_stick(FALSE); run_scrn_saver = FALSE; } int sc_switch_scr(sc_softc_t *sc, u_int next_scr) { scr_stat *cur_scp; struct tty *tp; struct proc *p; int s; DPRINTF(5, ("sc0: sc_switch_scr() %d ", next_scr + 1)); if (sc->cur_scp == NULL) return (0); /* prevent switch if previously requested */ if (sc->flags & SC_SCRN_VTYLOCK) { sc_bell(sc->cur_scp, sc->cur_scp->bell_pitch, sc->cur_scp->bell_duration); return EPERM; } /* delay switch if the screen is blanked or being updated */ if ((sc->flags & SC_SCRN_BLANKED) || sc->write_in_progress || sc->blink_in_progress) { sc->delayed_next_scr = next_scr + 1; sc_touch_scrn_saver(); DPRINTF(5, ("switch delayed\n")); return 0; } sc->delayed_next_scr = 0; s = spltty(); cur_scp = sc->cur_scp; /* we are in the middle of the vty switching process... */ if (sc->switch_in_progress && (cur_scp->smode.mode == VT_PROCESS) && cur_scp->proc) { p = pfind(cur_scp->pid); if (cur_scp->proc != p) { if (p) PROC_UNLOCK(p); /* * The controlling process has died!!. Do some clean up. * NOTE:`cur_scp->proc' and `cur_scp->smode.mode' * are not reset here yet; they will be cleared later. */ DPRINTF(5, ("cur_scp controlling process %d died, ", cur_scp->pid)); if (cur_scp->status & SWITCH_WAIT_REL) { /* * Force the previous switch to finish, but return now * with error. */ DPRINTF(5, ("reset WAIT_REL, ")); finish_vt_rel(cur_scp, TRUE, &s); splx(s); DPRINTF(5, ("finishing previous switch\n")); return EINVAL; } else if (cur_scp->status & SWITCH_WAIT_ACQ) { /* let's assume screen switch has been completed. */ DPRINTF(5, ("reset WAIT_ACQ, ")); finish_vt_acq(cur_scp); } else { /* * We are in between screen release and acquisition, and * reached here via scgetc() or scrn_timer() which has * interrupted exchange_scr(). Don't do anything stupid. */ DPRINTF(5, ("waiting nothing, ")); } } else { if (p) PROC_UNLOCK(p); /* * The controlling process is alive, but not responding... * It is either buggy or it may be just taking time. * The following code is a gross kludge to cope with this * problem for which there is no clean solution. XXX */ if (cur_scp->status & SWITCH_WAIT_REL) { switch (sc->switch_in_progress++) { case 1: break; case 2: DPRINTF(5, ("sending relsig again, ")); signal_vt_rel(cur_scp); break; case 3: break; case 4: default: /* * Act as if the controlling program returned * VT_FALSE. */ DPRINTF(5, ("force reset WAIT_REL, ")); finish_vt_rel(cur_scp, FALSE, &s); splx(s); DPRINTF(5, ("act as if VT_FALSE was seen\n")); return EINVAL; } } else if (cur_scp->status & SWITCH_WAIT_ACQ) { switch (sc->switch_in_progress++) { case 1: break; case 2: DPRINTF(5, ("sending acqsig again, ")); signal_vt_acq(cur_scp); break; case 3: break; case 4: default: /* clear the flag and finish the previous switch */ DPRINTF(5, ("force reset WAIT_ACQ, ")); finish_vt_acq(cur_scp); break; } } } } /* * Return error if an invalid argument is given, or vty switch * is still in progress. */ if ((next_scr < sc->first_vty) || (next_scr >= sc->first_vty + sc->vtys) || sc->switch_in_progress) { splx(s); sc_bell(cur_scp, bios_value.bell_pitch, BELL_DURATION); DPRINTF(5, ("error 1\n")); return EINVAL; } /* * Don't allow switching away from the graphics mode vty * if the switch mode is VT_AUTO, unless the next vty is the same * as the current or the current vty has been closed (but showing). */ tp = SC_DEV(sc, cur_scp->index); if ((cur_scp->index != next_scr) && tty_opened_ns(tp) && (cur_scp->smode.mode == VT_AUTO) && ISGRAPHSC(cur_scp)) { splx(s); sc_bell(cur_scp, bios_value.bell_pitch, BELL_DURATION); DPRINTF(5, ("error, graphics mode\n")); return EINVAL; } /* * Is the wanted vty open? Don't allow switching to a closed vty. * If we are in DDB, don't switch to a vty in the VT_PROCESS mode. * Note that we always allow the user to switch to the kernel * console even if it is closed. */ if ((sc_console == NULL) || (next_scr != sc_console->index)) { tp = SC_DEV(sc, next_scr); if (!tty_opened_ns(tp)) { splx(s); sc_bell(cur_scp, bios_value.bell_pitch, BELL_DURATION); DPRINTF(5, ("error 2, requested vty isn't open!\n")); return EINVAL; } if (kdb_active && SC_STAT(tp)->smode.mode == VT_PROCESS) { splx(s); DPRINTF(5, ("error 3, requested vty is in the VT_PROCESS mode\n")); return EINVAL; } } /* this is the start of vty switching process... */ ++sc->switch_in_progress; sc->old_scp = cur_scp; sc->new_scp = sc_get_stat(SC_DEV(sc, next_scr)); if (sc->new_scp == sc->old_scp) { sc->switch_in_progress = 0; /* * XXX wakeup() locks the scheduler lock which will hang if * the lock is in an in-between state, e.g., when we stop at * a breakpoint at fork_exit. It has always been wrong to call * wakeup() when the debugger is active. In RELENG_4, wakeup() * is supposed to be locked by splhigh(), but the debugger may * be invoked at splhigh(). */ if (!kdb_active) wakeup(VTY_WCHAN(sc,next_scr)); splx(s); DPRINTF(5, ("switch done (new == old)\n")); return 0; } /* has controlling process died? */ vt_proc_alive(sc->old_scp); vt_proc_alive(sc->new_scp); /* wait for the controlling process to release the screen, if necessary */ if (signal_vt_rel(sc->old_scp)) { splx(s); return 0; } /* go set up the new vty screen */ splx(s); exchange_scr(sc); s = spltty(); /* wake up processes waiting for this vty */ if (!kdb_active) wakeup(VTY_WCHAN(sc,next_scr)); /* wait for the controlling process to acknowledge, if necessary */ if (signal_vt_acq(sc->cur_scp)) { splx(s); return 0; } sc->switch_in_progress = 0; if (sc->unit == sc_console_unit) cnavailable(sc_consptr, TRUE); splx(s); DPRINTF(5, ("switch done\n")); return 0; } static int do_switch_scr(sc_softc_t *sc, int s) { vt_proc_alive(sc->new_scp); splx(s); exchange_scr(sc); s = spltty(); /* sc->cur_scp == sc->new_scp */ wakeup(VTY_WCHAN(sc,sc->cur_scp->index)); /* wait for the controlling process to acknowledge, if necessary */ if (!signal_vt_acq(sc->cur_scp)) { sc->switch_in_progress = 0; if (sc->unit == sc_console_unit) cnavailable(sc_consptr, TRUE); } return s; } static int vt_proc_alive(scr_stat *scp) { struct proc *p; if (scp->proc) { if ((p = pfind(scp->pid)) != NULL) PROC_UNLOCK(p); if (scp->proc == p) return TRUE; scp->proc = NULL; scp->smode.mode = VT_AUTO; DPRINTF(5, ("vt controlling process %d died\n", scp->pid)); } return FALSE; } static int signal_vt_rel(scr_stat *scp) { if (scp->smode.mode != VT_PROCESS) return FALSE; scp->status |= SWITCH_WAIT_REL; PROC_LOCK(scp->proc); kern_psignal(scp->proc, scp->smode.relsig); PROC_UNLOCK(scp->proc); DPRINTF(5, ("sending relsig to %d\n", scp->pid)); return TRUE; } static int signal_vt_acq(scr_stat *scp) { if (scp->smode.mode != VT_PROCESS) return FALSE; if (scp->sc->unit == sc_console_unit) cnavailable(sc_consptr, FALSE); scp->status |= SWITCH_WAIT_ACQ; PROC_LOCK(scp->proc); kern_psignal(scp->proc, scp->smode.acqsig); PROC_UNLOCK(scp->proc); DPRINTF(5, ("sending acqsig to %d\n", scp->pid)); return TRUE; } static int finish_vt_rel(scr_stat *scp, int release, int *s) { if (scp == scp->sc->old_scp && scp->status & SWITCH_WAIT_REL) { scp->status &= ~SWITCH_WAIT_REL; if (release) *s = do_switch_scr(scp->sc, *s); else scp->sc->switch_in_progress = 0; return 0; } return EINVAL; } static int finish_vt_acq(scr_stat *scp) { if (scp == scp->sc->new_scp && scp->status & SWITCH_WAIT_ACQ) { scp->status &= ~SWITCH_WAIT_ACQ; scp->sc->switch_in_progress = 0; return 0; } return EINVAL; } static void exchange_scr(sc_softc_t *sc) { scr_stat *scp; /* save the current state of video and keyboard */ sc_move_cursor(sc->old_scp, sc->old_scp->xpos, sc->old_scp->ypos); if (!ISGRAPHSC(sc->old_scp)) sc_remove_cursor_image(sc->old_scp); if (sc->old_scp->kbd_mode == K_XLATE) save_kbd_state(sc->old_scp); /* set up the video for the new screen */ scp = sc->cur_scp = sc->new_scp; if (sc->old_scp->mode != scp->mode || ISUNKNOWNSC(sc->old_scp)) set_mode(scp); #ifndef __sparc64__ else sc_vtb_init(&scp->scr, VTB_FRAMEBUFFER, scp->xsize, scp->ysize, (void *)sc->adp->va_window, FALSE); #endif scp->status |= MOUSE_HIDDEN; sc_move_cursor(scp, scp->xpos, scp->ypos); if (!ISGRAPHSC(scp)) sc_set_cursor_image(scp); #ifndef SC_NO_PALETTE_LOADING if (ISGRAPHSC(sc->old_scp)) { #ifdef SC_PIXEL_MODE if (sc->adp->va_info.vi_mem_model == V_INFO_MM_DIRECT) vidd_load_palette(sc->adp, sc->palette2); else #endif vidd_load_palette(sc->adp, sc->palette); } #endif sc_set_border(scp, scp->border); /* set up the keyboard for the new screen */ if (sc->kbd_open_level == 0 && sc->old_scp->kbd_mode != scp->kbd_mode) (void)kbdd_ioctl(sc->kbd, KDSKBMODE, (caddr_t)&scp->kbd_mode); update_kbd_state(scp, scp->status, LOCK_MASK); mark_all(scp); } static void sc_puts(scr_stat *scp, u_char *buf, int len) { #ifdef DEV_SPLASH /* make screensaver happy */ if (!sticky_splash && scp == scp->sc->cur_scp && !sc_saver_keyb_only) run_scrn_saver = FALSE; #endif if (scp->tsw) (*scp->tsw->te_puts)(scp, buf, len); if (scp->sc->delayed_next_scr) sc_switch_scr(scp->sc, scp->sc->delayed_next_scr - 1); } void sc_draw_cursor_image(scr_stat *scp) { /* assert(scp == scp->sc->cur_scp); */ SC_VIDEO_LOCK(scp->sc); (*scp->rndr->draw_cursor)(scp, scp->cursor_pos, scp->curs_attr.flags & CONS_BLINK_CURSOR, TRUE, sc_inside_cutmark(scp, scp->cursor_pos)); scp->cursor_oldpos = scp->cursor_pos; SC_VIDEO_UNLOCK(scp->sc); } void sc_remove_cursor_image(scr_stat *scp) { /* assert(scp == scp->sc->cur_scp); */ SC_VIDEO_LOCK(scp->sc); (*scp->rndr->draw_cursor)(scp, scp->cursor_oldpos, scp->curs_attr.flags & CONS_BLINK_CURSOR, FALSE, sc_inside_cutmark(scp, scp->cursor_oldpos)); SC_VIDEO_UNLOCK(scp->sc); } static void update_cursor_image(scr_stat *scp) { /* assert(scp == scp->sc->cur_scp); */ sc_remove_cursor_image(scp); sc_set_cursor_image(scp); sc_draw_cursor_image(scp); } void sc_set_cursor_image(scr_stat *scp) { scp->curs_attr = scp->base_curs_attr; if (scp->curs_attr.flags & CONS_HIDDEN_CURSOR) { /* hidden cursor is internally represented as zero-height underline */ scp->curs_attr.flags = CONS_CHAR_CURSOR; scp->curs_attr.base = scp->curs_attr.height = 0; } else if (scp->curs_attr.flags & CONS_CHAR_CURSOR) { scp->curs_attr.base = imin(scp->base_curs_attr.base, scp->font_size - 1); scp->curs_attr.height = imin(scp->base_curs_attr.height, scp->font_size - scp->curs_attr.base); } else { /* block cursor */ scp->curs_attr.base = 0; scp->curs_attr.height = scp->font_size; } /* assert(scp == scp->sc->cur_scp); */ SC_VIDEO_LOCK(scp->sc); (*scp->rndr->set_cursor)(scp, scp->curs_attr.base, scp->curs_attr.height, scp->curs_attr.flags & CONS_BLINK_CURSOR); SC_VIDEO_UNLOCK(scp->sc); } static void sc_adjust_ca(struct cursor_attr *cap, int flags, int base, int height) { - if (0) { - /* Dummy clause to avoid changing indentation later. */ + if (flags & CONS_CHARCURSOR_COLORS) { + cap->bg[0] = base & 0xff; + cap->bg[1] = height & 0xff; + } else if (flags & CONS_MOUSECURSOR_COLORS) { + cap->mouse_ba = base & 0xff; + cap->mouse_ia = height & 0xff; } else { if (base >= 0) cap->base = base; if (height >= 0) cap->height = height; if (!(flags & CONS_SHAPEONLY_CURSOR)) cap->flags = flags & CONS_CURSOR_ATTRS; } } static void change_cursor_shape(scr_stat *scp, int flags, int base, int height) { if ((scp == scp->sc->cur_scp) && !ISGRAPHSC(scp)) sc_remove_cursor_image(scp); if (flags & CONS_RESET_CURSOR) scp->base_curs_attr = scp->dflt_curs_attr; else if (flags & CONS_DEFAULT_CURSOR) { sc_adjust_ca(&scp->dflt_curs_attr, flags, base, height); scp->base_curs_attr = scp->dflt_curs_attr; } else sc_adjust_ca(&scp->base_curs_attr, flags, base, height); if ((scp == scp->sc->cur_scp) && !ISGRAPHSC(scp)) { sc_set_cursor_image(scp); sc_draw_cursor_image(scp); } } void sc_change_cursor_shape(scr_stat *scp, int flags, int base, int height) { sc_softc_t *sc; struct tty *tp; int s; int i; if (flags == -1) flags = CONS_SHAPEONLY_CURSOR; s = spltty(); if (flags & CONS_LOCAL_CURSOR) { /* local (per vty) change */ change_cursor_shape(scp, flags, base, height); splx(s); return; } /* global change */ sc = scp->sc; if (flags & CONS_RESET_CURSOR) sc->curs_attr = sc->dflt_curs_attr; else if (flags & CONS_DEFAULT_CURSOR) { sc_adjust_ca(&sc->dflt_curs_attr, flags, base, height); sc->curs_attr = sc->dflt_curs_attr; } else sc_adjust_ca(&sc->curs_attr, flags, base, height); for (i = sc->first_vty; i < sc->first_vty + sc->vtys; ++i) { if ((tp = SC_DEV(sc, i)) == NULL) continue; if ((scp = sc_get_stat(tp)) == NULL) continue; scp->dflt_curs_attr = sc->curs_attr; change_cursor_shape(scp, CONS_RESET_CURSOR, -1, -1); } splx(s); } static void scinit(int unit, int flags) { /* * When syscons is being initialized as the kernel console, malloc() * is not yet functional, because various kernel structures has not been * fully initialized yet. Therefore, we need to declare the following * static buffers for the console. This is less than ideal, * but is necessry evil for the time being. XXX */ static u_short sc_buffer[ROW*COL]; /* XXX */ #ifndef SC_NO_FONT_LOADING static u_char font_8[256*8]; static u_char font_14[256*14]; static u_char font_16[256*16]; #endif sc_softc_t *sc; scr_stat *scp; video_adapter_t *adp; int col; int row; int i; /* one time initialization */ if (init_done == COLD) { sc_get_bios_values(&bios_value); for (i = 0; i < nitems(sc_kattrtab); i++) { #if SC_KERNEL_CONS_ATTR == FG_WHITE sc_kattrtab[i] = 8 + (i + FG_WHITE) % 8U; #else sc_kattrtab[i] = SC_KERNEL_CONS_ATTR; #endif } } init_done = WARM; /* * Allocate resources. Even if we are being called for the second * time, we must allocate them again, because they might have * disappeared... */ sc = sc_get_softc(unit, flags & SC_KERNEL_CONSOLE); if ((sc->flags & SC_INIT_DONE) == 0) SC_VIDEO_LOCKINIT(sc); adp = NULL; if (sc->adapter >= 0) { vid_release(sc->adp, (void *)&sc->adapter); adp = sc->adp; sc->adp = NULL; } if (sc->keyboard >= 0) { DPRINTF(5, ("sc%d: releasing kbd%d\n", unit, sc->keyboard)); i = kbd_release(sc->kbd, (void *)&sc->keyboard); DPRINTF(5, ("sc%d: kbd_release returned %d\n", unit, i)); if (sc->kbd != NULL) { DPRINTF(5, ("sc%d: kbd != NULL!, index:%d, unit:%d, flags:0x%x\n", unit, sc->kbd->kb_index, sc->kbd->kb_unit, sc->kbd->kb_flags)); } sc->kbd = NULL; } sc->adapter = vid_allocate("*", unit, (void *)&sc->adapter); sc->adp = vid_get_adapter(sc->adapter); /* assert((sc->adapter >= 0) && (sc->adp != NULL)) */ sc->keyboard = sc_allocate_keyboard(sc, unit); DPRINTF(1, ("sc%d: keyboard %d\n", unit, sc->keyboard)); sc->kbd = kbd_get_keyboard(sc->keyboard); if (sc->kbd != NULL) { DPRINTF(1, ("sc%d: kbd index:%d, unit:%d, flags:0x%x\n", unit, sc->kbd->kb_index, sc->kbd->kb_unit, sc->kbd->kb_flags)); } if (!(sc->flags & SC_INIT_DONE) || (adp != sc->adp)) { sc->initial_mode = sc->adp->va_initial_mode; #ifndef SC_NO_FONT_LOADING if (flags & SC_KERNEL_CONSOLE) { sc->font_8 = font_8; sc->font_14 = font_14; sc->font_16 = font_16; } else if (sc->font_8 == NULL) { /* assert(sc_malloc) */ sc->font_8 = malloc(sizeof(font_8), M_DEVBUF, M_WAITOK); sc->font_14 = malloc(sizeof(font_14), M_DEVBUF, M_WAITOK); sc->font_16 = malloc(sizeof(font_16), M_DEVBUF, M_WAITOK); } #endif /* extract the hardware cursor location and hide the cursor for now */ vidd_read_hw_cursor(sc->adp, &col, &row); vidd_set_hw_cursor(sc->adp, -1, -1); /* set up the first console */ sc->first_vty = unit*MAXCONS; sc->vtys = MAXCONS; /* XXX: should be configurable */ if (flags & SC_KERNEL_CONSOLE) { /* * Set up devs structure but don't use it yet, calling make_dev() * might panic kernel. Wait for sc_attach_unit() to actually * create the devices. */ sc->dev = main_devs; scp = &main_console; init_scp(sc, sc->first_vty, scp); sc_vtb_init(&scp->vtb, VTB_MEMORY, scp->xsize, scp->ysize, (void *)sc_buffer, FALSE); if (sc_init_emulator(scp, SC_DFLT_TERM)) sc_init_emulator(scp, "*"); (*scp->tsw->te_default_attr)(scp, SC_KERNEL_CONS_ATTR, SC_KERNEL_CONS_REV_ATTR); } else { /* assert(sc_malloc) */ sc->dev = malloc(sizeof(struct tty *)*sc->vtys, M_DEVBUF, M_WAITOK|M_ZERO); sc->dev[0] = sc_alloc_tty(0, unit * MAXCONS); scp = alloc_scp(sc, sc->first_vty); SC_STAT(sc->dev[0]) = scp; } sc->cur_scp = scp; #ifndef __sparc64__ /* copy screen to temporary buffer */ sc_vtb_init(&scp->scr, VTB_FRAMEBUFFER, scp->xsize, scp->ysize, (void *)scp->sc->adp->va_window, FALSE); if (ISTEXTSC(scp)) sc_vtb_copy(&scp->scr, 0, &scp->vtb, 0, scp->xsize*scp->ysize); #endif /* Sync h/w cursor position to s/w (sc and teken). */ if (col >= scp->xsize) col = 0; if (row >= scp->ysize) row = scp->ysize - 1; scp->xpos = col; scp->ypos = row; scp->cursor_pos = scp->cursor_oldpos = row*scp->xsize + col; (*scp->tsw->te_sync)(scp); sc->dflt_curs_attr.base = 0; sc->dflt_curs_attr.height = howmany(scp->font_size, 8); sc->dflt_curs_attr.flags = 0; + sc->dflt_curs_attr.bg[0] = FG_RED; + sc->dflt_curs_attr.bg[1] = FG_LIGHTGREY; + sc->dflt_curs_attr.bg[2] = FG_BLUE; + sc->dflt_curs_attr.mouse_ba = FG_WHITE; + sc->dflt_curs_attr.mouse_ia = FG_RED; sc->curs_attr = sc->dflt_curs_attr; scp->base_curs_attr = scp->dflt_curs_attr = sc->curs_attr; + scp->curs_attr = scp->base_curs_attr; #ifndef SC_NO_SYSMOUSE sc_mouse_move(scp, scp->xpixel/2, scp->ypixel/2); #endif if (!ISGRAPHSC(scp)) { sc_set_cursor_image(scp); sc_draw_cursor_image(scp); } /* save font and palette */ #ifndef SC_NO_FONT_LOADING sc->fonts_loaded = 0; if (ISFONTAVAIL(sc->adp->va_flags)) { #ifdef SC_DFLT_FONT bcopy(dflt_font_8, sc->font_8, sizeof(dflt_font_8)); bcopy(dflt_font_14, sc->font_14, sizeof(dflt_font_14)); bcopy(dflt_font_16, sc->font_16, sizeof(dflt_font_16)); sc->fonts_loaded = FONT_16 | FONT_14 | FONT_8; if (scp->font_size < 14) { sc_load_font(scp, 0, 8, 8, sc->font_8, 0, 256); } else if (scp->font_size >= 16) { sc_load_font(scp, 0, 16, 8, sc->font_16, 0, 256); } else { sc_load_font(scp, 0, 14, 8, sc->font_14, 0, 256); } #else /* !SC_DFLT_FONT */ if (scp->font_size < 14) { sc_save_font(scp, 0, 8, 8, sc->font_8, 0, 256); sc->fonts_loaded = FONT_8; } else if (scp->font_size >= 16) { sc_save_font(scp, 0, 16, 8, sc->font_16, 0, 256); sc->fonts_loaded = FONT_16; } else { sc_save_font(scp, 0, 14, 8, sc->font_14, 0, 256); sc->fonts_loaded = FONT_14; } #endif /* SC_DFLT_FONT */ /* FONT KLUDGE: always use the font page #0. XXX */ sc_show_font(scp, 0); } #endif /* !SC_NO_FONT_LOADING */ #ifndef SC_NO_PALETTE_LOADING vidd_save_palette(sc->adp, sc->palette); #ifdef SC_PIXEL_MODE for (i = 0; i < sizeof(sc->palette2); i++) sc->palette2[i] = i / 3; #endif #endif #ifdef DEV_SPLASH if (!(sc->flags & SC_SPLASH_SCRN)) { /* we are ready to put up the splash image! */ splash_init(sc->adp, scsplash_callback, sc); sc->flags |= SC_SPLASH_SCRN; } #endif } /* the rest is not necessary, if we have done it once */ if (sc->flags & SC_INIT_DONE) return; /* initialize mapscrn arrays to a one to one map */ for (i = 0; i < sizeof(sc->scr_map); i++) sc->scr_map[i] = sc->scr_rmap[i] = i; sc->flags |= SC_INIT_DONE; } static void scterm(int unit, int flags) { sc_softc_t *sc; scr_stat *scp; sc = sc_get_softc(unit, flags & SC_KERNEL_CONSOLE); if (sc == NULL) return; /* shouldn't happen */ #ifdef DEV_SPLASH /* this console is no longer available for the splash screen */ if (sc->flags & SC_SPLASH_SCRN) { splash_term(sc->adp); sc->flags &= ~SC_SPLASH_SCRN; } #endif #if 0 /* XXX */ /* move the hardware cursor to the upper-left corner */ vidd_set_hw_cursor(sc->adp, 0, 0); #endif /* release the keyboard and the video card */ if (sc->keyboard >= 0) kbd_release(sc->kbd, &sc->keyboard); if (sc->adapter >= 0) vid_release(sc->adp, &sc->adapter); /* stop the terminal emulator, if any */ scp = sc_get_stat(sc->dev[0]); if (scp->tsw) (*scp->tsw->te_term)(scp, &scp->ts); mtx_destroy(&sc->video_mtx); /* clear the structure */ if (!(flags & SC_KERNEL_CONSOLE)) { free(scp->ts, M_DEVBUF); /* XXX: We need delete_dev() for this */ free(sc->dev, M_DEVBUF); #if 0 /* XXX: We need a ttyunregister for this */ free(sc->tty, M_DEVBUF); #endif #ifndef SC_NO_FONT_LOADING free(sc->font_8, M_DEVBUF); free(sc->font_14, M_DEVBUF); free(sc->font_16, M_DEVBUF); #endif /* XXX vtb, history */ } bzero(sc, sizeof(*sc)); sc->keyboard = -1; sc->adapter = -1; } static void scshutdown(__unused void *arg, __unused int howto) { KASSERT(sc_console != NULL, ("sc_console != NULL")); KASSERT(sc_console->sc != NULL, ("sc_console->sc != NULL")); KASSERT(sc_console->sc->cur_scp != NULL, ("sc_console->sc->cur_scp != NULL")); sc_touch_scrn_saver(); if (!cold && sc_console->sc->cur_scp->index != sc_console->index && sc_console->sc->cur_scp->smode.mode == VT_AUTO && sc_console->smode.mode == VT_AUTO) sc_switch_scr(sc_console->sc, sc_console->index); shutdown_in_progress = TRUE; } static void scsuspend(__unused void *arg) { int retry; KASSERT(sc_console != NULL, ("sc_console != NULL")); KASSERT(sc_console->sc != NULL, ("sc_console->sc != NULL")); KASSERT(sc_console->sc->cur_scp != NULL, ("sc_console->sc->cur_scp != NULL")); sc_susp_scr = sc_console->sc->cur_scp->index; if (sc_no_suspend_vtswitch || sc_susp_scr == sc_console->index) { sc_touch_scrn_saver(); sc_susp_scr = -1; return; } for (retry = 0; retry < 10; retry++) { sc_switch_scr(sc_console->sc, sc_console->index); if (!sc_console->sc->switch_in_progress) break; pause("scsuspend", hz); } suspend_in_progress = TRUE; } static void scresume(__unused void *arg) { KASSERT(sc_console != NULL, ("sc_console != NULL")); KASSERT(sc_console->sc != NULL, ("sc_console->sc != NULL")); KASSERT(sc_console->sc->cur_scp != NULL, ("sc_console->sc->cur_scp != NULL")); suspend_in_progress = FALSE; if (sc_susp_scr < 0) { update_font(sc_console->sc->cur_scp); return; } sc_switch_scr(sc_console->sc, sc_susp_scr); } int sc_clean_up(scr_stat *scp) { #ifdef DEV_SPLASH int error; #endif if (scp->sc->flags & SC_SCRN_BLANKED) { sc_touch_scrn_saver(); #ifdef DEV_SPLASH if ((error = wait_scrn_saver_stop(scp->sc))) return error; #endif } scp->status |= MOUSE_HIDDEN; sc_remove_mouse_image(scp); sc_remove_cutmarking(scp); return 0; } void sc_alloc_scr_buffer(scr_stat *scp, int wait, int discard) { sc_vtb_t new; sc_vtb_t old; old = scp->vtb; sc_vtb_init(&new, VTB_MEMORY, scp->xsize, scp->ysize, NULL, wait); if (!discard && (old.vtb_flags & VTB_VALID)) { /* retain the current cursor position and buffer contants */ scp->cursor_oldpos = scp->cursor_pos; /* * This works only if the old buffer has the same size as or larger * than the new one. XXX */ sc_vtb_copy(&old, 0, &new, 0, scp->xsize*scp->ysize); scp->vtb = new; } else { scp->vtb = new; sc_vtb_destroy(&old); } #ifndef SC_NO_SYSMOUSE /* move the mouse cursor at the center of the screen */ sc_mouse_move(scp, scp->xpixel / 2, scp->ypixel / 2); #endif } static scr_stat *alloc_scp(sc_softc_t *sc, int vty) { scr_stat *scp; /* assert(sc_malloc) */ scp = (scr_stat *)malloc(sizeof(scr_stat), M_DEVBUF, M_WAITOK); init_scp(sc, vty, scp); sc_alloc_scr_buffer(scp, TRUE, TRUE); if (sc_init_emulator(scp, SC_DFLT_TERM)) sc_init_emulator(scp, "*"); #ifndef SC_NO_CUTPASTE sc_alloc_cut_buffer(scp, TRUE); #endif #ifndef SC_NO_HISTORY sc_alloc_history_buffer(scp, 0, 0, TRUE); #endif return scp; } static void init_scp(sc_softc_t *sc, int vty, scr_stat *scp) { video_info_t info; bzero(scp, sizeof(*scp)); scp->index = vty; scp->sc = sc; scp->status = 0; scp->mode = sc->initial_mode; vidd_get_info(sc->adp, scp->mode, &info); if (info.vi_flags & V_INFO_GRAPHICS) { scp->status |= GRAPHICS_MODE; scp->xpixel = info.vi_width; scp->ypixel = info.vi_height; scp->xsize = info.vi_width/info.vi_cwidth; scp->ysize = info.vi_height/info.vi_cheight; scp->font_size = 0; scp->font = NULL; } else { scp->xsize = info.vi_width; scp->ysize = info.vi_height; scp->xpixel = scp->xsize*info.vi_cwidth; scp->ypixel = scp->ysize*info.vi_cheight; } scp->font_size = info.vi_cheight; scp->font_width = info.vi_cwidth; #ifndef SC_NO_FONT_LOADING if (info.vi_cheight < 14) scp->font = sc->font_8; else if (info.vi_cheight >= 16) scp->font = sc->font_16; else scp->font = sc->font_14; #else scp->font = NULL; #endif sc_vtb_init(&scp->vtb, VTB_MEMORY, 0, 0, NULL, FALSE); #ifndef __sparc64__ sc_vtb_init(&scp->scr, VTB_FRAMEBUFFER, 0, 0, NULL, FALSE); #endif scp->xoff = scp->yoff = 0; scp->xpos = scp->ypos = 0; scp->start = scp->xsize * scp->ysize - 1; scp->end = 0; scp->tsw = NULL; scp->ts = NULL; scp->rndr = NULL; scp->border = (SC_NORM_ATTR >> 4) & 0x0f; scp->base_curs_attr = scp->dflt_curs_attr = sc->curs_attr; scp->mouse_cut_start = scp->xsize*scp->ysize; scp->mouse_cut_end = -1; scp->mouse_signal = 0; scp->mouse_pid = 0; scp->mouse_proc = NULL; scp->kbd_mode = K_XLATE; scp->bell_pitch = bios_value.bell_pitch; scp->bell_duration = BELL_DURATION; scp->status |= (bios_value.shift_state & NLKED); scp->status |= CURSOR_ENABLED | MOUSE_HIDDEN; scp->pid = 0; scp->proc = NULL; scp->smode.mode = VT_AUTO; scp->history = NULL; scp->history_pos = 0; scp->history_size = 0; } int sc_init_emulator(scr_stat *scp, char *name) { sc_term_sw_t *sw; sc_rndr_sw_t *rndr; void *p; int error; if (name == NULL) /* if no name is given, use the current emulator */ sw = scp->tsw; else /* ...otherwise find the named emulator */ sw = sc_term_match(name); if (sw == NULL) return EINVAL; rndr = NULL; if (strcmp(sw->te_renderer, "*") != 0) { rndr = sc_render_match(scp, sw->te_renderer, scp->status & (GRAPHICS_MODE | PIXEL_MODE)); } if (rndr == NULL) { rndr = sc_render_match(scp, scp->sc->adp->va_name, scp->status & (GRAPHICS_MODE | PIXEL_MODE)); if (rndr == NULL) return ENODEV; } if (sw == scp->tsw) { error = (*sw->te_init)(scp, &scp->ts, SC_TE_WARM_INIT); scp->rndr = rndr; scp->rndr->init(scp); sc_clear_screen(scp); /* assert(error == 0); */ return error; } if (sc_malloc && (sw->te_size > 0)) p = malloc(sw->te_size, M_DEVBUF, M_NOWAIT); else p = NULL; error = (*sw->te_init)(scp, &p, SC_TE_COLD_INIT); if (error) return error; if (scp->tsw) (*scp->tsw->te_term)(scp, &scp->ts); if (scp->ts != NULL) free(scp->ts, M_DEVBUF); scp->tsw = sw; scp->ts = p; scp->rndr = rndr; scp->rndr->init(scp); (*sw->te_default_attr)(scp, SC_NORM_ATTR, SC_NORM_REV_ATTR); sc_clear_screen(scp); return 0; } /* * scgetc(flags) - get character from keyboard. * If flags & SCGETC_CN, then avoid harmful side effects. * If flags & SCGETC_NONBLOCK, then wait until a key is pressed, else * return NOKEY if there is nothing there. */ static u_int scgetc(sc_softc_t *sc, u_int flags, struct sc_cnstate *sp) { scr_stat *scp; #ifndef SC_NO_HISTORY struct tty *tp; #endif u_int c; int this_scr; int f; int i; if (sc->kbd == NULL) return NOKEY; next_code: #if 1 /* I don't like this, but... XXX */ if (flags & SCGETC_CN) sccnupdate(sc->cur_scp); #endif scp = sc->cur_scp; /* first see if there is something in the keyboard port */ for (;;) { if (flags & SCGETC_CN) sccnscrunlock(sc, sp); c = kbdd_read_char(sc->kbd, !(flags & SCGETC_NONBLOCK)); if (flags & SCGETC_CN) sccnscrlock(sc, sp); if (c == ERRKEY) { if (!(flags & SCGETC_CN)) sc_bell(scp, bios_value.bell_pitch, BELL_DURATION); } else if (c == NOKEY) return c; else break; } /* make screensaver happy */ if (!(c & RELKEY)) sc_touch_scrn_saver(); if (!(flags & SCGETC_CN)) random_harvest_queue(&c, sizeof(c), 1, RANDOM_KEYBOARD); if (sc->kbd_open_level == 0 && scp->kbd_mode != K_XLATE) return KEYCHAR(c); /* if scroll-lock pressed allow history browsing */ if (!ISGRAPHSC(scp) && scp->history && scp->status & SLKED) { scp->status &= ~CURSOR_ENABLED; sc_remove_cursor_image(scp); #ifndef SC_NO_HISTORY if (!(scp->status & BUFFER_SAVED)) { scp->status |= BUFFER_SAVED; sc_hist_save(scp); } switch (c) { /* FIXME: key codes */ case SPCLKEY | FKEY | F(49): /* home key */ sc_remove_cutmarking(scp); sc_hist_home(scp); goto next_code; case SPCLKEY | FKEY | F(57): /* end key */ sc_remove_cutmarking(scp); sc_hist_end(scp); goto next_code; case SPCLKEY | FKEY | F(50): /* up arrow key */ sc_remove_cutmarking(scp); if (sc_hist_up_line(scp)) if (!(flags & SCGETC_CN)) sc_bell(scp, bios_value.bell_pitch, BELL_DURATION); goto next_code; case SPCLKEY | FKEY | F(58): /* down arrow key */ sc_remove_cutmarking(scp); if (sc_hist_down_line(scp)) if (!(flags & SCGETC_CN)) sc_bell(scp, bios_value.bell_pitch, BELL_DURATION); goto next_code; case SPCLKEY | FKEY | F(51): /* page up key */ sc_remove_cutmarking(scp); for (i=0; iysize; i++) if (sc_hist_up_line(scp)) { if (!(flags & SCGETC_CN)) sc_bell(scp, bios_value.bell_pitch, BELL_DURATION); break; } goto next_code; case SPCLKEY | FKEY | F(59): /* page down key */ sc_remove_cutmarking(scp); for (i=0; iysize; i++) if (sc_hist_down_line(scp)) { if (!(flags & SCGETC_CN)) sc_bell(scp, bios_value.bell_pitch, BELL_DURATION); break; } goto next_code; } #endif /* SC_NO_HISTORY */ } /* * Process and consume special keys here. Return a plain char code * or a char code with the META flag or a function key code. */ if (c & RELKEY) { /* key released */ /* goto next_code */ } else { /* key pressed */ if (c & SPCLKEY) { c &= ~SPCLKEY; switch (KEYCHAR(c)) { /* LOCKING KEYS */ case NLK: case CLK: case ALK: break; case SLK: (void)kbdd_ioctl(sc->kbd, KDGKBSTATE, (caddr_t)&f); if (f & SLKED) { scp->status |= SLKED; } else { if (scp->status & SLKED) { scp->status &= ~SLKED; #ifndef SC_NO_HISTORY if (scp->status & BUFFER_SAVED) { if (!sc_hist_restore(scp)) sc_remove_cutmarking(scp); scp->status &= ~BUFFER_SAVED; scp->status |= CURSOR_ENABLED; sc_draw_cursor_image(scp); } /* Only safe in Giant-locked context. */ tp = SC_DEV(sc, scp->index); if (!(flags & SCGETC_CN) && tty_opened_ns(tp)) sctty_outwakeup(tp); #endif } } break; case PASTE: #ifndef SC_NO_CUTPASTE sc_mouse_paste(scp); #endif break; /* NON-LOCKING KEYS */ case NOP: case LSH: case RSH: case LCTR: case RCTR: case LALT: case RALT: case ASH: case META: break; case BTAB: if (!(sc->flags & SC_SCRN_BLANKED)) return c; break; case SPSC: #ifdef DEV_SPLASH /* force activatation/deactivation of the screen saver */ if (!(sc->flags & SC_SCRN_BLANKED)) { run_scrn_saver = TRUE; sc->scrn_time_stamp -= scrn_blank_time; } if (cold) { /* * While devices are being probed, the screen saver need * to be invoked explicitly. XXX */ if (sc->flags & SC_SCRN_BLANKED) { scsplash_stick(FALSE); stop_scrn_saver(sc, current_saver); } else { if (!ISGRAPHSC(scp)) { scsplash_stick(TRUE); (*current_saver)(sc, TRUE); } } } #endif /* DEV_SPLASH */ break; case RBT: #ifndef SC_DISABLE_REBOOT if (enable_reboot && !(flags & SCGETC_CN)) shutdown_nice(0); #endif break; case HALT: #ifndef SC_DISABLE_REBOOT if (enable_reboot && !(flags & SCGETC_CN)) shutdown_nice(RB_HALT); #endif break; case PDWN: #ifndef SC_DISABLE_REBOOT if (enable_reboot && !(flags & SCGETC_CN)) shutdown_nice(RB_HALT|RB_POWEROFF); #endif break; case SUSP: power_pm_suspend(POWER_SLEEP_STATE_SUSPEND); break; case STBY: power_pm_suspend(POWER_SLEEP_STATE_STANDBY); break; case DBG: #ifndef SC_DISABLE_KDBKEY if (enable_kdbkey) kdb_break(); #endif break; case PNC: if (enable_panic_key) panic("Forced by the panic key"); break; case NEXT: this_scr = scp->index; for (i = (this_scr - sc->first_vty + 1)%sc->vtys; sc->first_vty + i != this_scr; i = (i + 1)%sc->vtys) { struct tty *tp = SC_DEV(sc, sc->first_vty + i); if (tty_opened_ns(tp)) { sc_switch_scr(scp->sc, sc->first_vty + i); break; } } break; case PREV: this_scr = scp->index; for (i = (this_scr - sc->first_vty + sc->vtys - 1)%sc->vtys; sc->first_vty + i != this_scr; i = (i + sc->vtys - 1)%sc->vtys) { struct tty *tp = SC_DEV(sc, sc->first_vty + i); if (tty_opened_ns(tp)) { sc_switch_scr(scp->sc, sc->first_vty + i); break; } } break; default: if (KEYCHAR(c) >= F_SCR && KEYCHAR(c) <= L_SCR) { sc_switch_scr(scp->sc, sc->first_vty + KEYCHAR(c) - F_SCR); break; } /* assert(c & FKEY) */ if (!(sc->flags & SC_SCRN_BLANKED)) return c; break; } /* goto next_code */ } else { /* regular keys (maybe MKEY is set) */ #if !defined(SC_DISABLE_KDBKEY) && defined(KDB) if (enable_kdbkey) kdb_alt_break(c, &sc->sc_altbrk); #endif if (!(sc->flags & SC_SCRN_BLANKED)) return c; } } goto next_code; } static int sctty_mmap(struct tty *tp, vm_ooffset_t offset, vm_paddr_t *paddr, int nprot, vm_memattr_t *memattr) { scr_stat *scp; scp = sc_get_stat(tp); if (scp != scp->sc->cur_scp) return -1; return vidd_mmap(scp->sc->adp, offset, paddr, nprot, memattr); } static void update_font(scr_stat *scp) { #ifndef SC_NO_FONT_LOADING /* load appropriate font */ if (!(scp->status & GRAPHICS_MODE)) { if (!(scp->status & PIXEL_MODE) && ISFONTAVAIL(scp->sc->adp->va_flags)) { if (scp->font_size < 14) { if (scp->sc->fonts_loaded & FONT_8) sc_load_font(scp, 0, 8, 8, scp->sc->font_8, 0, 256); } else if (scp->font_size >= 16) { if (scp->sc->fonts_loaded & FONT_16) sc_load_font(scp, 0, 16, 8, scp->sc->font_16, 0, 256); } else { if (scp->sc->fonts_loaded & FONT_14) sc_load_font(scp, 0, 14, 8, scp->sc->font_14, 0, 256); } /* * FONT KLUDGE: * This is an interim kludge to display correct font. * Always use the font page #0 on the video plane 2. * Somehow we cannot show the font in other font pages on * some video cards... XXX */ sc_show_font(scp, 0); } mark_all(scp); } #endif /* !SC_NO_FONT_LOADING */ } static int save_kbd_state(scr_stat *scp) { int state; int error; error = kbdd_ioctl(scp->sc->kbd, KDGKBSTATE, (caddr_t)&state); if (error == ENOIOCTL) error = ENODEV; if (error == 0) { scp->status &= ~LOCK_MASK; scp->status |= state; } return error; } static int update_kbd_state(scr_stat *scp, int new_bits, int mask) { int state; int error; if (mask != LOCK_MASK) { error = kbdd_ioctl(scp->sc->kbd, KDGKBSTATE, (caddr_t)&state); if (error == ENOIOCTL) error = ENODEV; if (error) return error; state &= ~mask; state |= new_bits & mask; } else { state = new_bits & LOCK_MASK; } error = kbdd_ioctl(scp->sc->kbd, KDSKBSTATE, (caddr_t)&state); if (error == ENOIOCTL) error = ENODEV; return error; } static int update_kbd_leds(scr_stat *scp, int which) { int error; which &= LOCK_MASK; error = kbdd_ioctl(scp->sc->kbd, KDSETLED, (caddr_t)&which); if (error == ENOIOCTL) error = ENODEV; return error; } int set_mode(scr_stat *scp) { video_info_t info; /* reject unsupported mode */ if (vidd_get_info(scp->sc->adp, scp->mode, &info)) return 1; /* if this vty is not currently showing, do nothing */ if (scp != scp->sc->cur_scp) return 0; /* setup video hardware for the given mode */ vidd_set_mode(scp->sc->adp, scp->mode); scp->rndr->init(scp); #ifndef __sparc64__ sc_vtb_init(&scp->scr, VTB_FRAMEBUFFER, scp->xsize, scp->ysize, (void *)scp->sc->adp->va_window, FALSE); #endif update_font(scp); sc_set_border(scp, scp->border); sc_set_cursor_image(scp); return 0; } void sc_set_border(scr_stat *scp, int color) { SC_VIDEO_LOCK(scp->sc); (*scp->rndr->draw_border)(scp, color); SC_VIDEO_UNLOCK(scp->sc); } #ifndef SC_NO_FONT_LOADING void sc_load_font(scr_stat *scp, int page, int size, int width, u_char *buf, int base, int count) { sc_softc_t *sc; sc = scp->sc; sc->font_loading_in_progress = TRUE; vidd_load_font(sc->adp, page, size, width, buf, base, count); sc->font_loading_in_progress = FALSE; } void sc_save_font(scr_stat *scp, int page, int size, int width, u_char *buf, int base, int count) { sc_softc_t *sc; sc = scp->sc; sc->font_loading_in_progress = TRUE; vidd_save_font(sc->adp, page, size, width, buf, base, count); sc->font_loading_in_progress = FALSE; } void sc_show_font(scr_stat *scp, int page) { vidd_show_font(scp->sc->adp, page); } #endif /* !SC_NO_FONT_LOADING */ void sc_paste(scr_stat *scp, const u_char *p, int count) { struct tty *tp; u_char *rmap; tp = SC_DEV(scp->sc, scp->sc->cur_scp->index); if (!tty_opened_ns(tp)) return; rmap = scp->sc->scr_rmap; for (; count > 0; --count) ttydisc_rint(tp, rmap[*p++], 0); ttydisc_rint_done(tp); } void sc_respond(scr_stat *scp, const u_char *p, int count, int wakeup) { struct tty *tp; tp = SC_DEV(scp->sc, scp->sc->cur_scp->index); if (!tty_opened_ns(tp)) return; ttydisc_rint_simple(tp, p, count); if (wakeup) { /* XXX: we can't always call ttydisc_rint_done() here! */ ttydisc_rint_done(tp); } } void sc_bell(scr_stat *scp, int pitch, int duration) { if (cold || kdb_active || shutdown_in_progress || !enable_bell) return; if (scp != scp->sc->cur_scp && (scp->sc->flags & SC_QUIET_BELL)) return; if (scp->sc->flags & SC_VISUAL_BELL) { if (scp->sc->blink_in_progress) return; scp->sc->blink_in_progress = 3; if (scp != scp->sc->cur_scp) scp->sc->blink_in_progress += 2; blink_screen(scp->sc->cur_scp); } else if (duration != 0 && pitch != 0) { if (scp != scp->sc->cur_scp) pitch *= 2; sysbeep(1193182 / pitch, duration); } } static int sc_kattr(void) { if (sc_console == NULL) - return (SC_KERNEL_CONS_ATTR); + return (SC_KERNEL_CONS_ATTR); /* for very early, before pcpu */ return (sc_kattrtab[PCPU_GET(cpuid) % nitems(sc_kattrtab)]); } static void blink_screen(void *arg) { scr_stat *scp = arg; struct tty *tp; if (ISGRAPHSC(scp) || (scp->sc->blink_in_progress <= 1)) { scp->sc->blink_in_progress = 0; mark_all(scp); tp = SC_DEV(scp->sc, scp->index); if (tty_opened_ns(tp)) sctty_outwakeup(tp); if (scp->sc->delayed_next_scr) sc_switch_scr(scp->sc, scp->sc->delayed_next_scr - 1); } else { (*scp->rndr->draw)(scp, 0, scp->xsize*scp->ysize, scp->sc->blink_in_progress & 1); scp->sc->blink_in_progress--; callout_reset_sbt(&scp->sc->cblink, SBT_1S / 15, 0, blink_screen, scp, C_PREL(0)); } } /* * Until sc_attach_unit() gets called no dev structures will be available * to store the per-screen current status. This is the case when the * kernel is initially booting and needs access to its console. During * this early phase of booting the console's current status is kept in * one statically defined scr_stat structure, and any pointers to the * dev structures will be NULL. */ static scr_stat * sc_get_stat(struct tty *tp) { if (tp == NULL) return (&main_console); return (SC_STAT(tp)); } /* * Allocate active keyboard. Try to allocate "kbdmux" keyboard first, and, * if found, add all non-busy keyboards to "kbdmux". Otherwise look for * any keyboard. */ static int sc_allocate_keyboard(sc_softc_t *sc, int unit) { int idx0, idx; keyboard_t *k0, *k; keyboard_info_t ki; idx0 = kbd_allocate("kbdmux", -1, (void *)&sc->keyboard, sckbdevent, sc); if (idx0 != -1) { k0 = kbd_get_keyboard(idx0); for (idx = kbd_find_keyboard2("*", -1, 0); idx != -1; idx = kbd_find_keyboard2("*", -1, idx + 1)) { k = kbd_get_keyboard(idx); if (idx == idx0 || KBD_IS_BUSY(k)) continue; bzero(&ki, sizeof(ki)); strcpy(ki.kb_name, k->kb_name); ki.kb_unit = k->kb_unit; (void)kbdd_ioctl(k0, KBADDKBD, (caddr_t) &ki); } } else idx0 = kbd_allocate("*", unit, (void *)&sc->keyboard, sckbdevent, sc); return (idx0); } Index: projects/runtime-coverage/sys/dev/syscons/syscons.h =================================================================== --- projects/runtime-coverage/sys/dev/syscons/syscons.h (revision 322921) +++ projects/runtime-coverage/sys/dev/syscons/syscons.h (revision 322922) @@ -1,688 +1,691 @@ /*- * Copyright (c) 1995-1998 Søren Schmidt * All rights reserved. * * This code is derived from software contributed to The DragonFly Project * by Sascha Wildner * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _DEV_SYSCONS_SYSCONS_H_ #define _DEV_SYSCONS_SYSCONS_H_ #include /* XXX */ #include #include /* default values for configuration options */ #ifndef MAXCONS #define MAXCONS 16 #endif #ifdef SC_NO_SYSMOUSE #undef SC_NO_CUTPASTE #define SC_NO_CUTPASTE 1 #endif #ifdef SC_NO_MODE_CHANGE #undef SC_PIXEL_MODE #endif /* Always load font data if the pixel (raster text) mode is to be used. */ #ifdef SC_PIXEL_MODE #undef SC_NO_FONT_LOADING #endif /* * If font data is not available, the `arrow'-shaped mouse cursor cannot * be drawn. Use the alternative drawing method. */ #ifdef SC_NO_FONT_LOADING #undef SC_ALT_MOUSE_IMAGE #define SC_ALT_MOUSE_IMAGE 1 #endif #ifndef SC_CURSOR_CHAR #define SC_CURSOR_CHAR 7 #endif #ifndef SC_MOUSE_CHAR #define SC_MOUSE_CHAR 8 #endif #if SC_MOUSE_CHAR <= SC_CURSOR_CHAR && SC_CURSOR_CHAR < (SC_MOUSE_CHAR + 4) #undef SC_CURSOR_CHAR #define SC_CURSOR_CHAR (SC_MOUSE_CHAR + 4) #endif #ifndef SC_DEBUG_LEVEL #define SC_DEBUG_LEVEL 0 #endif #define DPRINTF(l, p) if (SC_DEBUG_LEVEL >= (l)) printf p #ifndef __sparc64__ #define SC_DRIVER_NAME "sc" #else /* * Use a different driver name on sparc64 so it does not get confused * with the system controller devices which are also termed 'sc' in OFW. */ #define SC_DRIVER_NAME "syscons" #endif #define SC_VTY(dev) (((sc_ttysoftc *)tty_softc(tp))->st_index) #define SC_DEV(sc, vty) ((sc)->dev[(vty) - (sc)->first_vty]) #define SC_STAT(tp) (*((scr_stat **)&((sc_ttysoftc *)tty_softc(tp))->st_stat)) /* printable chars */ #ifndef PRINTABLE #define PRINTABLE(ch) ((ch) > 0x1b || ((ch) > 0x0d && (ch) < 0x1b) \ || (ch) < 0x07) #endif /* macros for "intelligent" screen update */ #define mark_for_update(scp, x) {\ if ((x) < scp->start) scp->start = (x);\ else if ((x) > scp->end) scp->end = (x);\ } #define mark_all(scp) {\ scp->start = 0;\ scp->end = scp->xsize * scp->ysize - 1;\ } /* vty status flags (scp->status) */ #define UNKNOWN_MODE 0x00010 /* unknown video mode */ #define SWITCH_WAIT_REL 0x00080 /* waiting for vty release */ #define SWITCH_WAIT_ACQ 0x00100 /* waiting for vty ack */ #define BUFFER_SAVED 0x00200 /* vty buffer is saved */ #define CURSOR_ENABLED 0x00400 /* text cursor is enabled */ #define MOUSE_MOVED 0x01000 /* mouse cursor has moved */ #define MOUSE_CUTTING 0x02000 /* mouse cursor is cutting text */ #define MOUSE_VISIBLE 0x04000 /* mouse cursor is showing */ #define GRAPHICS_MODE 0x08000 /* vty is in a graphics mode */ #define PIXEL_MODE 0x10000 /* vty is in a raster text mode */ #define SAVER_RUNNING 0x20000 /* screen saver is running */ #define VR_CURSOR_BLINK 0x40000 /* blinking text cursor */ #define VR_CURSOR_ON 0x80000 /* text cursor is on */ #define MOUSE_HIDDEN 0x100000 /* mouse cursor is temporarily hidden */ /* misc defines */ #define FALSE 0 #define TRUE 1 /* The following #defines are hard-coded for a maximum text resolution corresponding to a maximum framebuffer resolution of 1920x1200 with an 8x8 font... */ #define COL 240 #define ROW 150 #define PCBURST 128 #ifndef BELL_DURATION #define BELL_DURATION ((5 * hz + 99) / 100) #define BELL_PITCH 800 #endif /* virtual terminal buffer */ typedef struct sc_vtb { int vtb_flags; #define VTB_VALID (1 << 0) #define VTB_ALLOCED (1 << 1) int vtb_type; #define VTB_INVALID 0 #define VTB_MEMORY 1 #define VTB_FRAMEBUFFER 2 #define VTB_RINGBUFFER 3 int vtb_cols; int vtb_rows; int vtb_size; vm_offset_t vtb_buffer; int vtb_tail; /* valid for VTB_RINGBUFFER only */ } sc_vtb_t; -/* text cursor attributes */ +/* text and some mouse cursor attributes */ struct cursor_attr { - int flags; - int base; - int height; + u_char flags; + u_char base; + u_char height; + u_char bg[3]; + u_char mouse_ba; + u_char mouse_ia; }; /* softc */ struct keyboard; struct video_adapter; struct scr_stat; struct tty; struct sc_cnstate { u_char kbd_locked; u_char kdb_locked; u_char mtx_locked; u_char kbd_opened; u_char scr_opened; }; typedef struct sc_softc { int unit; /* unit # */ int config; /* configuration flags */ #define SC_VESAMODE (1 << 7) #define SC_AUTODETECT_KBD (1 << 8) #define SC_KERNEL_CONSOLE (1 << 9) int flags; /* status flags */ #define SC_VISUAL_BELL (1 << 0) #define SC_QUIET_BELL (1 << 1) #if 0 /* not used anymore */ #define SC_BLINK_CURSOR (1 << 2) #define SC_CHAR_CURSOR (1 << 3) #endif #define SC_MOUSE_ENABLED (1 << 4) #define SC_SCRN_IDLE (1 << 5) #define SC_SCRN_BLANKED (1 << 6) #define SC_SAVER_FAILED (1 << 7) #define SC_SCRN_VTYLOCK (1 << 8) #define SC_INIT_DONE (1 << 16) #define SC_SPLASH_SCRN (1 << 17) int keyboard; /* -1 if unavailable */ struct keyboard *kbd; int adapter; struct video_adapter *adp; int initial_mode; /* initial video mode */ int first_vty; int vtys; struct tty **dev; struct scr_stat *cur_scp; struct scr_stat *new_scp; struct scr_stat *old_scp; int delayed_next_scr; char font_loading_in_progress; char switch_in_progress; char write_in_progress; char blink_in_progress; int grab_level; /* 2 is just enough for kdb to grab for stepping normal grabbing: */ struct sc_cnstate grab_state[2]; int kbd_open_level; struct mtx video_mtx; long scrn_time_stamp; struct cursor_attr dflt_curs_attr; struct cursor_attr curs_attr; u_char scr_map[256]; u_char scr_rmap[256]; #ifdef _SC_MD_SOFTC_DECLARED_ sc_md_softc_t md; /* machine dependent vars */ #endif #ifndef SC_NO_PALETTE_LOADING u_char palette[256 * 3]; #ifdef SC_PIXEL_MODE u_char palette2[256 * 3]; #endif #endif #ifndef SC_NO_FONT_LOADING int fonts_loaded; #define FONT_8 2 #define FONT_14 4 #define FONT_16 8 #define FONT_22 8 u_char *font_8; u_char *font_14; u_char *font_16; u_char *font_22; #endif u_char cursor_char; u_char mouse_char; #ifdef KDB int sc_altbrk; #endif struct callout ctimeout; struct callout cblink; } sc_softc_t; /* virtual screen */ typedef struct scr_stat { int index; /* index of this vty */ struct sc_softc *sc; /* pointer to softc */ struct sc_rndr_sw *rndr; /* renderer */ #ifndef __sparc64__ sc_vtb_t scr; #endif sc_vtb_t vtb; int xpos; /* current X position */ int ypos; /* current Y position */ int xsize; /* X text size */ int ysize; /* Y text size */ int xpixel; /* X graphics size */ int ypixel; /* Y graphics size */ int xoff; /* X offset in pixel mode */ int yoff; /* Y offset in pixel mode */ u_char *font; /* current font */ int font_size; /* fontsize in Y direction */ int font_width; /* fontsize in X direction */ int start; /* modified area start */ int end; /* modified area end */ struct sc_term_sw *tsw; void *ts; int status; /* status (bitfield) */ int kbd_mode; /* keyboard I/O mode */ int cursor_pos; /* cursor buffer position */ int cursor_oldpos; /* cursor old buffer position */ struct cursor_attr dflt_curs_attr; struct cursor_attr base_curs_attr; struct cursor_attr curs_attr; int mouse_pos; /* mouse buffer position */ int mouse_oldpos; /* mouse old buffer position */ short mouse_xpos; /* mouse x coordinate */ short mouse_ypos; /* mouse y coordinate */ short mouse_oldxpos; /* mouse previous x coordinate */ short mouse_oldypos; /* mouse previous y coordinate */ short mouse_buttons; /* mouse buttons */ int mouse_cut_start; /* mouse cut start pos */ int mouse_cut_end; /* mouse cut end pos */ int mouse_level; /* xterm mouse protocol */ struct proc *mouse_proc; /* proc* of controlling proc */ pid_t mouse_pid; /* pid of controlling proc */ int mouse_signal; /* signal # to report with */ const void *mouse_data; /* renderer (pixmap) data */ u_short bell_duration; u_short bell_pitch; u_char border; /* border color */ int mode; /* mode */ pid_t pid; /* pid of controlling proc */ struct proc *proc; /* proc* of controlling proc */ struct vt_mode smode; /* switch mode */ sc_vtb_t *history; /* circular history buffer */ int history_pos; /* position shown on screen */ int history_size; /* size of history buffer */ int splash_save_mode; /* saved mode for splash screen */ int splash_save_status; /* saved status for splash screen */ #ifdef _SCR_MD_STAT_DECLARED_ scr_md_stat_t md; /* machine dependent vars */ #endif } scr_stat; /* TTY softc. */ typedef struct sc_ttysoftc { int st_index; scr_stat *st_stat; } sc_ttysoftc; #ifndef SC_NORM_ATTR #define SC_NORM_ATTR (FG_LIGHTGREY | BG_BLACK) #endif #ifndef SC_NORM_REV_ATTR #define SC_NORM_REV_ATTR (FG_BLACK | BG_LIGHTGREY) #endif #ifndef SC_KERNEL_CONS_ATTR #define SC_KERNEL_CONS_ATTR (FG_WHITE | BG_BLACK) #endif #ifndef SC_KERNEL_CONS_REV_ATTR #define SC_KERNEL_CONS_REV_ATTR (FG_BLACK | BG_LIGHTGREY) #endif /* terminal emulator */ #ifndef SC_DFLT_TERM #define SC_DFLT_TERM "*" /* any */ #endif typedef int sc_term_init_t(scr_stat *scp, void **tcp, int code); #define SC_TE_COLD_INIT 0 #define SC_TE_WARM_INIT 1 typedef int sc_term_term_t(scr_stat *scp, void **tcp); typedef void sc_term_puts_t(scr_stat *scp, u_char *buf, int len); typedef int sc_term_ioctl_t(scr_stat *scp, struct tty *tp, u_long cmd, caddr_t data, struct thread *td); typedef int sc_term_reset_t(scr_stat *scp, int code); #define SC_TE_HARD_RESET 0 #define SC_TE_SOFT_RESET 1 typedef void sc_term_default_attr_t(scr_stat *scp, int norm, int rev); typedef void sc_term_clear_t(scr_stat *scp); typedef void sc_term_notify_t(scr_stat *scp, int event); #define SC_TE_NOTIFY_VTSWITCH_IN 0 #define SC_TE_NOTIFY_VTSWITCH_OUT 1 typedef int sc_term_input_t(scr_stat *scp, int c, struct tty *tp); typedef const char *sc_term_fkeystr_t(scr_stat *scp, int c); typedef void sc_term_sync_t(scr_stat *scp); typedef struct sc_term_sw { LIST_ENTRY(sc_term_sw) link; char *te_name; /* name of the emulator */ char *te_desc; /* description */ char *te_renderer; /* matching renderer */ size_t te_size; /* size of internal buffer */ int te_refcount; /* reference counter */ sc_term_init_t *te_init; sc_term_term_t *te_term; sc_term_puts_t *te_puts; sc_term_ioctl_t *te_ioctl; sc_term_reset_t *te_reset; sc_term_default_attr_t *te_default_attr; sc_term_clear_t *te_clear; sc_term_notify_t *te_notify; sc_term_input_t *te_input; sc_term_fkeystr_t *te_fkeystr; sc_term_sync_t *te_sync; } sc_term_sw_t; #define SCTERM_MODULE(name, sw) \ DATA_SET(scterm_set, sw); \ static int \ scterm_##name##_event(module_t mod, int type, void *data) \ { \ switch (type) { \ case MOD_LOAD: \ return sc_term_add(&sw); \ case MOD_UNLOAD: \ if (sw.te_refcount > 0) \ return EBUSY; \ return sc_term_remove(&sw); \ default: \ return EOPNOTSUPP; \ break; \ } \ return 0; \ } \ static moduledata_t scterm_##name##_mod = { \ "scterm-" #name, \ scterm_##name##_event, \ NULL, \ }; \ DECLARE_MODULE(scterm_##name, scterm_##name##_mod, \ SI_SUB_DRIVERS, SI_ORDER_MIDDLE) /* renderer function table */ typedef void vr_init_t(scr_stat *scp); typedef void vr_clear_t(scr_stat *scp, int c, int attr); typedef void vr_draw_border_t(scr_stat *scp, int color); typedef void vr_draw_t(scr_stat *scp, int from, int count, int flip); typedef void vr_set_cursor_t(scr_stat *scp, int base, int height, int blink); typedef void vr_draw_cursor_t(scr_stat *scp, int at, int blink, int on, int flip); typedef void vr_blink_cursor_t(scr_stat *scp, int at, int flip); typedef void vr_set_mouse_t(scr_stat *scp); typedef void vr_draw_mouse_t(scr_stat *scp, int x, int y, int on); typedef struct sc_rndr_sw { vr_init_t *init; vr_clear_t *clear; vr_draw_border_t *draw_border; vr_draw_t *draw; vr_set_cursor_t *set_cursor; vr_draw_cursor_t *draw_cursor; vr_blink_cursor_t *blink_cursor; vr_set_mouse_t *set_mouse; vr_draw_mouse_t *draw_mouse; } sc_rndr_sw_t; typedef struct sc_renderer { char *name; int mode; sc_rndr_sw_t *rndrsw; LIST_ENTRY(sc_renderer) link; } sc_renderer_t; #define RENDERER(name, mode, sw, set) \ static struct sc_renderer scrndr_##name##_##mode = { \ #name, mode, &sw \ }; \ DATA_SET(scrndr_set, scrndr_##name##_##mode); \ DATA_SET(set, scrndr_##name##_##mode) #define RENDERER_MODULE(name, set) \ SET_DECLARE(set, sc_renderer_t); \ static int \ scrndr_##name##_event(module_t mod, int type, void *data) \ { \ sc_renderer_t **list; \ int error = 0; \ switch (type) { \ case MOD_LOAD: \ SET_FOREACH(list, set) { \ error = sc_render_add(*list); \ if (error) \ break; \ } \ break; \ case MOD_UNLOAD: \ SET_FOREACH(list, set) { \ error = sc_render_remove(*list);\ if (error) \ break; \ } \ break; \ default: \ return EOPNOTSUPP; \ break; \ } \ return error; \ } \ static moduledata_t scrndr_##name##_mod = { \ "scrndr-" #name, \ scrndr_##name##_event, \ NULL, \ }; \ DECLARE_MODULE(scrndr_##name, scrndr_##name##_mod, \ SI_SUB_DRIVERS, SI_ORDER_MIDDLE) typedef struct { int shift_state; int bell_pitch; } bios_values_t; /* other macros */ #define ISTEXTSC(scp) (!((scp)->status \ & (UNKNOWN_MODE | GRAPHICS_MODE | PIXEL_MODE))) #define ISGRAPHSC(scp) (((scp)->status \ & (UNKNOWN_MODE | GRAPHICS_MODE))) #define ISPIXELSC(scp) (((scp)->status \ & (UNKNOWN_MODE | GRAPHICS_MODE | PIXEL_MODE))\ == PIXEL_MODE) #define ISUNKNOWNSC(scp) ((scp)->status & UNKNOWN_MODE) #define ISMOUSEAVAIL(af) ((af) & V_ADP_FONT) #define ISFONTAVAIL(af) ((af) & V_ADP_FONT) #define ISPALAVAIL(af) ((af) & V_ADP_PALETTE) #define ISSIGVALID(sig) ((sig) > 0 && (sig) < NSIG) #define SC_VIDEO_LOCKINIT(sc) \ mtx_init(&(sc)->video_mtx, "syscons video lock", NULL, \ MTX_SPIN | MTX_RECURSE); #define SC_VIDEO_LOCK(sc) \ do { \ if (!kdb_active) \ mtx_lock_spin(&(sc)->video_mtx); \ } while(0) #define SC_VIDEO_UNLOCK(sc) \ do { \ if (!kdb_active) \ mtx_unlock_spin(&(sc)->video_mtx); \ } while(0) /* syscons.c */ extern int (*sc_user_ioctl)(struct tty *tp, u_long cmd, caddr_t data, struct thread *td); int sc_probe_unit(int unit, int flags); int sc_attach_unit(int unit, int flags); int set_mode(scr_stat *scp); void sc_set_border(scr_stat *scp, int color); void sc_load_font(scr_stat *scp, int page, int size, int width, u_char *font, int base, int count); void sc_save_font(scr_stat *scp, int page, int size, int width, u_char *font, int base, int count); void sc_show_font(scr_stat *scp, int page); void sc_touch_scrn_saver(void); void sc_draw_cursor_image(scr_stat *scp); void sc_remove_cursor_image(scr_stat *scp); void sc_set_cursor_image(scr_stat *scp); void sc_change_cursor_shape(scr_stat *scp, int flags, int base, int height); int sc_clean_up(scr_stat *scp); int sc_switch_scr(sc_softc_t *sc, u_int next_scr); void sc_alloc_scr_buffer(scr_stat *scp, int wait, int discard); int sc_init_emulator(scr_stat *scp, char *name); void sc_paste(scr_stat *scp, const u_char *p, int count); void sc_respond(scr_stat *scp, const u_char *p, int count, int wakeup); void sc_bell(scr_stat *scp, int pitch, int duration); /* schistory.c */ #ifndef SC_NO_HISTORY int sc_alloc_history_buffer(scr_stat *scp, int lines, int prev_ysize, int wait); void sc_free_history_buffer(scr_stat *scp, int prev_ysize); void sc_hist_save(scr_stat *scp); #define sc_hist_save_one_line(scp, from) \ sc_vtb_append(&(scp)->vtb, (from), (scp)->history, (scp)->xsize) int sc_hist_restore(scr_stat *scp); void sc_hist_home(scr_stat *scp); void sc_hist_end(scr_stat *scp); int sc_hist_up_line(scr_stat *scp); int sc_hist_down_line(scr_stat *scp); int sc_hist_ioctl(struct tty *tp, u_long cmd, caddr_t data, struct thread *td); #endif /* SC_NO_HISTORY */ /* scmouse.c */ #ifndef SC_NO_CUTPASTE void sc_alloc_cut_buffer(scr_stat *scp, int wait); void sc_draw_mouse_image(scr_stat *scp); void sc_remove_mouse_image(scr_stat *scp); int sc_inside_cutmark(scr_stat *scp, int pos); void sc_remove_cutmarking(scr_stat *scp); void sc_remove_all_cutmarkings(sc_softc_t *scp); void sc_remove_all_mouse(sc_softc_t *scp); void sc_mouse_paste(scr_stat *scp); #else #define sc_draw_mouse_image(scp) #define sc_remove_mouse_image(scp) #define sc_inside_cutmark(scp, pos) FALSE #define sc_remove_cutmarking(scp) #define sc_remove_all_cutmarkings(scp) #define sc_remove_all_mouse(scp) #define sc_mouse_paste(scp) #endif /* SC_NO_CUTPASTE */ #ifndef SC_NO_SYSMOUSE void sc_mouse_move(scr_stat *scp, int x, int y); int sc_mouse_ioctl(struct tty *tp, u_long cmd, caddr_t data, struct thread *td); #endif /* SC_NO_SYSMOUSE */ /* scvidctl.c */ int sc_set_text_mode(scr_stat *scp, struct tty *tp, int mode, int xsize, int ysize, int fontsize, int font_width); int sc_set_graphics_mode(scr_stat *scp, struct tty *tp, int mode); int sc_set_pixel_mode(scr_stat *scp, struct tty *tp, int xsize, int ysize, int fontsize, int font_width); int sc_support_pixel_mode(void *arg); int sc_vid_ioctl(struct tty *tp, u_long cmd, caddr_t data, struct thread *td); int sc_render_add(sc_renderer_t *rndr); int sc_render_remove(sc_renderer_t *rndr); sc_rndr_sw_t *sc_render_match(scr_stat *scp, char *name, int mode); /* scvtb.c */ void sc_vtb_init(sc_vtb_t *vtb, int type, int cols, int rows, void *buffer, int wait); void sc_vtb_destroy(sc_vtb_t *vtb); size_t sc_vtb_size(int cols, int rows); void sc_vtb_clear(sc_vtb_t *vtb, int c, int attr); int sc_vtb_getc(sc_vtb_t *vtb, int at); int sc_vtb_geta(sc_vtb_t *vtb, int at); void sc_vtb_putc(sc_vtb_t *vtb, int at, int c, int a); vm_offset_t sc_vtb_putchar(sc_vtb_t *vtb, vm_offset_t p, int c, int a); vm_offset_t sc_vtb_pointer(sc_vtb_t *vtb, int at); int sc_vtb_pos(sc_vtb_t *vtb, int pos, int offset); #define sc_vtb_tail(vtb) ((vtb)->vtb_tail) #define sc_vtb_rows(vtb) ((vtb)->vtb_rows) #define sc_vtb_cols(vtb) ((vtb)->vtb_cols) void sc_vtb_copy(sc_vtb_t *vtb1, int from, sc_vtb_t *vtb2, int to, int count); void sc_vtb_append(sc_vtb_t *vtb1, int from, sc_vtb_t *vtb2, int count); void sc_vtb_seek(sc_vtb_t *vtb, int pos); void sc_vtb_erase(sc_vtb_t *vtb, int at, int count, int c, int attr); void sc_vtb_move(sc_vtb_t *vtb, int from, int to, int count); void sc_vtb_delete(sc_vtb_t *vtb, int at, int count, int c, int attr); void sc_vtb_ins(sc_vtb_t *vtb, int at, int count, int c, int attr); /* sysmouse.c */ int sysmouse_event(mouse_info_t *info); /* scterm.c */ void sc_move_cursor(scr_stat *scp, int x, int y); void sc_clear_screen(scr_stat *scp); int sc_term_add(sc_term_sw_t *sw); int sc_term_remove(sc_term_sw_t *sw); sc_term_sw_t *sc_term_match(char *name); sc_term_sw_t *sc_term_match_by_number(int index); /* machine dependent functions */ int sc_max_unit(void); sc_softc_t *sc_get_softc(int unit, int flags); sc_softc_t *sc_find_softc(struct video_adapter *adp, struct keyboard *kbd); int sc_get_cons_priority(int *unit, int *flags); void sc_get_bios_values(bios_values_t *values); int sc_tone(int herz); #endif /* !_DEV_SYSCONS_SYSCONS_H_ */ Index: projects/runtime-coverage/sys/kern/subr_blist.c =================================================================== --- projects/runtime-coverage/sys/kern/subr_blist.c (revision 322921) +++ projects/runtime-coverage/sys/kern/subr_blist.c (revision 322922) @@ -1,1069 +1,1071 @@ /*- * Copyright (c) 1998 Matthew Dillon. All Rights Reserved. * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE * GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * BLIST.C - Bitmap allocator/deallocator, using a radix tree with hinting * * This module implements a general bitmap allocator/deallocator. The * allocator eats around 2 bits per 'block'. The module does not * try to interpret the meaning of a 'block' other than to return * SWAPBLK_NONE on an allocation failure. * * A radix tree is used to maintain the bitmap. Two radix constants are * involved: One for the bitmaps contained in the leaf nodes (typically * 64), and one for the meta nodes (typically 16). Both meta and leaf * nodes have a hint field. This field gives us a hint as to the largest * free contiguous range of blocks under the node. It may contain a * value that is too high, but will never contain a value that is too * low. When the radix tree is searched, allocation failures in subtrees * update the hint. * * The radix tree also implements two collapsed states for meta nodes: * the ALL-ALLOCATED state and the ALL-FREE state. If a meta node is * in either of these two states, all information contained underneath * the node is considered stale. These states are used to optimize * allocation and freeing operations. * * The hinting greatly increases code efficiency for allocations while * the general radix structure optimizes both allocations and frees. The * radix tree should be able to operate well no matter how much * fragmentation there is and no matter how large a bitmap is used. * * The blist code wires all necessary memory at creation time. Neither * allocations nor frees require interaction with the memory subsystem. * The non-blocking features of the blist code are used in the swap code * (vm/swap_pager.c). * * LAYOUT: The radix tree is laid out recursively using a * linear array. Each meta node is immediately followed (laid out * sequentially in memory) by BLIST_META_RADIX lower level nodes. This * is a recursive structure but one that can be easily scanned through * a very simple 'skip' calculation. In order to support large radixes, * portions of the tree may reside outside our memory allocation. We * handle this with an early-termination optimization (when bighint is * set to -1) on the scan. The memory allocation is only large enough * to cover the number of blocks requested at creation time even if it * must be encompassed in larger root-node radix. * * NOTE: the allocator cannot currently allocate more than * BLIST_BMAP_RADIX blocks per call. It will panic with 'allocation too * large' if you try. This is an area that could use improvement. The * radix is large enough that this restriction does not effect the swap * system, though. Currently only the allocation code is affected by * this algorithmic unfeature. The freeing code can handle arbitrary * ranges. * * This code can be compiled stand-alone for debugging. */ #include __FBSDID("$FreeBSD$"); #ifdef _KERNEL #include #include #include #include #include #include #include #include #else #ifndef BLIST_NO_DEBUG #define BLIST_DEBUG #endif #include #include #include #include #include #include #include #define bitcount64(x) __bitcount64((uint64_t)(x)) #define malloc(a,b,c) calloc(a, 1) #define free(a,b) free(a) #define CTASSERT(expr) #include void panic(const char *ctl, ...); #endif /* * static support functions */ static daddr_t blst_leaf_alloc(blmeta_t *scan, daddr_t blk, int count, daddr_t cursor); static daddr_t blst_meta_alloc(blmeta_t *scan, daddr_t cursor, daddr_t count, u_daddr_t radix); static void blst_leaf_free(blmeta_t *scan, daddr_t relblk, int count); static void blst_meta_free(blmeta_t *scan, daddr_t freeBlk, daddr_t count, u_daddr_t radix); static void blst_copy(blmeta_t *scan, daddr_t blk, daddr_t radix, blist_t dest, daddr_t count); static daddr_t blst_leaf_fill(blmeta_t *scan, daddr_t blk, int count); static daddr_t blst_meta_fill(blmeta_t *scan, daddr_t allocBlk, daddr_t count, u_daddr_t radix); static daddr_t blst_radix_init(blmeta_t *scan, daddr_t radix, daddr_t count); #ifndef _KERNEL static void blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t radix, int tab); #endif #ifdef _KERNEL static MALLOC_DEFINE(M_SWAP, "SWAP", "Swap space"); #endif CTASSERT(BLIST_BMAP_RADIX % BLIST_META_RADIX == 0); /* * For a subtree that can represent the state of up to 'radix' blocks, the * number of leaf nodes of the subtree is L=radix/BLIST_BMAP_RADIX. If 'm' * is short for BLIST_META_RADIX, then for a tree of height h with L=m**h * leaf nodes, the total number of tree nodes is 1 + m + m**2 + ... + m**h, * or, equivalently, (m**(h+1)-1)/(m-1). This quantity is called 'skip' * in the 'meta' functions that process subtrees. Since integer division * discards remainders, we can express this computation as * skip = (m * m**h) / (m - 1) * skip = (m * (radix / BLIST_BMAP_RADIX)) / (m - 1) * and since m divides BLIST_BMAP_RADIX, we can simplify further to * skip = (radix / (BLIST_BMAP_RADIX / m)) / (m - 1) * skip = radix / ((BLIST_BMAP_RADIX / m) * (m - 1)) * so that simple integer division by a constant can safely be used for the * calculation. */ static inline daddr_t radix_to_skip(daddr_t radix) { return (radix / ((BLIST_BMAP_RADIX / BLIST_META_RADIX) * (BLIST_META_RADIX - 1))); } /* * blist_create() - create a blist capable of handling up to the specified * number of blocks * * blocks - must be greater than 0 * flags - malloc flags * * The smallest blist consists of a single leaf node capable of * managing BLIST_BMAP_RADIX blocks. */ blist_t blist_create(daddr_t blocks, int flags) { blist_t bl; daddr_t nodes, radix; /* * Calculate the radix field used for scanning. */ radix = BLIST_BMAP_RADIX; while (radix < blocks) { radix *= BLIST_META_RADIX; } nodes = 1 + blst_radix_init(NULL, radix, blocks); bl = malloc(sizeof(struct blist), M_SWAP, flags); if (bl == NULL) return (NULL); bl->bl_blocks = blocks; bl->bl_radix = radix; bl->bl_cursor = 0; bl->bl_root = malloc(nodes * sizeof(blmeta_t), M_SWAP, flags); if (bl->bl_root == NULL) { free(bl, M_SWAP); return (NULL); } blst_radix_init(bl->bl_root, radix, blocks); #if defined(BLIST_DEBUG) printf( "BLIST representing %lld blocks (%lld MB of swap)" ", requiring %lldK of ram\n", (long long)bl->bl_blocks, (long long)bl->bl_blocks * 4 / 1024, (long long)(nodes * sizeof(blmeta_t) + 1023) / 1024 ); printf("BLIST raw radix tree contains %lld records\n", (long long)nodes); #endif return (bl); } void blist_destroy(blist_t bl) { free(bl->bl_root, M_SWAP); free(bl, M_SWAP); } /* * blist_alloc() - reserve space in the block bitmap. Return the base * of a contiguous region or SWAPBLK_NONE if space could * not be allocated. */ daddr_t blist_alloc(blist_t bl, daddr_t count) { daddr_t blk; /* * This loop iterates at most twice. An allocation failure in the * first iteration leads to a second iteration only if the cursor was * non-zero. When the cursor is zero, an allocation failure will * reduce the hint, stopping further iterations. */ while (count <= bl->bl_root->bm_bighint) { blk = blst_meta_alloc(bl->bl_root, bl->bl_cursor, count, bl->bl_radix); if (blk != SWAPBLK_NONE) { bl->bl_cursor = blk + count; + if (bl->bl_cursor == bl->bl_blocks) + bl->bl_cursor = 0; return (blk); } else if (bl->bl_cursor != 0) bl->bl_cursor = 0; } return (SWAPBLK_NONE); } /* * blist_avail() - return the number of free blocks. */ daddr_t blist_avail(blist_t bl) { if (bl->bl_radix == BLIST_BMAP_RADIX) return (bitcount64(bl->bl_root->u.bmu_bitmap)); else return (bl->bl_root->u.bmu_avail); } /* * blist_free() - free up space in the block bitmap. Return the base * of a contiguous region. Panic if an inconsistancy is * found. */ void blist_free(blist_t bl, daddr_t blkno, daddr_t count) { blst_meta_free(bl->bl_root, blkno, count, bl->bl_radix); } /* * blist_fill() - mark a region in the block bitmap as off-limits * to the allocator (i.e. allocate it), ignoring any * existing allocations. Return the number of blocks * actually filled that were free before the call. */ daddr_t blist_fill(blist_t bl, daddr_t blkno, daddr_t count) { return (blst_meta_fill(bl->bl_root, blkno, count, bl->bl_radix)); } /* * blist_resize() - resize an existing radix tree to handle the * specified number of blocks. This will reallocate * the tree and transfer the previous bitmap to the new * one. When extending the tree you can specify whether * the new blocks are to left allocated or freed. */ void blist_resize(blist_t *pbl, daddr_t count, int freenew, int flags) { blist_t newbl = blist_create(count, flags); blist_t save = *pbl; *pbl = newbl; if (count > save->bl_blocks) count = save->bl_blocks; blst_copy(save->bl_root, 0, save->bl_radix, newbl, count); /* * If resizing upwards, should we free the new space or not? */ if (freenew && count < newbl->bl_blocks) { blist_free(newbl, count, newbl->bl_blocks - count); } blist_destroy(save); } #ifdef BLIST_DEBUG /* * blist_print() - dump radix tree */ void blist_print(blist_t bl) { printf("BLIST cursor = %08jx {\n", (uintmax_t)bl->bl_cursor); blst_radix_print(bl->bl_root, 0, bl->bl_radix, 4); printf("}\n"); } #endif /************************************************************************ * ALLOCATION SUPPORT FUNCTIONS * ************************************************************************ * * These support functions do all the actual work. They may seem * rather longish, but that's because I've commented them up. The * actual code is straight forward. * */ /* * blist_leaf_alloc() - allocate at a leaf in the radix tree (a bitmap). * * This is the core of the allocator and is optimized for the * BLIST_BMAP_RADIX block allocation case. Otherwise, execution * time is proportional to log2(count) + log2(BLIST_BMAP_RADIX). */ static daddr_t blst_leaf_alloc(blmeta_t *scan, daddr_t blk, int count, daddr_t cursor) { u_daddr_t mask; int count1, hi, lo, mid, num_shifts, range1, range_ext; if (count == BLIST_BMAP_RADIX) { /* * Optimize allocation of BLIST_BMAP_RADIX bits. If this wasn't * a special case, then forming the final value of 'mask' below * would require special handling to avoid an invalid left shift * when count equals the number of bits in mask. */ if (~scan->u.bmu_bitmap != 0) { scan->bm_bighint = BLIST_BMAP_RADIX - 1; return (SWAPBLK_NONE); } if (cursor != blk) return (SWAPBLK_NONE); scan->u.bmu_bitmap = 0; scan->bm_bighint = 0; return (blk); } range1 = 0; count1 = count - 1; num_shifts = fls(count1); mask = scan->u.bmu_bitmap; while (mask != 0 && num_shifts > 0) { /* * If bit i is set in mask, then bits in [i, i+range1] are set * in scan->u.bmu_bitmap. The value of range1 is equal to * count1 >> num_shifts. Grow range and reduce num_shifts to 0, * while preserving these invariants. The updates to mask leave * fewer bits set, but each bit that remains set represents a * longer string of consecutive bits set in scan->u.bmu_bitmap. */ num_shifts--; range_ext = range1 + ((count1 >> num_shifts) & 1); mask &= mask >> range_ext; range1 += range_ext; } if (mask == 0) { /* * Update bighint. There is no allocation bigger than range1 * available in this leaf. */ scan->bm_bighint = range1; return (SWAPBLK_NONE); } /* * Discard any candidates that appear before the cursor. */ lo = cursor - blk; mask &= ~(u_daddr_t)0 << lo; if (mask == 0) return (SWAPBLK_NONE); /* * The least significant set bit in mask marks the start of the first * available range of sufficient size. Clear all the bits but that one, * and then perform a binary search to find its position. */ mask &= -mask; hi = BLIST_BMAP_RADIX - count1; while (lo + 1 < hi) { mid = (lo + hi) >> 1; if ((mask >> mid) != 0) lo = mid; else hi = mid; } /* * Set in mask exactly the bits being allocated, and clear them from * the set of available bits. */ mask = (mask << count) - mask; scan->u.bmu_bitmap &= ~mask; return (blk + lo); } /* * blist_meta_alloc() - allocate at a meta in the radix tree. * * Attempt to allocate at a meta node. If we can't, we update * bighint and return a failure. Updating bighint optimize future * calls that hit this node. We have to check for our collapse cases * and we have a few optimizations strewn in as well. */ static daddr_t blst_meta_alloc(blmeta_t *scan, daddr_t cursor, daddr_t count, u_daddr_t radix) { daddr_t blk, i, next_skip, r, skip; int child; bool scan_from_start; blk = cursor & -radix; if (radix == BLIST_BMAP_RADIX) return (blst_leaf_alloc(scan, blk, count, cursor)); if (scan->u.bmu_avail < count) { /* * The meta node's hint must be too large if the allocation * exceeds the number of free blocks. Reduce the hint, and * return failure. */ scan->bm_bighint = scan->u.bmu_avail; return (SWAPBLK_NONE); } skip = radix_to_skip(radix); next_skip = skip / BLIST_META_RADIX; /* * An ALL-FREE meta node requires special handling before allocating * any of its blocks. */ if (scan->u.bmu_avail == radix) { radix /= BLIST_META_RADIX; /* * Reinitialize each of the meta node's children. An ALL-FREE * meta node cannot have a terminator in any subtree. */ for (i = 1; i < skip; i += next_skip) { if (next_skip == 1) scan[i].u.bmu_bitmap = (u_daddr_t)-1; else scan[i].u.bmu_avail = radix; scan[i].bm_bighint = radix; } } else { radix /= BLIST_META_RADIX; } if (count > radix) { /* * The allocation exceeds the number of blocks that are * managed by a subtree of this meta node. */ panic("allocation too large"); } scan_from_start = cursor == blk; child = (cursor - blk) / radix; blk += child * radix; for (i = 1 + child * next_skip; i < skip; i += next_skip) { if (count <= scan[i].bm_bighint) { /* * The allocation might fit in the i'th subtree. */ r = blst_meta_alloc(&scan[i], cursor > blk ? cursor : blk, count, radix); if (r != SWAPBLK_NONE) { scan->u.bmu_avail -= count; return (r); } } else if (scan[i].bm_bighint == (daddr_t)-1) { /* * Terminator */ break; } blk += radix; } /* * We couldn't allocate count in this subtree, update bighint. */ if (scan_from_start && scan->bm_bighint >= count) scan->bm_bighint = count - 1; return (SWAPBLK_NONE); } /* * BLST_LEAF_FREE() - free allocated block from leaf bitmap * */ static void blst_leaf_free(blmeta_t *scan, daddr_t blk, int count) { /* * free some data in this bitmap * * e.g. * 0000111111111110000 * \_________/\__/ * v n */ int n = blk & (BLIST_BMAP_RADIX - 1); u_daddr_t mask; mask = ((u_daddr_t)-1 << n) & ((u_daddr_t)-1 >> (BLIST_BMAP_RADIX - count - n)); if (scan->u.bmu_bitmap & mask) panic("blst_radix_free: freeing free block"); scan->u.bmu_bitmap |= mask; /* * We could probably do a better job here. We are required to make * bighint at least as large as the biggest contiguous block of * data. If we just shoehorn it, a little extra overhead will * be incured on the next allocation (but only that one typically). */ scan->bm_bighint = BLIST_BMAP_RADIX; } /* * BLST_META_FREE() - free allocated blocks from radix tree meta info * * This support routine frees a range of blocks from the bitmap. * The range must be entirely enclosed by this radix node. If a * meta node, we break the range down recursively to free blocks * in subnodes (which means that this code can free an arbitrary * range whereas the allocation code cannot allocate an arbitrary * range). */ static void blst_meta_free(blmeta_t *scan, daddr_t freeBlk, daddr_t count, u_daddr_t radix) { daddr_t blk, i, next_skip, skip, v; int child; if (scan->bm_bighint == (daddr_t)-1) panic("freeing invalid range"); if (radix == BLIST_BMAP_RADIX) return (blst_leaf_free(scan, freeBlk, count)); skip = radix_to_skip(radix); next_skip = skip / BLIST_META_RADIX; if (scan->u.bmu_avail == 0) { /* * ALL-ALLOCATED special case, with possible * shortcut to ALL-FREE special case. */ scan->u.bmu_avail = count; scan->bm_bighint = count; if (count != radix) { for (i = 1; i < skip; i += next_skip) { if (scan[i].bm_bighint == (daddr_t)-1) break; scan[i].bm_bighint = 0; if (next_skip == 1) { scan[i].u.bmu_bitmap = 0; } else { scan[i].u.bmu_avail = 0; } } /* fall through */ } } else { scan->u.bmu_avail += count; /* scan->bm_bighint = radix; */ } /* * ALL-FREE special case. */ if (scan->u.bmu_avail == radix) return; if (scan->u.bmu_avail > radix) panic("blst_meta_free: freeing already free blocks (%lld) %lld/%lld", (long long)count, (long long)scan->u.bmu_avail, (long long)radix); /* * Break the free down into its components */ blk = freeBlk & -radix; radix /= BLIST_META_RADIX; child = (freeBlk - blk) / radix; blk += child * radix; i = 1 + child * next_skip; while (i < skip && blk < freeBlk + count) { v = blk + radix - freeBlk; if (v > count) v = count; blst_meta_free(&scan[i], freeBlk, v, radix); if (scan->bm_bighint < scan[i].bm_bighint) scan->bm_bighint = scan[i].bm_bighint; count -= v; freeBlk += v; blk += radix; i += next_skip; } } /* * BLIST_RADIX_COPY() - copy one radix tree to another * * Locates free space in the source tree and frees it in the destination * tree. The space may not already be free in the destination. */ static void blst_copy(blmeta_t *scan, daddr_t blk, daddr_t radix, blist_t dest, daddr_t count) { daddr_t i, next_skip, skip; /* * Leaf node */ if (radix == BLIST_BMAP_RADIX) { u_daddr_t v = scan->u.bmu_bitmap; if (v == (u_daddr_t)-1) { blist_free(dest, blk, count); } else if (v != 0) { int i; for (i = 0; i < BLIST_BMAP_RADIX && i < count; ++i) { if (v & ((u_daddr_t)1 << i)) blist_free(dest, blk + i, 1); } } return; } /* * Meta node */ if (scan->u.bmu_avail == 0) { /* * Source all allocated, leave dest allocated */ return; } if (scan->u.bmu_avail == radix) { /* * Source all free, free entire dest */ if (count < radix) blist_free(dest, blk, count); else blist_free(dest, blk, radix); return; } skip = radix_to_skip(radix); next_skip = skip / BLIST_META_RADIX; radix /= BLIST_META_RADIX; for (i = 1; count && i < skip; i += next_skip) { if (scan[i].bm_bighint == (daddr_t)-1) break; if (count >= radix) { blst_copy(&scan[i], blk, radix, dest, radix); count -= radix; } else { if (count) { blst_copy(&scan[i], blk, radix, dest, count); } count = 0; } blk += radix; } } /* * BLST_LEAF_FILL() - allocate specific blocks in leaf bitmap * * This routine allocates all blocks in the specified range * regardless of any existing allocations in that range. Returns * the number of blocks allocated by the call. */ static daddr_t blst_leaf_fill(blmeta_t *scan, daddr_t blk, int count) { int n = blk & (BLIST_BMAP_RADIX - 1); daddr_t nblks; u_daddr_t mask; mask = ((u_daddr_t)-1 << n) & ((u_daddr_t)-1 >> (BLIST_BMAP_RADIX - count - n)); /* Count the number of blocks that we are allocating. */ nblks = bitcount64(scan->u.bmu_bitmap & mask); scan->u.bmu_bitmap &= ~mask; return (nblks); } /* * BLIST_META_FILL() - allocate specific blocks at a meta node * * This routine allocates the specified range of blocks, * regardless of any existing allocations in the range. The * range must be within the extent of this node. Returns the * number of blocks allocated by the call. */ static daddr_t blst_meta_fill(blmeta_t *scan, daddr_t allocBlk, daddr_t count, u_daddr_t radix) { daddr_t blk, i, nblks, next_skip, skip, v; int child; if (scan->bm_bighint == (daddr_t)-1) panic("filling invalid range"); if (count > radix) { /* * The allocation exceeds the number of blocks that are * managed by this node. */ panic("fill too large"); } if (radix == BLIST_BMAP_RADIX) return (blst_leaf_fill(scan, allocBlk, count)); if (count == radix || scan->u.bmu_avail == 0) { /* * ALL-ALLOCATED special case */ nblks = scan->u.bmu_avail; scan->u.bmu_avail = 0; scan->bm_bighint = 0; return (nblks); } skip = radix_to_skip(radix); next_skip = skip / BLIST_META_RADIX; blk = allocBlk & -radix; /* * An ALL-FREE meta node requires special handling before allocating * any of its blocks. */ if (scan->u.bmu_avail == radix) { radix /= BLIST_META_RADIX; /* * Reinitialize each of the meta node's children. An ALL-FREE * meta node cannot have a terminator in any subtree. */ for (i = 1; i < skip; i += next_skip) { if (next_skip == 1) scan[i].u.bmu_bitmap = (u_daddr_t)-1; else scan[i].u.bmu_avail = radix; scan[i].bm_bighint = radix; } } else { radix /= BLIST_META_RADIX; } nblks = 0; child = (allocBlk - blk) / radix; blk += child * radix; i = 1 + child * next_skip; while (i < skip && blk < allocBlk + count) { v = blk + radix - allocBlk; if (v > count) v = count; nblks += blst_meta_fill(&scan[i], allocBlk, v, radix); count -= v; allocBlk += v; blk += radix; i += next_skip; } scan->u.bmu_avail -= nblks; return (nblks); } /* * BLST_RADIX_INIT() - initialize radix tree * * Initialize our meta structures and bitmaps and calculate the exact * amount of space required to manage 'count' blocks - this space may * be considerably less than the calculated radix due to the large * RADIX values we use. */ static daddr_t blst_radix_init(blmeta_t *scan, daddr_t radix, daddr_t count) { daddr_t i, memindex, next_skip, skip; memindex = 0; /* * Leaf node */ if (radix == BLIST_BMAP_RADIX) { if (scan) { scan->bm_bighint = 0; scan->u.bmu_bitmap = 0; } return (memindex); } /* * Meta node. If allocating the entire object we can special * case it. However, we need to figure out how much memory * is required to manage 'count' blocks, so we continue on anyway. */ if (scan) { scan->bm_bighint = 0; scan->u.bmu_avail = 0; } skip = radix_to_skip(radix); next_skip = skip / BLIST_META_RADIX; radix /= BLIST_META_RADIX; for (i = 1; i < skip; i += next_skip) { if (count >= radix) { /* * Allocate the entire object */ memindex = i + blst_radix_init(((scan) ? &scan[i] : NULL), radix, radix); count -= radix; } else if (count > 0) { /* * Allocate a partial object */ memindex = i + blst_radix_init(((scan) ? &scan[i] : NULL), radix, count); count = 0; } else { /* * Add terminator and break out */ if (scan) scan[i].bm_bighint = (daddr_t)-1; break; } } if (memindex < i) memindex = i; return (memindex); } #ifdef BLIST_DEBUG static void blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t radix, int tab) { daddr_t i, next_skip, skip; if (radix == BLIST_BMAP_RADIX) { printf( "%*.*s(%08llx,%lld): bitmap %016llx big=%lld\n", tab, tab, "", (long long)blk, (long long)radix, (long long)scan->u.bmu_bitmap, (long long)scan->bm_bighint ); return; } if (scan->u.bmu_avail == 0) { printf( "%*.*s(%08llx,%lld) ALL ALLOCATED\n", tab, tab, "", (long long)blk, (long long)radix ); return; } if (scan->u.bmu_avail == radix) { printf( "%*.*s(%08llx,%lld) ALL FREE\n", tab, tab, "", (long long)blk, (long long)radix ); return; } printf( "%*.*s(%08llx,%lld): subtree (%lld/%lld) big=%lld {\n", tab, tab, "", (long long)blk, (long long)radix, (long long)scan->u.bmu_avail, (long long)radix, (long long)scan->bm_bighint ); skip = radix_to_skip(radix); next_skip = skip / BLIST_META_RADIX; radix /= BLIST_META_RADIX; tab += 4; for (i = 1; i < skip; i += next_skip) { if (scan[i].bm_bighint == (daddr_t)-1) { printf( "%*.*s(%08llx,%lld): Terminator\n", tab, tab, "", (long long)blk, (long long)radix ); break; } blst_radix_print(&scan[i], blk, radix, tab); blk += radix; } tab -= 4; printf( "%*.*s}\n", tab, tab, "" ); } #endif #ifdef BLIST_DEBUG int main(int ac, char **av) { int size = 1024; int i; blist_t bl; for (i = 1; i < ac; ++i) { const char *ptr = av[i]; if (*ptr != '-') { size = strtol(ptr, NULL, 0); continue; } ptr += 2; fprintf(stderr, "Bad option: %s\n", ptr - 2); exit(1); } bl = blist_create(size, M_WAITOK); blist_free(bl, 0, size); for (;;) { char buf[1024]; long long da = 0; long long count = 0; printf("%lld/%lld/%lld> ", (long long)blist_avail(bl), (long long)size, (long long)bl->bl_radix); fflush(stdout); if (fgets(buf, sizeof(buf), stdin) == NULL) break; switch(buf[0]) { case 'r': if (sscanf(buf + 1, "%lld", &count) == 1) { blist_resize(&bl, count, 1, M_WAITOK); } else { printf("?\n"); } case 'p': blist_print(bl); break; case 'a': if (sscanf(buf + 1, "%lld", &count) == 1) { daddr_t blk = blist_alloc(bl, count); printf(" R=%08llx\n", (long long)blk); } else { printf("?\n"); } break; case 'f': if (sscanf(buf + 1, "%llx %lld", &da, &count) == 2) { blist_free(bl, da, count); } else { printf("?\n"); } break; case 'l': if (sscanf(buf + 1, "%llx %lld", &da, &count) == 2) { printf(" n=%jd\n", (intmax_t)blist_fill(bl, da, count)); } else { printf("?\n"); } break; case '?': case 'h': puts( "p -print\n" "a %d -allocate\n" "f %x %d -free\n" "l %x %d -fill\n" "r %d -resize\n" "h/? -help" ); break; default: printf("?\n"); break; } } return(0); } void panic(const char *ctl, ...) { va_list va; va_start(va, ctl); vfprintf(stderr, ctl, va); fprintf(stderr, "\n"); va_end(va); exit(1); } #endif Index: projects/runtime-coverage/sys/kern/sys_socket.c =================================================================== --- projects/runtime-coverage/sys/kern/sys_socket.c (revision 322921) +++ projects/runtime-coverage/sys/kern/sys_socket.c (revision 322922) @@ -1,823 +1,819 @@ /*- * Copyright (c) 1982, 1986, 1990, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)sys_socket.c 8.1 (Berkeley) 6/10/93 */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* XXX */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static SYSCTL_NODE(_kern_ipc, OID_AUTO, aio, CTLFLAG_RD, NULL, "socket AIO stats"); static int empty_results; SYSCTL_INT(_kern_ipc_aio, OID_AUTO, empty_results, CTLFLAG_RD, &empty_results, 0, "socket operation returned EAGAIN"); static int empty_retries; SYSCTL_INT(_kern_ipc_aio, OID_AUTO, empty_retries, CTLFLAG_RD, &empty_retries, 0, "socket operation retries"); static fo_rdwr_t soo_read; static fo_rdwr_t soo_write; static fo_ioctl_t soo_ioctl; static fo_poll_t soo_poll; extern fo_kqfilter_t soo_kqfilter; static fo_stat_t soo_stat; static fo_close_t soo_close; static fo_fill_kinfo_t soo_fill_kinfo; static fo_aio_queue_t soo_aio_queue; static void soo_aio_cancel(struct kaiocb *job); struct fileops socketops = { .fo_read = soo_read, .fo_write = soo_write, .fo_truncate = invfo_truncate, .fo_ioctl = soo_ioctl, .fo_poll = soo_poll, .fo_kqfilter = soo_kqfilter, .fo_stat = soo_stat, .fo_close = soo_close, .fo_chmod = invfo_chmod, .fo_chown = invfo_chown, .fo_sendfile = invfo_sendfile, .fo_fill_kinfo = soo_fill_kinfo, .fo_aio_queue = soo_aio_queue, .fo_flags = DFLAG_PASSABLE }; static int soo_read(struct file *fp, struct uio *uio, struct ucred *active_cred, int flags, struct thread *td) { struct socket *so = fp->f_data; int error; #ifdef MAC error = mac_socket_check_receive(active_cred, so); if (error) return (error); #endif error = soreceive(so, 0, uio, 0, 0, 0); return (error); } static int soo_write(struct file *fp, struct uio *uio, struct ucred *active_cred, int flags, struct thread *td) { struct socket *so = fp->f_data; int error; #ifdef MAC error = mac_socket_check_send(active_cred, so); if (error) return (error); #endif error = sosend(so, 0, uio, 0, 0, 0, uio->uio_td); if (error == EPIPE && (so->so_options & SO_NOSIGPIPE) == 0) { PROC_LOCK(uio->uio_td->td_proc); tdsignal(uio->uio_td, SIGPIPE); PROC_UNLOCK(uio->uio_td->td_proc); } return (error); } static int soo_ioctl(struct file *fp, u_long cmd, void *data, struct ucred *active_cred, struct thread *td) { struct socket *so = fp->f_data; int error = 0; switch (cmd) { case FIONBIO: SOCK_LOCK(so); if (*(int *)data) so->so_state |= SS_NBIO; else so->so_state &= ~SS_NBIO; SOCK_UNLOCK(so); break; case FIOASYNC: if (*(int *)data) { SOCK_LOCK(so); so->so_state |= SS_ASYNC; if (SOLISTENING(so)) { so->sol_sbrcv_flags |= SB_ASYNC; so->sol_sbsnd_flags |= SB_ASYNC; } else { SOCKBUF_LOCK(&so->so_rcv); so->so_rcv.sb_flags |= SB_ASYNC; SOCKBUF_UNLOCK(&so->so_rcv); SOCKBUF_LOCK(&so->so_snd); so->so_snd.sb_flags |= SB_ASYNC; SOCKBUF_UNLOCK(&so->so_snd); } SOCK_UNLOCK(so); } else { SOCK_LOCK(so); so->so_state &= ~SS_ASYNC; if (SOLISTENING(so)) { so->sol_sbrcv_flags &= ~SB_ASYNC; so->sol_sbsnd_flags &= ~SB_ASYNC; } else { SOCKBUF_LOCK(&so->so_rcv); so->so_rcv.sb_flags &= ~SB_ASYNC; SOCKBUF_UNLOCK(&so->so_rcv); SOCKBUF_LOCK(&so->so_snd); so->so_snd.sb_flags &= ~SB_ASYNC; SOCKBUF_UNLOCK(&so->so_snd); } SOCK_UNLOCK(so); } break; case FIONREAD: /* Unlocked read. */ *(int *)data = sbavail(&so->so_rcv); break; case FIONWRITE: /* Unlocked read. */ *(int *)data = sbavail(&so->so_snd); break; case FIONSPACE: /* Unlocked read. */ if ((so->so_snd.sb_hiwat < sbused(&so->so_snd)) || (so->so_snd.sb_mbmax < so->so_snd.sb_mbcnt)) *(int *)data = 0; else *(int *)data = sbspace(&so->so_snd); break; case FIOSETOWN: error = fsetown(*(int *)data, &so->so_sigio); break; case FIOGETOWN: *(int *)data = fgetown(&so->so_sigio); break; case SIOCSPGRP: error = fsetown(-(*(int *)data), &so->so_sigio); break; case SIOCGPGRP: *(int *)data = -fgetown(&so->so_sigio); break; case SIOCATMARK: /* Unlocked read. */ *(int *)data = (so->so_rcv.sb_state & SBS_RCVATMARK) != 0; break; default: /* * Interface/routing/protocol specific ioctls: interface and * routing ioctls should have a different entry since a * socket is unnecessary. */ if (IOCGROUP(cmd) == 'i') error = ifioctl(so, cmd, data, td); else if (IOCGROUP(cmd) == 'r') { CURVNET_SET(so->so_vnet); error = rtioctl_fib(cmd, data, so->so_fibnum); CURVNET_RESTORE(); } else { CURVNET_SET(so->so_vnet); error = ((*so->so_proto->pr_usrreqs->pru_control) (so, cmd, data, 0, td)); CURVNET_RESTORE(); } break; } return (error); } static int soo_poll(struct file *fp, int events, struct ucred *active_cred, struct thread *td) { struct socket *so = fp->f_data; #ifdef MAC int error; error = mac_socket_check_poll(active_cred, so); if (error) return (error); #endif return (sopoll(so, events, fp->f_cred, td)); } static int soo_stat(struct file *fp, struct stat *ub, struct ucred *active_cred, struct thread *td) { struct socket *so = fp->f_data; #ifdef MAC int error; #endif bzero((caddr_t)ub, sizeof (*ub)); ub->st_mode = S_IFSOCK; #ifdef MAC error = mac_socket_check_stat(active_cred, so); if (error) return (error); #endif if (!SOLISTENING(so)) { struct sockbuf *sb; /* * If SBS_CANTRCVMORE is set, but there's still data left * in the receive buffer, the socket is still readable. */ sb = &so->so_rcv; SOCKBUF_LOCK(sb); if ((sb->sb_state & SBS_CANTRCVMORE) == 0 || sbavail(sb)) ub->st_mode |= S_IRUSR | S_IRGRP | S_IROTH; ub->st_size = sbavail(sb) - sb->sb_ctl; SOCKBUF_UNLOCK(sb); sb = &so->so_snd; SOCKBUF_LOCK(sb); if ((sb->sb_state & SBS_CANTSENDMORE) == 0) ub->st_mode |= S_IWUSR | S_IWGRP | S_IWOTH; SOCKBUF_UNLOCK(sb); } ub->st_uid = so->so_cred->cr_uid; ub->st_gid = so->so_cred->cr_gid; return (*so->so_proto->pr_usrreqs->pru_sense)(so, ub); } /* * API socket close on file pointer. We call soclose() to close the socket * (including initiating closing protocols). soclose() will sorele() the * file reference but the actual socket will not go away until the socket's * ref count hits 0. */ static int soo_close(struct file *fp, struct thread *td) { int error = 0; struct socket *so; so = fp->f_data; fp->f_ops = &badfileops; fp->f_data = NULL; if (so) error = soclose(so); return (error); } static int soo_fill_kinfo(struct file *fp, struct kinfo_file *kif, struct filedesc *fdp) { struct sockaddr *sa; struct inpcb *inpcb; struct unpcb *unpcb; struct socket *so; int error; kif->kf_type = KF_TYPE_SOCKET; so = fp->f_data; kif->kf_un.kf_sock.kf_sock_domain0 = so->so_proto->pr_domain->dom_family; kif->kf_un.kf_sock.kf_sock_type0 = so->so_type; kif->kf_un.kf_sock.kf_sock_protocol0 = so->so_proto->pr_protocol; kif->kf_un.kf_sock.kf_sock_pcb = (uintptr_t)so->so_pcb; switch (kif->kf_un.kf_sock.kf_sock_domain0) { case AF_INET: case AF_INET6: if (kif->kf_un.kf_sock.kf_sock_protocol0 == IPPROTO_TCP) { if (so->so_pcb != NULL) { inpcb = (struct inpcb *)(so->so_pcb); kif->kf_un.kf_sock.kf_sock_inpcb = (uintptr_t)inpcb->inp_ppcb; kif->kf_un.kf_sock.kf_sock_sendq = sbused(&so->so_snd); kif->kf_un.kf_sock.kf_sock_recvq = sbused(&so->so_rcv); } } break; case AF_UNIX: if (so->so_pcb != NULL) { unpcb = (struct unpcb *)(so->so_pcb); if (unpcb->unp_conn) { kif->kf_un.kf_sock.kf_sock_unpconn = (uintptr_t)unpcb->unp_conn; kif->kf_un.kf_sock.kf_sock_rcv_sb_state = so->so_rcv.sb_state; kif->kf_un.kf_sock.kf_sock_snd_sb_state = so->so_snd.sb_state; kif->kf_un.kf_sock.kf_sock_sendq = sbused(&so->so_snd); kif->kf_un.kf_sock.kf_sock_recvq = sbused(&so->so_rcv); } } break; } error = so->so_proto->pr_usrreqs->pru_sockaddr(so, &sa); if (error == 0 && sa->sa_len <= sizeof(kif->kf_un.kf_sock.kf_sa_local)) { bcopy(sa, &kif->kf_un.kf_sock.kf_sa_local, sa->sa_len); free(sa, M_SONAME); } error = so->so_proto->pr_usrreqs->pru_peeraddr(so, &sa); if (error == 0 && sa->sa_len <= sizeof(kif->kf_un.kf_sock.kf_sa_peer)) { bcopy(sa, &kif->kf_un.kf_sock.kf_sa_peer, sa->sa_len); free(sa, M_SONAME); } strncpy(kif->kf_path, so->so_proto->pr_domain->dom_name, sizeof(kif->kf_path)); return (0); } /* * Use the 'backend3' field in AIO jobs to store the amount of data * completed by the AIO job so far. */ #define aio_done backend3 static STAILQ_HEAD(, task) soaio_jobs; static struct mtx soaio_jobs_lock; static struct task soaio_kproc_task; static int soaio_starting, soaio_idle, soaio_queued; static struct unrhdr *soaio_kproc_unr; static int soaio_max_procs = MAX_AIO_PROCS; SYSCTL_INT(_kern_ipc_aio, OID_AUTO, max_procs, CTLFLAG_RW, &soaio_max_procs, 0, "Maximum number of kernel processes to use for async socket IO"); static int soaio_num_procs; SYSCTL_INT(_kern_ipc_aio, OID_AUTO, num_procs, CTLFLAG_RD, &soaio_num_procs, 0, "Number of active kernel processes for async socket IO"); static int soaio_target_procs = TARGET_AIO_PROCS; SYSCTL_INT(_kern_ipc_aio, OID_AUTO, target_procs, CTLFLAG_RD, &soaio_target_procs, 0, "Preferred number of ready kernel processes for async socket IO"); static int soaio_lifetime; SYSCTL_INT(_kern_ipc_aio, OID_AUTO, lifetime, CTLFLAG_RW, &soaio_lifetime, 0, "Maximum lifetime for idle aiod"); static void soaio_kproc_loop(void *arg) { struct proc *p; struct vmspace *myvm; struct task *task; int error, id, pending; id = (intptr_t)arg; /* * Grab an extra reference on the daemon's vmspace so that it * doesn't get freed by jobs that switch to a different * vmspace. */ p = curproc; myvm = vmspace_acquire_ref(p); mtx_lock(&soaio_jobs_lock); MPASS(soaio_starting > 0); soaio_starting--; for (;;) { while (!STAILQ_EMPTY(&soaio_jobs)) { task = STAILQ_FIRST(&soaio_jobs); STAILQ_REMOVE_HEAD(&soaio_jobs, ta_link); soaio_queued--; pending = task->ta_pending; task->ta_pending = 0; mtx_unlock(&soaio_jobs_lock); task->ta_func(task->ta_context, pending); mtx_lock(&soaio_jobs_lock); } MPASS(soaio_queued == 0); if (p->p_vmspace != myvm) { mtx_unlock(&soaio_jobs_lock); vmspace_switch_aio(myvm); mtx_lock(&soaio_jobs_lock); continue; } soaio_idle++; error = mtx_sleep(&soaio_idle, &soaio_jobs_lock, 0, "-", soaio_lifetime); soaio_idle--; if (error == EWOULDBLOCK && STAILQ_EMPTY(&soaio_jobs) && soaio_num_procs > soaio_target_procs) break; } soaio_num_procs--; mtx_unlock(&soaio_jobs_lock); free_unr(soaio_kproc_unr, id); kproc_exit(0); } static void soaio_kproc_create(void *context, int pending) { struct proc *p; int error, id; mtx_lock(&soaio_jobs_lock); for (;;) { if (soaio_num_procs < soaio_target_procs) { /* Must create */ } else if (soaio_num_procs >= soaio_max_procs) { /* * Hit the limit on kernel processes, don't * create another one. */ break; } else if (soaio_queued <= soaio_idle + soaio_starting) { /* * No more AIO jobs waiting for a process to be * created, so stop. */ break; } soaio_starting++; mtx_unlock(&soaio_jobs_lock); id = alloc_unr(soaio_kproc_unr); error = kproc_create(soaio_kproc_loop, (void *)(intptr_t)id, &p, 0, 0, "soaiod%d", id); if (error != 0) { free_unr(soaio_kproc_unr, id); mtx_lock(&soaio_jobs_lock); soaio_starting--; break; } mtx_lock(&soaio_jobs_lock); soaio_num_procs++; } mtx_unlock(&soaio_jobs_lock); } void soaio_enqueue(struct task *task) { mtx_lock(&soaio_jobs_lock); MPASS(task->ta_pending == 0); task->ta_pending++; STAILQ_INSERT_TAIL(&soaio_jobs, task, ta_link); soaio_queued++; if (soaio_queued <= soaio_idle) wakeup_one(&soaio_idle); else if (soaio_num_procs < soaio_max_procs) taskqueue_enqueue(taskqueue_thread, &soaio_kproc_task); mtx_unlock(&soaio_jobs_lock); } static void soaio_init(void) { soaio_lifetime = AIOD_LIFETIME_DEFAULT; STAILQ_INIT(&soaio_jobs); mtx_init(&soaio_jobs_lock, "soaio jobs", NULL, MTX_DEF); soaio_kproc_unr = new_unrhdr(1, INT_MAX, NULL); TASK_INIT(&soaio_kproc_task, 0, soaio_kproc_create, NULL); if (soaio_target_procs > 0) taskqueue_enqueue(taskqueue_thread, &soaio_kproc_task); } SYSINIT(soaio, SI_SUB_VFS, SI_ORDER_ANY, soaio_init, NULL); static __inline int soaio_ready(struct socket *so, struct sockbuf *sb) { return (sb == &so->so_rcv ? soreadable(so) : sowriteable(so)); } static void soaio_process_job(struct socket *so, struct sockbuf *sb, struct kaiocb *job) { struct ucred *td_savedcred; struct thread *td; struct file *fp; struct uio uio; struct iovec iov; size_t cnt, done; long ru_before; int error, flags; SOCKBUF_UNLOCK(sb); aio_switch_vmspace(job); td = curthread; fp = job->fd_file; retry: td_savedcred = td->td_ucred; td->td_ucred = job->cred; done = job->aio_done; cnt = job->uaiocb.aio_nbytes - done; iov.iov_base = (void *)((uintptr_t)job->uaiocb.aio_buf + done); iov.iov_len = cnt; uio.uio_iov = &iov; uio.uio_iovcnt = 1; uio.uio_offset = 0; uio.uio_resid = cnt; uio.uio_segflg = UIO_USERSPACE; uio.uio_td = td; flags = MSG_NBIO; /* * For resource usage accounting, only count a completed request * as a single message to avoid counting multiple calls to * sosend/soreceive on a blocking socket. */ if (sb == &so->so_rcv) { uio.uio_rw = UIO_READ; ru_before = td->td_ru.ru_msgrcv; #ifdef MAC error = mac_socket_check_receive(fp->f_cred, so); if (error == 0) #endif error = soreceive(so, NULL, &uio, NULL, NULL, &flags); if (td->td_ru.ru_msgrcv != ru_before) job->msgrcv = 1; } else { if (!TAILQ_EMPTY(&sb->sb_aiojobq)) flags |= MSG_MORETOCOME; uio.uio_rw = UIO_WRITE; ru_before = td->td_ru.ru_msgsnd; #ifdef MAC error = mac_socket_check_send(fp->f_cred, so); if (error == 0) #endif error = sosend(so, NULL, &uio, NULL, NULL, flags, td); if (td->td_ru.ru_msgsnd != ru_before) job->msgsnd = 1; if (error == EPIPE && (so->so_options & SO_NOSIGPIPE) == 0) { PROC_LOCK(job->userproc); kern_psignal(job->userproc, SIGPIPE); PROC_UNLOCK(job->userproc); } } done += cnt - uio.uio_resid; job->aio_done = done; td->td_ucred = td_savedcred; if (error == EWOULDBLOCK) { /* * The request was either partially completed or not * completed at all due to racing with a read() or * write() on the socket. If the socket is * non-blocking, return with any partial completion. * If the socket is blocking or if no progress has * been made, requeue this request at the head of the * queue to try again when the socket is ready. */ MPASS(done != job->uaiocb.aio_nbytes); SOCKBUF_LOCK(sb); if (done == 0 || !(so->so_state & SS_NBIO)) { empty_results++; if (soaio_ready(so, sb)) { empty_retries++; SOCKBUF_UNLOCK(sb); goto retry; } if (!aio_set_cancel_function(job, soo_aio_cancel)) { SOCKBUF_UNLOCK(sb); if (done != 0) aio_complete(job, done, 0); else aio_cancel(job); SOCKBUF_LOCK(sb); } else { TAILQ_INSERT_HEAD(&sb->sb_aiojobq, job, list); } return; } SOCKBUF_UNLOCK(sb); } if (done != 0 && (error == ERESTART || error == EINTR || error == EWOULDBLOCK)) error = 0; if (error) aio_complete(job, -1, error); else aio_complete(job, done, 0); SOCKBUF_LOCK(sb); } static void soaio_process_sb(struct socket *so, struct sockbuf *sb) { struct kaiocb *job; CURVNET_SET(so->so_vnet); SOCKBUF_LOCK(sb); while (!TAILQ_EMPTY(&sb->sb_aiojobq) && soaio_ready(so, sb)) { job = TAILQ_FIRST(&sb->sb_aiojobq); TAILQ_REMOVE(&sb->sb_aiojobq, job, list); if (!aio_clear_cancel_function(job)) continue; soaio_process_job(so, sb, job); } /* * If there are still pending requests, the socket must not be * ready so set SB_AIO to request a wakeup when the socket * becomes ready. */ if (!TAILQ_EMPTY(&sb->sb_aiojobq)) sb->sb_flags |= SB_AIO; sb->sb_flags &= ~SB_AIO_RUNNING; SOCKBUF_UNLOCK(sb); SOCK_LOCK(so); sorele(so); CURVNET_RESTORE(); } void soaio_rcv(void *context, int pending) { struct socket *so; so = context; soaio_process_sb(so, &so->so_rcv); } void soaio_snd(void *context, int pending) { struct socket *so; so = context; soaio_process_sb(so, &so->so_snd); } void sowakeup_aio(struct socket *so, struct sockbuf *sb) { SOCKBUF_LOCK_ASSERT(sb); sb->sb_flags &= ~SB_AIO; if (sb->sb_flags & SB_AIO_RUNNING) return; sb->sb_flags |= SB_AIO_RUNNING; - if (sb == &so->so_snd) - SOCK_LOCK(so); soref(so); - if (sb == &so->so_snd) - SOCK_UNLOCK(so); soaio_enqueue(&sb->sb_aiotask); } static void soo_aio_cancel(struct kaiocb *job) { struct socket *so; struct sockbuf *sb; long done; int opcode; so = job->fd_file->f_data; opcode = job->uaiocb.aio_lio_opcode; if (opcode == LIO_READ) sb = &so->so_rcv; else { MPASS(opcode == LIO_WRITE); sb = &so->so_snd; } SOCKBUF_LOCK(sb); if (!aio_cancel_cleared(job)) TAILQ_REMOVE(&sb->sb_aiojobq, job, list); if (TAILQ_EMPTY(&sb->sb_aiojobq)) sb->sb_flags &= ~SB_AIO; SOCKBUF_UNLOCK(sb); done = job->aio_done; if (done != 0) aio_complete(job, done, 0); else aio_cancel(job); } static int soo_aio_queue(struct file *fp, struct kaiocb *job) { struct socket *so; struct sockbuf *sb; int error; so = fp->f_data; error = (*so->so_proto->pr_usrreqs->pru_aio_queue)(so, job); if (error == 0) return (0); switch (job->uaiocb.aio_lio_opcode) { case LIO_READ: sb = &so->so_rcv; break; case LIO_WRITE: sb = &so->so_snd; break; default: return (EINVAL); } SOCKBUF_LOCK(sb); if (!aio_set_cancel_function(job, soo_aio_cancel)) panic("new job was cancelled"); TAILQ_INSERT_TAIL(&sb->sb_aiojobq, job, list); if (!(sb->sb_flags & SB_AIO_RUNNING)) { if (soaio_ready(so, sb)) sowakeup_aio(so, sb); else sb->sb_flags |= SB_AIO; } SOCKBUF_UNLOCK(sb); return (0); } Index: projects/runtime-coverage/sys/mips/include/_limits.h =================================================================== --- projects/runtime-coverage/sys/mips/include/_limits.h (revision 322921) +++ projects/runtime-coverage/sys/mips/include/_limits.h (revision 322922) @@ -1,94 +1,100 @@ /*- * Copyright (c) 1988, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)limits.h 8.3 (Berkeley) 1/4/94 * from: src/sys/i386/include/_limits.h,v 1.27 2005/01/06 22:18:15 imp * $FreeBSD$ */ #ifndef _MACHINE__LIMITS_H_ #define _MACHINE__LIMITS_H_ /* * According to ANSI (section 2.2.4.2), the values below must be usable by * #if preprocessing directives. Additionally, the expression must have the * same type as would an expression that is an object of the corresponding * type converted according to the integral promotions. The subtraction for * INT_MIN, etc., is so the value is not unsigned; e.g., 0x80000000 is an * unsigned int for 32-bit two's complement ANSI compilers (section 3.1.3.2). */ #define __CHAR_BIT 8 /* number of bits in a char */ #define __SCHAR_MAX 0x7f /* max value for a signed char */ #define __SCHAR_MIN (-0x7f - 1) /* min value for a signed char */ #define __UCHAR_MAX 0xff /* max value for an unsigned char */ #define __USHRT_MAX 0xffff /* max value for an unsigned short */ #define __SHRT_MAX 0x7fff /* max value for a short */ #define __SHRT_MIN (-0x7fff - 1) /* min value for a short */ #define __UINT_MAX 0xffffffff /* max value for an unsigned int */ #define __INT_MAX 0x7fffffff /* max value for an int */ #define __INT_MIN (-0x7fffffff - 1) /* min value for an int */ #ifdef __mips_n64 #define __ULONG_MAX 0xffffffffffffffff #define __LONG_MAX 0x7fffffffffffffff #define __LONG_MIN (-0x7fffffffffffffff - 1) #define __LONG_BIT 64 #else #define __ULONG_MAX 0xffffffffUL /* max value for an unsigned long */ #define __LONG_MAX 0x7fffffffL /* max value for a long */ #define __LONG_MIN (-0x7fffffffL - 1) /* min value for a long */ #define __LONG_BIT 32 #endif /* max value for an unsigned long long */ #define __ULLONG_MAX 0xffffffffffffffffULL #define __LLONG_MAX 0x7fffffffffffffffLL /* max value for a long long */ #define __LLONG_MIN (-0x7fffffffffffffffLL - 1) /* min for a long long */ +#ifdef __mips_n64 #define __SSIZE_MAX __LONG_MAX /* max value for a ssize_t */ - #define __SIZE_T_MAX __ULONG_MAX /* max value for a size_t */ - -#define __OFF_MAX __LLONG_MAX /* max value for an off_t */ -#define __OFF_MIN __LLONG_MIN /* min value for an off_t */ - -/* Quads and long longs are the same size. Ensure they stay in sync. */ -#define __UQUAD_MAX __ULLONG_MAX /* max value for a uquad_t */ -#define __QUAD_MAX __LLONG_MAX /* max value for a quad_t */ -#define __QUAD_MIN __LLONG_MIN /* min value for a quad_t */ +#define __OFF_MAX __LONG_MAX /* max value for an off_t */ +#define __OFF_MIN __LONG_MIN /* min value for an off_t */ +#define __UQUAD_MAX __ULONG_MAX /* max value for a uquad_t */ +#define __QUAD_MAX __LONG_MAX /* max value for a quad_t */ +#define __QUAD_MIN __LONG_MIN /* min value for a quad_t */ +#else +#define __SSIZE_MAX __INT_MAX +#define __SIZE_T_MAX __UINT_MAX +#define __OFF_MAX __LLONG_MAX +#define __OFF_MIN __LLONG_MIN +#define __UQUAD_MAX __ULLONG_MAX +#define __QUAD_MAX __LLONG_MAX +#define __QUAD_MIN __LLONG_MIN +#endif #define __WORD_BIT 32 #define __MINSIGSTKSZ (512 * 4) #endif /* !_MACHINE__LIMITS_H_ */ Index: projects/runtime-coverage/sys/netinet/tcp_timer.c =================================================================== --- projects/runtime-coverage/sys/netinet/tcp_timer.c (revision 322921) +++ projects/runtime-coverage/sys/netinet/tcp_timer.c (revision 322922) @@ -1,1008 +1,985 @@ /*- * Copyright (c) 1982, 1986, 1988, 1990, 1993, 1995 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)tcp_timer.c 8.2 (Berkeley) 5/24/95 */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include "opt_tcpdebug.h" #include "opt_rss.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET6 #include #endif #include #include #include #include #include #include #ifdef INET6 #include #endif #include #ifdef TCPDEBUG #include #endif int tcp_persmin; SYSCTL_PROC(_net_inet_tcp, OID_AUTO, persmin, CTLTYPE_INT|CTLFLAG_RW, &tcp_persmin, 0, sysctl_msec_to_ticks, "I", "minimum persistence interval"); int tcp_persmax; SYSCTL_PROC(_net_inet_tcp, OID_AUTO, persmax, CTLTYPE_INT|CTLFLAG_RW, &tcp_persmax, 0, sysctl_msec_to_ticks, "I", "maximum persistence interval"); int tcp_keepinit; SYSCTL_PROC(_net_inet_tcp, TCPCTL_KEEPINIT, keepinit, CTLTYPE_INT|CTLFLAG_RW, &tcp_keepinit, 0, sysctl_msec_to_ticks, "I", "time to establish connection"); int tcp_keepidle; SYSCTL_PROC(_net_inet_tcp, TCPCTL_KEEPIDLE, keepidle, CTLTYPE_INT|CTLFLAG_RW, &tcp_keepidle, 0, sysctl_msec_to_ticks, "I", "time before keepalive probes begin"); int tcp_keepintvl; SYSCTL_PROC(_net_inet_tcp, TCPCTL_KEEPINTVL, keepintvl, CTLTYPE_INT|CTLFLAG_RW, &tcp_keepintvl, 0, sysctl_msec_to_ticks, "I", "time between keepalive probes"); int tcp_delacktime; SYSCTL_PROC(_net_inet_tcp, TCPCTL_DELACKTIME, delacktime, CTLTYPE_INT|CTLFLAG_RW, &tcp_delacktime, 0, sysctl_msec_to_ticks, "I", "Time before a delayed ACK is sent"); int tcp_msl; SYSCTL_PROC(_net_inet_tcp, OID_AUTO, msl, CTLTYPE_INT|CTLFLAG_RW, &tcp_msl, 0, sysctl_msec_to_ticks, "I", "Maximum segment lifetime"); int tcp_rexmit_min; SYSCTL_PROC(_net_inet_tcp, OID_AUTO, rexmit_min, CTLTYPE_INT|CTLFLAG_RW, &tcp_rexmit_min, 0, sysctl_msec_to_ticks, "I", "Minimum Retransmission Timeout"); int tcp_rexmit_slop; SYSCTL_PROC(_net_inet_tcp, OID_AUTO, rexmit_slop, CTLTYPE_INT|CTLFLAG_RW, &tcp_rexmit_slop, 0, sysctl_msec_to_ticks, "I", "Retransmission Timer Slop"); static int always_keepalive = 1; SYSCTL_INT(_net_inet_tcp, OID_AUTO, always_keepalive, CTLFLAG_RW, &always_keepalive , 0, "Assume SO_KEEPALIVE on all TCP connections"); int tcp_fast_finwait2_recycle = 0; SYSCTL_INT(_net_inet_tcp, OID_AUTO, fast_finwait2_recycle, CTLFLAG_RW, &tcp_fast_finwait2_recycle, 0, "Recycle closed FIN_WAIT_2 connections faster"); int tcp_finwait2_timeout; SYSCTL_PROC(_net_inet_tcp, OID_AUTO, finwait2_timeout, CTLTYPE_INT|CTLFLAG_RW, &tcp_finwait2_timeout, 0, sysctl_msec_to_ticks, "I", "FIN-WAIT2 timeout"); int tcp_keepcnt = TCPTV_KEEPCNT; SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, CTLFLAG_RW, &tcp_keepcnt, 0, "Number of keepalive probes to send"); /* max idle probes */ int tcp_maxpersistidle; static int tcp_rexmit_drop_options = 0; SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW, &tcp_rexmit_drop_options, 0, "Drop TCP options from 3rd and later retransmitted SYN"); static VNET_DEFINE(int, tcp_pmtud_blackhole_detect); #define V_tcp_pmtud_blackhole_detect VNET(tcp_pmtud_blackhole_detect) SYSCTL_INT(_net_inet_tcp, OID_AUTO, pmtud_blackhole_detection, CTLFLAG_RW|CTLFLAG_VNET, &VNET_NAME(tcp_pmtud_blackhole_detect), 0, "Path MTU Discovery Black Hole Detection Enabled"); -static VNET_DEFINE(int, tcp_pmtud_blackhole_activated); -#define V_tcp_pmtud_blackhole_activated \ - VNET(tcp_pmtud_blackhole_activated) -SYSCTL_INT(_net_inet_tcp, OID_AUTO, pmtud_blackhole_activated, - CTLFLAG_RD|CTLFLAG_VNET, - &VNET_NAME(tcp_pmtud_blackhole_activated), 0, - "Path MTU Discovery Black Hole Detection, Activation Count"); - -static VNET_DEFINE(int, tcp_pmtud_blackhole_activated_min_mss); -#define V_tcp_pmtud_blackhole_activated_min_mss \ - VNET(tcp_pmtud_blackhole_activated_min_mss) -SYSCTL_INT(_net_inet_tcp, OID_AUTO, pmtud_blackhole_activated_min_mss, - CTLFLAG_RD|CTLFLAG_VNET, - &VNET_NAME(tcp_pmtud_blackhole_activated_min_mss), 0, - "Path MTU Discovery Black Hole Detection, Activation Count at min MSS"); - -static VNET_DEFINE(int, tcp_pmtud_blackhole_failed); -#define V_tcp_pmtud_blackhole_failed VNET(tcp_pmtud_blackhole_failed) -SYSCTL_INT(_net_inet_tcp, OID_AUTO, pmtud_blackhole_failed, - CTLFLAG_RD|CTLFLAG_VNET, - &VNET_NAME(tcp_pmtud_blackhole_failed), 0, - "Path MTU Discovery Black Hole Detection, Failure Count"); - #ifdef INET static VNET_DEFINE(int, tcp_pmtud_blackhole_mss) = 1200; #define V_tcp_pmtud_blackhole_mss VNET(tcp_pmtud_blackhole_mss) SYSCTL_INT(_net_inet_tcp, OID_AUTO, pmtud_blackhole_mss, CTLFLAG_RW|CTLFLAG_VNET, &VNET_NAME(tcp_pmtud_blackhole_mss), 0, "Path MTU Discovery Black Hole Detection lowered MSS"); #endif #ifdef INET6 static VNET_DEFINE(int, tcp_v6pmtud_blackhole_mss) = 1220; #define V_tcp_v6pmtud_blackhole_mss VNET(tcp_v6pmtud_blackhole_mss) SYSCTL_INT(_net_inet_tcp, OID_AUTO, v6pmtud_blackhole_mss, CTLFLAG_RW|CTLFLAG_VNET, &VNET_NAME(tcp_v6pmtud_blackhole_mss), 0, "Path MTU Discovery IPv6 Black Hole Detection lowered MSS"); #endif #ifdef RSS static int per_cpu_timers = 1; #else static int per_cpu_timers = 0; #endif SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW, &per_cpu_timers , 0, "run tcp timers on all cpus"); #if 0 #define INP_CPU(inp) (per_cpu_timers ? (!CPU_ABSENT(((inp)->inp_flowid % (mp_maxid+1))) ? \ ((inp)->inp_flowid % (mp_maxid+1)) : curcpu) : 0) #endif /* * Map the given inp to a CPU id. * * This queries RSS if it's compiled in, else it defaults to the current * CPU ID. */ static inline int inp_to_cpuid(struct inpcb *inp) { u_int cpuid; #ifdef RSS if (per_cpu_timers) { cpuid = rss_hash2cpuid(inp->inp_flowid, inp->inp_flowtype); if (cpuid == NETISR_CPUID_NONE) return (curcpu); /* XXX */ else return (cpuid); } #else /* Legacy, pre-RSS behaviour */ if (per_cpu_timers) { /* * We don't have a flowid -> cpuid mapping, so cheat and * just map unknown cpuids to curcpu. Not the best, but * apparently better than defaulting to swi 0. */ cpuid = inp->inp_flowid % (mp_maxid + 1); if (! CPU_ABSENT(cpuid)) return (cpuid); return (curcpu); } #endif /* Default for RSS and non-RSS - cpuid 0 */ else { return (0); } } /* * Tcp protocol timeout routine called every 500 ms. * Updates timestamps used for TCP * causes finite state machine actions if timers expire. */ void tcp_slowtimo(void) { VNET_ITERATOR_DECL(vnet_iter); VNET_LIST_RLOCK_NOSLEEP(); VNET_FOREACH(vnet_iter) { CURVNET_SET(vnet_iter); (void) tcp_tw_2msl_scan(0); CURVNET_RESTORE(); } VNET_LIST_RUNLOCK_NOSLEEP(); } int tcp_syn_backoff[TCP_MAXRXTSHIFT + 1] = { 1, 1, 1, 1, 1, 2, 4, 8, 16, 32, 64, 64, 64 }; int tcp_backoff[TCP_MAXRXTSHIFT + 1] = { 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 512, 512, 512 }; static int tcp_totbackoff = 2559; /* sum of tcp_backoff[] */ /* * TCP timer processing. */ void tcp_timer_delack(void *xtp) { struct tcpcb *tp = xtp; struct inpcb *inp; CURVNET_SET(tp->t_vnet); inp = tp->t_inpcb; KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp)); INP_WLOCK(inp); if (callout_pending(&tp->t_timers->tt_delack) || !callout_active(&tp->t_timers->tt_delack)) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } callout_deactivate(&tp->t_timers->tt_delack); if ((inp->inp_flags & INP_DROPPED) != 0) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } tp->t_flags |= TF_ACKNOW; TCPSTAT_INC(tcps_delack); (void) tp->t_fb->tfb_tcp_output(tp); INP_WUNLOCK(inp); CURVNET_RESTORE(); } /* * When a timer wants to remove a TCB it must * hold the INP_INFO_RLOCK(). The timer function * should only have grabbed the INP_WLOCK() when * it entered. To safely switch to holding both the * INP_INFO_RLOCK() and the INP_WLOCK() we must first * grab a reference on the inp, which will hold the inp * so that it can't be removed. We then unlock the INP_WLOCK(), * and grab the INP_INFO_RLOCK() lock. Once we have the INP_INFO_RLOCK() * we proceed again to get the INP_WLOCK() (this preserves proper * lock order). After acquiring the INP_WLOCK we must check if someone * else deleted the pcb i.e. the inp_flags check. * If so we return 1 otherwise we return 0. * * No matter what the tcp_inpinfo_lock_add() function * returns the caller must afterwards call tcp_inpinfo_lock_del() * to drop the locks and reference properly. */ int tcp_inpinfo_lock_add(struct inpcb *inp) { in_pcbref(inp); INP_WUNLOCK(inp); INP_INFO_RLOCK(&V_tcbinfo); INP_WLOCK(inp); if (inp->inp_flags & (INP_TIMEWAIT | INP_DROPPED)) { return(1); } return(0); } void tcp_inpinfo_lock_del(struct inpcb *inp, struct tcpcb *tp) { INP_INFO_RUNLOCK(&V_tcbinfo); if (inp && (tp == NULL)) { /* * If tcp_close/drop() gets called and tp * returns NULL, then the function dropped * the inp lock, we hold a reference keeping * this around, so we must re-aquire the * INP_WLOCK() in order to proceed with * our dropping the inp reference. */ INP_WLOCK(inp); } if (inp && in_pcbrele_wlocked(inp) == 0) INP_WUNLOCK(inp); } void tcp_timer_2msl(void *xtp) { struct tcpcb *tp = xtp; struct inpcb *inp; CURVNET_SET(tp->t_vnet); #ifdef TCPDEBUG int ostate; ostate = tp->t_state; #endif inp = tp->t_inpcb; KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp)); INP_WLOCK(inp); tcp_free_sackholes(tp); if (callout_pending(&tp->t_timers->tt_2msl) || !callout_active(&tp->t_timers->tt_2msl)) { INP_WUNLOCK(tp->t_inpcb); CURVNET_RESTORE(); return; } callout_deactivate(&tp->t_timers->tt_2msl); if ((inp->inp_flags & INP_DROPPED) != 0) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } KASSERT((tp->t_timers->tt_flags & TT_STOPPED) == 0, ("%s: tp %p tcpcb can't be stopped here", __func__, tp)); /* * 2 MSL timeout in shutdown went off. If we're closed but * still waiting for peer to close and connection has been idle * too long delete connection control block. Otherwise, check * again in a bit. * * If in TIME_WAIT state just ignore as this timeout is handled in * tcp_tw_2msl_scan(). * * If fastrecycle of FIN_WAIT_2, in FIN_WAIT_2 and receiver has closed, * there's no point in hanging onto FIN_WAIT_2 socket. Just close it. * Ignore fact that there were recent incoming segments. */ if ((inp->inp_flags & INP_TIMEWAIT) != 0) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } if (tcp_fast_finwait2_recycle && tp->t_state == TCPS_FIN_WAIT_2 && tp->t_inpcb && tp->t_inpcb->inp_socket && (tp->t_inpcb->inp_socket->so_rcv.sb_state & SBS_CANTRCVMORE)) { TCPSTAT_INC(tcps_finwait2_drops); if (tcp_inpinfo_lock_add(inp)) { tcp_inpinfo_lock_del(inp, tp); goto out; } tp = tcp_close(tp); tcp_inpinfo_lock_del(inp, tp); goto out; } else { if (ticks - tp->t_rcvtime <= TP_MAXIDLE(tp)) { callout_reset(&tp->t_timers->tt_2msl, TP_KEEPINTVL(tp), tcp_timer_2msl, tp); } else { if (tcp_inpinfo_lock_add(inp)) { tcp_inpinfo_lock_del(inp, tp); goto out; } tp = tcp_close(tp); tcp_inpinfo_lock_del(inp, tp); goto out; } } #ifdef TCPDEBUG if (tp != NULL && (tp->t_inpcb->inp_socket->so_options & SO_DEBUG)) tcp_trace(TA_USER, ostate, tp, (void *)0, (struct tcphdr *)0, PRU_SLOWTIMO); #endif TCP_PROBE2(debug__user, tp, PRU_SLOWTIMO); if (tp != NULL) INP_WUNLOCK(inp); out: CURVNET_RESTORE(); } void tcp_timer_keep(void *xtp) { struct tcpcb *tp = xtp; struct tcptemp *t_template; struct inpcb *inp; CURVNET_SET(tp->t_vnet); #ifdef TCPDEBUG int ostate; ostate = tp->t_state; #endif inp = tp->t_inpcb; KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp)); INP_WLOCK(inp); if (callout_pending(&tp->t_timers->tt_keep) || !callout_active(&tp->t_timers->tt_keep)) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } callout_deactivate(&tp->t_timers->tt_keep); if ((inp->inp_flags & INP_DROPPED) != 0) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } KASSERT((tp->t_timers->tt_flags & TT_STOPPED) == 0, ("%s: tp %p tcpcb can't be stopped here", __func__, tp)); /* * Because we don't regularly reset the keepalive callout in * the ESTABLISHED state, it may be that we don't actually need * to send a keepalive yet. If that occurs, schedule another * call for the next time the keepalive timer might expire. */ if (TCPS_HAVEESTABLISHED(tp->t_state)) { u_int idletime; idletime = ticks - tp->t_rcvtime; if (idletime < TP_KEEPIDLE(tp)) { callout_reset(&tp->t_timers->tt_keep, TP_KEEPIDLE(tp) - idletime, tcp_timer_keep, tp); INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } } /* * Keep-alive timer went off; send something * or drop connection if idle for too long. */ TCPSTAT_INC(tcps_keeptimeo); if (tp->t_state < TCPS_ESTABLISHED) goto dropit; if ((always_keepalive || inp->inp_socket->so_options & SO_KEEPALIVE) && tp->t_state <= TCPS_CLOSING) { if (ticks - tp->t_rcvtime >= TP_KEEPIDLE(tp) + TP_MAXIDLE(tp)) goto dropit; /* * Send a packet designed to force a response * if the peer is up and reachable: * either an ACK if the connection is still alive, * or an RST if the peer has closed the connection * due to timeout or reboot. * Using sequence number tp->snd_una-1 * causes the transmitted zero-length segment * to lie outside the receive window; * by the protocol spec, this requires the * correspondent TCP to respond. */ TCPSTAT_INC(tcps_keepprobe); t_template = tcpip_maketemplate(inp); if (t_template) { tcp_respond(tp, t_template->tt_ipgen, &t_template->tt_t, (struct mbuf *)NULL, tp->rcv_nxt, tp->snd_una - 1, 0); free(t_template, M_TEMP); } callout_reset(&tp->t_timers->tt_keep, TP_KEEPINTVL(tp), tcp_timer_keep, tp); } else callout_reset(&tp->t_timers->tt_keep, TP_KEEPIDLE(tp), tcp_timer_keep, tp); #ifdef TCPDEBUG if (inp->inp_socket->so_options & SO_DEBUG) tcp_trace(TA_USER, ostate, tp, (void *)0, (struct tcphdr *)0, PRU_SLOWTIMO); #endif TCP_PROBE2(debug__user, tp, PRU_SLOWTIMO); INP_WUNLOCK(inp); CURVNET_RESTORE(); return; dropit: TCPSTAT_INC(tcps_keepdrops); if (tcp_inpinfo_lock_add(inp)) { tcp_inpinfo_lock_del(inp, tp); goto out; } tp = tcp_drop(tp, ETIMEDOUT); #ifdef TCPDEBUG if (tp != NULL && (tp->t_inpcb->inp_socket->so_options & SO_DEBUG)) tcp_trace(TA_USER, ostate, tp, (void *)0, (struct tcphdr *)0, PRU_SLOWTIMO); #endif TCP_PROBE2(debug__user, tp, PRU_SLOWTIMO); tcp_inpinfo_lock_del(inp, tp); out: CURVNET_RESTORE(); } void tcp_timer_persist(void *xtp) { struct tcpcb *tp = xtp; struct inpcb *inp; CURVNET_SET(tp->t_vnet); #ifdef TCPDEBUG int ostate; ostate = tp->t_state; #endif inp = tp->t_inpcb; KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp)); INP_WLOCK(inp); if (callout_pending(&tp->t_timers->tt_persist) || !callout_active(&tp->t_timers->tt_persist)) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } callout_deactivate(&tp->t_timers->tt_persist); if ((inp->inp_flags & INP_DROPPED) != 0) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } KASSERT((tp->t_timers->tt_flags & TT_STOPPED) == 0, ("%s: tp %p tcpcb can't be stopped here", __func__, tp)); /* * Persistence timer into zero window. * Force a byte to be output, if possible. */ TCPSTAT_INC(tcps_persisttimeo); /* * Hack: if the peer is dead/unreachable, we do not * time out if the window is closed. After a full * backoff, drop the connection if the idle time * (no responses to probes) reaches the maximum * backoff that we would use if retransmitting. */ if (tp->t_rxtshift == TCP_MAXRXTSHIFT && (ticks - tp->t_rcvtime >= tcp_maxpersistidle || ticks - tp->t_rcvtime >= TCP_REXMTVAL(tp) * tcp_totbackoff)) { TCPSTAT_INC(tcps_persistdrop); if (tcp_inpinfo_lock_add(inp)) { tcp_inpinfo_lock_del(inp, tp); goto out; } tp = tcp_drop(tp, ETIMEDOUT); tcp_inpinfo_lock_del(inp, tp); goto out; } /* * If the user has closed the socket then drop a persisting * connection after a much reduced timeout. */ if (tp->t_state > TCPS_CLOSE_WAIT && (ticks - tp->t_rcvtime) >= TCPTV_PERSMAX) { TCPSTAT_INC(tcps_persistdrop); if (tcp_inpinfo_lock_add(inp)) { tcp_inpinfo_lock_del(inp, tp); goto out; } tp = tcp_drop(tp, ETIMEDOUT); tcp_inpinfo_lock_del(inp, tp); goto out; } tcp_setpersist(tp); tp->t_flags |= TF_FORCEDATA; (void) tp->t_fb->tfb_tcp_output(tp); tp->t_flags &= ~TF_FORCEDATA; #ifdef TCPDEBUG if (tp != NULL && tp->t_inpcb->inp_socket->so_options & SO_DEBUG) tcp_trace(TA_USER, ostate, tp, NULL, NULL, PRU_SLOWTIMO); #endif TCP_PROBE2(debug__user, tp, PRU_SLOWTIMO); INP_WUNLOCK(inp); out: CURVNET_RESTORE(); } void tcp_timer_rexmt(void * xtp) { struct tcpcb *tp = xtp; CURVNET_SET(tp->t_vnet); int rexmt; struct inpcb *inp; #ifdef TCPDEBUG int ostate; ostate = tp->t_state; #endif inp = tp->t_inpcb; KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp)); INP_WLOCK(inp); if (callout_pending(&tp->t_timers->tt_rexmt) || !callout_active(&tp->t_timers->tt_rexmt)) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } callout_deactivate(&tp->t_timers->tt_rexmt); if ((inp->inp_flags & INP_DROPPED) != 0) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } KASSERT((tp->t_timers->tt_flags & TT_STOPPED) == 0, ("%s: tp %p tcpcb can't be stopped here", __func__, tp)); tcp_free_sackholes(tp); if (tp->t_fb->tfb_tcp_rexmit_tmr) { /* The stack has a timer action too. */ (*tp->t_fb->tfb_tcp_rexmit_tmr)(tp); } /* * Retransmission timer went off. Message has not * been acked within retransmit interval. Back off * to a longer retransmit interval and retransmit one segment. */ if (++tp->t_rxtshift > TCP_MAXRXTSHIFT) { tp->t_rxtshift = TCP_MAXRXTSHIFT; TCPSTAT_INC(tcps_timeoutdrop); if (tcp_inpinfo_lock_add(inp)) { tcp_inpinfo_lock_del(inp, tp); goto out; } tp = tcp_drop(tp, tp->t_softerror ? tp->t_softerror : ETIMEDOUT); tcp_inpinfo_lock_del(inp, tp); goto out; } if (tp->t_state == TCPS_SYN_SENT) { /* * If the SYN was retransmitted, indicate CWND to be * limited to 1 segment in cc_conn_init(). */ tp->snd_cwnd = 1; } else if (tp->t_rxtshift == 1) { /* * first retransmit; record ssthresh and cwnd so they can * be recovered if this turns out to be a "bad" retransmit. * A retransmit is considered "bad" if an ACK for this * segment is received within RTT/2 interval; the assumption * here is that the ACK was already in flight. See * "On Estimating End-to-End Network Path Properties" by * Allman and Paxson for more details. */ tp->snd_cwnd_prev = tp->snd_cwnd; tp->snd_ssthresh_prev = tp->snd_ssthresh; tp->snd_recover_prev = tp->snd_recover; if (IN_FASTRECOVERY(tp->t_flags)) tp->t_flags |= TF_WASFRECOVERY; else tp->t_flags &= ~TF_WASFRECOVERY; if (IN_CONGRECOVERY(tp->t_flags)) tp->t_flags |= TF_WASCRECOVERY; else tp->t_flags &= ~TF_WASCRECOVERY; tp->t_badrxtwin = ticks + (tp->t_srtt >> (TCP_RTT_SHIFT + 1)); tp->t_flags |= TF_PREVVALID; } else tp->t_flags &= ~TF_PREVVALID; TCPSTAT_INC(tcps_rexmttimeo); if ((tp->t_state == TCPS_SYN_SENT) || (tp->t_state == TCPS_SYN_RECEIVED)) rexmt = TCPTV_RTOBASE * tcp_syn_backoff[tp->t_rxtshift]; else rexmt = TCP_REXMTVAL(tp) * tcp_backoff[tp->t_rxtshift]; TCPT_RANGESET(tp->t_rxtcur, rexmt, tp->t_rttmin, TCPTV_REXMTMAX); /* * We enter the path for PLMTUD if connection is established or, if * connection is FIN_WAIT_1 status, reason for the last is that if * amount of data we send is very small, we could send it in couple of * packets and process straight to FIN. In that case we won't catch * ESTABLISHED state. */ if (V_tcp_pmtud_blackhole_detect && (((tp->t_state == TCPS_ESTABLISHED)) || (tp->t_state == TCPS_FIN_WAIT_1))) { #ifdef INET6 int isipv6; #endif /* * Idea here is that at each stage of mtu probe (usually, 1448 * -> 1188 -> 524) should be given 2 chances to recover before * further clamping down. 'tp->t_rxtshift % 2 == 0' should * take care of that. */ if (((tp->t_flags2 & (TF2_PLPMTU_PMTUD|TF2_PLPMTU_MAXSEGSNT)) == (TF2_PLPMTU_PMTUD|TF2_PLPMTU_MAXSEGSNT)) && (tp->t_rxtshift >= 2 && tp->t_rxtshift % 2 == 0)) { /* * Enter Path MTU Black-hole Detection mechanism: * - Disable Path MTU Discovery (IP "DF" bit). * - Reduce MTU to lower value than what we * negotiated with peer. */ /* Record that we may have found a black hole. */ tp->t_flags2 |= TF2_PLPMTU_BLACKHOLE; /* Keep track of previous MSS. */ tp->t_pmtud_saved_maxseg = tp->t_maxseg; /* * Reduce the MSS to blackhole value or to the default * in an attempt to retransmit. */ #ifdef INET6 isipv6 = (tp->t_inpcb->inp_vflag & INP_IPV6) ? 1 : 0; if (isipv6 && tp->t_maxseg > V_tcp_v6pmtud_blackhole_mss) { /* Use the sysctl tuneable blackhole MSS. */ tp->t_maxseg = V_tcp_v6pmtud_blackhole_mss; - V_tcp_pmtud_blackhole_activated++; + TCPSTAT_INC(tcps_pmtud_blackhole_activated); } else if (isipv6) { /* Use the default MSS. */ tp->t_maxseg = V_tcp_v6mssdflt; /* * Disable Path MTU Discovery when we switch to * minmss. */ tp->t_flags2 &= ~TF2_PLPMTU_PMTUD; - V_tcp_pmtud_blackhole_activated_min_mss++; + TCPSTAT_INC(tcps_pmtud_blackhole_activated_min_mss); } #endif #if defined(INET6) && defined(INET) else #endif #ifdef INET if (tp->t_maxseg > V_tcp_pmtud_blackhole_mss) { /* Use the sysctl tuneable blackhole MSS. */ tp->t_maxseg = V_tcp_pmtud_blackhole_mss; - V_tcp_pmtud_blackhole_activated++; + TCPSTAT_INC(tcps_pmtud_blackhole_activated); } else { /* Use the default MSS. */ tp->t_maxseg = V_tcp_mssdflt; /* * Disable Path MTU Discovery when we switch to * minmss. */ tp->t_flags2 &= ~TF2_PLPMTU_PMTUD; - V_tcp_pmtud_blackhole_activated_min_mss++; + TCPSTAT_INC(tcps_pmtud_blackhole_activated_min_mss); } #endif /* * Reset the slow-start flight size * as it may depend on the new MSS. */ if (CC_ALGO(tp)->conn_init != NULL) CC_ALGO(tp)->conn_init(tp->ccv); } else { /* * If further retransmissions are still unsuccessful * with a lowered MTU, maybe this isn't a blackhole and * we restore the previous MSS and blackhole detection * flags. * The limit '6' is determined by giving each probe * stage (1448, 1188, 524) 2 chances to recover. */ if ((tp->t_flags2 & TF2_PLPMTU_BLACKHOLE) && (tp->t_rxtshift > 6)) { tp->t_flags2 |= TF2_PLPMTU_PMTUD; tp->t_flags2 &= ~TF2_PLPMTU_BLACKHOLE; tp->t_maxseg = tp->t_pmtud_saved_maxseg; - V_tcp_pmtud_blackhole_failed++; + TCPSTAT_INC(tcps_pmtud_blackhole_failed); /* * Reset the slow-start flight size as it * may depend on the new MSS. */ if (CC_ALGO(tp)->conn_init != NULL) CC_ALGO(tp)->conn_init(tp->ccv); } } } /* * Disable RFC1323 and SACK if we haven't got any response to * our third SYN to work-around some broken terminal servers * (most of which have hopefully been retired) that have bad VJ * header compression code which trashes TCP segments containing * unknown-to-them TCP options. */ if (tcp_rexmit_drop_options && (tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3)) tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP|TF_SACK_PERMIT); /* * If we backed off this far, notify the L3 protocol that we're having * connection problems. */ if (tp->t_rxtshift > TCP_RTT_INVALIDATE) { #ifdef INET6 if ((tp->t_inpcb->inp_vflag & INP_IPV6) != 0) in6_losing(tp->t_inpcb); else #endif in_losing(tp->t_inpcb); } tp->snd_nxt = tp->snd_una; tp->snd_recover = tp->snd_max; /* * Force a segment to be sent. */ tp->t_flags |= TF_ACKNOW; /* * If timing a segment in this window, stop the timer. */ tp->t_rtttime = 0; cc_cong_signal(tp, NULL, CC_RTO); (void) tp->t_fb->tfb_tcp_output(tp); #ifdef TCPDEBUG if (tp != NULL && (tp->t_inpcb->inp_socket->so_options & SO_DEBUG)) tcp_trace(TA_USER, ostate, tp, (void *)0, (struct tcphdr *)0, PRU_SLOWTIMO); #endif TCP_PROBE2(debug__user, tp, PRU_SLOWTIMO); INP_WUNLOCK(inp); out: CURVNET_RESTORE(); } void tcp_timer_activate(struct tcpcb *tp, uint32_t timer_type, u_int delta) { struct callout *t_callout; timeout_t *f_callout; struct inpcb *inp = tp->t_inpcb; int cpu = inp_to_cpuid(inp); #ifdef TCP_OFFLOAD if (tp->t_flags & TF_TOE) return; #endif if (tp->t_timers->tt_flags & TT_STOPPED) return; switch (timer_type) { case TT_DELACK: t_callout = &tp->t_timers->tt_delack; f_callout = tcp_timer_delack; break; case TT_REXMT: t_callout = &tp->t_timers->tt_rexmt; f_callout = tcp_timer_rexmt; break; case TT_PERSIST: t_callout = &tp->t_timers->tt_persist; f_callout = tcp_timer_persist; break; case TT_KEEP: t_callout = &tp->t_timers->tt_keep; f_callout = tcp_timer_keep; break; case TT_2MSL: t_callout = &tp->t_timers->tt_2msl; f_callout = tcp_timer_2msl; break; default: if (tp->t_fb->tfb_tcp_timer_activate) { tp->t_fb->tfb_tcp_timer_activate(tp, timer_type, delta); return; } panic("tp %p bad timer_type %#x", tp, timer_type); } if (delta == 0) { callout_stop(t_callout); } else { callout_reset_on(t_callout, delta, f_callout, tp, cpu); } } int tcp_timer_active(struct tcpcb *tp, uint32_t timer_type) { struct callout *t_callout; switch (timer_type) { case TT_DELACK: t_callout = &tp->t_timers->tt_delack; break; case TT_REXMT: t_callout = &tp->t_timers->tt_rexmt; break; case TT_PERSIST: t_callout = &tp->t_timers->tt_persist; break; case TT_KEEP: t_callout = &tp->t_timers->tt_keep; break; case TT_2MSL: t_callout = &tp->t_timers->tt_2msl; break; default: if (tp->t_fb->tfb_tcp_timer_active) { return(tp->t_fb->tfb_tcp_timer_active(tp, timer_type)); } panic("tp %p bad timer_type %#x", tp, timer_type); } return callout_active(t_callout); } void tcp_timer_stop(struct tcpcb *tp, uint32_t timer_type) { struct callout *t_callout; tp->t_timers->tt_flags |= TT_STOPPED; switch (timer_type) { case TT_DELACK: t_callout = &tp->t_timers->tt_delack; break; case TT_REXMT: t_callout = &tp->t_timers->tt_rexmt; break; case TT_PERSIST: t_callout = &tp->t_timers->tt_persist; break; case TT_KEEP: t_callout = &tp->t_timers->tt_keep; break; case TT_2MSL: t_callout = &tp->t_timers->tt_2msl; break; default: if (tp->t_fb->tfb_tcp_timer_stop) { /* * XXXrrs we need to look at this with the * stop case below (flags). */ tp->t_fb->tfb_tcp_timer_stop(tp, timer_type); return; } panic("tp %p bad timer_type %#x", tp, timer_type); } if (callout_async_drain(t_callout, tcp_timer_discard) == 0) { /* * Can't stop the callout, defer tcpcb actual deletion * to the last one. We do this using the async drain * function and incrementing the count in */ tp->t_timers->tt_draincnt++; } } Index: projects/runtime-coverage/sys/netinet/tcp_var.h =================================================================== --- projects/runtime-coverage/sys/netinet/tcp_var.h (revision 322921) +++ projects/runtime-coverage/sys/netinet/tcp_var.h (revision 322922) @@ -1,875 +1,880 @@ /*- * Copyright (c) 1982, 1986, 1993, 1994, 1995 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)tcp_var.h 8.4 (Berkeley) 5/24/95 * $FreeBSD$ */ #ifndef _NETINET_TCP_VAR_H_ #define _NETINET_TCP_VAR_H_ #include #include #ifdef _KERNEL #include #include #endif #if defined(_KERNEL) || defined(_WANT_TCPCB) /* TCP segment queue entry */ struct tseg_qent { LIST_ENTRY(tseg_qent) tqe_q; int tqe_len; /* TCP segment data length */ struct tcphdr *tqe_th; /* a pointer to tcp header */ struct mbuf *tqe_m; /* mbuf contains packet */ }; LIST_HEAD(tsegqe_head, tseg_qent); struct sackblk { tcp_seq start; /* start seq no. of sack block */ tcp_seq end; /* end seq no. */ }; struct sackhole { tcp_seq start; /* start seq no. of hole */ tcp_seq end; /* end seq no. */ tcp_seq rxmit; /* next seq. no in hole to be retransmitted */ TAILQ_ENTRY(sackhole) scblink; /* scoreboard linkage */ }; struct sackhint { struct sackhole *nexthole; int sack_bytes_rexmit; tcp_seq last_sack_ack; /* Most recent/largest sacked ack */ int ispare; /* explicit pad for 64bit alignment */ int sacked_bytes; /* * Total sacked bytes reported by the * receiver via sack option */ uint32_t _pad1[1]; /* TBD */ uint64_t _pad[1]; /* TBD */ }; /* * Tcp control block, one per tcp; fields: * Organized for 16 byte cacheline efficiency. */ struct tcpcb { struct tsegqe_head t_segq; /* segment reassembly queue */ int t_segqlen; /* segment reassembly queue length */ int t_dupacks; /* consecutive dup acks recd */ struct tcp_timer *t_timers; /* All the TCP timers in one struct */ struct inpcb *t_inpcb; /* back pointer to internet pcb */ int t_state; /* state of this connection */ u_int t_flags; struct vnet *t_vnet; /* back pointer to parent vnet */ tcp_seq snd_una; /* sent but unacknowledged */ tcp_seq snd_max; /* highest sequence number sent; * used to recognize retransmits */ tcp_seq snd_nxt; /* send next */ tcp_seq snd_up; /* send urgent pointer */ tcp_seq snd_wl1; /* window update seg seq number */ tcp_seq snd_wl2; /* window update seg ack number */ tcp_seq iss; /* initial send sequence number */ tcp_seq irs; /* initial receive sequence number */ tcp_seq rcv_nxt; /* receive next */ tcp_seq rcv_adv; /* advertised window */ uint32_t rcv_wnd; /* receive window */ tcp_seq rcv_up; /* receive urgent pointer */ uint32_t snd_wnd; /* send window */ uint32_t snd_cwnd; /* congestion-controlled window */ uint32_t snd_ssthresh; /* snd_cwnd size threshold for * for slow start exponential to * linear switch */ tcp_seq snd_recover; /* for use in NewReno Fast Recovery */ u_int t_rcvtime; /* inactivity time */ u_int t_starttime; /* time connection was established */ u_int t_rtttime; /* RTT measurement start time */ tcp_seq t_rtseq; /* sequence number being timed */ int t_rxtcur; /* current retransmit value (ticks) */ u_int t_maxseg; /* maximum segment size */ u_int t_pmtud_saved_maxseg; /* pre-blackhole MSS */ int t_srtt; /* smoothed round-trip time */ int t_rttvar; /* variance in round-trip time */ int t_rxtshift; /* log(2) of rexmt exp. backoff */ u_int t_rttmin; /* minimum rtt allowed */ u_int t_rttbest; /* best rtt we've seen */ u_long t_rttupdated; /* number of times rtt sampled */ uint32_t max_sndwnd; /* largest window peer has offered */ int t_softerror; /* possible error not yet reported */ /* out-of-band data */ char t_oobflags; /* have some */ char t_iobc; /* input character */ /* RFC 1323 variables */ u_char snd_scale; /* window scaling for send window */ u_char rcv_scale; /* window scaling for recv window */ u_char request_r_scale; /* pending window scaling */ u_int32_t ts_recent; /* timestamp echo data */ u_int ts_recent_age; /* when last updated */ u_int32_t ts_offset; /* our timestamp offset */ tcp_seq last_ack_sent; /* experimental */ uint32_t snd_cwnd_prev; /* cwnd prior to retransmit */ uint32_t snd_ssthresh_prev; /* ssthresh prior to retransmit */ tcp_seq snd_recover_prev; /* snd_recover prior to retransmit */ int t_sndzerowin; /* zero-window updates sent */ u_int t_badrxtwin; /* window for retransmit recovery */ u_char snd_limited; /* segments limited transmitted */ /* SACK related state */ int snd_numholes; /* number of holes seen by sender */ TAILQ_HEAD(sackhole_head, sackhole) snd_holes; /* SACK scoreboard (sorted) */ tcp_seq snd_fack; /* last seq number(+1) sack'd by rcv'r*/ int rcv_numsacks; /* # distinct sack blks present */ struct sackblk sackblks[MAX_SACK_BLKS]; /* seq nos. of sack blocks */ tcp_seq sack_newdata; /* New data xmitted in this recovery episode starts at this seq number */ struct sackhint sackhint; /* SACK scoreboard hint */ int t_rttlow; /* smallest observerved RTT */ u_int32_t rfbuf_ts; /* recv buffer autoscaling timestamp */ int rfbuf_cnt; /* recv buffer autoscaling byte count */ struct toedev *tod; /* toedev handling this connection */ int t_sndrexmitpack; /* retransmit packets sent */ int t_rcvoopack; /* out-of-order packets received */ void *t_toe; /* TOE pcb pointer */ int t_bytes_acked; /* # bytes acked during current RTT */ struct cc_algo *cc_algo; /* congestion control algorithm */ struct cc_var *ccv; /* congestion control specific vars */ struct osd *osd; /* storage for Khelp module data */ u_int t_keepinit; /* time to establish connection */ u_int t_keepidle; /* time before keepalive probes begin */ u_int t_keepintvl; /* interval between keepalives */ u_int t_keepcnt; /* number of keepalives before close */ u_int t_tsomax; /* TSO total burst length limit in bytes */ u_int t_tsomaxsegcount; /* TSO maximum segment count */ u_int t_tsomaxsegsize; /* TSO maximum segment size in bytes */ u_int t_flags2; /* More tcpcb flags storage */ struct tcp_function_block *t_fb;/* TCP function call block */ void *t_fb_ptr; /* Pointer to t_fb specific data */ #ifdef TCP_RFC7413 uint64_t t_tfo_cookie; /* TCP Fast Open cookie */ unsigned int *t_tfo_pending; /* TCP Fast Open pending counter */ #endif #ifdef TCPPCAP struct mbufq t_inpkts; /* List of saved input packets. */ struct mbufq t_outpkts; /* List of saved output packets. */ #endif }; #endif /* _KERNEL || _WANT_TCPCB */ #ifdef _KERNEL /* * Kernel variables for tcp. */ VNET_DECLARE(int, tcp_do_rfc1323); #define V_tcp_do_rfc1323 VNET(tcp_do_rfc1323) struct tcptemp { u_char tt_ipgen[40]; /* the size must be of max ip header, now IPv6 */ struct tcphdr tt_t; }; /* * TODO: We yet need to brave plowing in * to tcp_input() and the pru_usrreq() block. * Right now these go to the old standards which * are somewhat ok, but in the long term may * need to be changed. If we do tackle tcp_input() * then we need to get rid of the tcp_do_segment() * function below. */ /* Flags for tcp functions */ #define TCP_FUNC_BEING_REMOVED 0x01 /* Can no longer be referenced */ /* * If defining the optional tcp_timers, in the * tfb_tcp_timer_stop call you must use the * callout_async_drain() function with the * tcp_timer_discard callback. You should check * the return of callout_async_drain() and if 0 * increment tt_draincnt. Since the timer sub-system * does not know your callbacks you must provide a * stop_all function that loops through and calls * tcp_timer_stop() with each of your defined timers. * Adding a tfb_tcp_handoff_ok function allows the socket * option to change stacks to query you even if the * connection is in a later stage. You return 0 to * say you can take over and run your stack, you return * non-zero (an error number) to say no you can't. * If the function is undefined you can only change * in the early states (before connect or listen). * tfb_tcp_fb_fini is changed to add a flag to tell * the old stack if the tcb is being destroyed or * not. A one in the flag means the TCB is being * destroyed, a zero indicates its transitioning to * another stack (via socket option). */ struct tcp_function_block { char tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX]; int (*tfb_tcp_output)(struct tcpcb *); void (*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *, struct socket *, struct tcpcb *, int, int, uint8_t, int); int (*tfb_tcp_ctloutput)(struct socket *so, struct sockopt *sopt, struct inpcb *inp, struct tcpcb *tp); /* Optional memory allocation/free routine */ void (*tfb_tcp_fb_init)(struct tcpcb *); void (*tfb_tcp_fb_fini)(struct tcpcb *, int); /* Optional timers, must define all if you define one */ int (*tfb_tcp_timer_stop_all)(struct tcpcb *); void (*tfb_tcp_timer_activate)(struct tcpcb *, uint32_t, u_int); int (*tfb_tcp_timer_active)(struct tcpcb *, uint32_t); void (*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t); void (*tfb_tcp_rexmit_tmr)(struct tcpcb *); int (*tfb_tcp_handoff_ok)(struct tcpcb *); volatile uint32_t tfb_refcnt; uint32_t tfb_flags; }; struct tcp_function { TAILQ_ENTRY(tcp_function) tf_next; char tf_name[TCP_FUNCTION_NAME_LEN_MAX]; struct tcp_function_block *tf_fb; }; TAILQ_HEAD(tcp_funchead, tcp_function); #endif /* _KERNEL */ /* * Flags and utility macros for the t_flags field. */ #define TF_ACKNOW 0x000001 /* ack peer immediately */ #define TF_DELACK 0x000002 /* ack, but try to delay it */ #define TF_NODELAY 0x000004 /* don't delay packets to coalesce */ #define TF_NOOPT 0x000008 /* don't use tcp options */ #define TF_SENTFIN 0x000010 /* have sent FIN */ #define TF_REQ_SCALE 0x000020 /* have/will request window scaling */ #define TF_RCVD_SCALE 0x000040 /* other side has requested scaling */ #define TF_REQ_TSTMP 0x000080 /* have/will request timestamps */ #define TF_RCVD_TSTMP 0x000100 /* a timestamp was received in SYN */ #define TF_SACK_PERMIT 0x000200 /* other side said I could SACK */ #define TF_NEEDSYN 0x000400 /* send SYN (implicit state) */ #define TF_NEEDFIN 0x000800 /* send FIN (implicit state) */ #define TF_NOPUSH 0x001000 /* don't push */ #define TF_PREVVALID 0x002000 /* saved values for bad rxmit valid */ #define TF_MORETOCOME 0x010000 /* More data to be appended to sock */ #define TF_LQ_OVERFLOW 0x020000 /* listen queue overflow */ #define TF_LASTIDLE 0x040000 /* connection was previously idle */ #define TF_RXWIN0SENT 0x080000 /* sent a receiver win 0 in response */ #define TF_FASTRECOVERY 0x100000 /* in NewReno Fast Recovery */ #define TF_WASFRECOVERY 0x200000 /* was in NewReno Fast Recovery */ #define TF_SIGNATURE 0x400000 /* require MD5 digests (RFC2385) */ #define TF_FORCEDATA 0x800000 /* force out a byte */ #define TF_TSO 0x1000000 /* TSO enabled on this connection */ #define TF_TOE 0x2000000 /* this connection is offloaded */ #define TF_ECN_PERMIT 0x4000000 /* connection ECN-ready */ #define TF_ECN_SND_CWR 0x8000000 /* ECN CWR in queue */ #define TF_ECN_SND_ECE 0x10000000 /* ECN ECE in queue */ #define TF_CONGRECOVERY 0x20000000 /* congestion recovery mode */ #define TF_WASCRECOVERY 0x40000000 /* was in congestion recovery */ #define TF_FASTOPEN 0x80000000 /* TCP Fast Open indication */ #define IN_FASTRECOVERY(t_flags) (t_flags & TF_FASTRECOVERY) #define ENTER_FASTRECOVERY(t_flags) t_flags |= TF_FASTRECOVERY #define EXIT_FASTRECOVERY(t_flags) t_flags &= ~TF_FASTRECOVERY #define IN_CONGRECOVERY(t_flags) (t_flags & TF_CONGRECOVERY) #define ENTER_CONGRECOVERY(t_flags) t_flags |= TF_CONGRECOVERY #define EXIT_CONGRECOVERY(t_flags) t_flags &= ~TF_CONGRECOVERY #define IN_RECOVERY(t_flags) (t_flags & (TF_CONGRECOVERY | TF_FASTRECOVERY)) #define ENTER_RECOVERY(t_flags) t_flags |= (TF_CONGRECOVERY | TF_FASTRECOVERY) #define EXIT_RECOVERY(t_flags) t_flags &= ~(TF_CONGRECOVERY | TF_FASTRECOVERY) #if defined(_KERNEL) && !defined(TCP_RFC7413) #define IS_FASTOPEN(t_flags) (false) #else #define IS_FASTOPEN(t_flags) (t_flags & TF_FASTOPEN) #endif #define BYTES_THIS_ACK(tp, th) (th->th_ack - tp->snd_una) /* * Flags for the t_oobflags field. */ #define TCPOOB_HAVEDATA 0x01 #define TCPOOB_HADDATA 0x02 /* * Flags for PLPMTU handling, t_flags2 */ #define TF2_PLPMTU_BLACKHOLE 0x00000001 /* Possible PLPMTUD Black Hole. */ #define TF2_PLPMTU_PMTUD 0x00000002 /* Allowed to attempt PLPMTUD. */ #define TF2_PLPMTU_MAXSEGSNT 0x00000004 /* Last seg sent was full seg. */ /* * Structure to hold TCP options that are only used during segment * processing (in tcp_input), but not held in the tcpcb. * It's basically used to reduce the number of parameters * to tcp_dooptions and tcp_addoptions. * The binary order of the to_flags is relevant for packing of the * options in tcp_addoptions. */ struct tcpopt { u_int32_t to_flags; /* which options are present */ #define TOF_MSS 0x0001 /* maximum segment size */ #define TOF_SCALE 0x0002 /* window scaling */ #define TOF_SACKPERM 0x0004 /* SACK permitted */ #define TOF_TS 0x0010 /* timestamp */ #define TOF_SIGNATURE 0x0040 /* TCP-MD5 signature option (RFC2385) */ #define TOF_SACK 0x0080 /* Peer sent SACK option */ #define TOF_FASTOPEN 0x0100 /* TCP Fast Open (TFO) cookie */ #define TOF_MAXOPT 0x0200 u_int32_t to_tsval; /* new timestamp */ u_int32_t to_tsecr; /* reflected timestamp */ u_char *to_sacks; /* pointer to the first SACK blocks */ u_char *to_signature; /* pointer to the TCP-MD5 signature */ u_char *to_tfo_cookie; /* pointer to the TFO cookie */ u_int16_t to_mss; /* maximum segment size */ u_int8_t to_wscale; /* window scaling */ u_int8_t to_nsacks; /* number of SACK blocks */ u_int8_t to_tfo_len; /* TFO cookie length */ u_int32_t to_spare; /* UTO */ }; /* * Flags for tcp_dooptions. */ #define TO_SYN 0x01 /* parse SYN-only options */ struct hc_metrics_lite { /* must stay in sync with hc_metrics */ uint32_t rmx_mtu; /* MTU for this path */ uint32_t rmx_ssthresh; /* outbound gateway buffer limit */ uint32_t rmx_rtt; /* estimated round trip time */ uint32_t rmx_rttvar; /* estimated rtt variance */ uint32_t rmx_cwnd; /* congestion window */ uint32_t rmx_sendpipe; /* outbound delay-bandwidth product */ uint32_t rmx_recvpipe; /* inbound delay-bandwidth product */ }; /* * Used by tcp_maxmtu() to communicate interface specific features * and limits at the time of connection setup. */ struct tcp_ifcap { int ifcap; u_int tsomax; u_int tsomaxsegcount; u_int tsomaxsegsize; }; #ifndef _NETINET_IN_PCB_H_ struct in_conninfo; #endif /* _NETINET_IN_PCB_H_ */ struct tcptw { struct inpcb *tw_inpcb; /* XXX back pointer to internet pcb */ tcp_seq snd_nxt; tcp_seq rcv_nxt; tcp_seq iss; tcp_seq irs; u_short last_win; /* cached window value */ short tw_so_options; /* copy of so_options */ struct ucred *tw_cred; /* user credentials */ u_int32_t t_recent; u_int32_t ts_offset; /* our timestamp offset */ u_int t_starttime; int tw_time; TAILQ_ENTRY(tcptw) tw_2msl; void *tw_pspare; /* TCP_SIGNATURE */ u_int *tw_spare; /* TCP_SIGNATURE */ }; #define intotcpcb(ip) ((struct tcpcb *)(ip)->inp_ppcb) #define intotw(ip) ((struct tcptw *)(ip)->inp_ppcb) #define sototcpcb(so) (intotcpcb(sotoinpcb(so))) /* * The smoothed round-trip time and estimated variance * are stored as fixed point numbers scaled by the values below. * For convenience, these scales are also used in smoothing the average * (smoothed = (1/scale)sample + ((scale-1)/scale)smoothed). * With these scales, srtt has 3 bits to the right of the binary point, * and thus an "ALPHA" of 0.875. rttvar has 2 bits to the right of the * binary point, and is smoothed with an ALPHA of 0.75. */ #define TCP_RTT_SCALE 32 /* multiplier for srtt; 3 bits frac. */ #define TCP_RTT_SHIFT 5 /* shift for srtt; 3 bits frac. */ #define TCP_RTTVAR_SCALE 16 /* multiplier for rttvar; 2 bits */ #define TCP_RTTVAR_SHIFT 4 /* shift for rttvar; 2 bits */ #define TCP_DELTA_SHIFT 2 /* see tcp_input.c */ /* * The initial retransmission should happen at rtt + 4 * rttvar. * Because of the way we do the smoothing, srtt and rttvar * will each average +1/2 tick of bias. When we compute * the retransmit timer, we want 1/2 tick of rounding and * 1 extra tick because of +-1/2 tick uncertainty in the * firing of the timer. The bias will give us exactly the * 1.5 tick we need. But, because the bias is * statistical, we have to test that we don't drop below * the minimum feasible timer (which is 2 ticks). * This version of the macro adapted from a paper by Lawrence * Brakmo and Larry Peterson which outlines a problem caused * by insufficient precision in the original implementation, * which results in inappropriately large RTO values for very * fast networks. */ #define TCP_REXMTVAL(tp) \ max((tp)->t_rttmin, (((tp)->t_srtt >> (TCP_RTT_SHIFT - TCP_DELTA_SHIFT)) \ + (tp)->t_rttvar) >> TCP_DELTA_SHIFT) /* * TCP statistics. * Many of these should be kept per connection, * but that's inconvenient at the moment. */ struct tcpstat { uint64_t tcps_connattempt; /* connections initiated */ uint64_t tcps_accepts; /* connections accepted */ uint64_t tcps_connects; /* connections established */ uint64_t tcps_drops; /* connections dropped */ uint64_t tcps_conndrops; /* embryonic connections dropped */ uint64_t tcps_minmssdrops; /* average minmss too low drops */ uint64_t tcps_closed; /* conn. closed (includes drops) */ uint64_t tcps_segstimed; /* segs where we tried to get rtt */ uint64_t tcps_rttupdated; /* times we succeeded */ uint64_t tcps_delack; /* delayed acks sent */ uint64_t tcps_timeoutdrop; /* conn. dropped in rxmt timeout */ uint64_t tcps_rexmttimeo; /* retransmit timeouts */ uint64_t tcps_persisttimeo; /* persist timeouts */ uint64_t tcps_keeptimeo; /* keepalive timeouts */ uint64_t tcps_keepprobe; /* keepalive probes sent */ uint64_t tcps_keepdrops; /* connections dropped in keepalive */ uint64_t tcps_sndtotal; /* total packets sent */ uint64_t tcps_sndpack; /* data packets sent */ uint64_t tcps_sndbyte; /* data bytes sent */ uint64_t tcps_sndrexmitpack; /* data packets retransmitted */ uint64_t tcps_sndrexmitbyte; /* data bytes retransmitted */ uint64_t tcps_sndrexmitbad; /* unnecessary packet retransmissions */ uint64_t tcps_sndacks; /* ack-only packets sent */ uint64_t tcps_sndprobe; /* window probes sent */ uint64_t tcps_sndurg; /* packets sent with URG only */ uint64_t tcps_sndwinup; /* window update-only packets sent */ uint64_t tcps_sndctrl; /* control (SYN|FIN|RST) packets sent */ uint64_t tcps_rcvtotal; /* total packets received */ uint64_t tcps_rcvpack; /* packets received in sequence */ uint64_t tcps_rcvbyte; /* bytes received in sequence */ uint64_t tcps_rcvbadsum; /* packets received with ccksum errs */ uint64_t tcps_rcvbadoff; /* packets received with bad offset */ uint64_t tcps_rcvreassfull; /* packets dropped for no reass space */ uint64_t tcps_rcvshort; /* packets received too short */ uint64_t tcps_rcvduppack; /* duplicate-only packets received */ uint64_t tcps_rcvdupbyte; /* duplicate-only bytes received */ uint64_t tcps_rcvpartduppack; /* packets with some duplicate data */ uint64_t tcps_rcvpartdupbyte; /* dup. bytes in part-dup. packets */ uint64_t tcps_rcvoopack; /* out-of-order packets received */ uint64_t tcps_rcvoobyte; /* out-of-order bytes received */ uint64_t tcps_rcvpackafterwin; /* packets with data after window */ uint64_t tcps_rcvbyteafterwin; /* bytes rcvd after window */ uint64_t tcps_rcvafterclose; /* packets rcvd after "close" */ uint64_t tcps_rcvwinprobe; /* rcvd window probe packets */ uint64_t tcps_rcvdupack; /* rcvd duplicate acks */ uint64_t tcps_rcvacktoomuch; /* rcvd acks for unsent data */ uint64_t tcps_rcvackpack; /* rcvd ack packets */ uint64_t tcps_rcvackbyte; /* bytes acked by rcvd acks */ uint64_t tcps_rcvwinupd; /* rcvd window update packets */ uint64_t tcps_pawsdrop; /* segments dropped due to PAWS */ uint64_t tcps_predack; /* times hdr predict ok for acks */ uint64_t tcps_preddat; /* times hdr predict ok for data pkts */ uint64_t tcps_pcbcachemiss; uint64_t tcps_cachedrtt; /* times cached RTT in route updated */ uint64_t tcps_cachedrttvar; /* times cached rttvar updated */ uint64_t tcps_cachedssthresh; /* times cached ssthresh updated */ uint64_t tcps_usedrtt; /* times RTT initialized from route */ uint64_t tcps_usedrttvar; /* times RTTVAR initialized from rt */ uint64_t tcps_usedssthresh; /* times ssthresh initialized from rt*/ uint64_t tcps_persistdrop; /* timeout in persist state */ uint64_t tcps_badsyn; /* bogus SYN, e.g. premature ACK */ uint64_t tcps_mturesent; /* resends due to MTU discovery */ uint64_t tcps_listendrop; /* listen queue overflows */ uint64_t tcps_badrst; /* ignored RSTs in the window */ uint64_t tcps_sc_added; /* entry added to syncache */ uint64_t tcps_sc_retransmitted; /* syncache entry was retransmitted */ uint64_t tcps_sc_dupsyn; /* duplicate SYN packet */ uint64_t tcps_sc_dropped; /* could not reply to packet */ uint64_t tcps_sc_completed; /* successful extraction of entry */ uint64_t tcps_sc_bucketoverflow;/* syncache per-bucket limit hit */ uint64_t tcps_sc_cacheoverflow; /* syncache cache limit hit */ uint64_t tcps_sc_reset; /* RST removed entry from syncache */ uint64_t tcps_sc_stale; /* timed out or listen socket gone */ uint64_t tcps_sc_aborted; /* syncache entry aborted */ uint64_t tcps_sc_badack; /* removed due to bad ACK */ uint64_t tcps_sc_unreach; /* ICMP unreachable received */ uint64_t tcps_sc_zonefail; /* zalloc() failed */ uint64_t tcps_sc_sendcookie; /* SYN cookie sent */ uint64_t tcps_sc_recvcookie; /* SYN cookie received */ uint64_t tcps_hc_added; /* entry added to hostcache */ uint64_t tcps_hc_bucketoverflow;/* hostcache per bucket limit hit */ uint64_t tcps_finwait2_drops; /* Drop FIN_WAIT_2 connection after time limit */ /* SACK related stats */ uint64_t tcps_sack_recovery_episode; /* SACK recovery episodes */ uint64_t tcps_sack_rexmits; /* SACK rexmit segments */ uint64_t tcps_sack_rexmit_bytes; /* SACK rexmit bytes */ uint64_t tcps_sack_rcv_blocks; /* SACK blocks (options) received */ uint64_t tcps_sack_send_blocks; /* SACK blocks (options) sent */ uint64_t tcps_sack_sboverflow; /* times scoreboard overflowed */ /* ECN related stats */ uint64_t tcps_ecn_ce; /* ECN Congestion Experienced */ uint64_t tcps_ecn_ect0; /* ECN Capable Transport */ uint64_t tcps_ecn_ect1; /* ECN Capable Transport */ uint64_t tcps_ecn_shs; /* ECN successful handshakes */ uint64_t tcps_ecn_rcwnd; /* # times ECN reduced the cwnd */ /* TCP_SIGNATURE related stats */ uint64_t tcps_sig_rcvgoodsig; /* Total matching signature received */ uint64_t tcps_sig_rcvbadsig; /* Total bad signature received */ uint64_t tcps_sig_err_buildsig; /* Failed to make signature */ uint64_t tcps_sig_err_sigopt; /* No signature expected by socket */ uint64_t tcps_sig_err_nosigopt; /* No signature provided by segment */ + /* Path MTU Discovery Black Hole Detection related stats */ + uint64_t tcps_pmtud_blackhole_activated; /* Black Hole Count */ + uint64_t tcps_pmtud_blackhole_activated_min_mss; /* BH at min MSS Count */ + uint64_t tcps_pmtud_blackhole_failed; /* Black Hole Failure Count */ + uint64_t _pad[12]; /* 6 UTO, 6 TBD */ }; #define tcps_rcvmemdrop tcps_rcvreassfull /* compat */ #ifdef _KERNEL #define TI_UNLOCKED 1 #define TI_RLOCKED 2 #include VNET_PCPUSTAT_DECLARE(struct tcpstat, tcpstat); /* tcp statistics */ /* * In-kernel consumers can use these accessor macros directly to update * stats. */ #define TCPSTAT_ADD(name, val) \ VNET_PCPUSTAT_ADD(struct tcpstat, tcpstat, name, (val)) #define TCPSTAT_INC(name) TCPSTAT_ADD(name, 1) /* * Kernel module consumers must use this accessor macro. */ void kmod_tcpstat_inc(int statnum); #define KMOD_TCPSTAT_INC(name) \ kmod_tcpstat_inc(offsetof(struct tcpstat, name) / sizeof(uint64_t)) /* * Running TCP connection count by state. */ VNET_DECLARE(counter_u64_t, tcps_states[TCP_NSTATES]); #define V_tcps_states VNET(tcps_states) #define TCPSTATES_INC(state) counter_u64_add(V_tcps_states[state], 1) #define TCPSTATES_DEC(state) counter_u64_add(V_tcps_states[state], -1) /* * TCP specific helper hook point identifiers. */ #define HHOOK_TCP_EST_IN 0 #define HHOOK_TCP_EST_OUT 1 #define HHOOK_TCP_LAST HHOOK_TCP_EST_OUT struct tcp_hhook_data { struct tcpcb *tp; struct tcphdr *th; struct tcpopt *to; uint32_t len; int tso; tcp_seq curack; }; #endif /* * TCB structure exported to user-land via sysctl(3). * * Fields prefixed with "xt_" are unique to the export structure, and fields * with "t_" or other prefixes match corresponding fields of 'struct tcpcb'. * * Legend: * (s) - used by userland utilities in src * (p) - used by utilities in ports * (3) - is known to be used by third party software not in ports * (n) - no known usage * * Evil hack: declare only if in_pcb.h and sys/socketvar.h have been * included. Not all of our clients do. */ #if defined(_NETINET_IN_PCB_H_) && defined(_SYS_SOCKETVAR_H_) struct xtcpcb { size_t xt_len; /* length of this structure */ struct xinpcb xt_inp; char xt_stack[TCP_FUNCTION_NAME_LEN_MAX]; /* (n) */ int64_t spare64[8]; int32_t t_state; /* (s,p) */ uint32_t t_flags; /* (s,p) */ int32_t t_sndzerowin; /* (s) */ int32_t t_sndrexmitpack; /* (s) */ int32_t t_rcvoopack; /* (s) */ int32_t t_rcvtime; /* (s) */ int32_t tt_rexmt; /* (s) */ int32_t tt_persist; /* (s) */ int32_t tt_keep; /* (s) */ int32_t tt_2msl; /* (s) */ int32_t tt_delack; /* (s) */ int32_t spare32[32]; } __aligned(8); #ifdef _KERNEL void tcp_inptoxtp(const struct inpcb *, struct xtcpcb *); #endif #endif /* * Identifiers for TCP sysctl nodes */ #define TCPCTL_DO_RFC1323 1 /* use RFC-1323 extensions */ #define TCPCTL_MSSDFLT 3 /* MSS default */ #define TCPCTL_STATS 4 /* statistics */ #define TCPCTL_RTTDFLT 5 /* default RTT estimate */ #define TCPCTL_KEEPIDLE 6 /* keepalive idle timer */ #define TCPCTL_KEEPINTVL 7 /* interval to send keepalives */ #define TCPCTL_SENDSPACE 8 /* send buffer space */ #define TCPCTL_RECVSPACE 9 /* receive buffer space */ #define TCPCTL_KEEPINIT 10 /* timeout for establishing syn */ #define TCPCTL_PCBLIST 11 /* list of all outstanding PCBs */ #define TCPCTL_DELACKTIME 12 /* time before sending delayed ACK */ #define TCPCTL_V6MSSDFLT 13 /* MSS default for IPv6 */ #define TCPCTL_SACK 14 /* Selective Acknowledgement,rfc 2018 */ #define TCPCTL_DROP 15 /* drop tcp connection */ #define TCPCTL_STATES 16 /* connection counts by TCP state */ #ifdef _KERNEL #ifdef SYSCTL_DECL SYSCTL_DECL(_net_inet_tcp); SYSCTL_DECL(_net_inet_tcp_sack); MALLOC_DECLARE(M_TCPLOG); #endif VNET_DECLARE(struct inpcbhead, tcb); /* queue of active tcpcb's */ VNET_DECLARE(struct inpcbinfo, tcbinfo); extern int tcp_log_in_vain; VNET_DECLARE(int, tcp_mssdflt); /* XXX */ VNET_DECLARE(int, tcp_minmss); VNET_DECLARE(int, tcp_delack_enabled); VNET_DECLARE(int, tcp_do_rfc3390); VNET_DECLARE(int, tcp_initcwnd_segments); VNET_DECLARE(int, tcp_sendspace); VNET_DECLARE(int, tcp_recvspace); VNET_DECLARE(int, path_mtu_discovery); VNET_DECLARE(int, tcp_do_rfc3465); VNET_DECLARE(int, tcp_abc_l_var); #define V_tcb VNET(tcb) #define V_tcbinfo VNET(tcbinfo) #define V_tcp_mssdflt VNET(tcp_mssdflt) #define V_tcp_minmss VNET(tcp_minmss) #define V_tcp_delack_enabled VNET(tcp_delack_enabled) #define V_tcp_do_rfc3390 VNET(tcp_do_rfc3390) #define V_tcp_initcwnd_segments VNET(tcp_initcwnd_segments) #define V_tcp_sendspace VNET(tcp_sendspace) #define V_tcp_recvspace VNET(tcp_recvspace) #define V_path_mtu_discovery VNET(path_mtu_discovery) #define V_tcp_do_rfc3465 VNET(tcp_do_rfc3465) #define V_tcp_abc_l_var VNET(tcp_abc_l_var) VNET_DECLARE(int, tcp_do_sack); /* SACK enabled/disabled */ VNET_DECLARE(int, tcp_sc_rst_sock_fail); /* RST on sock alloc failure */ #define V_tcp_do_sack VNET(tcp_do_sack) #define V_tcp_sc_rst_sock_fail VNET(tcp_sc_rst_sock_fail) VNET_DECLARE(int, tcp_do_ecn); /* TCP ECN enabled/disabled */ VNET_DECLARE(int, tcp_ecn_maxretries); #define V_tcp_do_ecn VNET(tcp_do_ecn) #define V_tcp_ecn_maxretries VNET(tcp_ecn_maxretries) #ifdef TCP_HHOOK VNET_DECLARE(struct hhook_head *, tcp_hhh[HHOOK_TCP_LAST + 1]); #define V_tcp_hhh VNET(tcp_hhh) #endif VNET_DECLARE(int, tcp_do_rfc6675_pipe); #define V_tcp_do_rfc6675_pipe VNET(tcp_do_rfc6675_pipe) int tcp_addoptions(struct tcpopt *, u_char *); int tcp_ccalgounload(struct cc_algo *unload_algo); struct tcpcb * tcp_close(struct tcpcb *); void tcp_discardcb(struct tcpcb *); void tcp_twstart(struct tcpcb *); void tcp_twclose(struct tcptw *, int); void tcp_ctlinput(int, struct sockaddr *, void *); int tcp_ctloutput(struct socket *, struct sockopt *); struct tcpcb * tcp_drop(struct tcpcb *, int); void tcp_drain(void); void tcp_init(void); void tcp_fini(void *); char *tcp_log_addrs(struct in_conninfo *, struct tcphdr *, void *, const void *); char *tcp_log_vain(struct in_conninfo *, struct tcphdr *, void *, const void *); int tcp_reass(struct tcpcb *, struct tcphdr *, int *, struct mbuf *); void tcp_reass_global_init(void); void tcp_reass_flush(struct tcpcb *); void tcp_dooptions(struct tcpopt *, u_char *, int, int); void tcp_dropwithreset(struct mbuf *, struct tcphdr *, struct tcpcb *, int, int); void tcp_pulloutofband(struct socket *, struct tcphdr *, struct mbuf *, int); void tcp_xmit_timer(struct tcpcb *, int); void tcp_newreno_partial_ack(struct tcpcb *, struct tcphdr *); void cc_ack_received(struct tcpcb *tp, struct tcphdr *th, uint16_t nsegs, uint16_t type); void cc_conn_init(struct tcpcb *tp); void cc_post_recovery(struct tcpcb *tp, struct tcphdr *th); void cc_cong_signal(struct tcpcb *tp, struct tcphdr *th, uint32_t type); #ifdef TCP_HHOOK void hhook_run_tcp_est_in(struct tcpcb *tp, struct tcphdr *th, struct tcpopt *to); #endif int tcp_input(struct mbuf **, int *, int); int tcp_autorcvbuf(struct mbuf *, struct tcphdr *, struct socket *, struct tcpcb *, int); void tcp_do_segment(struct mbuf *, struct tcphdr *, struct socket *, struct tcpcb *, int, int, uint8_t, int); int register_tcp_functions(struct tcp_function_block *blk, int wait); int register_tcp_functions_as_names(struct tcp_function_block *blk, int wait, const char *names[], int *num_names); int register_tcp_functions_as_name(struct tcp_function_block *blk, const char *name, int wait); int deregister_tcp_functions(struct tcp_function_block *blk); struct tcp_function_block *find_and_ref_tcp_functions(struct tcp_function_set *fs); struct tcp_function_block *find_and_ref_tcp_fb(struct tcp_function_block *blk); int tcp_default_ctloutput(struct socket *so, struct sockopt *sopt, struct inpcb *inp, struct tcpcb *tp); uint32_t tcp_maxmtu(struct in_conninfo *, struct tcp_ifcap *); uint32_t tcp_maxmtu6(struct in_conninfo *, struct tcp_ifcap *); u_int tcp_maxseg(const struct tcpcb *); void tcp_mss_update(struct tcpcb *, int, int, struct hc_metrics_lite *, struct tcp_ifcap *); void tcp_mss(struct tcpcb *, int); int tcp_mssopt(struct in_conninfo *); struct inpcb * tcp_drop_syn_sent(struct inpcb *, int); struct tcpcb * tcp_newtcpcb(struct inpcb *); int tcp_output(struct tcpcb *); void tcp_state_change(struct tcpcb *, int); void tcp_respond(struct tcpcb *, void *, struct tcphdr *, struct mbuf *, tcp_seq, tcp_seq, int); void tcp_tw_init(void); #ifdef VIMAGE void tcp_tw_destroy(void); #endif void tcp_tw_zone_change(void); int tcp_twcheck(struct inpcb *, struct tcpopt *, struct tcphdr *, struct mbuf *, int); void tcp_setpersist(struct tcpcb *); void tcp_slowtimo(void); struct tcptemp * tcpip_maketemplate(struct inpcb *); void tcpip_fillheaders(struct inpcb *, void *, void *); void tcp_timer_activate(struct tcpcb *, uint32_t, u_int); int tcp_timer_active(struct tcpcb *, uint32_t); void tcp_timer_stop(struct tcpcb *, uint32_t); void tcp_trace(short, short, struct tcpcb *, void *, struct tcphdr *, int); /* * All tcp_hc_* functions are IPv4 and IPv6 (via in_conninfo) */ void tcp_hc_init(void); #ifdef VIMAGE void tcp_hc_destroy(void); #endif void tcp_hc_get(struct in_conninfo *, struct hc_metrics_lite *); uint32_t tcp_hc_getmtu(struct in_conninfo *); void tcp_hc_updatemtu(struct in_conninfo *, uint32_t); void tcp_hc_update(struct in_conninfo *, struct hc_metrics_lite *); extern struct pr_usrreqs tcp_usrreqs; tcp_seq tcp_new_isn(struct tcpcb *); int tcp_sack_doack(struct tcpcb *, struct tcpopt *, tcp_seq); void tcp_update_sack_list(struct tcpcb *tp, tcp_seq rcv_laststart, tcp_seq rcv_lastend); void tcp_clean_sackreport(struct tcpcb *tp); void tcp_sack_adjust(struct tcpcb *tp); struct sackhole *tcp_sack_output(struct tcpcb *tp, int *sack_bytes_rexmt); void tcp_sack_partialack(struct tcpcb *, struct tcphdr *); void tcp_free_sackholes(struct tcpcb *tp); int tcp_newreno(struct tcpcb *, struct tcphdr *); int tcp_compute_pipe(struct tcpcb *); static inline void tcp_fields_to_host(struct tcphdr *th) { th->th_seq = ntohl(th->th_seq); th->th_ack = ntohl(th->th_ack); th->th_win = ntohs(th->th_win); th->th_urp = ntohs(th->th_urp); } static inline void tcp_fields_to_net(struct tcphdr *th) { th->th_seq = htonl(th->th_seq); th->th_ack = htonl(th->th_ack); th->th_win = htons(th->th_win); th->th_urp = htons(th->th_urp); } #endif /* _KERNEL */ #endif /* _NETINET_TCP_VAR_H_ */ Index: projects/runtime-coverage/sys/sys/consio.h =================================================================== --- projects/runtime-coverage/sys/sys/consio.h (revision 322921) +++ projects/runtime-coverage/sys/sys/consio.h (revision 322922) @@ -1,464 +1,466 @@ /*- * Copyright (c) 1991-1996 Søren Schmidt * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _SYS_CONSIO_H_ #define _SYS_CONSIO_H_ #ifndef _KERNEL #include #endif #include /* * Console ioctl commands. Some commands are named as KDXXXX, GIO_XXX, and * PIO_XXX, rather than CONS_XXX, for historical and compatibility reasons. * Some other CONS_XXX commands are works as wrapper around frame buffer * ioctl commands FBIO_XXX. Do not try to change all these commands, * otherwise we shall have compatibility problems. */ /* get/set video mode */ #define KD_TEXT 0 /* set text mode restore fonts */ #define KD_TEXT0 0 /* ditto */ #define KD_GRAPHICS 1 /* set graphics mode */ #define KD_TEXT1 2 /* set text mode !restore fonts */ #define KD_PIXEL 3 /* set pixel mode */ #define KDGETMODE _IOR('K', 9, int) #define KDSETMODE _IOWINT('K', 10) /* set border color */ #define KDSBORDER _IOWINT('K', 13) /* set up raster(pixel) text mode */ struct _scr_size { int scr_size[3]; }; typedef struct _scr_size scr_size_t; #define KDRASTER _IOW('K', 100, scr_size_t) /* get/set screen char map */ struct _scrmap { char scrmap[256]; }; typedef struct _scrmap scrmap_t; #define GIO_SCRNMAP _IOR('k', 2, scrmap_t) #define PIO_SCRNMAP _IOW('k', 3, scrmap_t) /* get the current text attribute */ #define GIO_ATTR _IOR('a', 0, int) /* get the current text color */ #define GIO_COLOR _IOR('c', 0, int) /* get the adapter type (equivalent to FBIO_ADPTYPE) */ #define CONS_CURRENT _IOR('c', 1, int) /* get the current video mode (equivalent to FBIO_GETMODE) */ #define CONS_GET _IOR('c', 2, int) /* not supported? */ #define CONS_IO _IO('c', 3) /* set blank time interval */ #define CONS_BLANKTIME _IOW('c', 4, int) /* set/get the screen saver (these ioctls are current noop) */ struct ssaver { #define MAXSSAVER 16 char name[MAXSSAVER]; int num; long time; }; typedef struct ssaver ssaver_t; #define CONS_SSAVER _IOW('c', 5, ssaver_t) #define CONS_GSAVER _IOWR('c', 6, ssaver_t) /* * Set the text cursor type. * * This is an old interface extended to support the CONS_HIDDEN_CURSOR bit. * New code should use CONS_CURSORSHAPE. CONS_CURSOR_ATTRS gives the 3 * bits supported by the (extended) old interface. The old interface is * especially unusable for hiding the cursor (even with its extension) * since it changes the cursor on all vtys. */ #define CONS_CURSORTYPE _IOW('c', 7, int) /* set the bell type to audible or visual */ #define CONS_VISUAL_BELL (1 << 0) #define CONS_QUIET_BELL (1 << 1) #define CONS_BELLTYPE _IOW('c', 8, int) /* set the history (scroll back) buffer size (in lines) */ #define CONS_HISTORY _IOW('c', 9, int) /* clear the history (scroll back) buffer */ #define CONS_CLRHIST _IO('c', 10) /* mouse cursor ioctl */ struct mouse_data { int x; int y; int z; int buttons; }; typedef struct mouse_data mouse_data_t; struct mouse_mode { int mode; int signal; }; typedef struct mouse_mode mouse_mode_t; struct mouse_event { int id; /* one based */ int value; }; typedef struct mouse_event mouse_event_t; struct mouse_info { int operation; #define MOUSE_SHOW 0x01 #define MOUSE_HIDE 0x02 #define MOUSE_MOVEABS 0x03 #define MOUSE_MOVEREL 0x04 #define MOUSE_GETINFO 0x05 #define MOUSE_MODE 0x06 #define MOUSE_ACTION 0x07 #define MOUSE_MOTION_EVENT 0x08 #define MOUSE_BUTTON_EVENT 0x09 #define MOUSE_MOUSECHAR 0x0a union { mouse_data_t data; mouse_mode_t mode; mouse_event_t event; int mouse_char; } u; }; typedef struct mouse_info mouse_info_t; #define CONS_MOUSECTL _IOWR('c', 10, mouse_info_t) /* see if the vty has been idle */ #define CONS_IDLE _IOR('c', 11, int) /* set the screen saver mode */ #define CONS_NO_SAVER (-1) #define CONS_LKM_SAVER 0 #define CONS_USR_SAVER 1 #define CONS_SAVERMODE _IOW('c', 12, int) /* start the screen saver */ #define CONS_SAVERSTART _IOW('c', 13, int) /* set the text cursor shape (see also CONS_CURSORTYPE above) */ #define CONS_BLINK_CURSOR (1 << 0) #define CONS_CHAR_CURSOR (1 << 1) #define CONS_HIDDEN_CURSOR (1 << 2) #define CONS_CURSOR_ATTRS (CONS_BLINK_CURSOR | CONS_CHAR_CURSOR | \ CONS_HIDDEN_CURSOR) +#define CONS_CHARCURSOR_COLORS (1 << 26) +#define CONS_MOUSECURSOR_COLORS (1 << 27) #define CONS_DEFAULT_CURSOR (1 << 28) #define CONS_SHAPEONLY_CURSOR (1 << 29) #define CONS_RESET_CURSOR (1 << 30) #define CONS_LOCAL_CURSOR (1U << 31) struct cshape { /* shape[0]: flags, shape[1]: base, shape[2]: height */ int shape[3]; }; #define CONS_GETCURSORSHAPE _IOWR('c', 14, struct cshape) #define CONS_SETCURSORSHAPE _IOW('c', 15, struct cshape) /* set/get font data */ struct fnt8 { char fnt8x8[8*256]; }; typedef struct fnt8 fnt8_t; struct fnt14 { char fnt8x14[14*256]; }; typedef struct fnt14 fnt14_t; struct fnt16 { char fnt8x16[16*256]; }; typedef struct fnt16 fnt16_t; struct vfnt_map { uint32_t src; uint16_t dst; uint16_t len; }; typedef struct vfnt_map vfnt_map_t; #define VFNT_MAP_NORMAL 0 #define VFNT_MAP_NORMAL_RIGHT 1 #define VFNT_MAP_BOLD 2 #define VFNT_MAP_BOLD_RIGHT 3 #define VFNT_MAPS 4 struct vfnt { vfnt_map_t *map[VFNT_MAPS]; uint8_t *glyphs; unsigned int map_count[VFNT_MAPS]; unsigned int glyph_count; unsigned int width; unsigned int height; }; typedef struct vfnt vfnt_t; #define PIO_FONT8x8 _IOW('c', 64, fnt8_t) #define GIO_FONT8x8 _IOR('c', 65, fnt8_t) #define PIO_FONT8x14 _IOW('c', 66, fnt14_t) #define GIO_FONT8x14 _IOR('c', 67, fnt14_t) #define PIO_FONT8x16 _IOW('c', 68, fnt16_t) #define GIO_FONT8x16 _IOR('c', 69, fnt16_t) #define PIO_VFONT _IOW('c', 70, vfnt_t) #define GIO_VFONT _IOR('c', 71, vfnt_t) #define PIO_VFONT_DEFAULT _IO('c', 72) /* get video mode information */ struct colors { char fore; char back; }; struct vid_info { short size; short m_num; u_short font_size; u_short mv_row, mv_col; u_short mv_rsz, mv_csz; u_short mv_hsz; struct colors mv_norm, mv_rev, mv_grfc; u_char mv_ovscan; u_char mk_keylock; }; typedef struct vid_info vid_info_t; #define CONS_GETINFO _IOWR('c', 73, vid_info_t) /* get version */ #define CONS_GETVERS _IOR('c', 74, int) /* get the video adapter index (equivalent to FBIO_ADAPTER) */ #define CONS_CURRENTADP _IOR('c', 100, int) /* get the video adapter information (equivalent to FBIO_ADPINFO) */ #define CONS_ADPINFO _IOWR('c', 101, video_adapter_info_t) /* get the video mode information (equivalent to FBIO_MODEINFO) */ #define CONS_MODEINFO _IOWR('c', 102, video_info_t) /* find a video mode (equivalent to FBIO_FINDMODE) */ #define CONS_FINDMODE _IOWR('c', 103, video_info_t) /* set the frame buffer window origin (equivalent to FBIO_SETWINORG) */ #define CONS_SETWINORG _IOWINT('c', 104) /* use the specified keyboard */ #define CONS_SETKBD _IOWINT('c', 110) /* release the current keyboard */ #define CONS_RELKBD _IO('c', 111) struct scrshot { int x; int y; int xsize; int ysize; u_int16_t* buf; }; typedef struct scrshot scrshot_t; /* Snapshot the current video buffer */ #define CONS_SCRSHOT _IOWR('c', 105, scrshot_t) /* get/set the current terminal emulator info. */ #define TI_NAME_LEN 32 #define TI_DESC_LEN 64 struct term_info { int ti_index; int ti_flags; u_char ti_name[TI_NAME_LEN]; u_char ti_desc[TI_DESC_LEN]; }; typedef struct term_info term_info_t; #define CONS_GETTERM _IOWR('c', 112, term_info_t) #define CONS_SETTERM _IOW('c', 113, term_info_t) /* * Vty switching ioctl commands. */ /* get the next available vty */ #define VT_OPENQRY _IOR('v', 1, int) /* set/get vty switching mode */ #ifndef _VT_MODE_DECLARED #define _VT_MODE_DECLARED struct vt_mode { char mode; #define VT_AUTO 0 /* switching is automatic */ #define VT_PROCESS 1 /* switching controlled by prog */ #define VT_KERNEL 255 /* switching controlled in kernel */ char waitv; /* not implemented yet SOS */ short relsig; short acqsig; short frsig; /* not implemented yet SOS */ }; typedef struct vt_mode vtmode_t; #endif /* !_VT_MODE_DECLARED */ #define VT_SETMODE _IOW('v', 2, vtmode_t) #define VT_GETMODE _IOR('v', 3, vtmode_t) /* acknowledge release or acquisition of a vty */ #define VT_FALSE 0 #define VT_TRUE 1 #define VT_ACKACQ 2 #define VT_RELDISP _IOWINT('v', 4) /* activate the specified vty */ #define VT_ACTIVATE _IOWINT('v', 5) /* wait until the specified vty is activate */ #define VT_WAITACTIVE _IOWINT('v', 6) /* get the currently active vty */ #define VT_GETACTIVE _IOR('v', 7, int) /* get the index of the vty */ #define VT_GETINDEX _IOR('v', 8, int) /* prevent switching vtys */ #define VT_LOCKSWITCH _IOW('v', 9, int) /* * Video mode switching ioctl. See sys/fbio.h for mode numbers. */ #define SW_B40x25 _IO('S', M_B40x25) #define SW_C40x25 _IO('S', M_C40x25) #define SW_B80x25 _IO('S', M_B80x25) #define SW_C80x25 _IO('S', M_C80x25) #define SW_BG320 _IO('S', M_BG320) #define SW_CG320 _IO('S', M_CG320) #define SW_BG640 _IO('S', M_BG640) #define SW_EGAMONO80x25 _IO('S', M_EGAMONO80x25) #define SW_CG320_D _IO('S', M_CG320_D) #define SW_CG640_E _IO('S', M_CG640_E) #define SW_EGAMONOAPA _IO('S', M_EGAMONOAPA) #define SW_CG640x350 _IO('S', M_CG640x350) #define SW_ENH_MONOAPA2 _IO('S', M_ENHMONOAPA2) #define SW_ENH_CG640 _IO('S', M_ENH_CG640) #define SW_ENH_B40x25 _IO('S', M_ENH_B40x25) #define SW_ENH_C40x25 _IO('S', M_ENH_C40x25) #define SW_ENH_B80x25 _IO('S', M_ENH_B80x25) #define SW_ENH_C80x25 _IO('S', M_ENH_C80x25) #define SW_ENH_B80x43 _IO('S', M_ENH_B80x43) #define SW_ENH_C80x43 _IO('S', M_ENH_C80x43) #define SW_MCAMODE _IO('S', M_MCA_MODE) #define SW_VGA_C40x25 _IO('S', M_VGA_C40x25) #define SW_VGA_C80x25 _IO('S', M_VGA_C80x25) #define SW_VGA_C80x30 _IO('S', M_VGA_C80x30) #define SW_VGA_C80x50 _IO('S', M_VGA_C80x50) #define SW_VGA_C80x60 _IO('S', M_VGA_C80x60) #define SW_VGA_M80x25 _IO('S', M_VGA_M80x25) #define SW_VGA_M80x30 _IO('S', M_VGA_M80x30) #define SW_VGA_M80x50 _IO('S', M_VGA_M80x50) #define SW_VGA_M80x60 _IO('S', M_VGA_M80x60) #define SW_VGA11 _IO('S', M_VGA11) #define SW_BG640x480 _IO('S', M_VGA11) #define SW_VGA12 _IO('S', M_VGA12) #define SW_CG640x480 _IO('S', M_VGA12) #define SW_VGA13 _IO('S', M_VGA13) #define SW_VGA_CG320 _IO('S', M_VGA13) #define SW_VGA_CG640 _IO('S', M_VGA_CG640) #define SW_VGA_MODEX _IO('S', M_VGA_MODEX) #define SW_VGA_C90x25 _IO('S', M_VGA_C90x25) #define SW_VGA_M90x25 _IO('S', M_VGA_M90x25) #define SW_VGA_C90x30 _IO('S', M_VGA_C90x30) #define SW_VGA_M90x30 _IO('S', M_VGA_M90x30) #define SW_VGA_C90x43 _IO('S', M_VGA_C90x43) #define SW_VGA_M90x43 _IO('S', M_VGA_M90x43) #define SW_VGA_C90x50 _IO('S', M_VGA_C90x50) #define SW_VGA_M90x50 _IO('S', M_VGA_M90x50) #define SW_VGA_C90x60 _IO('S', M_VGA_C90x60) #define SW_VGA_M90x60 _IO('S', M_VGA_M90x60) #define SW_TEXT_80x25 _IO('S', M_TEXT_80x25) #define SW_TEXT_80x30 _IO('S', M_TEXT_80x30) #define SW_TEXT_80x43 _IO('S', M_TEXT_80x43) #define SW_TEXT_80x50 _IO('S', M_TEXT_80x50) #define SW_TEXT_80x60 _IO('S', M_TEXT_80x60) #define SW_TEXT_132x25 _IO('S', M_TEXT_132x25) #define SW_TEXT_132x30 _IO('S', M_TEXT_132x30) #define SW_TEXT_132x43 _IO('S', M_TEXT_132x43) #define SW_TEXT_132x50 _IO('S', M_TEXT_132x50) #define SW_TEXT_132x60 _IO('S', M_TEXT_132x60) #define SW_VESA_CG640x400 _IO('V', M_VESA_CG640x400 - M_VESA_BASE) #define SW_VESA_CG640x480 _IO('V', M_VESA_CG640x480 - M_VESA_BASE) #define SW_VESA_800x600 _IO('V', M_VESA_800x600 - M_VESA_BASE) #define SW_VESA_CG800x600 _IO('V', M_VESA_CG800x600 - M_VESA_BASE) #define SW_VESA_1024x768 _IO('V', M_VESA_1024x768 - M_VESA_BASE) #define SW_VESA_CG1024x768 _IO('V', M_VESA_CG1024x768 - M_VESA_BASE) #define SW_VESA_1280x1024 _IO('V', M_VESA_1280x1024 - M_VESA_BASE) #define SW_VESA_CG1280x1024 _IO('V', M_VESA_CG1280x1024 - M_VESA_BASE) #define SW_VESA_C80x60 _IO('V', M_VESA_C80x60 - M_VESA_BASE) #define SW_VESA_C132x25 _IO('V', M_VESA_C132x25 - M_VESA_BASE) #define SW_VESA_C132x43 _IO('V', M_VESA_C132x43 - M_VESA_BASE) #define SW_VESA_C132x50 _IO('V', M_VESA_C132x50 - M_VESA_BASE) #define SW_VESA_C132x60 _IO('V', M_VESA_C132x60 - M_VESA_BASE) #define SW_VESA_32K_320 _IO('V', M_VESA_32K_320 - M_VESA_BASE) #define SW_VESA_64K_320 _IO('V', M_VESA_64K_320 - M_VESA_BASE) #define SW_VESA_FULL_320 _IO('V', M_VESA_FULL_320 - M_VESA_BASE) #define SW_VESA_32K_640 _IO('V', M_VESA_32K_640 - M_VESA_BASE) #define SW_VESA_64K_640 _IO('V', M_VESA_64K_640 - M_VESA_BASE) #define SW_VESA_FULL_640 _IO('V', M_VESA_FULL_640 - M_VESA_BASE) #define SW_VESA_32K_800 _IO('V', M_VESA_32K_800 - M_VESA_BASE) #define SW_VESA_64K_800 _IO('V', M_VESA_64K_800 - M_VESA_BASE) #define SW_VESA_FULL_800 _IO('V', M_VESA_FULL_800 - M_VESA_BASE) #define SW_VESA_32K_1024 _IO('V', M_VESA_32K_1024 - M_VESA_BASE) #define SW_VESA_64K_1024 _IO('V', M_VESA_64K_1024 - M_VESA_BASE) #define SW_VESA_FULL_1024 _IO('V', M_VESA_FULL_1024 - M_VESA_BASE) #define SW_VESA_32K_1280 _IO('V', M_VESA_32K_1280 - M_VESA_BASE) #define SW_VESA_64K_1280 _IO('V', M_VESA_64K_1280 - M_VESA_BASE) #define SW_VESA_FULL_1280 _IO('V', M_VESA_FULL_1280 - M_VESA_BASE) #endif /* !_SYS_CONSIO_H_ */ Index: projects/runtime-coverage/sys/sys/param.h =================================================================== --- projects/runtime-coverage/sys/sys/param.h (revision 322921) +++ projects/runtime-coverage/sys/sys/param.h (revision 322922) @@ -1,363 +1,363 @@ /*- * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)param.h 8.3 (Berkeley) 4/4/95 * $FreeBSD$ */ #ifndef _SYS_PARAM_H_ #define _SYS_PARAM_H_ #include #define BSD 199506 /* System version (year & month). */ #define BSD4_3 1 #define BSD4_4 1 /* * __FreeBSD_version numbers are documented in the Porter's Handbook. * If you bump the version for any reason, you should update the documentation * there. * Currently this lives here in the doc/ repository: * * head/en_US.ISO8859-1/books/porters-handbook/versions/chapter.xml * * scheme is: Rxx * 'R' is in the range 0 to 4 if this is a release branch or * X.0-CURRENT before releng/X.0 is created, otherwise 'R' is * in the range 5 to 9. */ #undef __FreeBSD_version -#define __FreeBSD_version 1200041 /* Master, propagated to newvers */ +#define __FreeBSD_version 1200042 /* Master, propagated to newvers */ /* * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD, * which by definition is always true on FreeBSD. This macro is also defined * on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD. * * It is tempting to use this macro in userland code when we want to enable * kernel-specific routines, and in fact it's fine to do this in code that * is part of FreeBSD itself. However, be aware that as presence of this * macro is still not widespread (e.g. older FreeBSD versions, 3rd party * compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in * external applications without also checking for __FreeBSD__ as an * alternative. */ #undef __FreeBSD_kernel__ #define __FreeBSD_kernel__ #if defined(_KERNEL) || defined(IN_RTLD) #define P_OSREL_SIGWAIT 700000 #define P_OSREL_SIGSEGV 700004 #define P_OSREL_MAP_ANON 800104 #define P_OSREL_MAP_FSTRICT 1100036 #define P_OSREL_SHUTDOWN_ENOTCONN 1100077 #define P_OSREL_MAP_GUARD 1200035 #define P_OSREL_WRFSBASE 1200041 #define P_OSREL_MAJOR(x) ((x) / 100000) #endif #ifndef LOCORE #include #endif /* * Machine-independent constants (some used in following include files). * Redefined constants are from POSIX 1003.1 limits file. * * MAXCOMLEN should be >= sizeof(ac_comm) (see ) */ #include #define MAXCOMLEN 19 /* max command name remembered */ #define MAXINTERP PATH_MAX /* max interpreter file name length */ #define MAXLOGNAME 33 /* max login name length (incl. NUL) */ #define MAXUPRC CHILD_MAX /* max simultaneous processes */ #define NCARGS ARG_MAX /* max bytes for an exec function */ #define NGROUPS (NGROUPS_MAX+1) /* max number groups */ #define NOFILE OPEN_MAX /* max open files per process */ #define NOGROUP 65535 /* marker for empty group set member */ #define MAXHOSTNAMELEN 256 /* max hostname size */ #define SPECNAMELEN 63 /* max length of devicename */ /* More types and definitions used throughout the kernel. */ #ifdef _KERNEL #include #include #ifndef LOCORE #include #include #endif #ifndef FALSE #define FALSE 0 #endif #ifndef TRUE #define TRUE 1 #endif #endif #ifndef _KERNEL /* Signals. */ #include #endif /* Machine type dependent parameters. */ #include #ifndef _KERNEL #include #endif #ifndef DEV_BSHIFT #define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */ #endif #define DEV_BSIZE (1<>PAGE_SHIFT) #endif /* * btodb() is messy and perhaps slow because `bytes' may be an off_t. We * want to shift an unsigned type to avoid sign extension and we don't * want to widen `bytes' unnecessarily. Assume that the result fits in * a daddr_t. */ #ifndef btodb #define btodb(bytes) /* calculates (bytes / DEV_BSIZE) */ \ (sizeof (bytes) > sizeof(long) \ ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \ : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT)) #endif #ifndef dbtob #define dbtob(db) /* calculates (db * DEV_BSIZE) */ \ ((off_t)(db) << DEV_BSHIFT) #endif #define PRIMASK 0x0ff #define PCATCH 0x100 /* OR'd with pri for tsleep to check signals */ #define PDROP 0x200 /* OR'd with pri to stop re-entry of interlock mutex */ #define NZERO 0 /* default "nice" */ #define NBBY 8 /* number of bits in a byte */ #define NBPW sizeof(int) /* number of bytes per word (integer) */ #define CMASK 022 /* default file mask: S_IWGRP|S_IWOTH */ #define NODEV (dev_t)(-1) /* non-existent device */ /* * File system parameters and macros. * * MAXBSIZE - Filesystems are made out of blocks of at most MAXBSIZE bytes * per block. MAXBSIZE may be made larger without effecting * any existing filesystems as long as it does not exceed MAXPHYS, * and may be made smaller at the risk of not being able to use * filesystems which require a block size exceeding MAXBSIZE. * * MAXBCACHEBUF - Maximum size of a buffer in the buffer cache. This must * be >= MAXBSIZE and can be set differently for different * architectures by defining it in . * Making this larger allows NFS to do larger reads/writes. * * BKVASIZE - Nominal buffer space per buffer, in bytes. BKVASIZE is the * minimum KVM memory reservation the kernel is willing to make. * Filesystems can of course request smaller chunks. Actual * backing memory uses a chunk size of a page (PAGE_SIZE). * The default value here can be overridden on a per-architecture * basis by defining it in . * * If you make BKVASIZE too small you risk seriously fragmenting * the buffer KVM map which may slow things down a bit. If you * make it too big the kernel will not be able to optimally use * the KVM memory reserved for the buffer cache and will wind * up with too-few buffers. * * The default is 16384, roughly 2x the block size used by a * normal UFS filesystem. */ #define MAXBSIZE 65536 /* must be power of 2 */ #ifndef MAXBCACHEBUF #define MAXBCACHEBUF MAXBSIZE /* must be a power of 2 >= MAXBSIZE */ #endif #ifndef BKVASIZE #define BKVASIZE 16384 /* must be power of 2 */ #endif #define BKVAMASK (BKVASIZE-1) /* * MAXPATHLEN defines the longest permissible path length after expanding * symbolic links. It is used to allocate a temporary buffer from the buffer * pool in which to do the name expansion, hence should be a power of two, * and must be less than or equal to MAXBSIZE. MAXSYMLINKS defines the * maximum number of symbolic links that may be expanded in a path name. * It should be set high enough to allow all legitimate uses, but halt * infinite loops reasonably quickly. */ #define MAXPATHLEN PATH_MAX #define MAXSYMLINKS 32 /* Bit map related macros. */ #define setbit(a,i) (((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY)) #define clrbit(a,i) (((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY))) #define isset(a,i) \ (((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) #define isclr(a,i) \ ((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0) /* Macros for counting and rounding. */ #ifndef howmany #define howmany(x, y) (((x)+((y)-1))/(y)) #endif #define nitems(x) (sizeof((x)) / sizeof((x)[0])) #define rounddown(x, y) (((x)/(y))*(y)) #define rounddown2(x, y) ((x)&(~((y)-1))) /* if y is power of two */ #define roundup(x, y) ((((x)+((y)-1))/(y))*(y)) /* to any y */ #define roundup2(x, y) (((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */ #define powerof2(x) ((((x)-1)&(x))==0) /* Macros for min/max. */ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) #ifdef _KERNEL /* * Basic byte order function prototypes for non-inline functions. */ #ifndef LOCORE #ifndef _BYTEORDER_PROTOTYPED #define _BYTEORDER_PROTOTYPED __BEGIN_DECLS __uint32_t htonl(__uint32_t); __uint16_t htons(__uint16_t); __uint32_t ntohl(__uint32_t); __uint16_t ntohs(__uint16_t); __END_DECLS #endif #endif #ifndef lint #ifndef _BYTEORDER_FUNC_DEFINED #define _BYTEORDER_FUNC_DEFINED #define htonl(x) __htonl(x) #define htons(x) __htons(x) #define ntohl(x) __ntohl(x) #define ntohs(x) __ntohs(x) #endif /* !_BYTEORDER_FUNC_DEFINED */ #endif /* lint */ #endif /* _KERNEL */ /* * Scale factor for scaled integers used to count %cpu time and load avgs. * * The number of CPU `tick's that map to a unique `%age' can be expressed * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average that * can be calculated (assuming 32 bits) can be closely approximated using * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). * * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age', * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024. */ #define FSHIFT 11 /* bits to right of fixed binary point */ #define FSCALE (1<> (PAGE_SHIFT - DEV_BSHIFT)) #define ctodb(db) /* calculates pages to devblks */ \ ((db) << (PAGE_SHIFT - DEV_BSHIFT)) /* * Old spelling of __containerof(). */ #define member2struct(s, m, x) \ ((struct s *)(void *)((char *)(x) - offsetof(struct s, m))) /* * Access a variable length array that has been declared as a fixed * length array. */ #define __PAST_END(array, offset) (((__typeof__(*(array)) *)(array))[offset]) #endif /* _SYS_PARAM_H_ */ Index: projects/runtime-coverage/sys/ufs/ffs/ffs_softdep.c =================================================================== --- projects/runtime-coverage/sys/ufs/ffs/ffs_softdep.c (revision 322921) +++ projects/runtime-coverage/sys/ufs/ffs/ffs_softdep.c (revision 322922) @@ -1,14483 +1,14483 @@ /*- * Copyright 1998, 2000 Marshall Kirk McKusick. * Copyright 2009, 2010 Jeffrey W. Roberson * All rights reserved. * * The soft updates code is derived from the appendix of a University * of Michigan technical report (Gregory R. Ganger and Yale N. Patt, * "Soft Updates: A Solution to the Metadata Update Problem in File * Systems", CSE-TR-254-95, August 1995). * * Further information about soft updates can be obtained from: * * Marshall Kirk McKusick http://www.mckusick.com/softdep/ * 1614 Oxford Street mckusick@mckusick.com * Berkeley, CA 94709-1608 +1-510-843-9542 * USA * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * from: @(#)ffs_softdep.c 9.59 (McKusick) 6/21/00 */ #include __FBSDID("$FreeBSD$"); #include "opt_ffs.h" #include "opt_quota.h" #include "opt_ddb.h" /* * For now we want the safety net that the DEBUG flag provides. */ #ifndef DEBUG #define DEBUG #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define KTR_SUJ 0 /* Define to KTR_SPARE. */ #ifndef SOFTUPDATES int softdep_flushfiles(oldmnt, flags, td) struct mount *oldmnt; int flags; struct thread *td; { panic("softdep_flushfiles called"); } int softdep_mount(devvp, mp, fs, cred) struct vnode *devvp; struct mount *mp; struct fs *fs; struct ucred *cred; { return (0); } void softdep_initialize() { return; } void softdep_uninitialize() { return; } void softdep_unmount(mp) struct mount *mp; { panic("softdep_unmount called"); } void softdep_setup_sbupdate(ump, fs, bp) struct ufsmount *ump; struct fs *fs; struct buf *bp; { panic("softdep_setup_sbupdate called"); } void softdep_setup_inomapdep(bp, ip, newinum, mode) struct buf *bp; struct inode *ip; ino_t newinum; int mode; { panic("softdep_setup_inomapdep called"); } void softdep_setup_blkmapdep(bp, mp, newblkno, frags, oldfrags) struct buf *bp; struct mount *mp; ufs2_daddr_t newblkno; int frags; int oldfrags; { panic("softdep_setup_blkmapdep called"); } void softdep_setup_allocdirect(ip, lbn, newblkno, oldblkno, newsize, oldsize, bp) struct inode *ip; ufs_lbn_t lbn; ufs2_daddr_t newblkno; ufs2_daddr_t oldblkno; long newsize; long oldsize; struct buf *bp; { panic("softdep_setup_allocdirect called"); } void softdep_setup_allocext(ip, lbn, newblkno, oldblkno, newsize, oldsize, bp) struct inode *ip; ufs_lbn_t lbn; ufs2_daddr_t newblkno; ufs2_daddr_t oldblkno; long newsize; long oldsize; struct buf *bp; { panic("softdep_setup_allocext called"); } void softdep_setup_allocindir_page(ip, lbn, bp, ptrno, newblkno, oldblkno, nbp) struct inode *ip; ufs_lbn_t lbn; struct buf *bp; int ptrno; ufs2_daddr_t newblkno; ufs2_daddr_t oldblkno; struct buf *nbp; { panic("softdep_setup_allocindir_page called"); } void softdep_setup_allocindir_meta(nbp, ip, bp, ptrno, newblkno) struct buf *nbp; struct inode *ip; struct buf *bp; int ptrno; ufs2_daddr_t newblkno; { panic("softdep_setup_allocindir_meta called"); } void softdep_journal_freeblocks(ip, cred, length, flags) struct inode *ip; struct ucred *cred; off_t length; int flags; { panic("softdep_journal_freeblocks called"); } void softdep_journal_fsync(ip) struct inode *ip; { panic("softdep_journal_fsync called"); } void softdep_setup_freeblocks(ip, length, flags) struct inode *ip; off_t length; int flags; { panic("softdep_setup_freeblocks called"); } void softdep_freefile(pvp, ino, mode) struct vnode *pvp; ino_t ino; int mode; { panic("softdep_freefile called"); } int softdep_setup_directory_add(bp, dp, diroffset, newinum, newdirbp, isnewblk) struct buf *bp; struct inode *dp; off_t diroffset; ino_t newinum; struct buf *newdirbp; int isnewblk; { panic("softdep_setup_directory_add called"); } void softdep_change_directoryentry_offset(bp, dp, base, oldloc, newloc, entrysize) struct buf *bp; struct inode *dp; caddr_t base; caddr_t oldloc; caddr_t newloc; int entrysize; { panic("softdep_change_directoryentry_offset called"); } void softdep_setup_remove(bp, dp, ip, isrmdir) struct buf *bp; struct inode *dp; struct inode *ip; int isrmdir; { panic("softdep_setup_remove called"); } void softdep_setup_directory_change(bp, dp, ip, newinum, isrmdir) struct buf *bp; struct inode *dp; struct inode *ip; ino_t newinum; int isrmdir; { panic("softdep_setup_directory_change called"); } void softdep_setup_blkfree(mp, bp, blkno, frags, wkhd) struct mount *mp; struct buf *bp; ufs2_daddr_t blkno; int frags; struct workhead *wkhd; { panic("%s called", __FUNCTION__); } void softdep_setup_inofree(mp, bp, ino, wkhd) struct mount *mp; struct buf *bp; ino_t ino; struct workhead *wkhd; { panic("%s called", __FUNCTION__); } void softdep_setup_unlink(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_setup_link(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_revert_link(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_setup_rmdir(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_revert_rmdir(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_setup_create(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_revert_create(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_setup_mkdir(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_revert_mkdir(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } void softdep_setup_dotdot_link(dp, ip) struct inode *dp; struct inode *ip; { panic("%s called", __FUNCTION__); } int softdep_prealloc(vp, waitok) struct vnode *vp; int waitok; { panic("%s called", __FUNCTION__); } int softdep_journal_lookup(mp, vpp) struct mount *mp; struct vnode **vpp; { return (ENOENT); } void softdep_change_linkcnt(ip) struct inode *ip; { panic("softdep_change_linkcnt called"); } void softdep_load_inodeblock(ip) struct inode *ip; { panic("softdep_load_inodeblock called"); } void softdep_update_inodeblock(ip, bp, waitfor) struct inode *ip; struct buf *bp; int waitfor; { panic("softdep_update_inodeblock called"); } int softdep_fsync(vp) struct vnode *vp; /* the "in_core" copy of the inode */ { return (0); } void softdep_fsync_mountdev(vp) struct vnode *vp; { return; } int softdep_flushworklist(oldmnt, countp, td) struct mount *oldmnt; int *countp; struct thread *td; { *countp = 0; return (0); } int softdep_sync_metadata(struct vnode *vp) { panic("softdep_sync_metadata called"); } int softdep_sync_buf(struct vnode *vp, struct buf *bp, int waitfor) { panic("softdep_sync_buf called"); } int softdep_slowdown(vp) struct vnode *vp; { panic("softdep_slowdown called"); } int softdep_request_cleanup(fs, vp, cred, resource) struct fs *fs; struct vnode *vp; struct ucred *cred; int resource; { return (0); } int softdep_check_suspend(struct mount *mp, struct vnode *devvp, int softdep_depcnt, int softdep_accdepcnt, int secondary_writes, int secondary_accwrites) { struct bufobj *bo; int error; (void) softdep_depcnt, (void) softdep_accdepcnt; bo = &devvp->v_bufobj; ASSERT_BO_WLOCKED(bo); MNT_ILOCK(mp); while (mp->mnt_secondary_writes != 0) { BO_UNLOCK(bo); msleep(&mp->mnt_secondary_writes, MNT_MTX(mp), (PUSER - 1) | PDROP, "secwr", 0); BO_LOCK(bo); MNT_ILOCK(mp); } /* * Reasons for needing more work before suspend: * - Dirty buffers on devvp. * - Secondary writes occurred after start of vnode sync loop */ error = 0; if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0 || secondary_writes != 0 || mp->mnt_secondary_writes != 0 || secondary_accwrites != mp->mnt_secondary_accwrites) error = EAGAIN; BO_UNLOCK(bo); return (error); } void softdep_get_depcounts(struct mount *mp, int *softdepactivep, int *softdepactiveaccp) { (void) mp; *softdepactivep = 0; *softdepactiveaccp = 0; } void softdep_buf_append(bp, wkhd) struct buf *bp; struct workhead *wkhd; { panic("softdep_buf_appendwork called"); } void softdep_inode_append(ip, cred, wkhd) struct inode *ip; struct ucred *cred; struct workhead *wkhd; { panic("softdep_inode_appendwork called"); } void softdep_freework(wkhd) struct workhead *wkhd; { panic("softdep_freework called"); } #else FEATURE(softupdates, "FFS soft-updates support"); static SYSCTL_NODE(_debug, OID_AUTO, softdep, CTLFLAG_RW, 0, "soft updates stats"); static SYSCTL_NODE(_debug_softdep, OID_AUTO, total, CTLFLAG_RW, 0, "total dependencies allocated"); static SYSCTL_NODE(_debug_softdep, OID_AUTO, highuse, CTLFLAG_RW, 0, "high use dependencies allocated"); static SYSCTL_NODE(_debug_softdep, OID_AUTO, current, CTLFLAG_RW, 0, "current dependencies allocated"); static SYSCTL_NODE(_debug_softdep, OID_AUTO, write, CTLFLAG_RW, 0, "current dependencies written"); unsigned long dep_current[D_LAST + 1]; unsigned long dep_highuse[D_LAST + 1]; unsigned long dep_total[D_LAST + 1]; unsigned long dep_write[D_LAST + 1]; #define SOFTDEP_TYPE(type, str, long) \ static MALLOC_DEFINE(M_ ## type, #str, long); \ SYSCTL_ULONG(_debug_softdep_total, OID_AUTO, str, CTLFLAG_RD, \ &dep_total[D_ ## type], 0, ""); \ SYSCTL_ULONG(_debug_softdep_current, OID_AUTO, str, CTLFLAG_RD, \ &dep_current[D_ ## type], 0, ""); \ SYSCTL_ULONG(_debug_softdep_highuse, OID_AUTO, str, CTLFLAG_RD, \ &dep_highuse[D_ ## type], 0, ""); \ SYSCTL_ULONG(_debug_softdep_write, OID_AUTO, str, CTLFLAG_RD, \ &dep_write[D_ ## type], 0, ""); SOFTDEP_TYPE(PAGEDEP, pagedep, "File page dependencies"); SOFTDEP_TYPE(INODEDEP, inodedep, "Inode dependencies"); SOFTDEP_TYPE(BMSAFEMAP, bmsafemap, "Block or frag allocated from cyl group map"); SOFTDEP_TYPE(NEWBLK, newblk, "New block or frag allocation dependency"); SOFTDEP_TYPE(ALLOCDIRECT, allocdirect, "Block or frag dependency for an inode"); SOFTDEP_TYPE(INDIRDEP, indirdep, "Indirect block dependencies"); SOFTDEP_TYPE(ALLOCINDIR, allocindir, "Block dependency for an indirect block"); SOFTDEP_TYPE(FREEFRAG, freefrag, "Previously used frag for an inode"); SOFTDEP_TYPE(FREEBLKS, freeblks, "Blocks freed from an inode"); SOFTDEP_TYPE(FREEFILE, freefile, "Inode deallocated"); SOFTDEP_TYPE(DIRADD, diradd, "New directory entry"); SOFTDEP_TYPE(MKDIR, mkdir, "New directory"); SOFTDEP_TYPE(DIRREM, dirrem, "Directory entry deleted"); SOFTDEP_TYPE(NEWDIRBLK, newdirblk, "Unclaimed new directory block"); SOFTDEP_TYPE(FREEWORK, freework, "free an inode block"); SOFTDEP_TYPE(FREEDEP, freedep, "track a block free"); SOFTDEP_TYPE(JADDREF, jaddref, "Journal inode ref add"); SOFTDEP_TYPE(JREMREF, jremref, "Journal inode ref remove"); SOFTDEP_TYPE(JMVREF, jmvref, "Journal inode ref move"); SOFTDEP_TYPE(JNEWBLK, jnewblk, "Journal new block"); SOFTDEP_TYPE(JFREEBLK, jfreeblk, "Journal free block"); SOFTDEP_TYPE(JFREEFRAG, jfreefrag, "Journal free frag"); SOFTDEP_TYPE(JSEG, jseg, "Journal segment"); SOFTDEP_TYPE(JSEGDEP, jsegdep, "Journal segment complete"); SOFTDEP_TYPE(SBDEP, sbdep, "Superblock write dependency"); SOFTDEP_TYPE(JTRUNC, jtrunc, "Journal inode truncation"); SOFTDEP_TYPE(JFSYNC, jfsync, "Journal fsync complete"); static MALLOC_DEFINE(M_SENTINEL, "sentinel", "Worklist sentinel"); static MALLOC_DEFINE(M_SAVEDINO, "savedino", "Saved inodes"); static MALLOC_DEFINE(M_JBLOCKS, "jblocks", "Journal block locations"); static MALLOC_DEFINE(M_MOUNTDATA, "softdep", "Softdep per-mount data"); #define M_SOFTDEP_FLAGS (M_WAITOK) /* * translate from workitem type to memory type * MUST match the defines above, such that memtype[D_XXX] == M_XXX */ static struct malloc_type *memtype[] = { M_PAGEDEP, M_INODEDEP, M_BMSAFEMAP, M_NEWBLK, M_ALLOCDIRECT, M_INDIRDEP, M_ALLOCINDIR, M_FREEFRAG, M_FREEBLKS, M_FREEFILE, M_DIRADD, M_MKDIR, M_DIRREM, M_NEWDIRBLK, M_FREEWORK, M_FREEDEP, M_JADDREF, M_JREMREF, M_JMVREF, M_JNEWBLK, M_JFREEBLK, M_JFREEFRAG, M_JSEG, M_JSEGDEP, M_SBDEP, M_JTRUNC, M_JFSYNC, M_SENTINEL }; #define DtoM(type) (memtype[type]) /* * Names of malloc types. */ #define TYPENAME(type) \ ((unsigned)(type) <= D_LAST ? memtype[type]->ks_shortdesc : "???") /* * End system adaptation definitions. */ #define DOTDOT_OFFSET offsetof(struct dirtemplate, dotdot_ino) #define DOT_OFFSET offsetof(struct dirtemplate, dot_ino) /* * Internal function prototypes. */ static void check_clear_deps(struct mount *); static void softdep_error(char *, int); static int softdep_process_worklist(struct mount *, int); static int softdep_waitidle(struct mount *, int); static void drain_output(struct vnode *); static struct buf *getdirtybuf(struct buf *, struct rwlock *, int); static int check_inodedep_free(struct inodedep *); static void clear_remove(struct mount *); static void clear_inodedeps(struct mount *); static void unlinked_inodedep(struct mount *, struct inodedep *); static void clear_unlinked_inodedep(struct inodedep *); static struct inodedep *first_unlinked_inodedep(struct ufsmount *); static int flush_pagedep_deps(struct vnode *, struct mount *, struct diraddhd *); static int free_pagedep(struct pagedep *); static int flush_newblk_dep(struct vnode *, struct mount *, ufs_lbn_t); static int flush_inodedep_deps(struct vnode *, struct mount *, ino_t); static int flush_deplist(struct allocdirectlst *, int, int *); static int sync_cgs(struct mount *, int); static int handle_written_filepage(struct pagedep *, struct buf *, int); static int handle_written_sbdep(struct sbdep *, struct buf *); static void initiate_write_sbdep(struct sbdep *); static void diradd_inode_written(struct diradd *, struct inodedep *); static int handle_written_indirdep(struct indirdep *, struct buf *, struct buf**, int); static int handle_written_inodeblock(struct inodedep *, struct buf *, int); static int jnewblk_rollforward(struct jnewblk *, struct fs *, struct cg *, uint8_t *); static int handle_written_bmsafemap(struct bmsafemap *, struct buf *, int); static void handle_written_jaddref(struct jaddref *); static void handle_written_jremref(struct jremref *); static void handle_written_jseg(struct jseg *, struct buf *); static void handle_written_jnewblk(struct jnewblk *); static void handle_written_jblkdep(struct jblkdep *); static void handle_written_jfreefrag(struct jfreefrag *); static void complete_jseg(struct jseg *); static void complete_jsegs(struct jseg *); static void jseg_write(struct ufsmount *ump, struct jseg *, uint8_t *); static void jaddref_write(struct jaddref *, struct jseg *, uint8_t *); static void jremref_write(struct jremref *, struct jseg *, uint8_t *); static void jmvref_write(struct jmvref *, struct jseg *, uint8_t *); static void jtrunc_write(struct jtrunc *, struct jseg *, uint8_t *); static void jfsync_write(struct jfsync *, struct jseg *, uint8_t *data); static void jnewblk_write(struct jnewblk *, struct jseg *, uint8_t *); static void jfreeblk_write(struct jfreeblk *, struct jseg *, uint8_t *); static void jfreefrag_write(struct jfreefrag *, struct jseg *, uint8_t *); static inline void inoref_write(struct inoref *, struct jseg *, struct jrefrec *); static void handle_allocdirect_partdone(struct allocdirect *, struct workhead *); static struct jnewblk *cancel_newblk(struct newblk *, struct worklist *, struct workhead *); static void indirdep_complete(struct indirdep *); static int indirblk_lookup(struct mount *, ufs2_daddr_t); static void indirblk_insert(struct freework *); static void indirblk_remove(struct freework *); static void handle_allocindir_partdone(struct allocindir *); static void initiate_write_filepage(struct pagedep *, struct buf *); static void initiate_write_indirdep(struct indirdep*, struct buf *); static void handle_written_mkdir(struct mkdir *, int); static int jnewblk_rollback(struct jnewblk *, struct fs *, struct cg *, uint8_t *); static void initiate_write_bmsafemap(struct bmsafemap *, struct buf *); static void initiate_write_inodeblock_ufs1(struct inodedep *, struct buf *); static void initiate_write_inodeblock_ufs2(struct inodedep *, struct buf *); static void handle_workitem_freefile(struct freefile *); static int handle_workitem_remove(struct dirrem *, int); static struct dirrem *newdirrem(struct buf *, struct inode *, struct inode *, int, struct dirrem **); static struct indirdep *indirdep_lookup(struct mount *, struct inode *, struct buf *); static void cancel_indirdep(struct indirdep *, struct buf *, struct freeblks *); static void free_indirdep(struct indirdep *); static void free_diradd(struct diradd *, struct workhead *); static void merge_diradd(struct inodedep *, struct diradd *); static void complete_diradd(struct diradd *); static struct diradd *diradd_lookup(struct pagedep *, int); static struct jremref *cancel_diradd_dotdot(struct inode *, struct dirrem *, struct jremref *); static struct jremref *cancel_mkdir_dotdot(struct inode *, struct dirrem *, struct jremref *); static void cancel_diradd(struct diradd *, struct dirrem *, struct jremref *, struct jremref *, struct jremref *); static void dirrem_journal(struct dirrem *, struct jremref *, struct jremref *, struct jremref *); static void cancel_allocindir(struct allocindir *, struct buf *bp, struct freeblks *, int); static int setup_trunc_indir(struct freeblks *, struct inode *, ufs_lbn_t, ufs_lbn_t, ufs2_daddr_t); static void complete_trunc_indir(struct freework *); static void trunc_indirdep(struct indirdep *, struct freeblks *, struct buf *, int); static void complete_mkdir(struct mkdir *); static void free_newdirblk(struct newdirblk *); static void free_jremref(struct jremref *); static void free_jaddref(struct jaddref *); static void free_jsegdep(struct jsegdep *); static void free_jsegs(struct jblocks *); static void rele_jseg(struct jseg *); static void free_jseg(struct jseg *, struct jblocks *); static void free_jnewblk(struct jnewblk *); static void free_jblkdep(struct jblkdep *); static void free_jfreefrag(struct jfreefrag *); static void free_freedep(struct freedep *); static void journal_jremref(struct dirrem *, struct jremref *, struct inodedep *); static void cancel_jnewblk(struct jnewblk *, struct workhead *); static int cancel_jaddref(struct jaddref *, struct inodedep *, struct workhead *); static void cancel_jfreefrag(struct jfreefrag *); static inline void setup_freedirect(struct freeblks *, struct inode *, int, int); static inline void setup_freeext(struct freeblks *, struct inode *, int, int); static inline void setup_freeindir(struct freeblks *, struct inode *, int, ufs_lbn_t, int); static inline struct freeblks *newfreeblks(struct mount *, struct inode *); static void freeblks_free(struct ufsmount *, struct freeblks *, int); static void indir_trunc(struct freework *, ufs2_daddr_t, ufs_lbn_t); static ufs2_daddr_t blkcount(struct fs *, ufs2_daddr_t, off_t); static int trunc_check_buf(struct buf *, int *, ufs_lbn_t, int, int); static void trunc_dependencies(struct inode *, struct freeblks *, ufs_lbn_t, int, int); static void trunc_pages(struct inode *, off_t, ufs2_daddr_t, int); static int cancel_pagedep(struct pagedep *, struct freeblks *, int); static int deallocate_dependencies(struct buf *, struct freeblks *, int); static void newblk_freefrag(struct newblk*); static void free_newblk(struct newblk *); static void cancel_allocdirect(struct allocdirectlst *, struct allocdirect *, struct freeblks *); static int check_inode_unwritten(struct inodedep *); static int free_inodedep(struct inodedep *); static void freework_freeblock(struct freework *); static void freework_enqueue(struct freework *); static int handle_workitem_freeblocks(struct freeblks *, int); static int handle_complete_freeblocks(struct freeblks *, int); static void handle_workitem_indirblk(struct freework *); static void handle_written_freework(struct freework *); static void merge_inode_lists(struct allocdirectlst *,struct allocdirectlst *); static struct worklist *jnewblk_merge(struct worklist *, struct worklist *, struct workhead *); static struct freefrag *setup_allocindir_phase2(struct buf *, struct inode *, struct inodedep *, struct allocindir *, ufs_lbn_t); static struct allocindir *newallocindir(struct inode *, int, ufs2_daddr_t, ufs2_daddr_t, ufs_lbn_t); static void handle_workitem_freefrag(struct freefrag *); static struct freefrag *newfreefrag(struct inode *, ufs2_daddr_t, long, ufs_lbn_t); static void allocdirect_merge(struct allocdirectlst *, struct allocdirect *, struct allocdirect *); static struct freefrag *allocindir_merge(struct allocindir *, struct allocindir *); static int bmsafemap_find(struct bmsafemap_hashhead *, int, struct bmsafemap **); static struct bmsafemap *bmsafemap_lookup(struct mount *, struct buf *, int cg, struct bmsafemap *); static int newblk_find(struct newblk_hashhead *, ufs2_daddr_t, int, struct newblk **); static int newblk_lookup(struct mount *, ufs2_daddr_t, int, struct newblk **); static int inodedep_find(struct inodedep_hashhead *, ino_t, struct inodedep **); static int inodedep_lookup(struct mount *, ino_t, int, struct inodedep **); static int pagedep_lookup(struct mount *, struct buf *bp, ino_t, ufs_lbn_t, int, struct pagedep **); static int pagedep_find(struct pagedep_hashhead *, ino_t, ufs_lbn_t, struct pagedep **); static void pause_timer(void *); static int request_cleanup(struct mount *, int); static int softdep_request_cleanup_flush(struct mount *, struct ufsmount *); static void schedule_cleanup(struct mount *); static void softdep_ast_cleanup_proc(struct thread *); static int process_worklist_item(struct mount *, int, int); static void process_removes(struct vnode *); static void process_truncates(struct vnode *); static void jwork_move(struct workhead *, struct workhead *); static void jwork_insert(struct workhead *, struct jsegdep *); static void add_to_worklist(struct worklist *, int); static void wake_worklist(struct worklist *); static void wait_worklist(struct worklist *, char *); static void remove_from_worklist(struct worklist *); static void softdep_flush(void *); static void softdep_flushjournal(struct mount *); static int softdep_speedup(struct ufsmount *); static void worklist_speedup(struct mount *); static int journal_mount(struct mount *, struct fs *, struct ucred *); static void journal_unmount(struct ufsmount *); static int journal_space(struct ufsmount *, int); static void journal_suspend(struct ufsmount *); static int journal_unsuspend(struct ufsmount *ump); static void softdep_prelink(struct vnode *, struct vnode *); static void add_to_journal(struct worklist *); static void remove_from_journal(struct worklist *); static bool softdep_excess_items(struct ufsmount *, int); static void softdep_process_journal(struct mount *, struct worklist *, int); static struct jremref *newjremref(struct dirrem *, struct inode *, struct inode *ip, off_t, nlink_t); static struct jaddref *newjaddref(struct inode *, ino_t, off_t, int16_t, uint16_t); static inline void newinoref(struct inoref *, ino_t, ino_t, off_t, nlink_t, uint16_t); static inline struct jsegdep *inoref_jseg(struct inoref *); static struct jmvref *newjmvref(struct inode *, ino_t, off_t, off_t); static struct jfreeblk *newjfreeblk(struct freeblks *, ufs_lbn_t, ufs2_daddr_t, int); static void adjust_newfreework(struct freeblks *, int); static struct jtrunc *newjtrunc(struct freeblks *, off_t, int); static void move_newblock_dep(struct jaddref *, struct inodedep *); static void cancel_jfreeblk(struct freeblks *, ufs2_daddr_t); static struct jfreefrag *newjfreefrag(struct freefrag *, struct inode *, ufs2_daddr_t, long, ufs_lbn_t); static struct freework *newfreework(struct ufsmount *, struct freeblks *, struct freework *, ufs_lbn_t, ufs2_daddr_t, int, int, int); static int jwait(struct worklist *, int); static struct inodedep *inodedep_lookup_ip(struct inode *); static int bmsafemap_backgroundwrite(struct bmsafemap *, struct buf *); static struct freefile *handle_bufwait(struct inodedep *, struct workhead *); static void handle_jwork(struct workhead *); static struct mkdir *setup_newdir(struct diradd *, ino_t, ino_t, struct buf *, struct mkdir **); static struct jblocks *jblocks_create(void); static ufs2_daddr_t jblocks_alloc(struct jblocks *, int, int *); static void jblocks_free(struct jblocks *, struct mount *, int); static void jblocks_destroy(struct jblocks *); static void jblocks_add(struct jblocks *, ufs2_daddr_t, int); /* * Exported softdep operations. */ static void softdep_disk_io_initiation(struct buf *); static void softdep_disk_write_complete(struct buf *); static void softdep_deallocate_dependencies(struct buf *); static int softdep_count_dependencies(struct buf *bp, int); /* * Global lock over all of soft updates. */ static struct mtx lk; MTX_SYSINIT(softdep_lock, &lk, "Global Softdep Lock", MTX_DEF); #define ACQUIRE_GBLLOCK(lk) mtx_lock(lk) #define FREE_GBLLOCK(lk) mtx_unlock(lk) #define GBLLOCK_OWNED(lk) mtx_assert((lk), MA_OWNED) /* * Per-filesystem soft-updates locking. */ #define LOCK_PTR(ump) (&(ump)->um_softdep->sd_fslock) #define TRY_ACQUIRE_LOCK(ump) rw_try_wlock(&(ump)->um_softdep->sd_fslock) #define ACQUIRE_LOCK(ump) rw_wlock(&(ump)->um_softdep->sd_fslock) #define FREE_LOCK(ump) rw_wunlock(&(ump)->um_softdep->sd_fslock) #define LOCK_OWNED(ump) rw_assert(&(ump)->um_softdep->sd_fslock, \ RA_WLOCKED) #define BUF_AREC(bp) lockallowrecurse(&(bp)->b_lock) #define BUF_NOREC(bp) lockdisablerecurse(&(bp)->b_lock) /* * Worklist queue management. * These routines require that the lock be held. */ #ifndef /* NOT */ DEBUG #define WORKLIST_INSERT(head, item) do { \ (item)->wk_state |= ONWORKLIST; \ LIST_INSERT_HEAD(head, item, wk_list); \ } while (0) #define WORKLIST_REMOVE(item) do { \ (item)->wk_state &= ~ONWORKLIST; \ LIST_REMOVE(item, wk_list); \ } while (0) #define WORKLIST_INSERT_UNLOCKED WORKLIST_INSERT #define WORKLIST_REMOVE_UNLOCKED WORKLIST_REMOVE #else /* DEBUG */ static void worklist_insert(struct workhead *, struct worklist *, int); static void worklist_remove(struct worklist *, int); #define WORKLIST_INSERT(head, item) worklist_insert(head, item, 1) #define WORKLIST_INSERT_UNLOCKED(head, item) worklist_insert(head, item, 0) #define WORKLIST_REMOVE(item) worklist_remove(item, 1) #define WORKLIST_REMOVE_UNLOCKED(item) worklist_remove(item, 0) static void worklist_insert(head, item, locked) struct workhead *head; struct worklist *item; int locked; { if (locked) LOCK_OWNED(VFSTOUFS(item->wk_mp)); if (item->wk_state & ONWORKLIST) panic("worklist_insert: %p %s(0x%X) already on list", item, TYPENAME(item->wk_type), item->wk_state); item->wk_state |= ONWORKLIST; LIST_INSERT_HEAD(head, item, wk_list); } static void worklist_remove(item, locked) struct worklist *item; int locked; { if (locked) LOCK_OWNED(VFSTOUFS(item->wk_mp)); if ((item->wk_state & ONWORKLIST) == 0) panic("worklist_remove: %p %s(0x%X) not on list", item, TYPENAME(item->wk_type), item->wk_state); item->wk_state &= ~ONWORKLIST; LIST_REMOVE(item, wk_list); } #endif /* DEBUG */ /* * Merge two jsegdeps keeping only the oldest one as newer references * can't be discarded until after older references. */ static inline struct jsegdep * jsegdep_merge(struct jsegdep *one, struct jsegdep *two) { struct jsegdep *swp; if (two == NULL) return (one); if (one->jd_seg->js_seq > two->jd_seg->js_seq) { swp = one; one = two; two = swp; } WORKLIST_REMOVE(&two->jd_list); free_jsegdep(two); return (one); } /* * If two freedeps are compatible free one to reduce list size. */ static inline struct freedep * freedep_merge(struct freedep *one, struct freedep *two) { if (two == NULL) return (one); if (one->fd_freework == two->fd_freework) { WORKLIST_REMOVE(&two->fd_list); free_freedep(two); } return (one); } /* * Move journal work from one list to another. Duplicate freedeps and * jsegdeps are coalesced to keep the lists as small as possible. */ static void jwork_move(dst, src) struct workhead *dst; struct workhead *src; { struct freedep *freedep; struct jsegdep *jsegdep; struct worklist *wkn; struct worklist *wk; KASSERT(dst != src, ("jwork_move: dst == src")); freedep = NULL; jsegdep = NULL; LIST_FOREACH_SAFE(wk, dst, wk_list, wkn) { if (wk->wk_type == D_JSEGDEP) jsegdep = jsegdep_merge(WK_JSEGDEP(wk), jsegdep); else if (wk->wk_type == D_FREEDEP) freedep = freedep_merge(WK_FREEDEP(wk), freedep); } while ((wk = LIST_FIRST(src)) != NULL) { WORKLIST_REMOVE(wk); WORKLIST_INSERT(dst, wk); if (wk->wk_type == D_JSEGDEP) { jsegdep = jsegdep_merge(WK_JSEGDEP(wk), jsegdep); continue; } if (wk->wk_type == D_FREEDEP) freedep = freedep_merge(WK_FREEDEP(wk), freedep); } } static void jwork_insert(dst, jsegdep) struct workhead *dst; struct jsegdep *jsegdep; { struct jsegdep *jsegdepn; struct worklist *wk; LIST_FOREACH(wk, dst, wk_list) if (wk->wk_type == D_JSEGDEP) break; if (wk == NULL) { WORKLIST_INSERT(dst, &jsegdep->jd_list); return; } jsegdepn = WK_JSEGDEP(wk); if (jsegdep->jd_seg->js_seq < jsegdepn->jd_seg->js_seq) { WORKLIST_REMOVE(wk); free_jsegdep(jsegdepn); WORKLIST_INSERT(dst, &jsegdep->jd_list); } else free_jsegdep(jsegdep); } /* * Routines for tracking and managing workitems. */ static void workitem_free(struct worklist *, int); static void workitem_alloc(struct worklist *, int, struct mount *); static void workitem_reassign(struct worklist *, int); #define WORKITEM_FREE(item, type) \ workitem_free((struct worklist *)(item), (type)) #define WORKITEM_REASSIGN(item, type) \ workitem_reassign((struct worklist *)(item), (type)) static void workitem_free(item, type) struct worklist *item; int type; { struct ufsmount *ump; #ifdef DEBUG if (item->wk_state & ONWORKLIST) panic("workitem_free: %s(0x%X) still on list", TYPENAME(item->wk_type), item->wk_state); if (item->wk_type != type && type != D_NEWBLK) panic("workitem_free: type mismatch %s != %s", TYPENAME(item->wk_type), TYPENAME(type)); #endif if (item->wk_state & IOWAITING) wakeup(item); ump = VFSTOUFS(item->wk_mp); LOCK_OWNED(ump); KASSERT(ump->softdep_deps > 0, ("workitem_free: %s: softdep_deps going negative", ump->um_fs->fs_fsmnt)); if (--ump->softdep_deps == 0 && ump->softdep_req) wakeup(&ump->softdep_deps); KASSERT(dep_current[item->wk_type] > 0, ("workitem_free: %s: dep_current[%s] going negative", ump->um_fs->fs_fsmnt, TYPENAME(item->wk_type))); KASSERT(ump->softdep_curdeps[item->wk_type] > 0, ("workitem_free: %s: softdep_curdeps[%s] going negative", ump->um_fs->fs_fsmnt, TYPENAME(item->wk_type))); atomic_subtract_long(&dep_current[item->wk_type], 1); ump->softdep_curdeps[item->wk_type] -= 1; free(item, DtoM(type)); } static void workitem_alloc(item, type, mp) struct worklist *item; int type; struct mount *mp; { struct ufsmount *ump; item->wk_type = type; item->wk_mp = mp; item->wk_state = 0; ump = VFSTOUFS(mp); ACQUIRE_GBLLOCK(&lk); dep_current[type]++; if (dep_current[type] > dep_highuse[type]) dep_highuse[type] = dep_current[type]; dep_total[type]++; FREE_GBLLOCK(&lk); ACQUIRE_LOCK(ump); ump->softdep_curdeps[type] += 1; ump->softdep_deps++; ump->softdep_accdeps++; FREE_LOCK(ump); } static void workitem_reassign(item, newtype) struct worklist *item; int newtype; { struct ufsmount *ump; ump = VFSTOUFS(item->wk_mp); LOCK_OWNED(ump); KASSERT(ump->softdep_curdeps[item->wk_type] > 0, ("workitem_reassign: %s: softdep_curdeps[%s] going negative", VFSTOUFS(item->wk_mp)->um_fs->fs_fsmnt, TYPENAME(item->wk_type))); ump->softdep_curdeps[item->wk_type] -= 1; ump->softdep_curdeps[newtype] += 1; KASSERT(dep_current[item->wk_type] > 0, ("workitem_reassign: %s: dep_current[%s] going negative", VFSTOUFS(item->wk_mp)->um_fs->fs_fsmnt, TYPENAME(item->wk_type))); ACQUIRE_GBLLOCK(&lk); dep_current[newtype]++; dep_current[item->wk_type]--; if (dep_current[newtype] > dep_highuse[newtype]) dep_highuse[newtype] = dep_current[newtype]; dep_total[newtype]++; FREE_GBLLOCK(&lk); item->wk_type = newtype; } /* * Workitem queue management */ static int max_softdeps; /* maximum number of structs before slowdown */ static int tickdelay = 2; /* number of ticks to pause during slowdown */ static int proc_waiting; /* tracks whether we have a timeout posted */ static int *stat_countp; /* statistic to count in proc_waiting timeout */ static struct callout softdep_callout; static int req_clear_inodedeps; /* syncer process flush some inodedeps */ static int req_clear_remove; /* syncer process flush some freeblks */ static int softdep_flushcache = 0; /* Should we do BIO_FLUSH? */ /* * runtime statistics */ static int stat_flush_threads; /* number of softdep flushing threads */ static int stat_worklist_push; /* number of worklist cleanups */ static int stat_blk_limit_push; /* number of times block limit neared */ static int stat_ino_limit_push; /* number of times inode limit neared */ static int stat_blk_limit_hit; /* number of times block slowdown imposed */ static int stat_ino_limit_hit; /* number of times inode slowdown imposed */ static int stat_sync_limit_hit; /* number of synchronous slowdowns imposed */ static int stat_indir_blk_ptrs; /* bufs redirtied as indir ptrs not written */ static int stat_inode_bitmap; /* bufs redirtied as inode bitmap not written */ static int stat_direct_blk_ptrs;/* bufs redirtied as direct ptrs not written */ static int stat_dir_entry; /* bufs redirtied as dir entry cannot write */ static int stat_jaddref; /* bufs redirtied as ino bitmap can not write */ static int stat_jnewblk; /* bufs redirtied as blk bitmap can not write */ static int stat_journal_min; /* Times hit journal min threshold */ static int stat_journal_low; /* Times hit journal low threshold */ static int stat_journal_wait; /* Times blocked in jwait(). */ static int stat_jwait_filepage; /* Times blocked in jwait() for filepage. */ static int stat_jwait_freeblks; /* Times blocked in jwait() for freeblks. */ static int stat_jwait_inode; /* Times blocked in jwait() for inodes. */ static int stat_jwait_newblk; /* Times blocked in jwait() for newblks. */ static int stat_cleanup_high_delay; /* Maximum cleanup delay (in ticks) */ static int stat_cleanup_blkrequests; /* Number of block cleanup requests */ static int stat_cleanup_inorequests; /* Number of inode cleanup requests */ static int stat_cleanup_retries; /* Number of cleanups that needed to flush */ static int stat_cleanup_failures; /* Number of cleanup requests that failed */ static int stat_emptyjblocks; /* Number of potentially empty journal blocks */ SYSCTL_INT(_debug_softdep, OID_AUTO, max_softdeps, CTLFLAG_RW, &max_softdeps, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, tickdelay, CTLFLAG_RW, &tickdelay, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, flush_threads, CTLFLAG_RD, &stat_flush_threads, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, worklist_push, CTLFLAG_RW, &stat_worklist_push, 0,""); SYSCTL_INT(_debug_softdep, OID_AUTO, blk_limit_push, CTLFLAG_RW, &stat_blk_limit_push, 0,""); SYSCTL_INT(_debug_softdep, OID_AUTO, ino_limit_push, CTLFLAG_RW, &stat_ino_limit_push, 0,""); SYSCTL_INT(_debug_softdep, OID_AUTO, blk_limit_hit, CTLFLAG_RW, &stat_blk_limit_hit, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, ino_limit_hit, CTLFLAG_RW, &stat_ino_limit_hit, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, sync_limit_hit, CTLFLAG_RW, &stat_sync_limit_hit, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, indir_blk_ptrs, CTLFLAG_RW, &stat_indir_blk_ptrs, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, inode_bitmap, CTLFLAG_RW, &stat_inode_bitmap, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, direct_blk_ptrs, CTLFLAG_RW, &stat_direct_blk_ptrs, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, dir_entry, CTLFLAG_RW, &stat_dir_entry, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, jaddref_rollback, CTLFLAG_RW, &stat_jaddref, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, jnewblk_rollback, CTLFLAG_RW, &stat_jnewblk, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, journal_low, CTLFLAG_RW, &stat_journal_low, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, journal_min, CTLFLAG_RW, &stat_journal_min, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, journal_wait, CTLFLAG_RW, &stat_journal_wait, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, jwait_filepage, CTLFLAG_RW, &stat_jwait_filepage, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, jwait_freeblks, CTLFLAG_RW, &stat_jwait_freeblks, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, jwait_inode, CTLFLAG_RW, &stat_jwait_inode, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, jwait_newblk, CTLFLAG_RW, &stat_jwait_newblk, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, cleanup_blkrequests, CTLFLAG_RW, &stat_cleanup_blkrequests, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, cleanup_inorequests, CTLFLAG_RW, &stat_cleanup_inorequests, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, cleanup_high_delay, CTLFLAG_RW, &stat_cleanup_high_delay, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, cleanup_retries, CTLFLAG_RW, &stat_cleanup_retries, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, cleanup_failures, CTLFLAG_RW, &stat_cleanup_failures, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, flushcache, CTLFLAG_RW, &softdep_flushcache, 0, ""); SYSCTL_INT(_debug_softdep, OID_AUTO, emptyjblocks, CTLFLAG_RD, &stat_emptyjblocks, 0, ""); SYSCTL_DECL(_vfs_ffs); /* Whether to recompute the summary at mount time */ static int compute_summary_at_mount = 0; SYSCTL_INT(_vfs_ffs, OID_AUTO, compute_summary_at_mount, CTLFLAG_RW, &compute_summary_at_mount, 0, "Recompute summary at mount"); static int print_threads = 0; SYSCTL_INT(_debug_softdep, OID_AUTO, print_threads, CTLFLAG_RW, &print_threads, 0, "Notify flusher thread start/stop"); /* List of all filesystems mounted with soft updates */ static TAILQ_HEAD(, mount_softdeps) softdepmounts; /* * This function cleans the worklist for a filesystem. * Each filesystem running with soft dependencies gets its own * thread to run in this function. The thread is started up in * softdep_mount and shutdown in softdep_unmount. They show up * as part of the kernel "bufdaemon" process whose process * entry is available in bufdaemonproc. */ static int searchfailed; extern struct proc *bufdaemonproc; static void softdep_flush(addr) void *addr; { struct mount *mp; struct thread *td; struct ufsmount *ump; td = curthread; td->td_pflags |= TDP_NORUNNINGBUF; mp = (struct mount *)addr; ump = VFSTOUFS(mp); atomic_add_int(&stat_flush_threads, 1); ACQUIRE_LOCK(ump); ump->softdep_flags &= ~FLUSH_STARTING; wakeup(&ump->softdep_flushtd); FREE_LOCK(ump); if (print_threads) { if (stat_flush_threads == 1) printf("Running %s at pid %d\n", bufdaemonproc->p_comm, bufdaemonproc->p_pid); printf("Start thread %s\n", td->td_name); } for (;;) { while (softdep_process_worklist(mp, 0) > 0 || (MOUNTEDSUJ(mp) && VFSTOUFS(mp)->softdep_jblocks->jb_suspended)) kthread_suspend_check(); ACQUIRE_LOCK(ump); if ((ump->softdep_flags & (FLUSH_CLEANUP | FLUSH_EXIT)) == 0) msleep(&ump->softdep_flushtd, LOCK_PTR(ump), PVM, "sdflush", hz / 2); ump->softdep_flags &= ~FLUSH_CLEANUP; /* * Check to see if we are done and need to exit. */ if ((ump->softdep_flags & FLUSH_EXIT) == 0) { FREE_LOCK(ump); continue; } ump->softdep_flags &= ~FLUSH_EXIT; FREE_LOCK(ump); wakeup(&ump->softdep_flags); if (print_threads) printf("Stop thread %s: searchfailed %d, did cleanups %d\n", td->td_name, searchfailed, ump->um_softdep->sd_cleanups); atomic_subtract_int(&stat_flush_threads, 1); kthread_exit(); panic("kthread_exit failed\n"); } } static void worklist_speedup(mp) struct mount *mp; { struct ufsmount *ump; ump = VFSTOUFS(mp); LOCK_OWNED(ump); if ((ump->softdep_flags & (FLUSH_CLEANUP | FLUSH_EXIT)) == 0) ump->softdep_flags |= FLUSH_CLEANUP; wakeup(&ump->softdep_flushtd); } static int softdep_speedup(ump) struct ufsmount *ump; { struct ufsmount *altump; struct mount_softdeps *sdp; LOCK_OWNED(ump); worklist_speedup(ump->um_mountp); bd_speedup(); /* * If we have global shortages, then we need other * filesystems to help with the cleanup. Here we wakeup a * flusher thread for a filesystem that is over its fair * share of resources. */ if (req_clear_inodedeps || req_clear_remove) { ACQUIRE_GBLLOCK(&lk); TAILQ_FOREACH(sdp, &softdepmounts, sd_next) { if ((altump = sdp->sd_ump) == ump) continue; if (((req_clear_inodedeps && altump->softdep_curdeps[D_INODEDEP] > max_softdeps / stat_flush_threads) || (req_clear_remove && altump->softdep_curdeps[D_DIRREM] > (max_softdeps / 2) / stat_flush_threads)) && TRY_ACQUIRE_LOCK(altump)) break; } if (sdp == NULL) { searchfailed++; FREE_GBLLOCK(&lk); } else { /* * Move to the end of the list so we pick a * different one on out next try. */ TAILQ_REMOVE(&softdepmounts, sdp, sd_next); TAILQ_INSERT_TAIL(&softdepmounts, sdp, sd_next); FREE_GBLLOCK(&lk); if ((altump->softdep_flags & (FLUSH_CLEANUP | FLUSH_EXIT)) == 0) altump->softdep_flags |= FLUSH_CLEANUP; altump->um_softdep->sd_cleanups++; wakeup(&altump->softdep_flushtd); FREE_LOCK(altump); } } return (speedup_syncer()); } /* * Add an item to the end of the work queue. * This routine requires that the lock be held. * This is the only routine that adds items to the list. * The following routine is the only one that removes items * and does so in order from first to last. */ #define WK_HEAD 0x0001 /* Add to HEAD. */ #define WK_NODELAY 0x0002 /* Process immediately. */ static void add_to_worklist(wk, flags) struct worklist *wk; int flags; { struct ufsmount *ump; ump = VFSTOUFS(wk->wk_mp); LOCK_OWNED(ump); if (wk->wk_state & ONWORKLIST) panic("add_to_worklist: %s(0x%X) already on list", TYPENAME(wk->wk_type), wk->wk_state); wk->wk_state |= ONWORKLIST; if (ump->softdep_on_worklist == 0) { LIST_INSERT_HEAD(&ump->softdep_workitem_pending, wk, wk_list); ump->softdep_worklist_tail = wk; } else if (flags & WK_HEAD) { LIST_INSERT_HEAD(&ump->softdep_workitem_pending, wk, wk_list); } else { LIST_INSERT_AFTER(ump->softdep_worklist_tail, wk, wk_list); ump->softdep_worklist_tail = wk; } ump->softdep_on_worklist += 1; if (flags & WK_NODELAY) worklist_speedup(wk->wk_mp); } /* * Remove the item to be processed. If we are removing the last * item on the list, we need to recalculate the tail pointer. */ static void remove_from_worklist(wk) struct worklist *wk; { struct ufsmount *ump; ump = VFSTOUFS(wk->wk_mp); WORKLIST_REMOVE(wk); if (ump->softdep_worklist_tail == wk) ump->softdep_worklist_tail = (struct worklist *)wk->wk_list.le_prev; ump->softdep_on_worklist -= 1; } static void wake_worklist(wk) struct worklist *wk; { if (wk->wk_state & IOWAITING) { wk->wk_state &= ~IOWAITING; wakeup(wk); } } static void wait_worklist(wk, wmesg) struct worklist *wk; char *wmesg; { struct ufsmount *ump; ump = VFSTOUFS(wk->wk_mp); wk->wk_state |= IOWAITING; msleep(wk, LOCK_PTR(ump), PVM, wmesg, 0); } /* * Process that runs once per second to handle items in the background queue. * * Note that we ensure that everything is done in the order in which they * appear in the queue. The code below depends on this property to ensure * that blocks of a file are freed before the inode itself is freed. This * ordering ensures that no new triples will be generated * until all the old ones have been purged from the dependency lists. */ static int softdep_process_worklist(mp, full) struct mount *mp; int full; { int cnt, matchcnt; struct ufsmount *ump; long starttime; KASSERT(mp != NULL, ("softdep_process_worklist: NULL mp")); if (MOUNTEDSOFTDEP(mp) == 0) return (0); matchcnt = 0; ump = VFSTOUFS(mp); ACQUIRE_LOCK(ump); starttime = time_second; softdep_process_journal(mp, NULL, full ? MNT_WAIT : 0); check_clear_deps(mp); while (ump->softdep_on_worklist > 0) { if ((cnt = process_worklist_item(mp, 10, LK_NOWAIT)) == 0) break; else matchcnt += cnt; check_clear_deps(mp); /* * We do not generally want to stop for buffer space, but if * we are really being a buffer hog, we will stop and wait. */ if (should_yield()) { FREE_LOCK(ump); kern_yield(PRI_USER); bwillwrite(); ACQUIRE_LOCK(ump); } /* * Never allow processing to run for more than one * second. This gives the syncer thread the opportunity * to pause if appropriate. */ if (!full && starttime != time_second) break; } if (full == 0) journal_unsuspend(ump); FREE_LOCK(ump); return (matchcnt); } /* * Process all removes associated with a vnode if we are running out of * journal space. Any other process which attempts to flush these will * be unable as we have the vnodes locked. */ static void process_removes(vp) struct vnode *vp; { struct inodedep *inodedep; struct dirrem *dirrem; struct ufsmount *ump; struct mount *mp; ino_t inum; mp = vp->v_mount; ump = VFSTOUFS(mp); LOCK_OWNED(ump); inum = VTOI(vp)->i_number; for (;;) { top: if (inodedep_lookup(mp, inum, 0, &inodedep) == 0) return; LIST_FOREACH(dirrem, &inodedep->id_dirremhd, dm_inonext) { /* * If another thread is trying to lock this vnode * it will fail but we must wait for it to do so * before we can proceed. */ if (dirrem->dm_state & INPROGRESS) { wait_worklist(&dirrem->dm_list, "pwrwait"); goto top; } if ((dirrem->dm_state & (COMPLETE | ONWORKLIST)) == (COMPLETE | ONWORKLIST)) break; } if (dirrem == NULL) return; remove_from_worklist(&dirrem->dm_list); FREE_LOCK(ump); if (vn_start_secondary_write(NULL, &mp, V_NOWAIT)) panic("process_removes: suspended filesystem"); handle_workitem_remove(dirrem, 0); vn_finished_secondary_write(mp); ACQUIRE_LOCK(ump); } } /* * Process all truncations associated with a vnode if we are running out * of journal space. This is called when the vnode lock is already held * and no other process can clear the truncation. This function returns * a value greater than zero if it did any work. */ static void process_truncates(vp) struct vnode *vp; { struct inodedep *inodedep; struct freeblks *freeblks; struct ufsmount *ump; struct mount *mp; ino_t inum; int cgwait; mp = vp->v_mount; ump = VFSTOUFS(mp); LOCK_OWNED(ump); inum = VTOI(vp)->i_number; for (;;) { if (inodedep_lookup(mp, inum, 0, &inodedep) == 0) return; cgwait = 0; TAILQ_FOREACH(freeblks, &inodedep->id_freeblklst, fb_next) { /* Journal entries not yet written. */ if (!LIST_EMPTY(&freeblks->fb_jblkdephd)) { jwait(&LIST_FIRST( &freeblks->fb_jblkdephd)->jb_list, MNT_WAIT); break; } /* Another thread is executing this item. */ if (freeblks->fb_state & INPROGRESS) { wait_worklist(&freeblks->fb_list, "ptrwait"); break; } /* Freeblks is waiting on a inode write. */ if ((freeblks->fb_state & COMPLETE) == 0) { FREE_LOCK(ump); ffs_update(vp, 1); ACQUIRE_LOCK(ump); break; } if ((freeblks->fb_state & (ALLCOMPLETE | ONWORKLIST)) == (ALLCOMPLETE | ONWORKLIST)) { remove_from_worklist(&freeblks->fb_list); freeblks->fb_state |= INPROGRESS; FREE_LOCK(ump); if (vn_start_secondary_write(NULL, &mp, V_NOWAIT)) panic("process_truncates: " "suspended filesystem"); handle_workitem_freeblocks(freeblks, 0); vn_finished_secondary_write(mp); ACQUIRE_LOCK(ump); break; } if (freeblks->fb_cgwait) cgwait++; } if (cgwait) { FREE_LOCK(ump); sync_cgs(mp, MNT_WAIT); ffs_sync_snap(mp, MNT_WAIT); ACQUIRE_LOCK(ump); continue; } if (freeblks == NULL) break; } return; } /* * Process one item on the worklist. */ static int process_worklist_item(mp, target, flags) struct mount *mp; int target; int flags; { struct worklist sentinel; struct worklist *wk; struct ufsmount *ump; int matchcnt; int error; KASSERT(mp != NULL, ("process_worklist_item: NULL mp")); /* * If we are being called because of a process doing a * copy-on-write, then it is not safe to write as we may * recurse into the copy-on-write routine. */ if (curthread->td_pflags & TDP_COWINPROGRESS) return (-1); PHOLD(curproc); /* Don't let the stack go away. */ ump = VFSTOUFS(mp); LOCK_OWNED(ump); matchcnt = 0; sentinel.wk_mp = NULL; sentinel.wk_type = D_SENTINEL; LIST_INSERT_HEAD(&ump->softdep_workitem_pending, &sentinel, wk_list); for (wk = LIST_NEXT(&sentinel, wk_list); wk != NULL; wk = LIST_NEXT(&sentinel, wk_list)) { if (wk->wk_type == D_SENTINEL) { LIST_REMOVE(&sentinel, wk_list); LIST_INSERT_AFTER(wk, &sentinel, wk_list); continue; } if (wk->wk_state & INPROGRESS) panic("process_worklist_item: %p already in progress.", wk); wk->wk_state |= INPROGRESS; remove_from_worklist(wk); FREE_LOCK(ump); if (vn_start_secondary_write(NULL, &mp, V_NOWAIT)) panic("process_worklist_item: suspended filesystem"); switch (wk->wk_type) { case D_DIRREM: /* removal of a directory entry */ error = handle_workitem_remove(WK_DIRREM(wk), flags); break; case D_FREEBLKS: /* releasing blocks and/or fragments from a file */ error = handle_workitem_freeblocks(WK_FREEBLKS(wk), flags); break; case D_FREEFRAG: /* releasing a fragment when replaced as a file grows */ handle_workitem_freefrag(WK_FREEFRAG(wk)); error = 0; break; case D_FREEFILE: /* releasing an inode when its link count drops to 0 */ handle_workitem_freefile(WK_FREEFILE(wk)); error = 0; break; default: panic("%s_process_worklist: Unknown type %s", "softdep", TYPENAME(wk->wk_type)); /* NOTREACHED */ } vn_finished_secondary_write(mp); ACQUIRE_LOCK(ump); if (error == 0) { if (++matchcnt == target) break; continue; } /* * We have to retry the worklist item later. Wake up any * waiters who may be able to complete it immediately and * add the item back to the head so we don't try to execute * it again. */ wk->wk_state &= ~INPROGRESS; wake_worklist(wk); add_to_worklist(wk, WK_HEAD); } LIST_REMOVE(&sentinel, wk_list); /* Sentinal could've become the tail from remove_from_worklist. */ if (ump->softdep_worklist_tail == &sentinel) ump->softdep_worklist_tail = (struct worklist *)sentinel.wk_list.le_prev; PRELE(curproc); return (matchcnt); } /* * Move dependencies from one buffer to another. */ int softdep_move_dependencies(oldbp, newbp) struct buf *oldbp; struct buf *newbp; { struct worklist *wk, *wktail; struct ufsmount *ump; int dirty; if ((wk = LIST_FIRST(&oldbp->b_dep)) == NULL) return (0); KASSERT(MOUNTEDSOFTDEP(wk->wk_mp) != 0, ("softdep_move_dependencies called on non-softdep filesystem")); dirty = 0; wktail = NULL; ump = VFSTOUFS(wk->wk_mp); ACQUIRE_LOCK(ump); while ((wk = LIST_FIRST(&oldbp->b_dep)) != NULL) { LIST_REMOVE(wk, wk_list); if (wk->wk_type == D_BMSAFEMAP && bmsafemap_backgroundwrite(WK_BMSAFEMAP(wk), newbp)) dirty = 1; if (wktail == NULL) LIST_INSERT_HEAD(&newbp->b_dep, wk, wk_list); else LIST_INSERT_AFTER(wktail, wk, wk_list); wktail = wk; } FREE_LOCK(ump); return (dirty); } /* * Purge the work list of all items associated with a particular mount point. */ int softdep_flushworklist(oldmnt, countp, td) struct mount *oldmnt; int *countp; struct thread *td; { struct vnode *devvp; struct ufsmount *ump; int count, error; /* * Alternately flush the block device associated with the mount * point and process any dependencies that the flushing * creates. We continue until no more worklist dependencies * are found. */ *countp = 0; error = 0; ump = VFSTOUFS(oldmnt); devvp = ump->um_devvp; while ((count = softdep_process_worklist(oldmnt, 1)) > 0) { *countp += count; vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_FSYNC(devvp, MNT_WAIT, td); VOP_UNLOCK(devvp, 0); if (error != 0) break; } return (error); } #define SU_WAITIDLE_RETRIES 20 static int softdep_waitidle(struct mount *mp, int flags __unused) { struct ufsmount *ump; struct vnode *devvp; struct thread *td; int error, i; ump = VFSTOUFS(mp); devvp = ump->um_devvp; td = curthread; error = 0; ACQUIRE_LOCK(ump); for (i = 0; i < SU_WAITIDLE_RETRIES && ump->softdep_deps != 0; i++) { ump->softdep_req = 1; KASSERT((flags & FORCECLOSE) == 0 || ump->softdep_on_worklist == 0, ("softdep_waitidle: work added after flush")); msleep(&ump->softdep_deps, LOCK_PTR(ump), PVM | PDROP, "softdeps", 10 * hz); vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY); error = VOP_FSYNC(devvp, MNT_WAIT, td); VOP_UNLOCK(devvp, 0); ACQUIRE_LOCK(ump); if (error != 0) break; } ump->softdep_req = 0; if (i == SU_WAITIDLE_RETRIES && error == 0 && ump->softdep_deps != 0) { error = EBUSY; printf("softdep_waitidle: Failed to flush worklist for %p\n", mp); } FREE_LOCK(ump); return (error); } /* * Flush all vnodes and worklist items associated with a specified mount point. */ int softdep_flushfiles(oldmnt, flags, td) struct mount *oldmnt; int flags; struct thread *td; { #ifdef QUOTA struct ufsmount *ump; int i; #endif int error, early, depcount, loopcnt, retry_flush_count, retry; int morework; KASSERT(MOUNTEDSOFTDEP(oldmnt) != 0, ("softdep_flushfiles called on non-softdep filesystem")); loopcnt = 10; retry_flush_count = 3; retry_flush: error = 0; /* * Alternately flush the vnodes associated with the mount * point and process any dependencies that the flushing * creates. In theory, this loop can happen at most twice, * but we give it a few extra just to be sure. */ for (; loopcnt > 0; loopcnt--) { /* * Do another flush in case any vnodes were brought in * as part of the cleanup operations. */ early = retry_flush_count == 1 || (oldmnt->mnt_kern_flag & MNTK_UNMOUNT) == 0 ? 0 : EARLYFLUSH; if ((error = ffs_flushfiles(oldmnt, flags | early, td)) != 0) break; if ((error = softdep_flushworklist(oldmnt, &depcount, td)) != 0 || depcount == 0) break; } /* * If we are unmounting then it is an error to fail. If we * are simply trying to downgrade to read-only, then filesystem * activity can keep us busy forever, so we just fail with EBUSY. */ if (loopcnt == 0) { if (oldmnt->mnt_kern_flag & MNTK_UNMOUNT) panic("softdep_flushfiles: looping"); error = EBUSY; } if (!error) error = softdep_waitidle(oldmnt, flags); if (!error) { if (oldmnt->mnt_kern_flag & MNTK_UNMOUNT) { retry = 0; MNT_ILOCK(oldmnt); KASSERT((oldmnt->mnt_kern_flag & MNTK_NOINSMNTQ) != 0, ("softdep_flushfiles: !MNTK_NOINSMNTQ")); morework = oldmnt->mnt_nvnodelistsize > 0; #ifdef QUOTA ump = VFSTOUFS(oldmnt); UFS_LOCK(ump); for (i = 0; i < MAXQUOTAS; i++) { if (ump->um_quotas[i] != NULLVP) morework = 1; } UFS_UNLOCK(ump); #endif if (morework) { if (--retry_flush_count > 0) { retry = 1; loopcnt = 3; } else error = EBUSY; } MNT_IUNLOCK(oldmnt); if (retry) goto retry_flush; } } return (error); } /* * Structure hashing. * * There are four types of structures that can be looked up: * 1) pagedep structures identified by mount point, inode number, * and logical block. * 2) inodedep structures identified by mount point and inode number. * 3) newblk structures identified by mount point and * physical block number. * 4) bmsafemap structures identified by mount point and * cylinder group number. * * The "pagedep" and "inodedep" dependency structures are hashed * separately from the file blocks and inodes to which they correspond. * This separation helps when the in-memory copy of an inode or * file block must be replaced. It also obviates the need to access * an inode or file page when simply updating (or de-allocating) * dependency structures. Lookup of newblk structures is needed to * find newly allocated blocks when trying to associate them with * their allocdirect or allocindir structure. * * The lookup routines optionally create and hash a new instance when * an existing entry is not found. The bmsafemap lookup routine always * allocates a new structure if an existing one is not found. */ #define DEPALLOC 0x0001 /* allocate structure if lookup fails */ /* * Structures and routines associated with pagedep caching. */ #define PAGEDEP_HASH(ump, inum, lbn) \ (&(ump)->pagedep_hashtbl[((inum) + (lbn)) & (ump)->pagedep_hash_size]) static int pagedep_find(pagedephd, ino, lbn, pagedeppp) struct pagedep_hashhead *pagedephd; ino_t ino; ufs_lbn_t lbn; struct pagedep **pagedeppp; { struct pagedep *pagedep; LIST_FOREACH(pagedep, pagedephd, pd_hash) { if (ino == pagedep->pd_ino && lbn == pagedep->pd_lbn) { *pagedeppp = pagedep; return (1); } } *pagedeppp = NULL; return (0); } /* * Look up a pagedep. Return 1 if found, 0 otherwise. * If not found, allocate if DEPALLOC flag is passed. * Found or allocated entry is returned in pagedeppp. * This routine must be called with splbio interrupts blocked. */ static int pagedep_lookup(mp, bp, ino, lbn, flags, pagedeppp) struct mount *mp; struct buf *bp; ino_t ino; ufs_lbn_t lbn; int flags; struct pagedep **pagedeppp; { struct pagedep *pagedep; struct pagedep_hashhead *pagedephd; struct worklist *wk; struct ufsmount *ump; int ret; int i; ump = VFSTOUFS(mp); LOCK_OWNED(ump); if (bp) { LIST_FOREACH(wk, &bp->b_dep, wk_list) { if (wk->wk_type == D_PAGEDEP) { *pagedeppp = WK_PAGEDEP(wk); return (1); } } } pagedephd = PAGEDEP_HASH(ump, ino, lbn); ret = pagedep_find(pagedephd, ino, lbn, pagedeppp); if (ret) { if (((*pagedeppp)->pd_state & ONWORKLIST) == 0 && bp) WORKLIST_INSERT(&bp->b_dep, &(*pagedeppp)->pd_list); return (1); } if ((flags & DEPALLOC) == 0) return (0); FREE_LOCK(ump); pagedep = malloc(sizeof(struct pagedep), M_PAGEDEP, M_SOFTDEP_FLAGS|M_ZERO); workitem_alloc(&pagedep->pd_list, D_PAGEDEP, mp); ACQUIRE_LOCK(ump); ret = pagedep_find(pagedephd, ino, lbn, pagedeppp); if (*pagedeppp) { /* * This should never happen since we only create pagedeps * with the vnode lock held. Could be an assert. */ WORKITEM_FREE(pagedep, D_PAGEDEP); return (ret); } pagedep->pd_ino = ino; pagedep->pd_lbn = lbn; LIST_INIT(&pagedep->pd_dirremhd); LIST_INIT(&pagedep->pd_pendinghd); for (i = 0; i < DAHASHSZ; i++) LIST_INIT(&pagedep->pd_diraddhd[i]); LIST_INSERT_HEAD(pagedephd, pagedep, pd_hash); WORKLIST_INSERT(&bp->b_dep, &pagedep->pd_list); *pagedeppp = pagedep; return (0); } /* * Structures and routines associated with inodedep caching. */ #define INODEDEP_HASH(ump, inum) \ (&(ump)->inodedep_hashtbl[(inum) & (ump)->inodedep_hash_size]) static int inodedep_find(inodedephd, inum, inodedeppp) struct inodedep_hashhead *inodedephd; ino_t inum; struct inodedep **inodedeppp; { struct inodedep *inodedep; LIST_FOREACH(inodedep, inodedephd, id_hash) if (inum == inodedep->id_ino) break; if (inodedep) { *inodedeppp = inodedep; return (1); } *inodedeppp = NULL; return (0); } /* * Look up an inodedep. Return 1 if found, 0 if not found. * If not found, allocate if DEPALLOC flag is passed. * Found or allocated entry is returned in inodedeppp. * This routine must be called with splbio interrupts blocked. */ static int inodedep_lookup(mp, inum, flags, inodedeppp) struct mount *mp; ino_t inum; int flags; struct inodedep **inodedeppp; { struct inodedep *inodedep; struct inodedep_hashhead *inodedephd; struct ufsmount *ump; struct fs *fs; ump = VFSTOUFS(mp); LOCK_OWNED(ump); fs = ump->um_fs; inodedephd = INODEDEP_HASH(ump, inum); if (inodedep_find(inodedephd, inum, inodedeppp)) return (1); if ((flags & DEPALLOC) == 0) return (0); /* * If the system is over its limit and our filesystem is * responsible for more than our share of that usage and * we are not in a rush, request some inodedep cleanup. */ if (softdep_excess_items(ump, D_INODEDEP)) schedule_cleanup(mp); else FREE_LOCK(ump); inodedep = malloc(sizeof(struct inodedep), M_INODEDEP, M_SOFTDEP_FLAGS); workitem_alloc(&inodedep->id_list, D_INODEDEP, mp); ACQUIRE_LOCK(ump); if (inodedep_find(inodedephd, inum, inodedeppp)) { WORKITEM_FREE(inodedep, D_INODEDEP); return (1); } inodedep->id_fs = fs; inodedep->id_ino = inum; inodedep->id_state = ALLCOMPLETE; inodedep->id_nlinkdelta = 0; inodedep->id_savedino1 = NULL; inodedep->id_savedsize = -1; inodedep->id_savedextsize = -1; inodedep->id_savednlink = -1; inodedep->id_bmsafemap = NULL; inodedep->id_mkdiradd = NULL; LIST_INIT(&inodedep->id_dirremhd); LIST_INIT(&inodedep->id_pendinghd); LIST_INIT(&inodedep->id_inowait); LIST_INIT(&inodedep->id_bufwait); TAILQ_INIT(&inodedep->id_inoreflst); TAILQ_INIT(&inodedep->id_inoupdt); TAILQ_INIT(&inodedep->id_newinoupdt); TAILQ_INIT(&inodedep->id_extupdt); TAILQ_INIT(&inodedep->id_newextupdt); TAILQ_INIT(&inodedep->id_freeblklst); LIST_INSERT_HEAD(inodedephd, inodedep, id_hash); *inodedeppp = inodedep; return (0); } /* * Structures and routines associated with newblk caching. */ #define NEWBLK_HASH(ump, inum) \ (&(ump)->newblk_hashtbl[(inum) & (ump)->newblk_hash_size]) static int newblk_find(newblkhd, newblkno, flags, newblkpp) struct newblk_hashhead *newblkhd; ufs2_daddr_t newblkno; int flags; struct newblk **newblkpp; { struct newblk *newblk; LIST_FOREACH(newblk, newblkhd, nb_hash) { if (newblkno != newblk->nb_newblkno) continue; /* * If we're creating a new dependency don't match those that * have already been converted to allocdirects. This is for * a frag extend. */ if ((flags & DEPALLOC) && newblk->nb_list.wk_type != D_NEWBLK) continue; break; } if (newblk) { *newblkpp = newblk; return (1); } *newblkpp = NULL; return (0); } /* * Look up a newblk. Return 1 if found, 0 if not found. * If not found, allocate if DEPALLOC flag is passed. * Found or allocated entry is returned in newblkpp. */ static int newblk_lookup(mp, newblkno, flags, newblkpp) struct mount *mp; ufs2_daddr_t newblkno; int flags; struct newblk **newblkpp; { struct newblk *newblk; struct newblk_hashhead *newblkhd; struct ufsmount *ump; ump = VFSTOUFS(mp); LOCK_OWNED(ump); newblkhd = NEWBLK_HASH(ump, newblkno); if (newblk_find(newblkhd, newblkno, flags, newblkpp)) return (1); if ((flags & DEPALLOC) == 0) return (0); if (softdep_excess_items(ump, D_NEWBLK) || softdep_excess_items(ump, D_ALLOCDIRECT) || softdep_excess_items(ump, D_ALLOCINDIR)) schedule_cleanup(mp); else FREE_LOCK(ump); newblk = malloc(sizeof(union allblk), M_NEWBLK, M_SOFTDEP_FLAGS | M_ZERO); workitem_alloc(&newblk->nb_list, D_NEWBLK, mp); ACQUIRE_LOCK(ump); if (newblk_find(newblkhd, newblkno, flags, newblkpp)) { WORKITEM_FREE(newblk, D_NEWBLK); return (1); } newblk->nb_freefrag = NULL; LIST_INIT(&newblk->nb_indirdeps); LIST_INIT(&newblk->nb_newdirblk); LIST_INIT(&newblk->nb_jwork); newblk->nb_state = ATTACHED; newblk->nb_newblkno = newblkno; LIST_INSERT_HEAD(newblkhd, newblk, nb_hash); *newblkpp = newblk; return (0); } /* * Structures and routines associated with freed indirect block caching. */ #define INDIR_HASH(ump, blkno) \ (&(ump)->indir_hashtbl[(blkno) & (ump)->indir_hash_size]) /* * Lookup an indirect block in the indir hash table. The freework is * removed and potentially freed. The caller must do a blocking journal * write before writing to the blkno. */ static int indirblk_lookup(mp, blkno) struct mount *mp; ufs2_daddr_t blkno; { struct freework *freework; struct indir_hashhead *wkhd; struct ufsmount *ump; ump = VFSTOUFS(mp); wkhd = INDIR_HASH(ump, blkno); TAILQ_FOREACH(freework, wkhd, fw_next) { if (freework->fw_blkno != blkno) continue; indirblk_remove(freework); return (1); } return (0); } /* * Insert an indirect block represented by freework into the indirblk * hash table so that it may prevent the block from being re-used prior * to the journal being written. */ static void indirblk_insert(freework) struct freework *freework; { struct jblocks *jblocks; struct jseg *jseg; struct ufsmount *ump; ump = VFSTOUFS(freework->fw_list.wk_mp); jblocks = ump->softdep_jblocks; jseg = TAILQ_LAST(&jblocks->jb_segs, jseglst); if (jseg == NULL) return; LIST_INSERT_HEAD(&jseg->js_indirs, freework, fw_segs); TAILQ_INSERT_HEAD(INDIR_HASH(ump, freework->fw_blkno), freework, fw_next); freework->fw_state &= ~DEPCOMPLETE; } static void indirblk_remove(freework) struct freework *freework; { struct ufsmount *ump; ump = VFSTOUFS(freework->fw_list.wk_mp); LIST_REMOVE(freework, fw_segs); TAILQ_REMOVE(INDIR_HASH(ump, freework->fw_blkno), freework, fw_next); freework->fw_state |= DEPCOMPLETE; if ((freework->fw_state & ALLCOMPLETE) == ALLCOMPLETE) WORKITEM_FREE(freework, D_FREEWORK); } /* * Executed during filesystem system initialization before * mounting any filesystems. */ void softdep_initialize() { TAILQ_INIT(&softdepmounts); #ifdef __LP64__ max_softdeps = desiredvnodes * 4; #else max_softdeps = desiredvnodes * 2; #endif /* initialise bioops hack */ bioops.io_start = softdep_disk_io_initiation; bioops.io_complete = softdep_disk_write_complete; bioops.io_deallocate = softdep_deallocate_dependencies; bioops.io_countdeps = softdep_count_dependencies; softdep_ast_cleanup = softdep_ast_cleanup_proc; /* Initialize the callout with an mtx. */ callout_init_mtx(&softdep_callout, &lk, 0); } /* * Executed after all filesystems have been unmounted during * filesystem module unload. */ void softdep_uninitialize() { /* clear bioops hack */ bioops.io_start = NULL; bioops.io_complete = NULL; bioops.io_deallocate = NULL; bioops.io_countdeps = NULL; softdep_ast_cleanup = NULL; callout_drain(&softdep_callout); } /* * Called at mount time to notify the dependency code that a * filesystem wishes to use it. */ int softdep_mount(devvp, mp, fs, cred) struct vnode *devvp; struct mount *mp; struct fs *fs; struct ucred *cred; { struct csum_total cstotal; struct mount_softdeps *sdp; struct ufsmount *ump; struct cg *cgp; struct buf *bp; int i, error, cyl; sdp = malloc(sizeof(struct mount_softdeps), M_MOUNTDATA, M_WAITOK | M_ZERO); MNT_ILOCK(mp); mp->mnt_flag = (mp->mnt_flag & ~MNT_ASYNC) | MNT_SOFTDEP; if ((mp->mnt_kern_flag & MNTK_SOFTDEP) == 0) { mp->mnt_kern_flag = (mp->mnt_kern_flag & ~MNTK_ASYNC) | MNTK_SOFTDEP | MNTK_NOASYNC; } ump = VFSTOUFS(mp); ump->um_softdep = sdp; MNT_IUNLOCK(mp); rw_init(LOCK_PTR(ump), "Per-Filesystem Softdep Lock"); sdp->sd_ump = ump; LIST_INIT(&ump->softdep_workitem_pending); LIST_INIT(&ump->softdep_journal_pending); TAILQ_INIT(&ump->softdep_unlinked); LIST_INIT(&ump->softdep_dirtycg); ump->softdep_worklist_tail = NULL; ump->softdep_on_worklist = 0; ump->softdep_deps = 0; LIST_INIT(&ump->softdep_mkdirlisthd); ump->pagedep_hashtbl = hashinit(desiredvnodes / 5, M_PAGEDEP, &ump->pagedep_hash_size); ump->pagedep_nextclean = 0; ump->inodedep_hashtbl = hashinit(desiredvnodes, M_INODEDEP, &ump->inodedep_hash_size); ump->inodedep_nextclean = 0; ump->newblk_hashtbl = hashinit(max_softdeps / 2, M_NEWBLK, &ump->newblk_hash_size); ump->bmsafemap_hashtbl = hashinit(1024, M_BMSAFEMAP, &ump->bmsafemap_hash_size); i = 1 << (ffs(desiredvnodes / 10) - 1); ump->indir_hashtbl = malloc(i * sizeof(struct indir_hashhead), M_FREEWORK, M_WAITOK); ump->indir_hash_size = i - 1; for (i = 0; i <= ump->indir_hash_size; i++) TAILQ_INIT(&ump->indir_hashtbl[i]); ACQUIRE_GBLLOCK(&lk); TAILQ_INSERT_TAIL(&softdepmounts, sdp, sd_next); FREE_GBLLOCK(&lk); if ((fs->fs_flags & FS_SUJ) && (error = journal_mount(mp, fs, cred)) != 0) { printf("Failed to start journal: %d\n", error); softdep_unmount(mp); return (error); } /* * Start our flushing thread in the bufdaemon process. */ ACQUIRE_LOCK(ump); ump->softdep_flags |= FLUSH_STARTING; FREE_LOCK(ump); kproc_kthread_add(&softdep_flush, mp, &bufdaemonproc, &ump->softdep_flushtd, 0, 0, "softdepflush", "%s worker", mp->mnt_stat.f_mntonname); ACQUIRE_LOCK(ump); while ((ump->softdep_flags & FLUSH_STARTING) != 0) { msleep(&ump->softdep_flushtd, LOCK_PTR(ump), PVM, "sdstart", hz / 2); } FREE_LOCK(ump); /* * When doing soft updates, the counters in the * superblock may have gotten out of sync. Recomputation * can take a long time and can be deferred for background * fsck. However, the old behavior of scanning the cylinder * groups and recalculating them at mount time is available * by setting vfs.ffs.compute_summary_at_mount to one. */ if (compute_summary_at_mount == 0 || fs->fs_clean != 0) return (0); bzero(&cstotal, sizeof cstotal); for (cyl = 0; cyl < fs->fs_ncg; cyl++) { if ((error = bread(devvp, fsbtodb(fs, cgtod(fs, cyl)), fs->fs_cgsize, cred, &bp)) != 0) { brelse(bp); softdep_unmount(mp); return (error); } cgp = (struct cg *)bp->b_data; cstotal.cs_nffree += cgp->cg_cs.cs_nffree; cstotal.cs_nbfree += cgp->cg_cs.cs_nbfree; cstotal.cs_nifree += cgp->cg_cs.cs_nifree; cstotal.cs_ndir += cgp->cg_cs.cs_ndir; fs->fs_cs(fs, cyl) = cgp->cg_cs; brelse(bp); } #ifdef DEBUG if (bcmp(&cstotal, &fs->fs_cstotal, sizeof cstotal)) printf("%s: superblock summary recomputed\n", fs->fs_fsmnt); #endif bcopy(&cstotal, &fs->fs_cstotal, sizeof cstotal); return (0); } void softdep_unmount(mp) struct mount *mp; { struct ufsmount *ump; #ifdef INVARIANTS int i; #endif KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_unmount called on non-softdep filesystem")); ump = VFSTOUFS(mp); MNT_ILOCK(mp); mp->mnt_flag &= ~MNT_SOFTDEP; if (MOUNTEDSUJ(mp) == 0) { MNT_IUNLOCK(mp); } else { mp->mnt_flag &= ~MNT_SUJ; MNT_IUNLOCK(mp); journal_unmount(ump); } /* * Shut down our flushing thread. Check for NULL is if * softdep_mount errors out before the thread has been created. */ if (ump->softdep_flushtd != NULL) { ACQUIRE_LOCK(ump); ump->softdep_flags |= FLUSH_EXIT; wakeup(&ump->softdep_flushtd); msleep(&ump->softdep_flags, LOCK_PTR(ump), PVM | PDROP, "sdwait", 0); KASSERT((ump->softdep_flags & FLUSH_EXIT) == 0, ("Thread shutdown failed")); } /* * Free up our resources. */ ACQUIRE_GBLLOCK(&lk); TAILQ_REMOVE(&softdepmounts, ump->um_softdep, sd_next); FREE_GBLLOCK(&lk); rw_destroy(LOCK_PTR(ump)); hashdestroy(ump->pagedep_hashtbl, M_PAGEDEP, ump->pagedep_hash_size); hashdestroy(ump->inodedep_hashtbl, M_INODEDEP, ump->inodedep_hash_size); hashdestroy(ump->newblk_hashtbl, M_NEWBLK, ump->newblk_hash_size); hashdestroy(ump->bmsafemap_hashtbl, M_BMSAFEMAP, ump->bmsafemap_hash_size); free(ump->indir_hashtbl, M_FREEWORK); #ifdef INVARIANTS for (i = 0; i <= D_LAST; i++) KASSERT(ump->softdep_curdeps[i] == 0, ("Unmount %s: Dep type %s != 0 (%ld)", ump->um_fs->fs_fsmnt, TYPENAME(i), ump->softdep_curdeps[i])); #endif free(ump->um_softdep, M_MOUNTDATA); } static struct jblocks * jblocks_create(void) { struct jblocks *jblocks; jblocks = malloc(sizeof(*jblocks), M_JBLOCKS, M_WAITOK | M_ZERO); TAILQ_INIT(&jblocks->jb_segs); jblocks->jb_avail = 10; jblocks->jb_extent = malloc(sizeof(struct jextent) * jblocks->jb_avail, M_JBLOCKS, M_WAITOK | M_ZERO); return (jblocks); } static ufs2_daddr_t jblocks_alloc(jblocks, bytes, actual) struct jblocks *jblocks; int bytes; int *actual; { ufs2_daddr_t daddr; struct jextent *jext; int freecnt; int blocks; blocks = bytes / DEV_BSIZE; jext = &jblocks->jb_extent[jblocks->jb_head]; freecnt = jext->je_blocks - jblocks->jb_off; if (freecnt == 0) { jblocks->jb_off = 0; if (++jblocks->jb_head > jblocks->jb_used) jblocks->jb_head = 0; jext = &jblocks->jb_extent[jblocks->jb_head]; freecnt = jext->je_blocks; } if (freecnt > blocks) freecnt = blocks; *actual = freecnt * DEV_BSIZE; daddr = jext->je_daddr + jblocks->jb_off; jblocks->jb_off += freecnt; jblocks->jb_free -= freecnt; return (daddr); } static void jblocks_free(jblocks, mp, bytes) struct jblocks *jblocks; struct mount *mp; int bytes; { LOCK_OWNED(VFSTOUFS(mp)); jblocks->jb_free += bytes / DEV_BSIZE; if (jblocks->jb_suspended) worklist_speedup(mp); wakeup(jblocks); } static void jblocks_destroy(jblocks) struct jblocks *jblocks; { if (jblocks->jb_extent) free(jblocks->jb_extent, M_JBLOCKS); free(jblocks, M_JBLOCKS); } static void jblocks_add(jblocks, daddr, blocks) struct jblocks *jblocks; ufs2_daddr_t daddr; int blocks; { struct jextent *jext; jblocks->jb_blocks += blocks; jblocks->jb_free += blocks; jext = &jblocks->jb_extent[jblocks->jb_used]; /* Adding the first block. */ if (jext->je_daddr == 0) { jext->je_daddr = daddr; jext->je_blocks = blocks; return; } /* Extending the last extent. */ if (jext->je_daddr + jext->je_blocks == daddr) { jext->je_blocks += blocks; return; } /* Adding a new extent. */ if (++jblocks->jb_used == jblocks->jb_avail) { jblocks->jb_avail *= 2; jext = malloc(sizeof(struct jextent) * jblocks->jb_avail, M_JBLOCKS, M_WAITOK | M_ZERO); memcpy(jext, jblocks->jb_extent, sizeof(struct jextent) * jblocks->jb_used); free(jblocks->jb_extent, M_JBLOCKS); jblocks->jb_extent = jext; } jext = &jblocks->jb_extent[jblocks->jb_used]; jext->je_daddr = daddr; jext->je_blocks = blocks; return; } int softdep_journal_lookup(mp, vpp) struct mount *mp; struct vnode **vpp; { struct componentname cnp; struct vnode *dvp; ino_t sujournal; int error; error = VFS_VGET(mp, UFS_ROOTINO, LK_EXCLUSIVE, &dvp); if (error) return (error); bzero(&cnp, sizeof(cnp)); cnp.cn_nameiop = LOOKUP; cnp.cn_flags = ISLASTCN; cnp.cn_thread = curthread; cnp.cn_cred = curthread->td_ucred; cnp.cn_pnbuf = SUJ_FILE; cnp.cn_nameptr = SUJ_FILE; cnp.cn_namelen = strlen(SUJ_FILE); error = ufs_lookup_ino(dvp, NULL, &cnp, &sujournal); vput(dvp); if (error != 0) return (error); error = VFS_VGET(mp, sujournal, LK_EXCLUSIVE, vpp); return (error); } /* * Open and verify the journal file. */ static int journal_mount(mp, fs, cred) struct mount *mp; struct fs *fs; struct ucred *cred; { struct jblocks *jblocks; struct ufsmount *ump; struct vnode *vp; struct inode *ip; ufs2_daddr_t blkno; int bcount; int error; int i; ump = VFSTOUFS(mp); ump->softdep_journal_tail = NULL; ump->softdep_on_journal = 0; ump->softdep_accdeps = 0; ump->softdep_req = 0; ump->softdep_jblocks = NULL; error = softdep_journal_lookup(mp, &vp); if (error != 0) { printf("Failed to find journal. Use tunefs to create one\n"); return (error); } ip = VTOI(vp); if (ip->i_size < SUJ_MIN) { error = ENOSPC; goto out; } bcount = lblkno(fs, ip->i_size); /* Only use whole blocks. */ jblocks = jblocks_create(); for (i = 0; i < bcount; i++) { error = ufs_bmaparray(vp, i, &blkno, NULL, NULL, NULL); if (error) break; jblocks_add(jblocks, blkno, fsbtodb(fs, fs->fs_frag)); } if (error) { jblocks_destroy(jblocks); goto out; } jblocks->jb_low = jblocks->jb_free / 3; /* Reserve 33%. */ jblocks->jb_min = jblocks->jb_free / 10; /* Suspend at 10%. */ ump->softdep_jblocks = jblocks; out: if (error == 0) { MNT_ILOCK(mp); mp->mnt_flag |= MNT_SUJ; mp->mnt_flag &= ~MNT_SOFTDEP; MNT_IUNLOCK(mp); /* * Only validate the journal contents if the * filesystem is clean, otherwise we write the logs * but they'll never be used. If the filesystem was * still dirty when we mounted it the journal is * invalid and a new journal can only be valid if it * starts from a clean mount. */ if (fs->fs_clean) { DIP_SET(ip, i_modrev, fs->fs_mtime); ip->i_flags |= IN_MODIFIED; ffs_update(vp, 1); } } vput(vp); return (error); } static void journal_unmount(ump) struct ufsmount *ump; { if (ump->softdep_jblocks) jblocks_destroy(ump->softdep_jblocks); ump->softdep_jblocks = NULL; } /* * Called when a journal record is ready to be written. Space is allocated * and the journal entry is created when the journal is flushed to stable * store. */ static void add_to_journal(wk) struct worklist *wk; { struct ufsmount *ump; ump = VFSTOUFS(wk->wk_mp); LOCK_OWNED(ump); if (wk->wk_state & ONWORKLIST) panic("add_to_journal: %s(0x%X) already on list", TYPENAME(wk->wk_type), wk->wk_state); wk->wk_state |= ONWORKLIST | DEPCOMPLETE; if (LIST_EMPTY(&ump->softdep_journal_pending)) { ump->softdep_jblocks->jb_age = ticks; LIST_INSERT_HEAD(&ump->softdep_journal_pending, wk, wk_list); } else LIST_INSERT_AFTER(ump->softdep_journal_tail, wk, wk_list); ump->softdep_journal_tail = wk; ump->softdep_on_journal += 1; } /* * Remove an arbitrary item for the journal worklist maintain the tail * pointer. This happens when a new operation obviates the need to * journal an old operation. */ static void remove_from_journal(wk) struct worklist *wk; { struct ufsmount *ump; ump = VFSTOUFS(wk->wk_mp); LOCK_OWNED(ump); #ifdef SUJ_DEBUG { struct worklist *wkn; LIST_FOREACH(wkn, &ump->softdep_journal_pending, wk_list) if (wkn == wk) break; if (wkn == NULL) panic("remove_from_journal: %p is not in journal", wk); } #endif /* * We emulate a TAILQ to save space in most structures which do not * require TAILQ semantics. Here we must update the tail position * when removing the tail which is not the final entry. This works * only if the worklist linkage are at the beginning of the structure. */ if (ump->softdep_journal_tail == wk) ump->softdep_journal_tail = (struct worklist *)wk->wk_list.le_prev; WORKLIST_REMOVE(wk); ump->softdep_on_journal -= 1; } /* * Check for journal space as well as dependency limits so the prelink * code can throttle both journaled and non-journaled filesystems. * Threshold is 0 for low and 1 for min. */ static int journal_space(ump, thresh) struct ufsmount *ump; int thresh; { struct jblocks *jblocks; int limit, avail; jblocks = ump->softdep_jblocks; if (jblocks == NULL) return (1); /* * We use a tighter restriction here to prevent request_cleanup() * running in threads from running into locks we currently hold. * We have to be over the limit and our filesystem has to be * responsible for more than our share of that usage. */ limit = (max_softdeps / 10) * 9; if (dep_current[D_INODEDEP] > limit && ump->softdep_curdeps[D_INODEDEP] > limit / stat_flush_threads) return (0); if (thresh) thresh = jblocks->jb_min; else thresh = jblocks->jb_low; avail = (ump->softdep_on_journal * JREC_SIZE) / DEV_BSIZE; avail = jblocks->jb_free - avail; return (avail > thresh); } static void journal_suspend(ump) struct ufsmount *ump; { struct jblocks *jblocks; struct mount *mp; mp = UFSTOVFS(ump); jblocks = ump->softdep_jblocks; MNT_ILOCK(mp); if ((mp->mnt_kern_flag & MNTK_SUSPEND) == 0) { stat_journal_min++; mp->mnt_kern_flag |= MNTK_SUSPEND; mp->mnt_susp_owner = ump->softdep_flushtd; } jblocks->jb_suspended = 1; MNT_IUNLOCK(mp); } static int journal_unsuspend(struct ufsmount *ump) { struct jblocks *jblocks; struct mount *mp; mp = UFSTOVFS(ump); jblocks = ump->softdep_jblocks; if (jblocks != NULL && jblocks->jb_suspended && journal_space(ump, jblocks->jb_min)) { jblocks->jb_suspended = 0; FREE_LOCK(ump); mp->mnt_susp_owner = curthread; vfs_write_resume(mp, 0); ACQUIRE_LOCK(ump); return (1); } return (0); } /* * Called before any allocation function to be certain that there is * sufficient space in the journal prior to creating any new records. * Since in the case of block allocation we may have multiple locked * buffers at the time of the actual allocation we can not block * when the journal records are created. Doing so would create a deadlock * if any of these buffers needed to be flushed to reclaim space. Instead * we require a sufficiently large amount of available space such that * each thread in the system could have passed this allocation check and * still have sufficient free space. With 20% of a minimum journal size * of 1MB we have 6553 records available. */ int softdep_prealloc(vp, waitok) struct vnode *vp; int waitok; { struct ufsmount *ump; KASSERT(MOUNTEDSOFTDEP(vp->v_mount) != 0, ("softdep_prealloc called on non-softdep filesystem")); /* * Nothing to do if we are not running journaled soft updates. * If we currently hold the snapshot lock, we must avoid * handling other resources that could cause deadlock. Do not * touch quotas vnode since it is typically recursed with * other vnode locks held. */ if (DOINGSUJ(vp) == 0 || IS_SNAPSHOT(VTOI(vp)) || (vp->v_vflag & VV_SYSTEM) != 0) return (0); ump = VFSTOUFS(vp->v_mount); ACQUIRE_LOCK(ump); if (journal_space(ump, 0)) { FREE_LOCK(ump); return (0); } stat_journal_low++; FREE_LOCK(ump); if (waitok == MNT_NOWAIT) return (ENOSPC); /* * Attempt to sync this vnode once to flush any journal * work attached to it. */ if ((curthread->td_pflags & TDP_COWINPROGRESS) == 0) ffs_syncvnode(vp, waitok, 0); ACQUIRE_LOCK(ump); process_removes(vp); process_truncates(vp); if (journal_space(ump, 0) == 0) { softdep_speedup(ump); if (journal_space(ump, 1) == 0) journal_suspend(ump); } FREE_LOCK(ump); return (0); } /* * Before adjusting a link count on a vnode verify that we have sufficient * journal space. If not, process operations that depend on the currently * locked pair of vnodes to try to flush space as the syncer, buf daemon, * and softdep flush threads can not acquire these locks to reclaim space. */ static void softdep_prelink(dvp, vp) struct vnode *dvp; struct vnode *vp; { struct ufsmount *ump; ump = VFSTOUFS(dvp->v_mount); LOCK_OWNED(ump); /* * Nothing to do if we have sufficient journal space. * If we currently hold the snapshot lock, we must avoid * handling other resources that could cause deadlock. */ if (journal_space(ump, 0) || (vp && IS_SNAPSHOT(VTOI(vp)))) return; stat_journal_low++; FREE_LOCK(ump); if (vp) ffs_syncvnode(vp, MNT_NOWAIT, 0); ffs_syncvnode(dvp, MNT_WAIT, 0); ACQUIRE_LOCK(ump); /* Process vp before dvp as it may create .. removes. */ if (vp) { process_removes(vp); process_truncates(vp); } process_removes(dvp); process_truncates(dvp); softdep_speedup(ump); process_worklist_item(UFSTOVFS(ump), 2, LK_NOWAIT); if (journal_space(ump, 0) == 0) { softdep_speedup(ump); if (journal_space(ump, 1) == 0) journal_suspend(ump); } } static void jseg_write(ump, jseg, data) struct ufsmount *ump; struct jseg *jseg; uint8_t *data; { struct jsegrec *rec; rec = (struct jsegrec *)data; rec->jsr_seq = jseg->js_seq; rec->jsr_oldest = jseg->js_oldseq; rec->jsr_cnt = jseg->js_cnt; rec->jsr_blocks = jseg->js_size / ump->um_devvp->v_bufobj.bo_bsize; rec->jsr_crc = 0; rec->jsr_time = ump->um_fs->fs_mtime; } static inline void inoref_write(inoref, jseg, rec) struct inoref *inoref; struct jseg *jseg; struct jrefrec *rec; { inoref->if_jsegdep->jd_seg = jseg; rec->jr_ino = inoref->if_ino; rec->jr_parent = inoref->if_parent; rec->jr_nlink = inoref->if_nlink; rec->jr_mode = inoref->if_mode; rec->jr_diroff = inoref->if_diroff; } static void jaddref_write(jaddref, jseg, data) struct jaddref *jaddref; struct jseg *jseg; uint8_t *data; { struct jrefrec *rec; rec = (struct jrefrec *)data; rec->jr_op = JOP_ADDREF; inoref_write(&jaddref->ja_ref, jseg, rec); } static void jremref_write(jremref, jseg, data) struct jremref *jremref; struct jseg *jseg; uint8_t *data; { struct jrefrec *rec; rec = (struct jrefrec *)data; rec->jr_op = JOP_REMREF; inoref_write(&jremref->jr_ref, jseg, rec); } static void jmvref_write(jmvref, jseg, data) struct jmvref *jmvref; struct jseg *jseg; uint8_t *data; { struct jmvrec *rec; rec = (struct jmvrec *)data; rec->jm_op = JOP_MVREF; rec->jm_ino = jmvref->jm_ino; rec->jm_parent = jmvref->jm_parent; rec->jm_oldoff = jmvref->jm_oldoff; rec->jm_newoff = jmvref->jm_newoff; } static void jnewblk_write(jnewblk, jseg, data) struct jnewblk *jnewblk; struct jseg *jseg; uint8_t *data; { struct jblkrec *rec; jnewblk->jn_jsegdep->jd_seg = jseg; rec = (struct jblkrec *)data; rec->jb_op = JOP_NEWBLK; rec->jb_ino = jnewblk->jn_ino; rec->jb_blkno = jnewblk->jn_blkno; rec->jb_lbn = jnewblk->jn_lbn; rec->jb_frags = jnewblk->jn_frags; rec->jb_oldfrags = jnewblk->jn_oldfrags; } static void jfreeblk_write(jfreeblk, jseg, data) struct jfreeblk *jfreeblk; struct jseg *jseg; uint8_t *data; { struct jblkrec *rec; jfreeblk->jf_dep.jb_jsegdep->jd_seg = jseg; rec = (struct jblkrec *)data; rec->jb_op = JOP_FREEBLK; rec->jb_ino = jfreeblk->jf_ino; rec->jb_blkno = jfreeblk->jf_blkno; rec->jb_lbn = jfreeblk->jf_lbn; rec->jb_frags = jfreeblk->jf_frags; rec->jb_oldfrags = 0; } static void jfreefrag_write(jfreefrag, jseg, data) struct jfreefrag *jfreefrag; struct jseg *jseg; uint8_t *data; { struct jblkrec *rec; jfreefrag->fr_jsegdep->jd_seg = jseg; rec = (struct jblkrec *)data; rec->jb_op = JOP_FREEBLK; rec->jb_ino = jfreefrag->fr_ino; rec->jb_blkno = jfreefrag->fr_blkno; rec->jb_lbn = jfreefrag->fr_lbn; rec->jb_frags = jfreefrag->fr_frags; rec->jb_oldfrags = 0; } static void jtrunc_write(jtrunc, jseg, data) struct jtrunc *jtrunc; struct jseg *jseg; uint8_t *data; { struct jtrncrec *rec; jtrunc->jt_dep.jb_jsegdep->jd_seg = jseg; rec = (struct jtrncrec *)data; rec->jt_op = JOP_TRUNC; rec->jt_ino = jtrunc->jt_ino; rec->jt_size = jtrunc->jt_size; rec->jt_extsize = jtrunc->jt_extsize; } static void jfsync_write(jfsync, jseg, data) struct jfsync *jfsync; struct jseg *jseg; uint8_t *data; { struct jtrncrec *rec; rec = (struct jtrncrec *)data; rec->jt_op = JOP_SYNC; rec->jt_ino = jfsync->jfs_ino; rec->jt_size = jfsync->jfs_size; rec->jt_extsize = jfsync->jfs_extsize; } static void softdep_flushjournal(mp) struct mount *mp; { struct jblocks *jblocks; struct ufsmount *ump; if (MOUNTEDSUJ(mp) == 0) return; ump = VFSTOUFS(mp); jblocks = ump->softdep_jblocks; ACQUIRE_LOCK(ump); while (ump->softdep_on_journal) { jblocks->jb_needseg = 1; softdep_process_journal(mp, NULL, MNT_WAIT); } FREE_LOCK(ump); } static void softdep_synchronize_completed(struct bio *); static void softdep_synchronize(struct bio *, struct ufsmount *, void *); static void softdep_synchronize_completed(bp) struct bio *bp; { struct jseg *oldest; struct jseg *jseg; struct ufsmount *ump; /* * caller1 marks the last segment written before we issued the * synchronize cache. */ jseg = bp->bio_caller1; if (jseg == NULL) { g_destroy_bio(bp); return; } ump = VFSTOUFS(jseg->js_list.wk_mp); ACQUIRE_LOCK(ump); oldest = NULL; /* * Mark all the journal entries waiting on the synchronize cache * as completed so they may continue on. */ while (jseg != NULL && (jseg->js_state & COMPLETE) == 0) { jseg->js_state |= COMPLETE; oldest = jseg; jseg = TAILQ_PREV(jseg, jseglst, js_next); } /* * Restart deferred journal entry processing from the oldest * completed jseg. */ if (oldest) complete_jsegs(oldest); FREE_LOCK(ump); g_destroy_bio(bp); } /* * Send BIO_FLUSH/SYNCHRONIZE CACHE to the device to enforce write ordering * barriers. The journal must be written prior to any blocks that depend * on it and the journal can not be released until the blocks have be * written. This code handles both barriers simultaneously. */ static void softdep_synchronize(bp, ump, caller1) struct bio *bp; struct ufsmount *ump; void *caller1; { bp->bio_cmd = BIO_FLUSH; bp->bio_flags |= BIO_ORDERED; bp->bio_data = NULL; bp->bio_offset = ump->um_cp->provider->mediasize; bp->bio_length = 0; bp->bio_done = softdep_synchronize_completed; bp->bio_caller1 = caller1; g_io_request(bp, (struct g_consumer *)ump->um_devvp->v_bufobj.bo_private); } /* * Flush some journal records to disk. */ static void softdep_process_journal(mp, needwk, flags) struct mount *mp; struct worklist *needwk; int flags; { struct jblocks *jblocks; struct ufsmount *ump; struct worklist *wk; struct jseg *jseg; struct buf *bp; struct bio *bio; uint8_t *data; struct fs *fs; int shouldflush; int segwritten; int jrecmin; /* Minimum records per block. */ int jrecmax; /* Maximum records per block. */ int size; int cnt; int off; int devbsize; if (MOUNTEDSUJ(mp) == 0) return; shouldflush = softdep_flushcache; bio = NULL; jseg = NULL; ump = VFSTOUFS(mp); LOCK_OWNED(ump); fs = ump->um_fs; jblocks = ump->softdep_jblocks; devbsize = ump->um_devvp->v_bufobj.bo_bsize; /* * We write anywhere between a disk block and fs block. The upper * bound is picked to prevent buffer cache fragmentation and limit * processing time per I/O. */ jrecmin = (devbsize / JREC_SIZE) - 1; /* -1 for seg header */ jrecmax = (fs->fs_bsize / devbsize) * jrecmin; segwritten = 0; for (;;) { cnt = ump->softdep_on_journal; /* * Criteria for writing a segment: * 1) We have a full block. * 2) We're called from jwait() and haven't found the * journal item yet. * 3) Always write if needseg is set. * 4) If we are called from process_worklist and have * not yet written anything we write a partial block * to enforce a 1 second maximum latency on journal * entries. */ if (cnt < (jrecmax - 1) && needwk == NULL && jblocks->jb_needseg == 0 && (segwritten || cnt == 0)) break; cnt++; /* * Verify some free journal space. softdep_prealloc() should * guarantee that we don't run out so this is indicative of * a problem with the flow control. Try to recover * gracefully in any event. */ while (jblocks->jb_free == 0) { if (flags != MNT_WAIT) break; printf("softdep: Out of journal space!\n"); softdep_speedup(ump); msleep(jblocks, LOCK_PTR(ump), PRIBIO, "jblocks", hz); } FREE_LOCK(ump); jseg = malloc(sizeof(*jseg), M_JSEG, M_SOFTDEP_FLAGS); workitem_alloc(&jseg->js_list, D_JSEG, mp); LIST_INIT(&jseg->js_entries); LIST_INIT(&jseg->js_indirs); jseg->js_state = ATTACHED; if (shouldflush == 0) jseg->js_state |= COMPLETE; else if (bio == NULL) bio = g_alloc_bio(); jseg->js_jblocks = jblocks; bp = geteblk(fs->fs_bsize, 0); ACQUIRE_LOCK(ump); /* * If there was a race while we were allocating the block * and jseg the entry we care about was likely written. * We bail out in both the WAIT and NOWAIT case and assume * the caller will loop if the entry it cares about is * not written. */ cnt = ump->softdep_on_journal; if (cnt + jblocks->jb_needseg == 0 || jblocks->jb_free == 0) { bp->b_flags |= B_INVAL | B_NOCACHE; WORKITEM_FREE(jseg, D_JSEG); FREE_LOCK(ump); brelse(bp); ACQUIRE_LOCK(ump); break; } /* * Calculate the disk block size required for the available * records rounded to the min size. */ if (cnt == 0) size = devbsize; else if (cnt < jrecmax) size = howmany(cnt, jrecmin) * devbsize; else size = fs->fs_bsize; /* * Allocate a disk block for this journal data and account * for truncation of the requested size if enough contiguous * space was not available. */ bp->b_blkno = jblocks_alloc(jblocks, size, &size); bp->b_lblkno = bp->b_blkno; bp->b_offset = bp->b_blkno * DEV_BSIZE; bp->b_bcount = size; bp->b_flags &= ~B_INVAL; bp->b_flags |= B_VALIDSUSPWRT | B_NOCOPY; /* * Initialize our jseg with cnt records. Assign the next * sequence number to it and link it in-order. */ cnt = MIN(cnt, (size / devbsize) * jrecmin); jseg->js_buf = bp; jseg->js_cnt = cnt; jseg->js_refs = cnt + 1; /* Self ref. */ jseg->js_size = size; jseg->js_seq = jblocks->jb_nextseq++; if (jblocks->jb_oldestseg == NULL) jblocks->jb_oldestseg = jseg; jseg->js_oldseq = jblocks->jb_oldestseg->js_seq; TAILQ_INSERT_TAIL(&jblocks->jb_segs, jseg, js_next); if (jblocks->jb_writeseg == NULL) jblocks->jb_writeseg = jseg; /* * Start filling in records from the pending list. */ data = bp->b_data; off = 0; /* * Always put a header on the first block. * XXX As with below, there might not be a chance to get * into the loop. Ensure that something valid is written. */ jseg_write(ump, jseg, data); off += JREC_SIZE; data = bp->b_data + off; /* * XXX Something is wrong here. There's no work to do, * but we need to perform and I/O and allow it to complete * anyways. */ if (LIST_EMPTY(&ump->softdep_journal_pending)) stat_emptyjblocks++; while ((wk = LIST_FIRST(&ump->softdep_journal_pending)) != NULL) { if (cnt == 0) break; /* Place a segment header on every device block. */ if ((off % devbsize) == 0) { jseg_write(ump, jseg, data); off += JREC_SIZE; data = bp->b_data + off; } if (wk == needwk) needwk = NULL; remove_from_journal(wk); wk->wk_state |= INPROGRESS; WORKLIST_INSERT(&jseg->js_entries, wk); switch (wk->wk_type) { case D_JADDREF: jaddref_write(WK_JADDREF(wk), jseg, data); break; case D_JREMREF: jremref_write(WK_JREMREF(wk), jseg, data); break; case D_JMVREF: jmvref_write(WK_JMVREF(wk), jseg, data); break; case D_JNEWBLK: jnewblk_write(WK_JNEWBLK(wk), jseg, data); break; case D_JFREEBLK: jfreeblk_write(WK_JFREEBLK(wk), jseg, data); break; case D_JFREEFRAG: jfreefrag_write(WK_JFREEFRAG(wk), jseg, data); break; case D_JTRUNC: jtrunc_write(WK_JTRUNC(wk), jseg, data); break; case D_JFSYNC: jfsync_write(WK_JFSYNC(wk), jseg, data); break; default: panic("process_journal: Unknown type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } off += JREC_SIZE; data = bp->b_data + off; cnt--; } /* Clear any remaining space so we don't leak kernel data */ if (size > off) bzero(data, size - off); /* * Write this one buffer and continue. */ segwritten = 1; jblocks->jb_needseg = 0; WORKLIST_INSERT(&bp->b_dep, &jseg->js_list); FREE_LOCK(ump); pbgetvp(ump->um_devvp, bp); /* * We only do the blocking wait once we find the journal * entry we're looking for. */ if (needwk == NULL && flags == MNT_WAIT) bwrite(bp); else bawrite(bp); ACQUIRE_LOCK(ump); } /* * If we wrote a segment issue a synchronize cache so the journal * is reflected on disk before the data is written. Since reclaiming * journal space also requires writing a journal record this * process also enforces a barrier before reclamation. */ if (segwritten && shouldflush) { softdep_synchronize(bio, ump, TAILQ_LAST(&jblocks->jb_segs, jseglst)); } else if (bio) g_destroy_bio(bio); /* * If we've suspended the filesystem because we ran out of journal * space either try to sync it here to make some progress or * unsuspend it if we already have. */ if (flags == 0 && jblocks->jb_suspended) { if (journal_unsuspend(ump)) return; FREE_LOCK(ump); VFS_SYNC(mp, MNT_NOWAIT); ffs_sbupdate(ump, MNT_WAIT, 0); ACQUIRE_LOCK(ump); } } /* * Complete a jseg, allowing all dependencies awaiting journal writes * to proceed. Each journal dependency also attaches a jsegdep to dependent * structures so that the journal segment can be freed to reclaim space. */ static void complete_jseg(jseg) struct jseg *jseg; { struct worklist *wk; struct jmvref *jmvref; int waiting; #ifdef INVARIANTS int i = 0; #endif while ((wk = LIST_FIRST(&jseg->js_entries)) != NULL) { WORKLIST_REMOVE(wk); waiting = wk->wk_state & IOWAITING; wk->wk_state &= ~(INPROGRESS | IOWAITING); wk->wk_state |= COMPLETE; KASSERT(i++ < jseg->js_cnt, ("handle_written_jseg: overflow %d >= %d", i - 1, jseg->js_cnt)); switch (wk->wk_type) { case D_JADDREF: handle_written_jaddref(WK_JADDREF(wk)); break; case D_JREMREF: handle_written_jremref(WK_JREMREF(wk)); break; case D_JMVREF: rele_jseg(jseg); /* No jsegdep. */ jmvref = WK_JMVREF(wk); LIST_REMOVE(jmvref, jm_deps); if ((jmvref->jm_pagedep->pd_state & ONWORKLIST) == 0) free_pagedep(jmvref->jm_pagedep); WORKITEM_FREE(jmvref, D_JMVREF); break; case D_JNEWBLK: handle_written_jnewblk(WK_JNEWBLK(wk)); break; case D_JFREEBLK: handle_written_jblkdep(&WK_JFREEBLK(wk)->jf_dep); break; case D_JTRUNC: handle_written_jblkdep(&WK_JTRUNC(wk)->jt_dep); break; case D_JFSYNC: rele_jseg(jseg); /* No jsegdep. */ WORKITEM_FREE(wk, D_JFSYNC); break; case D_JFREEFRAG: handle_written_jfreefrag(WK_JFREEFRAG(wk)); break; default: panic("handle_written_jseg: Unknown type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } if (waiting) wakeup(wk); } /* Release the self reference so the structure may be freed. */ rele_jseg(jseg); } /* * Determine which jsegs are ready for completion processing. Waits for * synchronize cache to complete as well as forcing in-order completion * of journal entries. */ static void complete_jsegs(jseg) struct jseg *jseg; { struct jblocks *jblocks; struct jseg *jsegn; jblocks = jseg->js_jblocks; /* * Don't allow out of order completions. If this isn't the first * block wait for it to write before we're done. */ if (jseg != jblocks->jb_writeseg) return; /* Iterate through available jsegs processing their entries. */ while (jseg && (jseg->js_state & ALLCOMPLETE) == ALLCOMPLETE) { jblocks->jb_oldestwrseq = jseg->js_oldseq; jsegn = TAILQ_NEXT(jseg, js_next); complete_jseg(jseg); jseg = jsegn; } jblocks->jb_writeseg = jseg; /* * Attempt to free jsegs now that oldestwrseq may have advanced. */ free_jsegs(jblocks); } /* * Mark a jseg as DEPCOMPLETE and throw away the buffer. Attempt to handle * the final completions. */ static void handle_written_jseg(jseg, bp) struct jseg *jseg; struct buf *bp; { if (jseg->js_refs == 0) panic("handle_written_jseg: No self-reference on %p", jseg); jseg->js_state |= DEPCOMPLETE; /* * We'll never need this buffer again, set flags so it will be * discarded. */ bp->b_flags |= B_INVAL | B_NOCACHE; pbrelvp(bp); complete_jsegs(jseg); } static inline struct jsegdep * inoref_jseg(inoref) struct inoref *inoref; { struct jsegdep *jsegdep; jsegdep = inoref->if_jsegdep; inoref->if_jsegdep = NULL; return (jsegdep); } /* * Called once a jremref has made it to stable store. The jremref is marked * complete and we attempt to free it. Any pagedeps writes sleeping waiting * for the jremref to complete will be awoken by free_jremref. */ static void handle_written_jremref(jremref) struct jremref *jremref; { struct inodedep *inodedep; struct jsegdep *jsegdep; struct dirrem *dirrem; /* Grab the jsegdep. */ jsegdep = inoref_jseg(&jremref->jr_ref); /* * Remove us from the inoref list. */ if (inodedep_lookup(jremref->jr_list.wk_mp, jremref->jr_ref.if_ino, 0, &inodedep) == 0) panic("handle_written_jremref: Lost inodedep"); TAILQ_REMOVE(&inodedep->id_inoreflst, &jremref->jr_ref, if_deps); /* * Complete the dirrem. */ dirrem = jremref->jr_dirrem; jremref->jr_dirrem = NULL; LIST_REMOVE(jremref, jr_deps); jsegdep->jd_state |= jremref->jr_state & MKDIR_PARENT; jwork_insert(&dirrem->dm_jwork, jsegdep); if (LIST_EMPTY(&dirrem->dm_jremrefhd) && (dirrem->dm_state & COMPLETE) != 0) add_to_worklist(&dirrem->dm_list, 0); free_jremref(jremref); } /* * Called once a jaddref has made it to stable store. The dependency is * marked complete and any dependent structures are added to the inode * bufwait list to be completed as soon as it is written. If a bitmap write * depends on this entry we move the inode into the inodedephd of the * bmsafemap dependency and attempt to remove the jaddref from the bmsafemap. */ static void handle_written_jaddref(jaddref) struct jaddref *jaddref; { struct jsegdep *jsegdep; struct inodedep *inodedep; struct diradd *diradd; struct mkdir *mkdir; /* Grab the jsegdep. */ jsegdep = inoref_jseg(&jaddref->ja_ref); mkdir = NULL; diradd = NULL; if (inodedep_lookup(jaddref->ja_list.wk_mp, jaddref->ja_ino, 0, &inodedep) == 0) panic("handle_written_jaddref: Lost inodedep."); if (jaddref->ja_diradd == NULL) panic("handle_written_jaddref: No dependency"); if (jaddref->ja_diradd->da_list.wk_type == D_DIRADD) { diradd = jaddref->ja_diradd; WORKLIST_INSERT(&inodedep->id_bufwait, &diradd->da_list); } else if (jaddref->ja_state & MKDIR_PARENT) { mkdir = jaddref->ja_mkdir; WORKLIST_INSERT(&inodedep->id_bufwait, &mkdir->md_list); } else if (jaddref->ja_state & MKDIR_BODY) mkdir = jaddref->ja_mkdir; else panic("handle_written_jaddref: Unknown dependency %p", jaddref->ja_diradd); jaddref->ja_diradd = NULL; /* also clears ja_mkdir */ /* * Remove us from the inode list. */ TAILQ_REMOVE(&inodedep->id_inoreflst, &jaddref->ja_ref, if_deps); /* * The mkdir may be waiting on the jaddref to clear before freeing. */ if (mkdir) { KASSERT(mkdir->md_list.wk_type == D_MKDIR, ("handle_written_jaddref: Incorrect type for mkdir %s", TYPENAME(mkdir->md_list.wk_type))); mkdir->md_jaddref = NULL; diradd = mkdir->md_diradd; mkdir->md_state |= DEPCOMPLETE; complete_mkdir(mkdir); } jwork_insert(&diradd->da_jwork, jsegdep); if (jaddref->ja_state & NEWBLOCK) { inodedep->id_state |= ONDEPLIST; LIST_INSERT_HEAD(&inodedep->id_bmsafemap->sm_inodedephd, inodedep, id_deps); } free_jaddref(jaddref); } /* * Called once a jnewblk journal is written. The allocdirect or allocindir * is placed in the bmsafemap to await notification of a written bitmap. If * the operation was canceled we add the segdep to the appropriate * dependency to free the journal space once the canceling operation * completes. */ static void handle_written_jnewblk(jnewblk) struct jnewblk *jnewblk; { struct bmsafemap *bmsafemap; struct freefrag *freefrag; struct freework *freework; struct jsegdep *jsegdep; struct newblk *newblk; /* Grab the jsegdep. */ jsegdep = jnewblk->jn_jsegdep; jnewblk->jn_jsegdep = NULL; if (jnewblk->jn_dep == NULL) panic("handle_written_jnewblk: No dependency for the segdep."); switch (jnewblk->jn_dep->wk_type) { case D_NEWBLK: case D_ALLOCDIRECT: case D_ALLOCINDIR: /* * Add the written block to the bmsafemap so it can * be notified when the bitmap is on disk. */ newblk = WK_NEWBLK(jnewblk->jn_dep); newblk->nb_jnewblk = NULL; if ((newblk->nb_state & GOINGAWAY) == 0) { bmsafemap = newblk->nb_bmsafemap; newblk->nb_state |= ONDEPLIST; LIST_INSERT_HEAD(&bmsafemap->sm_newblkhd, newblk, nb_deps); } jwork_insert(&newblk->nb_jwork, jsegdep); break; case D_FREEFRAG: /* * A newblock being removed by a freefrag when replaced by * frag extension. */ freefrag = WK_FREEFRAG(jnewblk->jn_dep); freefrag->ff_jdep = NULL; jwork_insert(&freefrag->ff_jwork, jsegdep); break; case D_FREEWORK: /* * A direct block was removed by truncate. */ freework = WK_FREEWORK(jnewblk->jn_dep); freework->fw_jnewblk = NULL; jwork_insert(&freework->fw_freeblks->fb_jwork, jsegdep); break; default: panic("handle_written_jnewblk: Unknown type %d.", jnewblk->jn_dep->wk_type); } jnewblk->jn_dep = NULL; free_jnewblk(jnewblk); } /* * Cancel a jfreefrag that won't be needed, probably due to colliding with * an in-flight allocation that has not yet been committed. Divorce us * from the freefrag and mark it DEPCOMPLETE so that it may be added * to the worklist. */ static void cancel_jfreefrag(jfreefrag) struct jfreefrag *jfreefrag; { struct freefrag *freefrag; if (jfreefrag->fr_jsegdep) { free_jsegdep(jfreefrag->fr_jsegdep); jfreefrag->fr_jsegdep = NULL; } freefrag = jfreefrag->fr_freefrag; jfreefrag->fr_freefrag = NULL; free_jfreefrag(jfreefrag); freefrag->ff_state |= DEPCOMPLETE; CTR1(KTR_SUJ, "cancel_jfreefrag: blkno %jd", freefrag->ff_blkno); } /* * Free a jfreefrag when the parent freefrag is rendered obsolete. */ static void free_jfreefrag(jfreefrag) struct jfreefrag *jfreefrag; { if (jfreefrag->fr_state & INPROGRESS) WORKLIST_REMOVE(&jfreefrag->fr_list); else if (jfreefrag->fr_state & ONWORKLIST) remove_from_journal(&jfreefrag->fr_list); if (jfreefrag->fr_freefrag != NULL) panic("free_jfreefrag: Still attached to a freefrag."); WORKITEM_FREE(jfreefrag, D_JFREEFRAG); } /* * Called when the journal write for a jfreefrag completes. The parent * freefrag is added to the worklist if this completes its dependencies. */ static void handle_written_jfreefrag(jfreefrag) struct jfreefrag *jfreefrag; { struct jsegdep *jsegdep; struct freefrag *freefrag; /* Grab the jsegdep. */ jsegdep = jfreefrag->fr_jsegdep; jfreefrag->fr_jsegdep = NULL; freefrag = jfreefrag->fr_freefrag; if (freefrag == NULL) panic("handle_written_jfreefrag: No freefrag."); freefrag->ff_state |= DEPCOMPLETE; freefrag->ff_jdep = NULL; jwork_insert(&freefrag->ff_jwork, jsegdep); if ((freefrag->ff_state & ALLCOMPLETE) == ALLCOMPLETE) add_to_worklist(&freefrag->ff_list, 0); jfreefrag->fr_freefrag = NULL; free_jfreefrag(jfreefrag); } /* * Called when the journal write for a jfreeblk completes. The jfreeblk * is removed from the freeblks list of pending journal writes and the * jsegdep is moved to the freeblks jwork to be completed when all blocks * have been reclaimed. */ static void handle_written_jblkdep(jblkdep) struct jblkdep *jblkdep; { struct freeblks *freeblks; struct jsegdep *jsegdep; /* Grab the jsegdep. */ jsegdep = jblkdep->jb_jsegdep; jblkdep->jb_jsegdep = NULL; freeblks = jblkdep->jb_freeblks; LIST_REMOVE(jblkdep, jb_deps); jwork_insert(&freeblks->fb_jwork, jsegdep); /* * If the freeblks is all journaled, we can add it to the worklist. */ if (LIST_EMPTY(&freeblks->fb_jblkdephd) && (freeblks->fb_state & ALLCOMPLETE) == ALLCOMPLETE) add_to_worklist(&freeblks->fb_list, WK_NODELAY); free_jblkdep(jblkdep); } static struct jsegdep * newjsegdep(struct worklist *wk) { struct jsegdep *jsegdep; jsegdep = malloc(sizeof(*jsegdep), M_JSEGDEP, M_SOFTDEP_FLAGS); workitem_alloc(&jsegdep->jd_list, D_JSEGDEP, wk->wk_mp); jsegdep->jd_seg = NULL; return (jsegdep); } static struct jmvref * newjmvref(dp, ino, oldoff, newoff) struct inode *dp; ino_t ino; off_t oldoff; off_t newoff; { struct jmvref *jmvref; jmvref = malloc(sizeof(*jmvref), M_JMVREF, M_SOFTDEP_FLAGS); workitem_alloc(&jmvref->jm_list, D_JMVREF, ITOVFS(dp)); jmvref->jm_list.wk_state = ATTACHED | DEPCOMPLETE; jmvref->jm_parent = dp->i_number; jmvref->jm_ino = ino; jmvref->jm_oldoff = oldoff; jmvref->jm_newoff = newoff; return (jmvref); } /* * Allocate a new jremref that tracks the removal of ip from dp with the * directory entry offset of diroff. Mark the entry as ATTACHED and * DEPCOMPLETE as we have all the information required for the journal write * and the directory has already been removed from the buffer. The caller * is responsible for linking the jremref into the pagedep and adding it * to the journal to write. The MKDIR_PARENT flag is set if we're doing * a DOTDOT addition so handle_workitem_remove() can properly assign * the jsegdep when we're done. */ static struct jremref * newjremref(struct dirrem *dirrem, struct inode *dp, struct inode *ip, off_t diroff, nlink_t nlink) { struct jremref *jremref; jremref = malloc(sizeof(*jremref), M_JREMREF, M_SOFTDEP_FLAGS); workitem_alloc(&jremref->jr_list, D_JREMREF, ITOVFS(dp)); jremref->jr_state = ATTACHED; newinoref(&jremref->jr_ref, ip->i_number, dp->i_number, diroff, nlink, ip->i_mode); jremref->jr_dirrem = dirrem; return (jremref); } static inline void newinoref(struct inoref *inoref, ino_t ino, ino_t parent, off_t diroff, nlink_t nlink, uint16_t mode) { inoref->if_jsegdep = newjsegdep(&inoref->if_list); inoref->if_diroff = diroff; inoref->if_ino = ino; inoref->if_parent = parent; inoref->if_nlink = nlink; inoref->if_mode = mode; } /* * Allocate a new jaddref to track the addition of ino to dp at diroff. The * directory offset may not be known until later. The caller is responsible * adding the entry to the journal when this information is available. nlink * should be the link count prior to the addition and mode is only required * to have the correct FMT. */ static struct jaddref * newjaddref(struct inode *dp, ino_t ino, off_t diroff, int16_t nlink, uint16_t mode) { struct jaddref *jaddref; jaddref = malloc(sizeof(*jaddref), M_JADDREF, M_SOFTDEP_FLAGS); workitem_alloc(&jaddref->ja_list, D_JADDREF, ITOVFS(dp)); jaddref->ja_state = ATTACHED; jaddref->ja_mkdir = NULL; newinoref(&jaddref->ja_ref, ino, dp->i_number, diroff, nlink, mode); return (jaddref); } /* * Create a new free dependency for a freework. The caller is responsible * for adjusting the reference count when it has the lock held. The freedep * will track an outstanding bitmap write that will ultimately clear the * freework to continue. */ static struct freedep * newfreedep(struct freework *freework) { struct freedep *freedep; freedep = malloc(sizeof(*freedep), M_FREEDEP, M_SOFTDEP_FLAGS); workitem_alloc(&freedep->fd_list, D_FREEDEP, freework->fw_list.wk_mp); freedep->fd_freework = freework; return (freedep); } /* * Free a freedep structure once the buffer it is linked to is written. If * this is the last reference to the freework schedule it for completion. */ static void free_freedep(freedep) struct freedep *freedep; { struct freework *freework; freework = freedep->fd_freework; freework->fw_freeblks->fb_cgwait--; if (--freework->fw_ref == 0) freework_enqueue(freework); WORKITEM_FREE(freedep, D_FREEDEP); } /* * Allocate a new freework structure that may be a level in an indirect * when parent is not NULL or a top level block when it is. The top level * freework structures are allocated without the per-filesystem lock held * and before the freeblks is visible outside of softdep_setup_freeblocks(). */ static struct freework * newfreework(ump, freeblks, parent, lbn, nb, frags, off, journal) struct ufsmount *ump; struct freeblks *freeblks; struct freework *parent; ufs_lbn_t lbn; ufs2_daddr_t nb; int frags; int off; int journal; { struct freework *freework; freework = malloc(sizeof(*freework), M_FREEWORK, M_SOFTDEP_FLAGS); workitem_alloc(&freework->fw_list, D_FREEWORK, freeblks->fb_list.wk_mp); freework->fw_state = ATTACHED; freework->fw_jnewblk = NULL; freework->fw_freeblks = freeblks; freework->fw_parent = parent; freework->fw_lbn = lbn; freework->fw_blkno = nb; freework->fw_frags = frags; freework->fw_indir = NULL; freework->fw_ref = (MOUNTEDSUJ(UFSTOVFS(ump)) == 0 || lbn >= -UFS_NXADDR) ? 0 : NINDIR(ump->um_fs) + 1; freework->fw_start = freework->fw_off = off; if (journal) newjfreeblk(freeblks, lbn, nb, frags); if (parent == NULL) { ACQUIRE_LOCK(ump); WORKLIST_INSERT(&freeblks->fb_freeworkhd, &freework->fw_list); freeblks->fb_ref++; FREE_LOCK(ump); } return (freework); } /* * Eliminate a jfreeblk for a block that does not need journaling. */ static void cancel_jfreeblk(freeblks, blkno) struct freeblks *freeblks; ufs2_daddr_t blkno; { struct jfreeblk *jfreeblk; struct jblkdep *jblkdep; LIST_FOREACH(jblkdep, &freeblks->fb_jblkdephd, jb_deps) { if (jblkdep->jb_list.wk_type != D_JFREEBLK) continue; jfreeblk = WK_JFREEBLK(&jblkdep->jb_list); if (jfreeblk->jf_blkno == blkno) break; } if (jblkdep == NULL) return; CTR1(KTR_SUJ, "cancel_jfreeblk: blkno %jd", blkno); free_jsegdep(jblkdep->jb_jsegdep); LIST_REMOVE(jblkdep, jb_deps); WORKITEM_FREE(jfreeblk, D_JFREEBLK); } /* * Allocate a new jfreeblk to journal top level block pointer when truncating * a file. The caller must add this to the worklist when the per-filesystem * lock is held. */ static struct jfreeblk * newjfreeblk(freeblks, lbn, blkno, frags) struct freeblks *freeblks; ufs_lbn_t lbn; ufs2_daddr_t blkno; int frags; { struct jfreeblk *jfreeblk; jfreeblk = malloc(sizeof(*jfreeblk), M_JFREEBLK, M_SOFTDEP_FLAGS); workitem_alloc(&jfreeblk->jf_dep.jb_list, D_JFREEBLK, freeblks->fb_list.wk_mp); jfreeblk->jf_dep.jb_jsegdep = newjsegdep(&jfreeblk->jf_dep.jb_list); jfreeblk->jf_dep.jb_freeblks = freeblks; jfreeblk->jf_ino = freeblks->fb_inum; jfreeblk->jf_lbn = lbn; jfreeblk->jf_blkno = blkno; jfreeblk->jf_frags = frags; LIST_INSERT_HEAD(&freeblks->fb_jblkdephd, &jfreeblk->jf_dep, jb_deps); return (jfreeblk); } /* * The journal is only prepared to handle full-size block numbers, so we * have to adjust the record to reflect the change to a full-size block. * For example, suppose we have a block made up of fragments 8-15 and * want to free its last two fragments. We are given a request that says: * FREEBLK ino=5, blkno=14, lbn=0, frags=2, oldfrags=0 * where frags are the number of fragments to free and oldfrags are the * number of fragments to keep. To block align it, we have to change it to * have a valid full-size blkno, so it becomes: * FREEBLK ino=5, blkno=8, lbn=0, frags=2, oldfrags=6 */ static void adjust_newfreework(freeblks, frag_offset) struct freeblks *freeblks; int frag_offset; { struct jfreeblk *jfreeblk; KASSERT((LIST_FIRST(&freeblks->fb_jblkdephd) != NULL && LIST_FIRST(&freeblks->fb_jblkdephd)->jb_list.wk_type == D_JFREEBLK), ("adjust_newfreework: Missing freeblks dependency")); jfreeblk = WK_JFREEBLK(LIST_FIRST(&freeblks->fb_jblkdephd)); jfreeblk->jf_blkno -= frag_offset; jfreeblk->jf_frags += frag_offset; } /* * Allocate a new jtrunc to track a partial truncation. */ static struct jtrunc * newjtrunc(freeblks, size, extsize) struct freeblks *freeblks; off_t size; int extsize; { struct jtrunc *jtrunc; jtrunc = malloc(sizeof(*jtrunc), M_JTRUNC, M_SOFTDEP_FLAGS); workitem_alloc(&jtrunc->jt_dep.jb_list, D_JTRUNC, freeblks->fb_list.wk_mp); jtrunc->jt_dep.jb_jsegdep = newjsegdep(&jtrunc->jt_dep.jb_list); jtrunc->jt_dep.jb_freeblks = freeblks; jtrunc->jt_ino = freeblks->fb_inum; jtrunc->jt_size = size; jtrunc->jt_extsize = extsize; LIST_INSERT_HEAD(&freeblks->fb_jblkdephd, &jtrunc->jt_dep, jb_deps); return (jtrunc); } /* * If we're canceling a new bitmap we have to search for another ref * to move into the bmsafemap dep. This might be better expressed * with another structure. */ static void move_newblock_dep(jaddref, inodedep) struct jaddref *jaddref; struct inodedep *inodedep; { struct inoref *inoref; struct jaddref *jaddrefn; jaddrefn = NULL; for (inoref = TAILQ_NEXT(&jaddref->ja_ref, if_deps); inoref; inoref = TAILQ_NEXT(inoref, if_deps)) { if ((jaddref->ja_state & NEWBLOCK) && inoref->if_list.wk_type == D_JADDREF) { jaddrefn = (struct jaddref *)inoref; break; } } if (jaddrefn == NULL) return; jaddrefn->ja_state &= ~(ATTACHED | UNDONE); jaddrefn->ja_state |= jaddref->ja_state & (ATTACHED | UNDONE | NEWBLOCK); jaddref->ja_state &= ~(ATTACHED | UNDONE | NEWBLOCK); jaddref->ja_state |= ATTACHED; LIST_REMOVE(jaddref, ja_bmdeps); LIST_INSERT_HEAD(&inodedep->id_bmsafemap->sm_jaddrefhd, jaddrefn, ja_bmdeps); } /* * Cancel a jaddref either before it has been written or while it is being * written. This happens when a link is removed before the add reaches * the disk. The jaddref dependency is kept linked into the bmsafemap * and inode to prevent the link count or bitmap from reaching the disk * until handle_workitem_remove() re-adjusts the counts and bitmaps as * required. * * Returns 1 if the canceled addref requires journaling of the remove and * 0 otherwise. */ static int cancel_jaddref(jaddref, inodedep, wkhd) struct jaddref *jaddref; struct inodedep *inodedep; struct workhead *wkhd; { struct inoref *inoref; struct jsegdep *jsegdep; int needsj; KASSERT((jaddref->ja_state & COMPLETE) == 0, ("cancel_jaddref: Canceling complete jaddref")); if (jaddref->ja_state & (INPROGRESS | COMPLETE)) needsj = 1; else needsj = 0; if (inodedep == NULL) if (inodedep_lookup(jaddref->ja_list.wk_mp, jaddref->ja_ino, 0, &inodedep) == 0) panic("cancel_jaddref: Lost inodedep"); /* * We must adjust the nlink of any reference operation that follows * us so that it is consistent with the in-memory reference. This * ensures that inode nlink rollbacks always have the correct link. */ if (needsj == 0) { for (inoref = TAILQ_NEXT(&jaddref->ja_ref, if_deps); inoref; inoref = TAILQ_NEXT(inoref, if_deps)) { if (inoref->if_state & GOINGAWAY) break; inoref->if_nlink--; } } jsegdep = inoref_jseg(&jaddref->ja_ref); if (jaddref->ja_state & NEWBLOCK) move_newblock_dep(jaddref, inodedep); wake_worklist(&jaddref->ja_list); jaddref->ja_mkdir = NULL; if (jaddref->ja_state & INPROGRESS) { jaddref->ja_state &= ~INPROGRESS; WORKLIST_REMOVE(&jaddref->ja_list); jwork_insert(wkhd, jsegdep); } else { free_jsegdep(jsegdep); if (jaddref->ja_state & DEPCOMPLETE) remove_from_journal(&jaddref->ja_list); } jaddref->ja_state |= (GOINGAWAY | DEPCOMPLETE); /* * Leave NEWBLOCK jaddrefs on the inodedep so handle_workitem_remove * can arrange for them to be freed with the bitmap. Otherwise we * no longer need this addref attached to the inoreflst and it * will incorrectly adjust nlink if we leave it. */ if ((jaddref->ja_state & NEWBLOCK) == 0) { TAILQ_REMOVE(&inodedep->id_inoreflst, &jaddref->ja_ref, if_deps); jaddref->ja_state |= COMPLETE; free_jaddref(jaddref); return (needsj); } /* * Leave the head of the list for jsegdeps for fast merging. */ if (LIST_FIRST(wkhd) != NULL) { jaddref->ja_state |= ONWORKLIST; LIST_INSERT_AFTER(LIST_FIRST(wkhd), &jaddref->ja_list, wk_list); } else WORKLIST_INSERT(wkhd, &jaddref->ja_list); return (needsj); } /* * Attempt to free a jaddref structure when some work completes. This * should only succeed once the entry is written and all dependencies have * been notified. */ static void free_jaddref(jaddref) struct jaddref *jaddref; { if ((jaddref->ja_state & ALLCOMPLETE) != ALLCOMPLETE) return; if (jaddref->ja_ref.if_jsegdep) panic("free_jaddref: segdep attached to jaddref %p(0x%X)\n", jaddref, jaddref->ja_state); if (jaddref->ja_state & NEWBLOCK) LIST_REMOVE(jaddref, ja_bmdeps); if (jaddref->ja_state & (INPROGRESS | ONWORKLIST)) panic("free_jaddref: Bad state %p(0x%X)", jaddref, jaddref->ja_state); if (jaddref->ja_mkdir != NULL) panic("free_jaddref: Work pending, 0x%X\n", jaddref->ja_state); WORKITEM_FREE(jaddref, D_JADDREF); } /* * Free a jremref structure once it has been written or discarded. */ static void free_jremref(jremref) struct jremref *jremref; { if (jremref->jr_ref.if_jsegdep) free_jsegdep(jremref->jr_ref.if_jsegdep); if (jremref->jr_state & INPROGRESS) panic("free_jremref: IO still pending"); WORKITEM_FREE(jremref, D_JREMREF); } /* * Free a jnewblk structure. */ static void free_jnewblk(jnewblk) struct jnewblk *jnewblk; { if ((jnewblk->jn_state & ALLCOMPLETE) != ALLCOMPLETE) return; LIST_REMOVE(jnewblk, jn_deps); if (jnewblk->jn_dep != NULL) panic("free_jnewblk: Dependency still attached."); WORKITEM_FREE(jnewblk, D_JNEWBLK); } /* * Cancel a jnewblk which has been been made redundant by frag extension. */ static void cancel_jnewblk(jnewblk, wkhd) struct jnewblk *jnewblk; struct workhead *wkhd; { struct jsegdep *jsegdep; CTR1(KTR_SUJ, "cancel_jnewblk: blkno %jd", jnewblk->jn_blkno); jsegdep = jnewblk->jn_jsegdep; if (jnewblk->jn_jsegdep == NULL || jnewblk->jn_dep == NULL) panic("cancel_jnewblk: Invalid state"); jnewblk->jn_jsegdep = NULL; jnewblk->jn_dep = NULL; jnewblk->jn_state |= GOINGAWAY; if (jnewblk->jn_state & INPROGRESS) { jnewblk->jn_state &= ~INPROGRESS; WORKLIST_REMOVE(&jnewblk->jn_list); jwork_insert(wkhd, jsegdep); } else { free_jsegdep(jsegdep); remove_from_journal(&jnewblk->jn_list); } wake_worklist(&jnewblk->jn_list); WORKLIST_INSERT(wkhd, &jnewblk->jn_list); } static void free_jblkdep(jblkdep) struct jblkdep *jblkdep; { if (jblkdep->jb_list.wk_type == D_JFREEBLK) WORKITEM_FREE(jblkdep, D_JFREEBLK); else if (jblkdep->jb_list.wk_type == D_JTRUNC) WORKITEM_FREE(jblkdep, D_JTRUNC); else panic("free_jblkdep: Unexpected type %s", TYPENAME(jblkdep->jb_list.wk_type)); } /* * Free a single jseg once it is no longer referenced in memory or on * disk. Reclaim journal blocks and dependencies waiting for the segment * to disappear. */ static void free_jseg(jseg, jblocks) struct jseg *jseg; struct jblocks *jblocks; { struct freework *freework; /* * Free freework structures that were lingering to indicate freed * indirect blocks that forced journal write ordering on reallocate. */ while ((freework = LIST_FIRST(&jseg->js_indirs)) != NULL) indirblk_remove(freework); if (jblocks->jb_oldestseg == jseg) jblocks->jb_oldestseg = TAILQ_NEXT(jseg, js_next); TAILQ_REMOVE(&jblocks->jb_segs, jseg, js_next); jblocks_free(jblocks, jseg->js_list.wk_mp, jseg->js_size); KASSERT(LIST_EMPTY(&jseg->js_entries), ("free_jseg: Freed jseg has valid entries.")); WORKITEM_FREE(jseg, D_JSEG); } /* * Free all jsegs that meet the criteria for being reclaimed and update * oldestseg. */ static void free_jsegs(jblocks) struct jblocks *jblocks; { struct jseg *jseg; /* * Free only those jsegs which have none allocated before them to * preserve the journal space ordering. */ while ((jseg = TAILQ_FIRST(&jblocks->jb_segs)) != NULL) { /* * Only reclaim space when nothing depends on this journal * set and another set has written that it is no longer * valid. */ if (jseg->js_refs != 0) { jblocks->jb_oldestseg = jseg; return; } if ((jseg->js_state & ALLCOMPLETE) != ALLCOMPLETE) break; if (jseg->js_seq > jblocks->jb_oldestwrseq) break; /* * We can free jsegs that didn't write entries when * oldestwrseq == js_seq. */ if (jseg->js_seq == jblocks->jb_oldestwrseq && jseg->js_cnt != 0) break; free_jseg(jseg, jblocks); } /* * If we exited the loop above we still must discover the * oldest valid segment. */ if (jseg) for (jseg = jblocks->jb_oldestseg; jseg != NULL; jseg = TAILQ_NEXT(jseg, js_next)) if (jseg->js_refs != 0) break; jblocks->jb_oldestseg = jseg; /* * The journal has no valid records but some jsegs may still be * waiting on oldestwrseq to advance. We force a small record * out to permit these lingering records to be reclaimed. */ if (jblocks->jb_oldestseg == NULL && !TAILQ_EMPTY(&jblocks->jb_segs)) jblocks->jb_needseg = 1; } /* * Release one reference to a jseg and free it if the count reaches 0. This * should eventually reclaim journal space as well. */ static void rele_jseg(jseg) struct jseg *jseg; { KASSERT(jseg->js_refs > 0, ("free_jseg: Invalid refcnt %d", jseg->js_refs)); if (--jseg->js_refs != 0) return; free_jsegs(jseg->js_jblocks); } /* * Release a jsegdep and decrement the jseg count. */ static void free_jsegdep(jsegdep) struct jsegdep *jsegdep; { if (jsegdep->jd_seg) rele_jseg(jsegdep->jd_seg); WORKITEM_FREE(jsegdep, D_JSEGDEP); } /* * Wait for a journal item to make it to disk. Initiate journal processing * if required. */ static int jwait(wk, waitfor) struct worklist *wk; int waitfor; { LOCK_OWNED(VFSTOUFS(wk->wk_mp)); /* * Blocking journal waits cause slow synchronous behavior. Record * stats on the frequency of these blocking operations. */ if (waitfor == MNT_WAIT) { stat_journal_wait++; switch (wk->wk_type) { case D_JREMREF: case D_JMVREF: stat_jwait_filepage++; break; case D_JTRUNC: case D_JFREEBLK: stat_jwait_freeblks++; break; case D_JNEWBLK: stat_jwait_newblk++; break; case D_JADDREF: stat_jwait_inode++; break; default: break; } } /* * If IO has not started we process the journal. We can't mark the * worklist item as IOWAITING because we drop the lock while * processing the journal and the worklist entry may be freed after * this point. The caller may call back in and re-issue the request. */ if ((wk->wk_state & INPROGRESS) == 0) { softdep_process_journal(wk->wk_mp, wk, waitfor); if (waitfor != MNT_WAIT) return (EBUSY); return (0); } if (waitfor != MNT_WAIT) return (EBUSY); wait_worklist(wk, "jwait"); return (0); } /* * Lookup an inodedep based on an inode pointer and set the nlinkdelta as * appropriate. This is a convenience function to reduce duplicate code * for the setup and revert functions below. */ static struct inodedep * inodedep_lookup_ip(ip) struct inode *ip; { struct inodedep *inodedep; KASSERT(ip->i_nlink >= ip->i_effnlink, ("inodedep_lookup_ip: bad delta")); (void) inodedep_lookup(ITOVFS(ip), ip->i_number, DEPALLOC, &inodedep); inodedep->id_nlinkdelta = ip->i_nlink - ip->i_effnlink; KASSERT((inodedep->id_state & UNLINKED) == 0, ("inode unlinked")); return (inodedep); } /* * Called prior to creating a new inode and linking it to a directory. The * jaddref structure must already be allocated by softdep_setup_inomapdep * and it is discovered here so we can initialize the mode and update * nlinkdelta. */ void softdep_setup_create(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *jaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_setup_create called on non-softdep filesystem")); KASSERT(ip->i_nlink == 1, ("softdep_setup_create: Invalid link count.")); dvp = ITOV(dp); ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(ip); if (DOINGSUJ(dvp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref != NULL && jaddref->ja_parent == dp->i_number, ("softdep_setup_create: No addref structure present.")); } softdep_prelink(dvp, NULL); FREE_LOCK(ITOUMP(dp)); } /* * Create a jaddref structure to track the addition of a DOTDOT link when * we are reparenting an inode as part of a rename. This jaddref will be * found by softdep_setup_directory_change. Adjusts nlinkdelta for * non-journaling softdep. */ void softdep_setup_dotdot_link(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *jaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_setup_dotdot_link called on non-softdep filesystem")); dvp = ITOV(dp); jaddref = NULL; /* * We don't set MKDIR_PARENT as this is not tied to a mkdir and * is used as a normal link would be. */ if (DOINGSUJ(dvp)) jaddref = newjaddref(ip, dp->i_number, DOTDOT_OFFSET, dp->i_effnlink - 1, dp->i_mode); ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(dp); if (jaddref) TAILQ_INSERT_TAIL(&inodedep->id_inoreflst, &jaddref->ja_ref, if_deps); softdep_prelink(dvp, ITOV(ip)); FREE_LOCK(ITOUMP(dp)); } /* * Create a jaddref structure to track a new link to an inode. The directory * offset is not known until softdep_setup_directory_add or * softdep_setup_directory_change. Adjusts nlinkdelta for non-journaling * softdep. */ void softdep_setup_link(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *jaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_setup_link called on non-softdep filesystem")); dvp = ITOV(dp); jaddref = NULL; if (DOINGSUJ(dvp)) jaddref = newjaddref(dp, ip->i_number, 0, ip->i_effnlink - 1, ip->i_mode); ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(ip); if (jaddref) TAILQ_INSERT_TAIL(&inodedep->id_inoreflst, &jaddref->ja_ref, if_deps); softdep_prelink(dvp, ITOV(ip)); FREE_LOCK(ITOUMP(dp)); } /* * Called to create the jaddref structures to track . and .. references as * well as lookup and further initialize the incomplete jaddref created * by softdep_setup_inomapdep when the inode was allocated. Adjusts * nlinkdelta for non-journaling softdep. */ void softdep_setup_mkdir(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *dotdotaddref; struct jaddref *dotaddref; struct jaddref *jaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_setup_mkdir called on non-softdep filesystem")); dvp = ITOV(dp); dotaddref = dotdotaddref = NULL; if (DOINGSUJ(dvp)) { dotaddref = newjaddref(ip, ip->i_number, DOT_OFFSET, 1, ip->i_mode); dotaddref->ja_state |= MKDIR_BODY; dotdotaddref = newjaddref(ip, dp->i_number, DOTDOT_OFFSET, dp->i_effnlink - 1, dp->i_mode); dotdotaddref->ja_state |= MKDIR_PARENT; } ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(ip); if (DOINGSUJ(dvp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref != NULL, ("softdep_setup_mkdir: No addref structure present.")); KASSERT(jaddref->ja_parent == dp->i_number, ("softdep_setup_mkdir: bad parent %ju", (uintmax_t)jaddref->ja_parent)); TAILQ_INSERT_BEFORE(&jaddref->ja_ref, &dotaddref->ja_ref, if_deps); } inodedep = inodedep_lookup_ip(dp); if (DOINGSUJ(dvp)) TAILQ_INSERT_TAIL(&inodedep->id_inoreflst, &dotdotaddref->ja_ref, if_deps); softdep_prelink(ITOV(dp), NULL); FREE_LOCK(ITOUMP(dp)); } /* * Called to track nlinkdelta of the inode and parent directories prior to * unlinking a directory. */ void softdep_setup_rmdir(dp, ip) struct inode *dp; struct inode *ip; { struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_setup_rmdir called on non-softdep filesystem")); dvp = ITOV(dp); ACQUIRE_LOCK(ITOUMP(dp)); (void) inodedep_lookup_ip(ip); (void) inodedep_lookup_ip(dp); softdep_prelink(dvp, ITOV(ip)); FREE_LOCK(ITOUMP(dp)); } /* * Called to track nlinkdelta of the inode and parent directories prior to * unlink. */ void softdep_setup_unlink(dp, ip) struct inode *dp; struct inode *ip; { struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_setup_unlink called on non-softdep filesystem")); dvp = ITOV(dp); ACQUIRE_LOCK(ITOUMP(dp)); (void) inodedep_lookup_ip(ip); (void) inodedep_lookup_ip(dp); softdep_prelink(dvp, ITOV(ip)); FREE_LOCK(ITOUMP(dp)); } /* * Called to release the journal structures created by a failed non-directory * creation. Adjusts nlinkdelta for non-journaling softdep. */ void softdep_revert_create(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *jaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS((dp))) != 0, ("softdep_revert_create called on non-softdep filesystem")); dvp = ITOV(dp); ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(ip); if (DOINGSUJ(dvp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref->ja_parent == dp->i_number, ("softdep_revert_create: addref parent mismatch")); cancel_jaddref(jaddref, inodedep, &inodedep->id_inowait); } FREE_LOCK(ITOUMP(dp)); } /* * Called to release the journal structures created by a failed link * addition. Adjusts nlinkdelta for non-journaling softdep. */ void softdep_revert_link(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *jaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_revert_link called on non-softdep filesystem")); dvp = ITOV(dp); ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(ip); if (DOINGSUJ(dvp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref->ja_parent == dp->i_number, ("softdep_revert_link: addref parent mismatch")); cancel_jaddref(jaddref, inodedep, &inodedep->id_inowait); } FREE_LOCK(ITOUMP(dp)); } /* * Called to release the journal structures created by a failed mkdir * attempt. Adjusts nlinkdelta for non-journaling softdep. */ void softdep_revert_mkdir(dp, ip) struct inode *dp; struct inode *ip; { struct inodedep *inodedep; struct jaddref *jaddref; struct jaddref *dotaddref; struct vnode *dvp; KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_revert_mkdir called on non-softdep filesystem")); dvp = ITOV(dp); ACQUIRE_LOCK(ITOUMP(dp)); inodedep = inodedep_lookup_ip(dp); if (DOINGSUJ(dvp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref->ja_parent == ip->i_number, ("softdep_revert_mkdir: dotdot addref parent mismatch")); cancel_jaddref(jaddref, inodedep, &inodedep->id_inowait); } inodedep = inodedep_lookup_ip(ip); if (DOINGSUJ(dvp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref->ja_parent == dp->i_number, ("softdep_revert_mkdir: addref parent mismatch")); dotaddref = (struct jaddref *)TAILQ_PREV(&jaddref->ja_ref, inoreflst, if_deps); cancel_jaddref(jaddref, inodedep, &inodedep->id_inowait); KASSERT(dotaddref->ja_parent == ip->i_number, ("softdep_revert_mkdir: dot addref parent mismatch")); cancel_jaddref(dotaddref, inodedep, &inodedep->id_inowait); } FREE_LOCK(ITOUMP(dp)); } /* * Called to correct nlinkdelta after a failed rmdir. */ void softdep_revert_rmdir(dp, ip) struct inode *dp; struct inode *ip; { KASSERT(MOUNTEDSOFTDEP(ITOVFS(dp)) != 0, ("softdep_revert_rmdir called on non-softdep filesystem")); ACQUIRE_LOCK(ITOUMP(dp)); (void) inodedep_lookup_ip(ip); (void) inodedep_lookup_ip(dp); FREE_LOCK(ITOUMP(dp)); } /* * Protecting the freemaps (or bitmaps). * * To eliminate the need to execute fsck before mounting a filesystem * after a power failure, one must (conservatively) guarantee that the * on-disk copy of the bitmaps never indicate that a live inode or block is * free. So, when a block or inode is allocated, the bitmap should be * updated (on disk) before any new pointers. When a block or inode is * freed, the bitmap should not be updated until all pointers have been * reset. The latter dependency is handled by the delayed de-allocation * approach described below for block and inode de-allocation. The former * dependency is handled by calling the following procedure when a block or * inode is allocated. When an inode is allocated an "inodedep" is created * with its DEPCOMPLETE flag cleared until its bitmap is written to disk. * Each "inodedep" is also inserted into the hash indexing structure so * that any additional link additions can be made dependent on the inode * allocation. * * The ufs filesystem maintains a number of free block counts (e.g., per * cylinder group, per cylinder and per pair) * in addition to the bitmaps. These counts are used to improve efficiency * during allocation and therefore must be consistent with the bitmaps. * There is no convenient way to guarantee post-crash consistency of these * counts with simple update ordering, for two main reasons: (1) The counts * and bitmaps for a single cylinder group block are not in the same disk * sector. If a disk write is interrupted (e.g., by power failure), one may * be written and the other not. (2) Some of the counts are located in the * superblock rather than the cylinder group block. So, we focus our soft * updates implementation on protecting the bitmaps. When mounting a * filesystem, we recompute the auxiliary counts from the bitmaps. */ /* * Called just after updating the cylinder group block to allocate an inode. */ void softdep_setup_inomapdep(bp, ip, newinum, mode) struct buf *bp; /* buffer for cylgroup block with inode map */ struct inode *ip; /* inode related to allocation */ ino_t newinum; /* new inode number being allocated */ int mode; { struct inodedep *inodedep; struct bmsafemap *bmsafemap; struct jaddref *jaddref; struct mount *mp; struct fs *fs; mp = ITOVFS(ip); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_inomapdep called on non-softdep filesystem")); fs = VFSTOUFS(mp)->um_fs; jaddref = NULL; /* * Allocate the journal reference add structure so that the bitmap * can be dependent on it. */ if (MOUNTEDSUJ(mp)) { jaddref = newjaddref(ip, newinum, 0, 0, mode); jaddref->ja_state |= NEWBLOCK; } /* * Create a dependency for the newly allocated inode. * Panic if it already exists as something is seriously wrong. * Otherwise add it to the dependency list for the buffer holding * the cylinder group map from which it was allocated. * * We have to preallocate a bmsafemap entry in case it is needed * in bmsafemap_lookup since once we allocate the inodedep, we * have to finish initializing it before we can FREE_LOCK(). * By preallocating, we avoid FREE_LOCK() while doing a malloc * in bmsafemap_lookup. We cannot call bmsafemap_lookup before * creating the inodedep as it can be freed during the time * that we FREE_LOCK() while allocating the inodedep. We must * call workitem_alloc() before entering the locked section as * it also acquires the lock and we must avoid trying doing so * recursively. */ bmsafemap = malloc(sizeof(struct bmsafemap), M_BMSAFEMAP, M_SOFTDEP_FLAGS); workitem_alloc(&bmsafemap->sm_list, D_BMSAFEMAP, mp); ACQUIRE_LOCK(ITOUMP(ip)); if ((inodedep_lookup(mp, newinum, DEPALLOC, &inodedep))) panic("softdep_setup_inomapdep: dependency %p for new" "inode already exists", inodedep); bmsafemap = bmsafemap_lookup(mp, bp, ino_to_cg(fs, newinum), bmsafemap); if (jaddref) { LIST_INSERT_HEAD(&bmsafemap->sm_jaddrefhd, jaddref, ja_bmdeps); TAILQ_INSERT_TAIL(&inodedep->id_inoreflst, &jaddref->ja_ref, if_deps); } else { inodedep->id_state |= ONDEPLIST; LIST_INSERT_HEAD(&bmsafemap->sm_inodedephd, inodedep, id_deps); } inodedep->id_bmsafemap = bmsafemap; inodedep->id_state &= ~DEPCOMPLETE; FREE_LOCK(ITOUMP(ip)); } /* * Called just after updating the cylinder group block to * allocate block or fragment. */ void softdep_setup_blkmapdep(bp, mp, newblkno, frags, oldfrags) struct buf *bp; /* buffer for cylgroup block with block map */ struct mount *mp; /* filesystem doing allocation */ ufs2_daddr_t newblkno; /* number of newly allocated block */ int frags; /* Number of fragments. */ int oldfrags; /* Previous number of fragments for extend. */ { struct newblk *newblk; struct bmsafemap *bmsafemap; struct jnewblk *jnewblk; struct ufsmount *ump; struct fs *fs; KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_blkmapdep called on non-softdep filesystem")); ump = VFSTOUFS(mp); fs = ump->um_fs; jnewblk = NULL; /* * Create a dependency for the newly allocated block. * Add it to the dependency list for the buffer holding * the cylinder group map from which it was allocated. */ if (MOUNTEDSUJ(mp)) { jnewblk = malloc(sizeof(*jnewblk), M_JNEWBLK, M_SOFTDEP_FLAGS); workitem_alloc(&jnewblk->jn_list, D_JNEWBLK, mp); jnewblk->jn_jsegdep = newjsegdep(&jnewblk->jn_list); jnewblk->jn_state = ATTACHED; jnewblk->jn_blkno = newblkno; jnewblk->jn_frags = frags; jnewblk->jn_oldfrags = oldfrags; #ifdef SUJ_DEBUG { struct cg *cgp; uint8_t *blksfree; long bno; int i; cgp = (struct cg *)bp->b_data; blksfree = cg_blksfree(cgp); bno = dtogd(fs, jnewblk->jn_blkno); for (i = jnewblk->jn_oldfrags; i < jnewblk->jn_frags; i++) { if (isset(blksfree, bno + i)) panic("softdep_setup_blkmapdep: " "free fragment %d from %d-%d " "state 0x%X dep %p", i, jnewblk->jn_oldfrags, jnewblk->jn_frags, jnewblk->jn_state, jnewblk->jn_dep); } } #endif } CTR3(KTR_SUJ, "softdep_setup_blkmapdep: blkno %jd frags %d oldfrags %d", newblkno, frags, oldfrags); ACQUIRE_LOCK(ump); if (newblk_lookup(mp, newblkno, DEPALLOC, &newblk) != 0) panic("softdep_setup_blkmapdep: found block"); newblk->nb_bmsafemap = bmsafemap = bmsafemap_lookup(mp, bp, dtog(fs, newblkno), NULL); if (jnewblk) { jnewblk->jn_dep = (struct worklist *)newblk; LIST_INSERT_HEAD(&bmsafemap->sm_jnewblkhd, jnewblk, jn_deps); } else { newblk->nb_state |= ONDEPLIST; LIST_INSERT_HEAD(&bmsafemap->sm_newblkhd, newblk, nb_deps); } newblk->nb_bmsafemap = bmsafemap; newblk->nb_jnewblk = jnewblk; FREE_LOCK(ump); } #define BMSAFEMAP_HASH(ump, cg) \ (&(ump)->bmsafemap_hashtbl[(cg) & (ump)->bmsafemap_hash_size]) static int bmsafemap_find(bmsafemaphd, cg, bmsafemapp) struct bmsafemap_hashhead *bmsafemaphd; int cg; struct bmsafemap **bmsafemapp; { struct bmsafemap *bmsafemap; LIST_FOREACH(bmsafemap, bmsafemaphd, sm_hash) if (bmsafemap->sm_cg == cg) break; if (bmsafemap) { *bmsafemapp = bmsafemap; return (1); } *bmsafemapp = NULL; return (0); } /* * Find the bmsafemap associated with a cylinder group buffer. * If none exists, create one. The buffer must be locked when * this routine is called and this routine must be called with * the softdep lock held. To avoid giving up the lock while * allocating a new bmsafemap, a preallocated bmsafemap may be * provided. If it is provided but not needed, it is freed. */ static struct bmsafemap * bmsafemap_lookup(mp, bp, cg, newbmsafemap) struct mount *mp; struct buf *bp; int cg; struct bmsafemap *newbmsafemap; { struct bmsafemap_hashhead *bmsafemaphd; struct bmsafemap *bmsafemap, *collision; struct worklist *wk; struct ufsmount *ump; ump = VFSTOUFS(mp); LOCK_OWNED(ump); KASSERT(bp != NULL, ("bmsafemap_lookup: missing buffer")); LIST_FOREACH(wk, &bp->b_dep, wk_list) { if (wk->wk_type == D_BMSAFEMAP) { if (newbmsafemap) WORKITEM_FREE(newbmsafemap, D_BMSAFEMAP); return (WK_BMSAFEMAP(wk)); } } bmsafemaphd = BMSAFEMAP_HASH(ump, cg); if (bmsafemap_find(bmsafemaphd, cg, &bmsafemap) == 1) { if (newbmsafemap) WORKITEM_FREE(newbmsafemap, D_BMSAFEMAP); return (bmsafemap); } if (newbmsafemap) { bmsafemap = newbmsafemap; } else { FREE_LOCK(ump); bmsafemap = malloc(sizeof(struct bmsafemap), M_BMSAFEMAP, M_SOFTDEP_FLAGS); workitem_alloc(&bmsafemap->sm_list, D_BMSAFEMAP, mp); ACQUIRE_LOCK(ump); } bmsafemap->sm_buf = bp; LIST_INIT(&bmsafemap->sm_inodedephd); LIST_INIT(&bmsafemap->sm_inodedepwr); LIST_INIT(&bmsafemap->sm_newblkhd); LIST_INIT(&bmsafemap->sm_newblkwr); LIST_INIT(&bmsafemap->sm_jaddrefhd); LIST_INIT(&bmsafemap->sm_jnewblkhd); LIST_INIT(&bmsafemap->sm_freehd); LIST_INIT(&bmsafemap->sm_freewr); if (bmsafemap_find(bmsafemaphd, cg, &collision) == 1) { WORKITEM_FREE(bmsafemap, D_BMSAFEMAP); return (collision); } bmsafemap->sm_cg = cg; LIST_INSERT_HEAD(bmsafemaphd, bmsafemap, sm_hash); LIST_INSERT_HEAD(&ump->softdep_dirtycg, bmsafemap, sm_next); WORKLIST_INSERT(&bp->b_dep, &bmsafemap->sm_list); return (bmsafemap); } /* * Direct block allocation dependencies. * * When a new block is allocated, the corresponding disk locations must be * initialized (with zeros or new data) before the on-disk inode points to * them. Also, the freemap from which the block was allocated must be * updated (on disk) before the inode's pointer. These two dependencies are * independent of each other and are needed for all file blocks and indirect * blocks that are pointed to directly by the inode. Just before the * "in-core" version of the inode is updated with a newly allocated block * number, a procedure (below) is called to setup allocation dependency * structures. These structures are removed when the corresponding * dependencies are satisfied or when the block allocation becomes obsolete * (i.e., the file is deleted, the block is de-allocated, or the block is a * fragment that gets upgraded). All of these cases are handled in * procedures described later. * * When a file extension causes a fragment to be upgraded, either to a larger * fragment or to a full block, the on-disk location may change (if the * previous fragment could not simply be extended). In this case, the old * fragment must be de-allocated, but not until after the inode's pointer has * been updated. In most cases, this is handled by later procedures, which * will construct a "freefrag" structure to be added to the workitem queue * when the inode update is complete (or obsolete). The main exception to * this is when an allocation occurs while a pending allocation dependency * (for the same block pointer) remains. This case is handled in the main * allocation dependency setup procedure by immediately freeing the * unreferenced fragments. */ void softdep_setup_allocdirect(ip, off, newblkno, oldblkno, newsize, oldsize, bp) struct inode *ip; /* inode to which block is being added */ ufs_lbn_t off; /* block pointer within inode */ ufs2_daddr_t newblkno; /* disk block number being added */ ufs2_daddr_t oldblkno; /* previous block number, 0 unless frag */ long newsize; /* size of new block */ long oldsize; /* size of new block */ struct buf *bp; /* bp for allocated block */ { struct allocdirect *adp, *oldadp; struct allocdirectlst *adphead; struct freefrag *freefrag; struct inodedep *inodedep; struct pagedep *pagedep; struct jnewblk *jnewblk; struct newblk *newblk; struct mount *mp; ufs_lbn_t lbn; lbn = bp->b_lblkno; mp = ITOVFS(ip); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_allocdirect called on non-softdep filesystem")); if (oldblkno && oldblkno != newblkno) freefrag = newfreefrag(ip, oldblkno, oldsize, lbn); else freefrag = NULL; CTR6(KTR_SUJ, "softdep_setup_allocdirect: ino %d blkno %jd oldblkno %jd " "off %jd newsize %ld oldsize %d", ip->i_number, newblkno, oldblkno, off, newsize, oldsize); ACQUIRE_LOCK(ITOUMP(ip)); if (off >= UFS_NDADDR) { if (lbn > 0) panic("softdep_setup_allocdirect: bad lbn %jd, off %jd", lbn, off); /* allocating an indirect block */ if (oldblkno != 0) panic("softdep_setup_allocdirect: non-zero indir"); } else { if (off != lbn) panic("softdep_setup_allocdirect: lbn %jd != off %jd", lbn, off); /* * Allocating a direct block. * * If we are allocating a directory block, then we must * allocate an associated pagedep to track additions and * deletions. */ if ((ip->i_mode & IFMT) == IFDIR) pagedep_lookup(mp, bp, ip->i_number, off, DEPALLOC, &pagedep); } if (newblk_lookup(mp, newblkno, 0, &newblk) == 0) panic("softdep_setup_allocdirect: lost block"); KASSERT(newblk->nb_list.wk_type == D_NEWBLK, ("softdep_setup_allocdirect: newblk already initialized")); /* * Convert the newblk to an allocdirect. */ WORKITEM_REASSIGN(newblk, D_ALLOCDIRECT); adp = (struct allocdirect *)newblk; newblk->nb_freefrag = freefrag; adp->ad_offset = off; adp->ad_oldblkno = oldblkno; adp->ad_newsize = newsize; adp->ad_oldsize = oldsize; /* * Finish initializing the journal. */ if ((jnewblk = newblk->nb_jnewblk) != NULL) { jnewblk->jn_ino = ip->i_number; jnewblk->jn_lbn = lbn; add_to_journal(&jnewblk->jn_list); } if (freefrag && freefrag->ff_jdep != NULL && freefrag->ff_jdep->wk_type == D_JFREEFRAG) add_to_journal(freefrag->ff_jdep); inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); adp->ad_inodedep = inodedep; WORKLIST_INSERT(&bp->b_dep, &newblk->nb_list); /* * The list of allocdirects must be kept in sorted and ascending * order so that the rollback routines can quickly determine the * first uncommitted block (the size of the file stored on disk * ends at the end of the lowest committed fragment, or if there * are no fragments, at the end of the highest committed block). * Since files generally grow, the typical case is that the new * block is to be added at the end of the list. We speed this * special case by checking against the last allocdirect in the * list before laboriously traversing the list looking for the * insertion point. */ adphead = &inodedep->id_newinoupdt; oldadp = TAILQ_LAST(adphead, allocdirectlst); if (oldadp == NULL || oldadp->ad_offset <= off) { /* insert at end of list */ TAILQ_INSERT_TAIL(adphead, adp, ad_next); if (oldadp != NULL && oldadp->ad_offset == off) allocdirect_merge(adphead, adp, oldadp); FREE_LOCK(ITOUMP(ip)); return; } TAILQ_FOREACH(oldadp, adphead, ad_next) { if (oldadp->ad_offset >= off) break; } if (oldadp == NULL) panic("softdep_setup_allocdirect: lost entry"); /* insert in middle of list */ TAILQ_INSERT_BEFORE(oldadp, adp, ad_next); if (oldadp->ad_offset == off) allocdirect_merge(adphead, adp, oldadp); FREE_LOCK(ITOUMP(ip)); } /* * Merge a newer and older journal record to be stored either in a * newblock or freefrag. This handles aggregating journal records for * fragment allocation into a second record as well as replacing a * journal free with an aborted journal allocation. A segment for the * oldest record will be placed on wkhd if it has been written. If not * the segment for the newer record will suffice. */ static struct worklist * jnewblk_merge(new, old, wkhd) struct worklist *new; struct worklist *old; struct workhead *wkhd; { struct jnewblk *njnewblk; struct jnewblk *jnewblk; /* Handle NULLs to simplify callers. */ if (new == NULL) return (old); if (old == NULL) return (new); /* Replace a jfreefrag with a jnewblk. */ if (new->wk_type == D_JFREEFRAG) { if (WK_JNEWBLK(old)->jn_blkno != WK_JFREEFRAG(new)->fr_blkno) panic("jnewblk_merge: blkno mismatch: %p, %p", old, new); cancel_jfreefrag(WK_JFREEFRAG(new)); return (old); } if (old->wk_type != D_JNEWBLK || new->wk_type != D_JNEWBLK) panic("jnewblk_merge: Bad type: old %d new %d\n", old->wk_type, new->wk_type); /* * Handle merging of two jnewblk records that describe * different sets of fragments in the same block. */ jnewblk = WK_JNEWBLK(old); njnewblk = WK_JNEWBLK(new); if (jnewblk->jn_blkno != njnewblk->jn_blkno) panic("jnewblk_merge: Merging disparate blocks."); /* * The record may be rolled back in the cg. */ if (jnewblk->jn_state & UNDONE) { jnewblk->jn_state &= ~UNDONE; njnewblk->jn_state |= UNDONE; njnewblk->jn_state &= ~ATTACHED; } /* * We modify the newer addref and free the older so that if neither * has been written the most up-to-date copy will be on disk. If * both have been written but rolled back we only temporarily need * one of them to fix the bits when the cg write completes. */ jnewblk->jn_state |= ATTACHED | COMPLETE; njnewblk->jn_oldfrags = jnewblk->jn_oldfrags; cancel_jnewblk(jnewblk, wkhd); WORKLIST_REMOVE(&jnewblk->jn_list); free_jnewblk(jnewblk); return (new); } /* * Replace an old allocdirect dependency with a newer one. * This routine must be called with splbio interrupts blocked. */ static void allocdirect_merge(adphead, newadp, oldadp) struct allocdirectlst *adphead; /* head of list holding allocdirects */ struct allocdirect *newadp; /* allocdirect being added */ struct allocdirect *oldadp; /* existing allocdirect being checked */ { struct worklist *wk; struct freefrag *freefrag; freefrag = NULL; LOCK_OWNED(VFSTOUFS(newadp->ad_list.wk_mp)); if (newadp->ad_oldblkno != oldadp->ad_newblkno || newadp->ad_oldsize != oldadp->ad_newsize || newadp->ad_offset >= UFS_NDADDR) panic("%s %jd != new %jd || old size %ld != new %ld", "allocdirect_merge: old blkno", (intmax_t)newadp->ad_oldblkno, (intmax_t)oldadp->ad_newblkno, newadp->ad_oldsize, oldadp->ad_newsize); newadp->ad_oldblkno = oldadp->ad_oldblkno; newadp->ad_oldsize = oldadp->ad_oldsize; /* * If the old dependency had a fragment to free or had never * previously had a block allocated, then the new dependency * can immediately post its freefrag and adopt the old freefrag. * This action is done by swapping the freefrag dependencies. * The new dependency gains the old one's freefrag, and the * old one gets the new one and then immediately puts it on * the worklist when it is freed by free_newblk. It is * not possible to do this swap when the old dependency had a * non-zero size but no previous fragment to free. This condition * arises when the new block is an extension of the old block. * Here, the first part of the fragment allocated to the new * dependency is part of the block currently claimed on disk by * the old dependency, so cannot legitimately be freed until the * conditions for the new dependency are fulfilled. */ freefrag = newadp->ad_freefrag; if (oldadp->ad_freefrag != NULL || oldadp->ad_oldblkno == 0) { newadp->ad_freefrag = oldadp->ad_freefrag; oldadp->ad_freefrag = freefrag; } /* * If we are tracking a new directory-block allocation, * move it from the old allocdirect to the new allocdirect. */ if ((wk = LIST_FIRST(&oldadp->ad_newdirblk)) != NULL) { WORKLIST_REMOVE(wk); if (!LIST_EMPTY(&oldadp->ad_newdirblk)) panic("allocdirect_merge: extra newdirblk"); WORKLIST_INSERT(&newadp->ad_newdirblk, wk); } TAILQ_REMOVE(adphead, oldadp, ad_next); /* * We need to move any journal dependencies over to the freefrag * that releases this block if it exists. Otherwise we are * extending an existing block and we'll wait until that is * complete to release the journal space and extend the * new journal to cover this old space as well. */ if (freefrag == NULL) { if (oldadp->ad_newblkno != newadp->ad_newblkno) panic("allocdirect_merge: %jd != %jd", oldadp->ad_newblkno, newadp->ad_newblkno); newadp->ad_block.nb_jnewblk = (struct jnewblk *) jnewblk_merge(&newadp->ad_block.nb_jnewblk->jn_list, &oldadp->ad_block.nb_jnewblk->jn_list, &newadp->ad_block.nb_jwork); oldadp->ad_block.nb_jnewblk = NULL; cancel_newblk(&oldadp->ad_block, NULL, &newadp->ad_block.nb_jwork); } else { wk = (struct worklist *) cancel_newblk(&oldadp->ad_block, &freefrag->ff_list, &freefrag->ff_jwork); freefrag->ff_jdep = jnewblk_merge(freefrag->ff_jdep, wk, &freefrag->ff_jwork); } free_newblk(&oldadp->ad_block); } /* * Allocate a jfreefrag structure to journal a single block free. */ static struct jfreefrag * newjfreefrag(freefrag, ip, blkno, size, lbn) struct freefrag *freefrag; struct inode *ip; ufs2_daddr_t blkno; long size; ufs_lbn_t lbn; { struct jfreefrag *jfreefrag; struct fs *fs; fs = ITOFS(ip); jfreefrag = malloc(sizeof(struct jfreefrag), M_JFREEFRAG, M_SOFTDEP_FLAGS); workitem_alloc(&jfreefrag->fr_list, D_JFREEFRAG, ITOVFS(ip)); jfreefrag->fr_jsegdep = newjsegdep(&jfreefrag->fr_list); jfreefrag->fr_state = ATTACHED | DEPCOMPLETE; jfreefrag->fr_ino = ip->i_number; jfreefrag->fr_lbn = lbn; jfreefrag->fr_blkno = blkno; jfreefrag->fr_frags = numfrags(fs, size); jfreefrag->fr_freefrag = freefrag; return (jfreefrag); } /* * Allocate a new freefrag structure. */ static struct freefrag * newfreefrag(ip, blkno, size, lbn) struct inode *ip; ufs2_daddr_t blkno; long size; ufs_lbn_t lbn; { struct freefrag *freefrag; struct ufsmount *ump; struct fs *fs; CTR4(KTR_SUJ, "newfreefrag: ino %d blkno %jd size %ld lbn %jd", ip->i_number, blkno, size, lbn); ump = ITOUMP(ip); fs = ump->um_fs; if (fragnum(fs, blkno) + numfrags(fs, size) > fs->fs_frag) panic("newfreefrag: frag size"); freefrag = malloc(sizeof(struct freefrag), M_FREEFRAG, M_SOFTDEP_FLAGS); workitem_alloc(&freefrag->ff_list, D_FREEFRAG, UFSTOVFS(ump)); freefrag->ff_state = ATTACHED; LIST_INIT(&freefrag->ff_jwork); freefrag->ff_inum = ip->i_number; freefrag->ff_vtype = ITOV(ip)->v_type; freefrag->ff_blkno = blkno; freefrag->ff_fragsize = size; if (MOUNTEDSUJ(UFSTOVFS(ump))) { freefrag->ff_jdep = (struct worklist *) newjfreefrag(freefrag, ip, blkno, size, lbn); } else { freefrag->ff_state |= DEPCOMPLETE; freefrag->ff_jdep = NULL; } return (freefrag); } /* * This workitem de-allocates fragments that were replaced during * file block allocation. */ static void handle_workitem_freefrag(freefrag) struct freefrag *freefrag; { struct ufsmount *ump = VFSTOUFS(freefrag->ff_list.wk_mp); struct workhead wkhd; CTR3(KTR_SUJ, "handle_workitem_freefrag: ino %d blkno %jd size %ld", freefrag->ff_inum, freefrag->ff_blkno, freefrag->ff_fragsize); /* * It would be illegal to add new completion items to the * freefrag after it was schedule to be done so it must be * safe to modify the list head here. */ LIST_INIT(&wkhd); ACQUIRE_LOCK(ump); LIST_SWAP(&freefrag->ff_jwork, &wkhd, worklist, wk_list); /* * If the journal has not been written we must cancel it here. */ if (freefrag->ff_jdep) { if (freefrag->ff_jdep->wk_type != D_JNEWBLK) panic("handle_workitem_freefrag: Unexpected type %d\n", freefrag->ff_jdep->wk_type); cancel_jnewblk(WK_JNEWBLK(freefrag->ff_jdep), &wkhd); } FREE_LOCK(ump); ffs_blkfree(ump, ump->um_fs, ump->um_devvp, freefrag->ff_blkno, freefrag->ff_fragsize, freefrag->ff_inum, freefrag->ff_vtype, &wkhd); ACQUIRE_LOCK(ump); WORKITEM_FREE(freefrag, D_FREEFRAG); FREE_LOCK(ump); } /* * Set up a dependency structure for an external attributes data block. * This routine follows much of the structure of softdep_setup_allocdirect. * See the description of softdep_setup_allocdirect above for details. */ void softdep_setup_allocext(ip, off, newblkno, oldblkno, newsize, oldsize, bp) struct inode *ip; ufs_lbn_t off; ufs2_daddr_t newblkno; ufs2_daddr_t oldblkno; long newsize; long oldsize; struct buf *bp; { struct allocdirect *adp, *oldadp; struct allocdirectlst *adphead; struct freefrag *freefrag; struct inodedep *inodedep; struct jnewblk *jnewblk; struct newblk *newblk; struct mount *mp; struct ufsmount *ump; ufs_lbn_t lbn; mp = ITOVFS(ip); ump = VFSTOUFS(mp); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_allocext called on non-softdep filesystem")); KASSERT(off < UFS_NXADDR, ("softdep_setup_allocext: lbn %lld > UFS_NXADDR", (long long)off)); lbn = bp->b_lblkno; if (oldblkno && oldblkno != newblkno) freefrag = newfreefrag(ip, oldblkno, oldsize, lbn); else freefrag = NULL; ACQUIRE_LOCK(ump); if (newblk_lookup(mp, newblkno, 0, &newblk) == 0) panic("softdep_setup_allocext: lost block"); KASSERT(newblk->nb_list.wk_type == D_NEWBLK, ("softdep_setup_allocext: newblk already initialized")); /* * Convert the newblk to an allocdirect. */ WORKITEM_REASSIGN(newblk, D_ALLOCDIRECT); adp = (struct allocdirect *)newblk; newblk->nb_freefrag = freefrag; adp->ad_offset = off; adp->ad_oldblkno = oldblkno; adp->ad_newsize = newsize; adp->ad_oldsize = oldsize; adp->ad_state |= EXTDATA; /* * Finish initializing the journal. */ if ((jnewblk = newblk->nb_jnewblk) != NULL) { jnewblk->jn_ino = ip->i_number; jnewblk->jn_lbn = lbn; add_to_journal(&jnewblk->jn_list); } if (freefrag && freefrag->ff_jdep != NULL && freefrag->ff_jdep->wk_type == D_JFREEFRAG) add_to_journal(freefrag->ff_jdep); inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); adp->ad_inodedep = inodedep; WORKLIST_INSERT(&bp->b_dep, &newblk->nb_list); /* * The list of allocdirects must be kept in sorted and ascending * order so that the rollback routines can quickly determine the * first uncommitted block (the size of the file stored on disk * ends at the end of the lowest committed fragment, or if there * are no fragments, at the end of the highest committed block). * Since files generally grow, the typical case is that the new * block is to be added at the end of the list. We speed this * special case by checking against the last allocdirect in the * list before laboriously traversing the list looking for the * insertion point. */ adphead = &inodedep->id_newextupdt; oldadp = TAILQ_LAST(adphead, allocdirectlst); if (oldadp == NULL || oldadp->ad_offset <= off) { /* insert at end of list */ TAILQ_INSERT_TAIL(adphead, adp, ad_next); if (oldadp != NULL && oldadp->ad_offset == off) allocdirect_merge(adphead, adp, oldadp); FREE_LOCK(ump); return; } TAILQ_FOREACH(oldadp, adphead, ad_next) { if (oldadp->ad_offset >= off) break; } if (oldadp == NULL) panic("softdep_setup_allocext: lost entry"); /* insert in middle of list */ TAILQ_INSERT_BEFORE(oldadp, adp, ad_next); if (oldadp->ad_offset == off) allocdirect_merge(adphead, adp, oldadp); FREE_LOCK(ump); } /* * Indirect block allocation dependencies. * * The same dependencies that exist for a direct block also exist when * a new block is allocated and pointed to by an entry in a block of * indirect pointers. The undo/redo states described above are also * used here. Because an indirect block contains many pointers that * may have dependencies, a second copy of the entire in-memory indirect * block is kept. The buffer cache copy is always completely up-to-date. * The second copy, which is used only as a source for disk writes, * contains only the safe pointers (i.e., those that have no remaining * update dependencies). The second copy is freed when all pointers * are safe. The cache is not allowed to replace indirect blocks with * pending update dependencies. If a buffer containing an indirect * block with dependencies is written, these routines will mark it * dirty again. It can only be successfully written once all the * dependencies are removed. The ffs_fsync routine in conjunction with * softdep_sync_metadata work together to get all the dependencies * removed so that a file can be successfully written to disk. Three * procedures are used when setting up indirect block pointer * dependencies. The division is necessary because of the organization * of the "balloc" routine and because of the distinction between file * pages and file metadata blocks. */ /* * Allocate a new allocindir structure. */ static struct allocindir * newallocindir(ip, ptrno, newblkno, oldblkno, lbn) struct inode *ip; /* inode for file being extended */ int ptrno; /* offset of pointer in indirect block */ ufs2_daddr_t newblkno; /* disk block number being added */ ufs2_daddr_t oldblkno; /* previous block number, 0 if none */ ufs_lbn_t lbn; { struct newblk *newblk; struct allocindir *aip; struct freefrag *freefrag; struct jnewblk *jnewblk; if (oldblkno) freefrag = newfreefrag(ip, oldblkno, ITOFS(ip)->fs_bsize, lbn); else freefrag = NULL; ACQUIRE_LOCK(ITOUMP(ip)); if (newblk_lookup(ITOVFS(ip), newblkno, 0, &newblk) == 0) panic("new_allocindir: lost block"); KASSERT(newblk->nb_list.wk_type == D_NEWBLK, ("newallocindir: newblk already initialized")); WORKITEM_REASSIGN(newblk, D_ALLOCINDIR); newblk->nb_freefrag = freefrag; aip = (struct allocindir *)newblk; aip->ai_offset = ptrno; aip->ai_oldblkno = oldblkno; aip->ai_lbn = lbn; if ((jnewblk = newblk->nb_jnewblk) != NULL) { jnewblk->jn_ino = ip->i_number; jnewblk->jn_lbn = lbn; add_to_journal(&jnewblk->jn_list); } if (freefrag && freefrag->ff_jdep != NULL && freefrag->ff_jdep->wk_type == D_JFREEFRAG) add_to_journal(freefrag->ff_jdep); return (aip); } /* * Called just before setting an indirect block pointer * to a newly allocated file page. */ void softdep_setup_allocindir_page(ip, lbn, bp, ptrno, newblkno, oldblkno, nbp) struct inode *ip; /* inode for file being extended */ ufs_lbn_t lbn; /* allocated block number within file */ struct buf *bp; /* buffer with indirect blk referencing page */ int ptrno; /* offset of pointer in indirect block */ ufs2_daddr_t newblkno; /* disk block number being added */ ufs2_daddr_t oldblkno; /* previous block number, 0 if none */ struct buf *nbp; /* buffer holding allocated page */ { struct inodedep *inodedep; struct freefrag *freefrag; struct allocindir *aip; struct pagedep *pagedep; struct mount *mp; struct ufsmount *ump; mp = ITOVFS(ip); ump = VFSTOUFS(mp); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_allocindir_page called on non-softdep filesystem")); KASSERT(lbn == nbp->b_lblkno, ("softdep_setup_allocindir_page: lbn %jd != lblkno %jd", lbn, bp->b_lblkno)); CTR4(KTR_SUJ, "softdep_setup_allocindir_page: ino %d blkno %jd oldblkno %jd " "lbn %jd", ip->i_number, newblkno, oldblkno, lbn); ASSERT_VOP_LOCKED(ITOV(ip), "softdep_setup_allocindir_page"); aip = newallocindir(ip, ptrno, newblkno, oldblkno, lbn); (void) inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); /* * If we are allocating a directory page, then we must * allocate an associated pagedep to track additions and * deletions. */ if ((ip->i_mode & IFMT) == IFDIR) pagedep_lookup(mp, nbp, ip->i_number, lbn, DEPALLOC, &pagedep); WORKLIST_INSERT(&nbp->b_dep, &aip->ai_block.nb_list); freefrag = setup_allocindir_phase2(bp, ip, inodedep, aip, lbn); FREE_LOCK(ump); if (freefrag) handle_workitem_freefrag(freefrag); } /* * Called just before setting an indirect block pointer to a * newly allocated indirect block. */ void softdep_setup_allocindir_meta(nbp, ip, bp, ptrno, newblkno) struct buf *nbp; /* newly allocated indirect block */ struct inode *ip; /* inode for file being extended */ struct buf *bp; /* indirect block referencing allocated block */ int ptrno; /* offset of pointer in indirect block */ ufs2_daddr_t newblkno; /* disk block number being added */ { struct inodedep *inodedep; struct allocindir *aip; struct ufsmount *ump; ufs_lbn_t lbn; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_setup_allocindir_meta called on non-softdep filesystem")); CTR3(KTR_SUJ, "softdep_setup_allocindir_meta: ino %d blkno %jd ptrno %d", ip->i_number, newblkno, ptrno); lbn = nbp->b_lblkno; ASSERT_VOP_LOCKED(ITOV(ip), "softdep_setup_allocindir_meta"); aip = newallocindir(ip, ptrno, newblkno, 0, lbn); inodedep_lookup(UFSTOVFS(ump), ip->i_number, DEPALLOC, &inodedep); WORKLIST_INSERT(&nbp->b_dep, &aip->ai_block.nb_list); if (setup_allocindir_phase2(bp, ip, inodedep, aip, lbn)) panic("softdep_setup_allocindir_meta: Block already existed"); FREE_LOCK(ump); } static void indirdep_complete(indirdep) struct indirdep *indirdep; { struct allocindir *aip; LIST_REMOVE(indirdep, ir_next); indirdep->ir_state |= DEPCOMPLETE; while ((aip = LIST_FIRST(&indirdep->ir_completehd)) != NULL) { LIST_REMOVE(aip, ai_next); free_newblk(&aip->ai_block); } /* * If this indirdep is not attached to a buf it was simply waiting * on completion to clear completehd. free_indirdep() asserts * that nothing is dangling. */ if ((indirdep->ir_state & ONWORKLIST) == 0) free_indirdep(indirdep); } static struct indirdep * indirdep_lookup(mp, ip, bp) struct mount *mp; struct inode *ip; struct buf *bp; { struct indirdep *indirdep, *newindirdep; struct newblk *newblk; struct ufsmount *ump; struct worklist *wk; struct fs *fs; ufs2_daddr_t blkno; ump = VFSTOUFS(mp); LOCK_OWNED(ump); indirdep = NULL; newindirdep = NULL; fs = ump->um_fs; for (;;) { LIST_FOREACH(wk, &bp->b_dep, wk_list) { if (wk->wk_type != D_INDIRDEP) continue; indirdep = WK_INDIRDEP(wk); break; } /* Found on the buffer worklist, no new structure to free. */ if (indirdep != NULL && newindirdep == NULL) return (indirdep); if (indirdep != NULL && newindirdep != NULL) panic("indirdep_lookup: simultaneous create"); /* None found on the buffer and a new structure is ready. */ if (indirdep == NULL && newindirdep != NULL) break; /* None found and no new structure available. */ FREE_LOCK(ump); newindirdep = malloc(sizeof(struct indirdep), M_INDIRDEP, M_SOFTDEP_FLAGS); workitem_alloc(&newindirdep->ir_list, D_INDIRDEP, mp); newindirdep->ir_state = ATTACHED; if (I_IS_UFS1(ip)) newindirdep->ir_state |= UFS1FMT; TAILQ_INIT(&newindirdep->ir_trunc); newindirdep->ir_saveddata = NULL; LIST_INIT(&newindirdep->ir_deplisthd); LIST_INIT(&newindirdep->ir_donehd); LIST_INIT(&newindirdep->ir_writehd); LIST_INIT(&newindirdep->ir_completehd); if (bp->b_blkno == bp->b_lblkno) { ufs_bmaparray(bp->b_vp, bp->b_lblkno, &blkno, bp, NULL, NULL); bp->b_blkno = blkno; } newindirdep->ir_freeblks = NULL; newindirdep->ir_savebp = getblk(ump->um_devvp, bp->b_blkno, bp->b_bcount, 0, 0, 0); newindirdep->ir_bp = bp; BUF_KERNPROC(newindirdep->ir_savebp); bcopy(bp->b_data, newindirdep->ir_savebp->b_data, bp->b_bcount); ACQUIRE_LOCK(ump); } indirdep = newindirdep; WORKLIST_INSERT(&bp->b_dep, &indirdep->ir_list); /* * If the block is not yet allocated we don't set DEPCOMPLETE so * that we don't free dependencies until the pointers are valid. * This could search b_dep for D_ALLOCDIRECT/D_ALLOCINDIR rather * than using the hash. */ if (newblk_lookup(mp, dbtofsb(fs, bp->b_blkno), 0, &newblk)) LIST_INSERT_HEAD(&newblk->nb_indirdeps, indirdep, ir_next); else indirdep->ir_state |= DEPCOMPLETE; return (indirdep); } /* * Called to finish the allocation of the "aip" allocated * by one of the two routines above. */ static struct freefrag * setup_allocindir_phase2(bp, ip, inodedep, aip, lbn) struct buf *bp; /* in-memory copy of the indirect block */ struct inode *ip; /* inode for file being extended */ struct inodedep *inodedep; /* Inodedep for ip */ struct allocindir *aip; /* allocindir allocated by the above routines */ ufs_lbn_t lbn; /* Logical block number for this block. */ { struct fs *fs; struct indirdep *indirdep; struct allocindir *oldaip; struct freefrag *freefrag; struct mount *mp; struct ufsmount *ump; mp = ITOVFS(ip); ump = VFSTOUFS(mp); LOCK_OWNED(ump); fs = ump->um_fs; if (bp->b_lblkno >= 0) panic("setup_allocindir_phase2: not indir blk"); KASSERT(aip->ai_offset >= 0 && aip->ai_offset < NINDIR(fs), ("setup_allocindir_phase2: Bad offset %d", aip->ai_offset)); indirdep = indirdep_lookup(mp, ip, bp); KASSERT(indirdep->ir_savebp != NULL, ("setup_allocindir_phase2 NULL ir_savebp")); aip->ai_indirdep = indirdep; /* * Check for an unwritten dependency for this indirect offset. If * there is, merge the old dependency into the new one. This happens * as a result of reallocblk only. */ freefrag = NULL; if (aip->ai_oldblkno != 0) { LIST_FOREACH(oldaip, &indirdep->ir_deplisthd, ai_next) { if (oldaip->ai_offset == aip->ai_offset) { freefrag = allocindir_merge(aip, oldaip); goto done; } } LIST_FOREACH(oldaip, &indirdep->ir_donehd, ai_next) { if (oldaip->ai_offset == aip->ai_offset) { freefrag = allocindir_merge(aip, oldaip); goto done; } } } done: LIST_INSERT_HEAD(&indirdep->ir_deplisthd, aip, ai_next); return (freefrag); } /* * Merge two allocindirs which refer to the same block. Move newblock * dependencies and setup the freefrags appropriately. */ static struct freefrag * allocindir_merge(aip, oldaip) struct allocindir *aip; struct allocindir *oldaip; { struct freefrag *freefrag; struct worklist *wk; if (oldaip->ai_newblkno != aip->ai_oldblkno) panic("allocindir_merge: blkno"); aip->ai_oldblkno = oldaip->ai_oldblkno; freefrag = aip->ai_freefrag; aip->ai_freefrag = oldaip->ai_freefrag; oldaip->ai_freefrag = NULL; KASSERT(freefrag != NULL, ("setup_allocindir_phase2: No freefrag")); /* * If we are tracking a new directory-block allocation, * move it from the old allocindir to the new allocindir. */ if ((wk = LIST_FIRST(&oldaip->ai_newdirblk)) != NULL) { WORKLIST_REMOVE(wk); if (!LIST_EMPTY(&oldaip->ai_newdirblk)) panic("allocindir_merge: extra newdirblk"); WORKLIST_INSERT(&aip->ai_newdirblk, wk); } /* * We can skip journaling for this freefrag and just complete * any pending journal work for the allocindir that is being * removed after the freefrag completes. */ if (freefrag->ff_jdep) cancel_jfreefrag(WK_JFREEFRAG(freefrag->ff_jdep)); LIST_REMOVE(oldaip, ai_next); freefrag->ff_jdep = (struct worklist *)cancel_newblk(&oldaip->ai_block, &freefrag->ff_list, &freefrag->ff_jwork); free_newblk(&oldaip->ai_block); return (freefrag); } static inline void setup_freedirect(freeblks, ip, i, needj) struct freeblks *freeblks; struct inode *ip; int i; int needj; { struct ufsmount *ump; ufs2_daddr_t blkno; int frags; blkno = DIP(ip, i_db[i]); if (blkno == 0) return; DIP_SET(ip, i_db[i], 0); ump = ITOUMP(ip); frags = sblksize(ump->um_fs, ip->i_size, i); frags = numfrags(ump->um_fs, frags); newfreework(ump, freeblks, NULL, i, blkno, frags, 0, needj); } static inline void setup_freeext(freeblks, ip, i, needj) struct freeblks *freeblks; struct inode *ip; int i; int needj; { struct ufsmount *ump; ufs2_daddr_t blkno; int frags; blkno = ip->i_din2->di_extb[i]; if (blkno == 0) return; ip->i_din2->di_extb[i] = 0; ump = ITOUMP(ip); frags = sblksize(ump->um_fs, ip->i_din2->di_extsize, i); frags = numfrags(ump->um_fs, frags); newfreework(ump, freeblks, NULL, -1 - i, blkno, frags, 0, needj); } static inline void setup_freeindir(freeblks, ip, i, lbn, needj) struct freeblks *freeblks; struct inode *ip; int i; ufs_lbn_t lbn; int needj; { struct ufsmount *ump; ufs2_daddr_t blkno; blkno = DIP(ip, i_ib[i]); if (blkno == 0) return; DIP_SET(ip, i_ib[i], 0); ump = ITOUMP(ip); newfreework(ump, freeblks, NULL, lbn, blkno, ump->um_fs->fs_frag, 0, needj); } static inline struct freeblks * newfreeblks(mp, ip) struct mount *mp; struct inode *ip; { struct freeblks *freeblks; freeblks = malloc(sizeof(struct freeblks), M_FREEBLKS, M_SOFTDEP_FLAGS|M_ZERO); workitem_alloc(&freeblks->fb_list, D_FREEBLKS, mp); LIST_INIT(&freeblks->fb_jblkdephd); LIST_INIT(&freeblks->fb_jwork); freeblks->fb_ref = 0; freeblks->fb_cgwait = 0; freeblks->fb_state = ATTACHED; freeblks->fb_uid = ip->i_uid; freeblks->fb_inum = ip->i_number; freeblks->fb_vtype = ITOV(ip)->v_type; freeblks->fb_modrev = DIP(ip, i_modrev); freeblks->fb_devvp = ITODEVVP(ip); freeblks->fb_chkcnt = 0; freeblks->fb_len = 0; return (freeblks); } static void trunc_indirdep(indirdep, freeblks, bp, off) struct indirdep *indirdep; struct freeblks *freeblks; struct buf *bp; int off; { struct allocindir *aip, *aipn; /* * The first set of allocindirs won't be in savedbp. */ LIST_FOREACH_SAFE(aip, &indirdep->ir_deplisthd, ai_next, aipn) if (aip->ai_offset > off) cancel_allocindir(aip, bp, freeblks, 1); LIST_FOREACH_SAFE(aip, &indirdep->ir_donehd, ai_next, aipn) if (aip->ai_offset > off) cancel_allocindir(aip, bp, freeblks, 1); /* * These will exist in savedbp. */ LIST_FOREACH_SAFE(aip, &indirdep->ir_writehd, ai_next, aipn) if (aip->ai_offset > off) cancel_allocindir(aip, NULL, freeblks, 0); LIST_FOREACH_SAFE(aip, &indirdep->ir_completehd, ai_next, aipn) if (aip->ai_offset > off) cancel_allocindir(aip, NULL, freeblks, 0); } /* * Follow the chain of indirects down to lastlbn creating a freework * structure for each. This will be used to start indir_trunc() at * the right offset and create the journal records for the parrtial * truncation. A second step will handle the truncated dependencies. */ static int setup_trunc_indir(freeblks, ip, lbn, lastlbn, blkno) struct freeblks *freeblks; struct inode *ip; ufs_lbn_t lbn; ufs_lbn_t lastlbn; ufs2_daddr_t blkno; { struct indirdep *indirdep; struct indirdep *indirn; struct freework *freework; struct newblk *newblk; struct mount *mp; struct ufsmount *ump; struct buf *bp; uint8_t *start; uint8_t *end; ufs_lbn_t lbnadd; int level; int error; int off; freework = NULL; if (blkno == 0) return (0); mp = freeblks->fb_list.wk_mp; ump = VFSTOUFS(mp); bp = getblk(ITOV(ip), lbn, mp->mnt_stat.f_iosize, 0, 0, 0); if ((bp->b_flags & B_CACHE) == 0) { bp->b_blkno = blkptrtodb(VFSTOUFS(mp), blkno); bp->b_iocmd = BIO_READ; bp->b_flags &= ~B_INVAL; bp->b_ioflags &= ~BIO_ERROR; vfs_busy_pages(bp, 0); bp->b_iooffset = dbtob(bp->b_blkno); bstrategy(bp); #ifdef RACCT if (racct_enable) { PROC_LOCK(curproc); racct_add_buf(curproc, bp, 0); PROC_UNLOCK(curproc); } #endif /* RACCT */ curthread->td_ru.ru_inblock++; error = bufwait(bp); if (error) { brelse(bp); return (error); } } level = lbn_level(lbn); lbnadd = lbn_offset(ump->um_fs, level); /* * Compute the offset of the last block we want to keep. Store * in the freework the first block we want to completely free. */ off = (lastlbn - -(lbn + level)) / lbnadd; if (off + 1 == NINDIR(ump->um_fs)) goto nowork; freework = newfreework(ump, freeblks, NULL, lbn, blkno, 0, off + 1, 0); /* * Link the freework into the indirdep. This will prevent any new * allocations from proceeding until we are finished with the * truncate and the block is written. */ ACQUIRE_LOCK(ump); indirdep = indirdep_lookup(mp, ip, bp); if (indirdep->ir_freeblks) panic("setup_trunc_indir: indirdep already truncated."); TAILQ_INSERT_TAIL(&indirdep->ir_trunc, freework, fw_next); freework->fw_indir = indirdep; /* * Cancel any allocindirs that will not make it to disk. * We have to do this for all copies of the indirdep that * live on this newblk. */ if ((indirdep->ir_state & DEPCOMPLETE) == 0) { newblk_lookup(mp, dbtofsb(ump->um_fs, bp->b_blkno), 0, &newblk); LIST_FOREACH(indirn, &newblk->nb_indirdeps, ir_next) trunc_indirdep(indirn, freeblks, bp, off); } else trunc_indirdep(indirdep, freeblks, bp, off); FREE_LOCK(ump); /* * Creation is protected by the buf lock. The saveddata is only * needed if a full truncation follows a partial truncation but it * is difficult to allocate in that case so we fetch it anyway. */ if (indirdep->ir_saveddata == NULL) indirdep->ir_saveddata = malloc(bp->b_bcount, M_INDIRDEP, M_SOFTDEP_FLAGS); nowork: /* Fetch the blkno of the child and the zero start offset. */ if (I_IS_UFS1(ip)) { blkno = ((ufs1_daddr_t *)bp->b_data)[off]; start = (uint8_t *)&((ufs1_daddr_t *)bp->b_data)[off+1]; } else { blkno = ((ufs2_daddr_t *)bp->b_data)[off]; start = (uint8_t *)&((ufs2_daddr_t *)bp->b_data)[off+1]; } if (freework) { /* Zero the truncated pointers. */ end = bp->b_data + bp->b_bcount; bzero(start, end - start); bdwrite(bp); } else bqrelse(bp); if (level == 0) return (0); lbn++; /* adjust level */ lbn -= (off * lbnadd); return setup_trunc_indir(freeblks, ip, lbn, lastlbn, blkno); } /* * Complete the partial truncation of an indirect block setup by * setup_trunc_indir(). This zeros the truncated pointers in the saved * copy and writes them to disk before the freeblks is allowed to complete. */ static void complete_trunc_indir(freework) struct freework *freework; { struct freework *fwn; struct indirdep *indirdep; struct ufsmount *ump; struct buf *bp; uintptr_t start; int count; ump = VFSTOUFS(freework->fw_list.wk_mp); LOCK_OWNED(ump); indirdep = freework->fw_indir; for (;;) { bp = indirdep->ir_bp; /* See if the block was discarded. */ if (bp == NULL) break; /* Inline part of getdirtybuf(). We dont want bremfree. */ if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) == 0) break; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, LOCK_PTR(ump)) == 0) BUF_UNLOCK(bp); ACQUIRE_LOCK(ump); } freework->fw_state |= DEPCOMPLETE; TAILQ_REMOVE(&indirdep->ir_trunc, freework, fw_next); /* * Zero the pointers in the saved copy. */ if (indirdep->ir_state & UFS1FMT) start = sizeof(ufs1_daddr_t); else start = sizeof(ufs2_daddr_t); start *= freework->fw_start; count = indirdep->ir_savebp->b_bcount - start; start += (uintptr_t)indirdep->ir_savebp->b_data; bzero((char *)start, count); /* * We need to start the next truncation in the list if it has not * been started yet. */ fwn = TAILQ_FIRST(&indirdep->ir_trunc); if (fwn != NULL) { if (fwn->fw_freeblks == indirdep->ir_freeblks) TAILQ_REMOVE(&indirdep->ir_trunc, fwn, fw_next); if ((fwn->fw_state & ONWORKLIST) == 0) freework_enqueue(fwn); } /* * If bp is NULL the block was fully truncated, restore * the saved block list otherwise free it if it is no * longer needed. */ if (TAILQ_EMPTY(&indirdep->ir_trunc)) { if (bp == NULL) bcopy(indirdep->ir_saveddata, indirdep->ir_savebp->b_data, indirdep->ir_savebp->b_bcount); free(indirdep->ir_saveddata, M_INDIRDEP); indirdep->ir_saveddata = NULL; } /* * When bp is NULL there is a full truncation pending. We * must wait for this full truncation to be journaled before * we can release this freework because the disk pointers will * never be written as zero. */ if (bp == NULL) { if (LIST_EMPTY(&indirdep->ir_freeblks->fb_jblkdephd)) handle_written_freework(freework); else WORKLIST_INSERT(&indirdep->ir_freeblks->fb_freeworkhd, &freework->fw_list); } else { /* Complete when the real copy is written. */ WORKLIST_INSERT(&bp->b_dep, &freework->fw_list); BUF_UNLOCK(bp); } } /* * Calculate the number of blocks we are going to release where datablocks * is the current total and length is the new file size. */ static ufs2_daddr_t blkcount(fs, datablocks, length) struct fs *fs; ufs2_daddr_t datablocks; off_t length; { off_t totblks, numblks; totblks = 0; numblks = howmany(length, fs->fs_bsize); if (numblks <= UFS_NDADDR) { totblks = howmany(length, fs->fs_fsize); goto out; } totblks = blkstofrags(fs, numblks); numblks -= UFS_NDADDR; /* * Count all single, then double, then triple indirects required. * Subtracting one indirects worth of blocks for each pass * acknowledges one of each pointed to by the inode. */ for (;;) { totblks += blkstofrags(fs, howmany(numblks, NINDIR(fs))); numblks -= NINDIR(fs); if (numblks <= 0) break; numblks = howmany(numblks, NINDIR(fs)); } out: totblks = fsbtodb(fs, totblks); /* * Handle sparse files. We can't reclaim more blocks than the inode * references. We will correct it later in handle_complete_freeblks() * when we know the real count. */ if (totblks > datablocks) return (0); return (datablocks - totblks); } /* * Handle freeblocks for journaled softupdate filesystems. * * Contrary to normal softupdates, we must preserve the block pointers in * indirects until their subordinates are free. This is to avoid journaling * every block that is freed which may consume more space than the journal * itself. The recovery program will see the free block journals at the * base of the truncated area and traverse them to reclaim space. The * pointers in the inode may be cleared immediately after the journal * records are written because each direct and indirect pointer in the * inode is recorded in a journal. This permits full truncation to proceed * asynchronously. The write order is journal -> inode -> cgs -> indirects. * * The algorithm is as follows: * 1) Traverse the in-memory state and create journal entries to release * the relevant blocks and full indirect trees. * 2) Traverse the indirect block chain adding partial truncation freework * records to indirects in the path to lastlbn. The freework will * prevent new allocation dependencies from being satisfied in this * indirect until the truncation completes. * 3) Read and lock the inode block, performing an update with the new size * and pointers. This prevents truncated data from becoming valid on * disk through step 4. * 4) Reap unsatisfied dependencies that are beyond the truncated area, * eliminate journal work for those records that do not require it. * 5) Schedule the journal records to be written followed by the inode block. * 6) Allocate any necessary frags for the end of file. * 7) Zero any partially truncated blocks. * * From this truncation proceeds asynchronously using the freework and * indir_trunc machinery. The file will not be extended again into a * partially truncated indirect block until all work is completed but * the normal dependency mechanism ensures that it is rolled back/forward * as appropriate. Further truncation may occur without delay and is * serialized in indir_trunc(). */ void softdep_journal_freeblocks(ip, cred, length, flags) struct inode *ip; /* The inode whose length is to be reduced */ struct ucred *cred; off_t length; /* The new length for the file */ int flags; /* IO_EXT and/or IO_NORMAL */ { struct freeblks *freeblks, *fbn; struct worklist *wk, *wkn; struct inodedep *inodedep; struct jblkdep *jblkdep; struct allocdirect *adp, *adpn; struct ufsmount *ump; struct fs *fs; struct buf *bp; struct vnode *vp; struct mount *mp; ufs2_daddr_t extblocks, datablocks; ufs_lbn_t tmpval, lbn, lastlbn; int frags, lastoff, iboff, allocblock, needj, error, i; ump = ITOUMP(ip); mp = UFSTOVFS(ump); fs = ump->um_fs; KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_journal_freeblocks called on non-softdep filesystem")); vp = ITOV(ip); needj = 1; iboff = -1; allocblock = 0; extblocks = 0; datablocks = 0; frags = 0; freeblks = newfreeblks(mp, ip); ACQUIRE_LOCK(ump); /* * If we're truncating a removed file that will never be written * we don't need to journal the block frees. The canceled journals * for the allocations will suffice. */ inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); if ((inodedep->id_state & (UNLINKED | DEPCOMPLETE)) == UNLINKED && length == 0) needj = 0; CTR3(KTR_SUJ, "softdep_journal_freeblks: ip %d length %ld needj %d", ip->i_number, length, needj); FREE_LOCK(ump); /* * Calculate the lbn that we are truncating to. This results in -1 * if we're truncating the 0 bytes. So it is the last lbn we want * to keep, not the first lbn we want to truncate. */ lastlbn = lblkno(fs, length + fs->fs_bsize - 1) - 1; lastoff = blkoff(fs, length); /* * Compute frags we are keeping in lastlbn. 0 means all. */ if (lastlbn >= 0 && lastlbn < UFS_NDADDR) { frags = fragroundup(fs, lastoff); /* adp offset of last valid allocdirect. */ iboff = lastlbn; } else if (lastlbn > 0) iboff = UFS_NDADDR; if (fs->fs_magic == FS_UFS2_MAGIC) extblocks = btodb(fragroundup(fs, ip->i_din2->di_extsize)); /* * Handle normal data blocks and indirects. This section saves * values used after the inode update to complete frag and indirect * truncation. */ if ((flags & IO_NORMAL) != 0) { /* * Handle truncation of whole direct and indirect blocks. */ for (i = iboff + 1; i < UFS_NDADDR; i++) setup_freedirect(freeblks, ip, i, needj); for (i = 0, tmpval = NINDIR(fs), lbn = UFS_NDADDR; i < UFS_NIADDR; i++, lbn += tmpval, tmpval *= NINDIR(fs)) { /* Release a whole indirect tree. */ if (lbn > lastlbn) { setup_freeindir(freeblks, ip, i, -lbn -i, needj); continue; } iboff = i + UFS_NDADDR; /* * Traverse partially truncated indirect tree. */ if (lbn <= lastlbn && lbn + tmpval - 1 > lastlbn) setup_trunc_indir(freeblks, ip, -lbn - i, lastlbn, DIP(ip, i_ib[i])); } /* * Handle partial truncation to a frag boundary. */ if (frags) { ufs2_daddr_t blkno; long oldfrags; oldfrags = blksize(fs, ip, lastlbn); blkno = DIP(ip, i_db[lastlbn]); if (blkno && oldfrags != frags) { oldfrags -= frags; oldfrags = numfrags(fs, oldfrags); blkno += numfrags(fs, frags); newfreework(ump, freeblks, NULL, lastlbn, blkno, oldfrags, 0, needj); if (needj) adjust_newfreework(freeblks, numfrags(fs, frags)); } else if (blkno == 0) allocblock = 1; } /* * Add a journal record for partial truncate if we are * handling indirect blocks. Non-indirects need no extra * journaling. */ if (length != 0 && lastlbn >= UFS_NDADDR) { ip->i_flag |= IN_TRUNCATED; newjtrunc(freeblks, length, 0); } ip->i_size = length; DIP_SET(ip, i_size, ip->i_size); datablocks = DIP(ip, i_blocks) - extblocks; if (length != 0) datablocks = blkcount(fs, datablocks, length); freeblks->fb_len = length; } if ((flags & IO_EXT) != 0) { for (i = 0; i < UFS_NXADDR; i++) setup_freeext(freeblks, ip, i, needj); ip->i_din2->di_extsize = 0; datablocks += extblocks; } #ifdef QUOTA /* Reference the quotas in case the block count is wrong in the end. */ quotaref(vp, freeblks->fb_quota); (void) chkdq(ip, -datablocks, NOCRED, 0); #endif freeblks->fb_chkcnt = -datablocks; UFS_LOCK(ump); fs->fs_pendingblocks += datablocks; UFS_UNLOCK(ump); DIP_SET(ip, i_blocks, DIP(ip, i_blocks) - datablocks); /* * Handle truncation of incomplete alloc direct dependencies. We * hold the inode block locked to prevent incomplete dependencies * from reaching the disk while we are eliminating those that * have been truncated. This is a partially inlined ffs_update(). */ ufs_itimes(vp); ip->i_flag &= ~(IN_LAZYACCESS | IN_LAZYMOD | IN_MODIFIED); error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)), (int)fs->fs_bsize, cred, &bp); if (error) { brelse(bp); softdep_error("softdep_journal_freeblocks", error); return; } if (bp->b_bufsize == fs->fs_bsize) bp->b_flags |= B_CLUSTEROK; softdep_update_inodeblock(ip, bp, 0); if (ump->um_fstype == UFS1) *((struct ufs1_dinode *)bp->b_data + ino_to_fsbo(fs, ip->i_number)) = *ip->i_din1; else *((struct ufs2_dinode *)bp->b_data + ino_to_fsbo(fs, ip->i_number)) = *ip->i_din2; ACQUIRE_LOCK(ump); (void) inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); if ((inodedep->id_state & IOSTARTED) != 0) panic("softdep_setup_freeblocks: inode busy"); /* * Add the freeblks structure to the list of operations that * must await the zero'ed inode being written to disk. If we * still have a bitmap dependency (needj), then the inode * has never been written to disk, so we can process the * freeblks below once we have deleted the dependencies. */ if (needj) WORKLIST_INSERT(&bp->b_dep, &freeblks->fb_list); else freeblks->fb_state |= COMPLETE; if ((flags & IO_NORMAL) != 0) { TAILQ_FOREACH_SAFE(adp, &inodedep->id_inoupdt, ad_next, adpn) { if (adp->ad_offset > iboff) cancel_allocdirect(&inodedep->id_inoupdt, adp, freeblks); /* * Truncate the allocdirect. We could eliminate * or modify journal records as well. */ else if (adp->ad_offset == iboff && frags) adp->ad_newsize = frags; } } if ((flags & IO_EXT) != 0) while ((adp = TAILQ_FIRST(&inodedep->id_extupdt)) != NULL) cancel_allocdirect(&inodedep->id_extupdt, adp, freeblks); /* * Scan the bufwait list for newblock dependencies that will never * make it to disk. */ LIST_FOREACH_SAFE(wk, &inodedep->id_bufwait, wk_list, wkn) { if (wk->wk_type != D_ALLOCDIRECT) continue; adp = WK_ALLOCDIRECT(wk); if (((flags & IO_NORMAL) != 0 && (adp->ad_offset > iboff)) || ((flags & IO_EXT) != 0 && (adp->ad_state & EXTDATA))) { cancel_jfreeblk(freeblks, adp->ad_newblkno); cancel_newblk(WK_NEWBLK(wk), NULL, &freeblks->fb_jwork); WORKLIST_INSERT(&freeblks->fb_freeworkhd, wk); } } /* * Add journal work. */ LIST_FOREACH(jblkdep, &freeblks->fb_jblkdephd, jb_deps) add_to_journal(&jblkdep->jb_list); FREE_LOCK(ump); bdwrite(bp); /* * Truncate dependency structures beyond length. */ trunc_dependencies(ip, freeblks, lastlbn, frags, flags); /* * This is only set when we need to allocate a fragment because * none existed at the end of a frag-sized file. It handles only * allocating a new, zero filled block. */ if (allocblock) { ip->i_size = length - lastoff; DIP_SET(ip, i_size, ip->i_size); error = UFS_BALLOC(vp, length - 1, 1, cred, BA_CLRBUF, &bp); if (error != 0) { softdep_error("softdep_journal_freeblks", error); return; } ip->i_size = length; DIP_SET(ip, i_size, length); ip->i_flag |= IN_CHANGE | IN_UPDATE; allocbuf(bp, frags); ffs_update(vp, 0); bawrite(bp); } else if (lastoff != 0 && vp->v_type != VDIR) { int size; /* * Zero the end of a truncated frag or block. */ size = sblksize(fs, length, lastlbn); error = bread(vp, lastlbn, size, cred, &bp); if (error) { softdep_error("softdep_journal_freeblks", error); return; } bzero((char *)bp->b_data + lastoff, size - lastoff); bawrite(bp); } ACQUIRE_LOCK(ump); inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); TAILQ_INSERT_TAIL(&inodedep->id_freeblklst, freeblks, fb_next); freeblks->fb_state |= DEPCOMPLETE | ONDEPLIST; /* * We zero earlier truncations so they don't erroneously * update i_blocks. */ if (freeblks->fb_len == 0 && (flags & IO_NORMAL) != 0) TAILQ_FOREACH(fbn, &inodedep->id_freeblklst, fb_next) fbn->fb_len = 0; if ((freeblks->fb_state & ALLCOMPLETE) == ALLCOMPLETE && LIST_EMPTY(&freeblks->fb_jblkdephd)) freeblks->fb_state |= INPROGRESS; else freeblks = NULL; FREE_LOCK(ump); if (freeblks) handle_workitem_freeblocks(freeblks, 0); trunc_pages(ip, length, extblocks, flags); } /* * Flush a JOP_SYNC to the journal. */ void softdep_journal_fsync(ip) struct inode *ip; { struct jfsync *jfsync; struct ufsmount *ump; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_journal_fsync called on non-softdep filesystem")); if ((ip->i_flag & IN_TRUNCATED) == 0) return; ip->i_flag &= ~IN_TRUNCATED; jfsync = malloc(sizeof(*jfsync), M_JFSYNC, M_SOFTDEP_FLAGS | M_ZERO); workitem_alloc(&jfsync->jfs_list, D_JFSYNC, UFSTOVFS(ump)); jfsync->jfs_size = ip->i_size; jfsync->jfs_ino = ip->i_number; ACQUIRE_LOCK(ump); add_to_journal(&jfsync->jfs_list); jwait(&jfsync->jfs_list, MNT_WAIT); FREE_LOCK(ump); } /* * Block de-allocation dependencies. * * When blocks are de-allocated, the on-disk pointers must be nullified before * the blocks are made available for use by other files. (The true * requirement is that old pointers must be nullified before new on-disk * pointers are set. We chose this slightly more stringent requirement to * reduce complexity.) Our implementation handles this dependency by updating * the inode (or indirect block) appropriately but delaying the actual block * de-allocation (i.e., freemap and free space count manipulation) until * after the updated versions reach stable storage. After the disk is * updated, the blocks can be safely de-allocated whenever it is convenient. * This implementation handles only the common case of reducing a file's * length to zero. Other cases are handled by the conventional synchronous * write approach. * * The ffs implementation with which we worked double-checks * the state of the block pointers and file size as it reduces * a file's length. Some of this code is replicated here in our * soft updates implementation. The freeblks->fb_chkcnt field is * used to transfer a part of this information to the procedure * that eventually de-allocates the blocks. * * This routine should be called from the routine that shortens * a file's length, before the inode's size or block pointers * are modified. It will save the block pointer information for * later release and zero the inode so that the calling routine * can release it. */ void softdep_setup_freeblocks(ip, length, flags) struct inode *ip; /* The inode whose length is to be reduced */ off_t length; /* The new length for the file */ int flags; /* IO_EXT and/or IO_NORMAL */ { struct ufs1_dinode *dp1; struct ufs2_dinode *dp2; struct freeblks *freeblks; struct inodedep *inodedep; struct allocdirect *adp; struct ufsmount *ump; struct buf *bp; struct fs *fs; ufs2_daddr_t extblocks, datablocks; struct mount *mp; int i, delay, error; ufs_lbn_t tmpval; ufs_lbn_t lbn; ump = ITOUMP(ip); mp = UFSTOVFS(ump); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_freeblocks called on non-softdep filesystem")); CTR2(KTR_SUJ, "softdep_setup_freeblks: ip %d length %ld", ip->i_number, length); KASSERT(length == 0, ("softdep_setup_freeblocks: non-zero length")); fs = ump->um_fs; if ((error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)), (int)fs->fs_bsize, NOCRED, &bp)) != 0) { brelse(bp); softdep_error("softdep_setup_freeblocks", error); return; } freeblks = newfreeblks(mp, ip); extblocks = 0; datablocks = 0; if (fs->fs_magic == FS_UFS2_MAGIC) extblocks = btodb(fragroundup(fs, ip->i_din2->di_extsize)); if ((flags & IO_NORMAL) != 0) { for (i = 0; i < UFS_NDADDR; i++) setup_freedirect(freeblks, ip, i, 0); for (i = 0, tmpval = NINDIR(fs), lbn = UFS_NDADDR; i < UFS_NIADDR; i++, lbn += tmpval, tmpval *= NINDIR(fs)) setup_freeindir(freeblks, ip, i, -lbn -i, 0); ip->i_size = 0; DIP_SET(ip, i_size, 0); datablocks = DIP(ip, i_blocks) - extblocks; } if ((flags & IO_EXT) != 0) { for (i = 0; i < UFS_NXADDR; i++) setup_freeext(freeblks, ip, i, 0); ip->i_din2->di_extsize = 0; datablocks += extblocks; } #ifdef QUOTA /* Reference the quotas in case the block count is wrong in the end. */ quotaref(ITOV(ip), freeblks->fb_quota); (void) chkdq(ip, -datablocks, NOCRED, 0); #endif freeblks->fb_chkcnt = -datablocks; UFS_LOCK(ump); fs->fs_pendingblocks += datablocks; UFS_UNLOCK(ump); DIP_SET(ip, i_blocks, DIP(ip, i_blocks) - datablocks); /* * Push the zero'ed inode to to its disk buffer so that we are free * to delete its dependencies below. Once the dependencies are gone * the buffer can be safely released. */ if (ump->um_fstype == UFS1) { dp1 = ((struct ufs1_dinode *)bp->b_data + ino_to_fsbo(fs, ip->i_number)); ip->i_din1->di_freelink = dp1->di_freelink; *dp1 = *ip->i_din1; } else { dp2 = ((struct ufs2_dinode *)bp->b_data + ino_to_fsbo(fs, ip->i_number)); ip->i_din2->di_freelink = dp2->di_freelink; *dp2 = *ip->i_din2; } /* * Find and eliminate any inode dependencies. */ ACQUIRE_LOCK(ump); (void) inodedep_lookup(mp, ip->i_number, DEPALLOC, &inodedep); if ((inodedep->id_state & IOSTARTED) != 0) panic("softdep_setup_freeblocks: inode busy"); /* * Add the freeblks structure to the list of operations that * must await the zero'ed inode being written to disk. If we * still have a bitmap dependency (delay == 0), then the inode * has never been written to disk, so we can process the * freeblks below once we have deleted the dependencies. */ delay = (inodedep->id_state & DEPCOMPLETE); if (delay) WORKLIST_INSERT(&bp->b_dep, &freeblks->fb_list); else freeblks->fb_state |= COMPLETE; /* * Because the file length has been truncated to zero, any * pending block allocation dependency structures associated * with this inode are obsolete and can simply be de-allocated. * We must first merge the two dependency lists to get rid of * any duplicate freefrag structures, then purge the merged list. * If we still have a bitmap dependency, then the inode has never * been written to disk, so we can free any fragments without delay. */ if (flags & IO_NORMAL) { merge_inode_lists(&inodedep->id_newinoupdt, &inodedep->id_inoupdt); while ((adp = TAILQ_FIRST(&inodedep->id_inoupdt)) != NULL) cancel_allocdirect(&inodedep->id_inoupdt, adp, freeblks); } if (flags & IO_EXT) { merge_inode_lists(&inodedep->id_newextupdt, &inodedep->id_extupdt); while ((adp = TAILQ_FIRST(&inodedep->id_extupdt)) != NULL) cancel_allocdirect(&inodedep->id_extupdt, adp, freeblks); } FREE_LOCK(ump); bdwrite(bp); trunc_dependencies(ip, freeblks, -1, 0, flags); ACQUIRE_LOCK(ump); if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) != 0) (void) free_inodedep(inodedep); freeblks->fb_state |= DEPCOMPLETE; /* * If the inode with zeroed block pointers is now on disk * we can start freeing blocks. */ if ((freeblks->fb_state & ALLCOMPLETE) == ALLCOMPLETE) freeblks->fb_state |= INPROGRESS; else freeblks = NULL; FREE_LOCK(ump); if (freeblks) handle_workitem_freeblocks(freeblks, 0); trunc_pages(ip, length, extblocks, flags); } /* * Eliminate pages from the page cache that back parts of this inode and * adjust the vnode pager's idea of our size. This prevents stale data * from hanging around in the page cache. */ static void trunc_pages(ip, length, extblocks, flags) struct inode *ip; off_t length; ufs2_daddr_t extblocks; int flags; { struct vnode *vp; struct fs *fs; ufs_lbn_t lbn; off_t end, extend; vp = ITOV(ip); fs = ITOFS(ip); extend = OFF_TO_IDX(lblktosize(fs, -extblocks)); if ((flags & IO_EXT) != 0) vn_pages_remove(vp, extend, 0); if ((flags & IO_NORMAL) == 0) return; BO_LOCK(&vp->v_bufobj); drain_output(vp); BO_UNLOCK(&vp->v_bufobj); /* * The vnode pager eliminates file pages we eliminate indirects * below. */ vnode_pager_setsize(vp, length); /* * Calculate the end based on the last indirect we want to keep. If * the block extends into indirects we can just use the negative of * its lbn. Doubles and triples exist at lower numbers so we must * be careful not to remove those, if they exist. double and triple * indirect lbns do not overlap with others so it is not important * to verify how many levels are required. */ lbn = lblkno(fs, length); if (lbn >= UFS_NDADDR) { /* Calculate the virtual lbn of the triple indirect. */ lbn = -lbn - (UFS_NIADDR - 1); end = OFF_TO_IDX(lblktosize(fs, lbn)); } else end = extend; vn_pages_remove(vp, OFF_TO_IDX(OFF_MAX), end); } /* * See if the buf bp is in the range eliminated by truncation. */ static int trunc_check_buf(bp, blkoffp, lastlbn, lastoff, flags) struct buf *bp; int *blkoffp; ufs_lbn_t lastlbn; int lastoff; int flags; { ufs_lbn_t lbn; *blkoffp = 0; /* Only match ext/normal blocks as appropriate. */ if (((flags & IO_EXT) == 0 && (bp->b_xflags & BX_ALTDATA)) || ((flags & IO_NORMAL) == 0 && (bp->b_xflags & BX_ALTDATA) == 0)) return (0); /* ALTDATA is always a full truncation. */ if ((bp->b_xflags & BX_ALTDATA) != 0) return (1); /* -1 is full truncation. */ if (lastlbn == -1) return (1); /* * If this is a partial truncate we only want those * blocks and indirect blocks that cover the range * we're after. */ lbn = bp->b_lblkno; if (lbn < 0) lbn = -(lbn + lbn_level(lbn)); if (lbn < lastlbn) return (0); /* Here we only truncate lblkno if it's partial. */ if (lbn == lastlbn) { if (lastoff == 0) return (0); *blkoffp = lastoff; } return (1); } /* * Eliminate any dependencies that exist in memory beyond lblkno:off */ static void trunc_dependencies(ip, freeblks, lastlbn, lastoff, flags) struct inode *ip; struct freeblks *freeblks; ufs_lbn_t lastlbn; int lastoff; int flags; { struct bufobj *bo; struct vnode *vp; struct buf *bp; int blkoff; /* * We must wait for any I/O in progress to finish so that * all potential buffers on the dirty list will be visible. * Once they are all there, walk the list and get rid of * any dependencies. */ vp = ITOV(ip); bo = &vp->v_bufobj; BO_LOCK(bo); drain_output(vp); TAILQ_FOREACH(bp, &bo->bo_dirty.bv_hd, b_bobufs) bp->b_vflags &= ~BV_SCANNED; restart: TAILQ_FOREACH(bp, &bo->bo_dirty.bv_hd, b_bobufs) { if (bp->b_vflags & BV_SCANNED) continue; if (!trunc_check_buf(bp, &blkoff, lastlbn, lastoff, flags)) { bp->b_vflags |= BV_SCANNED; continue; } KASSERT(bp->b_bufobj == bo, ("Wrong object in buffer")); if ((bp = getdirtybuf(bp, BO_LOCKPTR(bo), MNT_WAIT)) == NULL) goto restart; BO_UNLOCK(bo); if (deallocate_dependencies(bp, freeblks, blkoff)) bqrelse(bp); else brelse(bp); BO_LOCK(bo); goto restart; } /* * Now do the work of vtruncbuf while also matching indirect blocks. */ TAILQ_FOREACH(bp, &bo->bo_clean.bv_hd, b_bobufs) bp->b_vflags &= ~BV_SCANNED; cleanrestart: TAILQ_FOREACH(bp, &bo->bo_clean.bv_hd, b_bobufs) { if (bp->b_vflags & BV_SCANNED) continue; if (!trunc_check_buf(bp, &blkoff, lastlbn, lastoff, flags)) { bp->b_vflags |= BV_SCANNED; continue; } if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) { BO_LOCK(bo); goto cleanrestart; } bp->b_vflags |= BV_SCANNED; bremfree(bp); if (blkoff != 0) { allocbuf(bp, blkoff); bqrelse(bp); } else { bp->b_flags |= B_INVAL | B_NOCACHE | B_RELBUF; brelse(bp); } BO_LOCK(bo); goto cleanrestart; } drain_output(vp); BO_UNLOCK(bo); } static int cancel_pagedep(pagedep, freeblks, blkoff) struct pagedep *pagedep; struct freeblks *freeblks; int blkoff; { struct jremref *jremref; struct jmvref *jmvref; struct dirrem *dirrem, *tmp; int i; /* * Copy any directory remove dependencies to the list * to be processed after the freeblks proceeds. If * directory entry never made it to disk they * can be dumped directly onto the work list. */ LIST_FOREACH_SAFE(dirrem, &pagedep->pd_dirremhd, dm_next, tmp) { /* Skip this directory removal if it is intended to remain. */ if (dirrem->dm_offset < blkoff) continue; /* * If there are any dirrems we wait for the journal write * to complete and then restart the buf scan as the lock * has been dropped. */ while ((jremref = LIST_FIRST(&dirrem->dm_jremrefhd)) != NULL) { jwait(&jremref->jr_list, MNT_WAIT); return (ERESTART); } LIST_REMOVE(dirrem, dm_next); dirrem->dm_dirinum = pagedep->pd_ino; WORKLIST_INSERT(&freeblks->fb_freeworkhd, &dirrem->dm_list); } while ((jmvref = LIST_FIRST(&pagedep->pd_jmvrefhd)) != NULL) { jwait(&jmvref->jm_list, MNT_WAIT); return (ERESTART); } /* * When we're partially truncating a pagedep we just want to flush * journal entries and return. There can not be any adds in the * truncated portion of the directory and newblk must remain if * part of the block remains. */ if (blkoff != 0) { struct diradd *dap; LIST_FOREACH(dap, &pagedep->pd_pendinghd, da_pdlist) if (dap->da_offset > blkoff) panic("cancel_pagedep: diradd %p off %d > %d", dap, dap->da_offset, blkoff); for (i = 0; i < DAHASHSZ; i++) LIST_FOREACH(dap, &pagedep->pd_diraddhd[i], da_pdlist) if (dap->da_offset > blkoff) panic("cancel_pagedep: diradd %p off %d > %d", dap, dap->da_offset, blkoff); return (0); } /* * There should be no directory add dependencies present * as the directory could not be truncated until all * children were removed. */ KASSERT(LIST_FIRST(&pagedep->pd_pendinghd) == NULL, ("deallocate_dependencies: pendinghd != NULL")); for (i = 0; i < DAHASHSZ; i++) KASSERT(LIST_FIRST(&pagedep->pd_diraddhd[i]) == NULL, ("deallocate_dependencies: diraddhd != NULL")); if ((pagedep->pd_state & NEWBLOCK) != 0) free_newdirblk(pagedep->pd_newdirblk); if (free_pagedep(pagedep) == 0) panic("Failed to free pagedep %p", pagedep); return (0); } /* * Reclaim any dependency structures from a buffer that is about to * be reallocated to a new vnode. The buffer must be locked, thus, * no I/O completion operations can occur while we are manipulating * its associated dependencies. The mutex is held so that other I/O's * associated with related dependencies do not occur. */ static int deallocate_dependencies(bp, freeblks, off) struct buf *bp; struct freeblks *freeblks; int off; { struct indirdep *indirdep; struct pagedep *pagedep; struct worklist *wk, *wkn; struct ufsmount *ump; if ((wk = LIST_FIRST(&bp->b_dep)) == NULL) goto done; ump = VFSTOUFS(wk->wk_mp); ACQUIRE_LOCK(ump); LIST_FOREACH_SAFE(wk, &bp->b_dep, wk_list, wkn) { switch (wk->wk_type) { case D_INDIRDEP: indirdep = WK_INDIRDEP(wk); if (bp->b_lblkno >= 0 || bp->b_blkno != indirdep->ir_savebp->b_lblkno) panic("deallocate_dependencies: not indir"); cancel_indirdep(indirdep, bp, freeblks); continue; case D_PAGEDEP: pagedep = WK_PAGEDEP(wk); if (cancel_pagedep(pagedep, freeblks, off)) { FREE_LOCK(ump); return (ERESTART); } continue; case D_ALLOCINDIR: /* * Simply remove the allocindir, we'll find it via * the indirdep where we can clear pointers if * needed. */ WORKLIST_REMOVE(wk); continue; case D_FREEWORK: /* * A truncation is waiting for the zero'd pointers * to be written. It can be freed when the freeblks * is journaled. */ WORKLIST_REMOVE(wk); wk->wk_state |= ONDEPLIST; WORKLIST_INSERT(&freeblks->fb_freeworkhd, wk); break; case D_ALLOCDIRECT: if (off != 0) continue; /* FALLTHROUGH */ default: panic("deallocate_dependencies: Unexpected type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } } FREE_LOCK(ump); done: /* * Don't throw away this buf, we were partially truncating and * some deps may always remain. */ if (off) { allocbuf(bp, off); bp->b_vflags |= BV_SCANNED; return (EBUSY); } bp->b_flags |= B_INVAL | B_NOCACHE; return (0); } /* * An allocdirect is being canceled due to a truncate. We must make sure * the journal entry is released in concert with the blkfree that releases * the storage. Completed journal entries must not be released until the * space is no longer pointed to by the inode or in the bitmap. */ static void cancel_allocdirect(adphead, adp, freeblks) struct allocdirectlst *adphead; struct allocdirect *adp; struct freeblks *freeblks; { struct freework *freework; struct newblk *newblk; struct worklist *wk; TAILQ_REMOVE(adphead, adp, ad_next); newblk = (struct newblk *)adp; freework = NULL; /* * Find the correct freework structure. */ LIST_FOREACH(wk, &freeblks->fb_freeworkhd, wk_list) { if (wk->wk_type != D_FREEWORK) continue; freework = WK_FREEWORK(wk); if (freework->fw_blkno == newblk->nb_newblkno) break; } if (freework == NULL) panic("cancel_allocdirect: Freework not found"); /* * If a newblk exists at all we still have the journal entry that * initiated the allocation so we do not need to journal the free. */ cancel_jfreeblk(freeblks, freework->fw_blkno); /* * If the journal hasn't been written the jnewblk must be passed * to the call to ffs_blkfree that reclaims the space. We accomplish * this by linking the journal dependency into the freework to be * freed when freework_freeblock() is called. If the journal has * been written we can simply reclaim the journal space when the * freeblks work is complete. */ freework->fw_jnewblk = cancel_newblk(newblk, &freework->fw_list, &freeblks->fb_jwork); WORKLIST_INSERT(&freeblks->fb_freeworkhd, &newblk->nb_list); } /* * Cancel a new block allocation. May be an indirect or direct block. We * remove it from various lists and return any journal record that needs to * be resolved by the caller. * * A special consideration is made for indirects which were never pointed * at on disk and will never be found once this block is released. */ static struct jnewblk * cancel_newblk(newblk, wk, wkhd) struct newblk *newblk; struct worklist *wk; struct workhead *wkhd; { struct jnewblk *jnewblk; CTR1(KTR_SUJ, "cancel_newblk: blkno %jd", newblk->nb_newblkno); newblk->nb_state |= GOINGAWAY; /* * Previously we traversed the completedhd on each indirdep * attached to this newblk to cancel them and gather journal * work. Since we need only the oldest journal segment and * the lowest point on the tree will always have the oldest * journal segment we are free to release the segments * of any subordinates and may leave the indirdep list to * indirdep_complete() when this newblk is freed. */ if (newblk->nb_state & ONDEPLIST) { newblk->nb_state &= ~ONDEPLIST; LIST_REMOVE(newblk, nb_deps); } if (newblk->nb_state & ONWORKLIST) WORKLIST_REMOVE(&newblk->nb_list); /* * If the journal entry hasn't been written we save a pointer to * the dependency that frees it until it is written or the * superseding operation completes. */ jnewblk = newblk->nb_jnewblk; if (jnewblk != NULL && wk != NULL) { newblk->nb_jnewblk = NULL; jnewblk->jn_dep = wk; } if (!LIST_EMPTY(&newblk->nb_jwork)) jwork_move(wkhd, &newblk->nb_jwork); /* * When truncating we must free the newdirblk early to remove * the pagedep from the hash before returning. */ if ((wk = LIST_FIRST(&newblk->nb_newdirblk)) != NULL) free_newdirblk(WK_NEWDIRBLK(wk)); if (!LIST_EMPTY(&newblk->nb_newdirblk)) panic("cancel_newblk: extra newdirblk"); return (jnewblk); } /* * Schedule the freefrag associated with a newblk to be released once * the pointers are written and the previous block is no longer needed. */ static void newblk_freefrag(newblk) struct newblk *newblk; { struct freefrag *freefrag; if (newblk->nb_freefrag == NULL) return; freefrag = newblk->nb_freefrag; newblk->nb_freefrag = NULL; freefrag->ff_state |= COMPLETE; if ((freefrag->ff_state & ALLCOMPLETE) == ALLCOMPLETE) add_to_worklist(&freefrag->ff_list, 0); } /* * Free a newblk. Generate a new freefrag work request if appropriate. * This must be called after the inode pointer and any direct block pointers * are valid or fully removed via truncate or frag extension. */ static void free_newblk(newblk) struct newblk *newblk; { struct indirdep *indirdep; struct worklist *wk; KASSERT(newblk->nb_jnewblk == NULL, ("free_newblk: jnewblk %p still attached", newblk->nb_jnewblk)); KASSERT(newblk->nb_list.wk_type != D_NEWBLK, ("free_newblk: unclaimed newblk")); LOCK_OWNED(VFSTOUFS(newblk->nb_list.wk_mp)); newblk_freefrag(newblk); if (newblk->nb_state & ONDEPLIST) LIST_REMOVE(newblk, nb_deps); if (newblk->nb_state & ONWORKLIST) WORKLIST_REMOVE(&newblk->nb_list); LIST_REMOVE(newblk, nb_hash); if ((wk = LIST_FIRST(&newblk->nb_newdirblk)) != NULL) free_newdirblk(WK_NEWDIRBLK(wk)); if (!LIST_EMPTY(&newblk->nb_newdirblk)) panic("free_newblk: extra newdirblk"); while ((indirdep = LIST_FIRST(&newblk->nb_indirdeps)) != NULL) indirdep_complete(indirdep); handle_jwork(&newblk->nb_jwork); WORKITEM_FREE(newblk, D_NEWBLK); } /* * Free a newdirblk. Clear the NEWBLOCK flag on its associated pagedep. * This routine must be called with splbio interrupts blocked. */ static void free_newdirblk(newdirblk) struct newdirblk *newdirblk; { struct pagedep *pagedep; struct diradd *dap; struct worklist *wk; LOCK_OWNED(VFSTOUFS(newdirblk->db_list.wk_mp)); WORKLIST_REMOVE(&newdirblk->db_list); /* * If the pagedep is still linked onto the directory buffer * dependency chain, then some of the entries on the * pd_pendinghd list may not be committed to disk yet. In * this case, we will simply clear the NEWBLOCK flag and * let the pd_pendinghd list be processed when the pagedep * is next written. If the pagedep is no longer on the buffer * dependency chain, then all the entries on the pd_pending * list are committed to disk and we can free them here. */ pagedep = newdirblk->db_pagedep; pagedep->pd_state &= ~NEWBLOCK; if ((pagedep->pd_state & ONWORKLIST) == 0) { while ((dap = LIST_FIRST(&pagedep->pd_pendinghd)) != NULL) free_diradd(dap, NULL); /* * If no dependencies remain, the pagedep will be freed. */ free_pagedep(pagedep); } /* Should only ever be one item in the list. */ while ((wk = LIST_FIRST(&newdirblk->db_mkdir)) != NULL) { WORKLIST_REMOVE(wk); handle_written_mkdir(WK_MKDIR(wk), MKDIR_BODY); } WORKITEM_FREE(newdirblk, D_NEWDIRBLK); } /* * Prepare an inode to be freed. The actual free operation is not * done until the zero'ed inode has been written to disk. */ void softdep_freefile(pvp, ino, mode) struct vnode *pvp; ino_t ino; int mode; { struct inode *ip = VTOI(pvp); struct inodedep *inodedep; struct freefile *freefile; struct freeblks *freeblks; struct ufsmount *ump; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_freefile called on non-softdep filesystem")); /* * This sets up the inode de-allocation dependency. */ freefile = malloc(sizeof(struct freefile), M_FREEFILE, M_SOFTDEP_FLAGS); workitem_alloc(&freefile->fx_list, D_FREEFILE, pvp->v_mount); freefile->fx_mode = mode; freefile->fx_oldinum = ino; freefile->fx_devvp = ump->um_devvp; LIST_INIT(&freefile->fx_jwork); UFS_LOCK(ump); ump->um_fs->fs_pendinginodes += 1; UFS_UNLOCK(ump); /* * If the inodedep does not exist, then the zero'ed inode has * been written to disk. If the allocated inode has never been * written to disk, then the on-disk inode is zero'ed. In either * case we can free the file immediately. If the journal was * canceled before being written the inode will never make it to * disk and we must send the canceled journal entrys to * ffs_freefile() to be cleared in conjunction with the bitmap. * Any blocks waiting on the inode to write can be safely freed * here as it will never been written. */ ACQUIRE_LOCK(ump); inodedep_lookup(pvp->v_mount, ino, 0, &inodedep); if (inodedep) { /* * Clear out freeblks that no longer need to reference * this inode. */ while ((freeblks = TAILQ_FIRST(&inodedep->id_freeblklst)) != NULL) { TAILQ_REMOVE(&inodedep->id_freeblklst, freeblks, fb_next); freeblks->fb_state &= ~ONDEPLIST; } /* * Remove this inode from the unlinked list. */ if (inodedep->id_state & UNLINKED) { /* * Save the journal work to be freed with the bitmap * before we clear UNLINKED. Otherwise it can be lost * if the inode block is written. */ handle_bufwait(inodedep, &freefile->fx_jwork); clear_unlinked_inodedep(inodedep); /* * Re-acquire inodedep as we've dropped the * per-filesystem lock in clear_unlinked_inodedep(). */ inodedep_lookup(pvp->v_mount, ino, 0, &inodedep); } } if (inodedep == NULL || check_inode_unwritten(inodedep)) { FREE_LOCK(ump); handle_workitem_freefile(freefile); return; } if ((inodedep->id_state & DEPCOMPLETE) == 0) inodedep->id_state |= GOINGAWAY; WORKLIST_INSERT(&inodedep->id_inowait, &freefile->fx_list); FREE_LOCK(ump); if (ip->i_number == ino) ip->i_flag |= IN_MODIFIED; } /* * Check to see if an inode has never been written to disk. If * so free the inodedep and return success, otherwise return failure. * This routine must be called with splbio interrupts blocked. * * If we still have a bitmap dependency, then the inode has never * been written to disk. Drop the dependency as it is no longer * necessary since the inode is being deallocated. We set the * ALLCOMPLETE flags since the bitmap now properly shows that the * inode is not allocated. Even if the inode is actively being * written, it has been rolled back to its zero'ed state, so we * are ensured that a zero inode is what is on the disk. For short * lived files, this change will usually result in removing all the * dependencies from the inode so that it can be freed immediately. */ static int check_inode_unwritten(inodedep) struct inodedep *inodedep; { LOCK_OWNED(VFSTOUFS(inodedep->id_list.wk_mp)); if ((inodedep->id_state & (DEPCOMPLETE | UNLINKED)) != 0 || !LIST_EMPTY(&inodedep->id_dirremhd) || !LIST_EMPTY(&inodedep->id_pendinghd) || !LIST_EMPTY(&inodedep->id_bufwait) || !LIST_EMPTY(&inodedep->id_inowait) || !TAILQ_EMPTY(&inodedep->id_inoreflst) || !TAILQ_EMPTY(&inodedep->id_inoupdt) || !TAILQ_EMPTY(&inodedep->id_newinoupdt) || !TAILQ_EMPTY(&inodedep->id_extupdt) || !TAILQ_EMPTY(&inodedep->id_newextupdt) || !TAILQ_EMPTY(&inodedep->id_freeblklst) || inodedep->id_mkdiradd != NULL || inodedep->id_nlinkdelta != 0) return (0); /* * Another process might be in initiate_write_inodeblock_ufs[12] * trying to allocate memory without holding "Softdep Lock". */ if ((inodedep->id_state & IOSTARTED) != 0 && inodedep->id_savedino1 == NULL) return (0); if (inodedep->id_state & ONDEPLIST) LIST_REMOVE(inodedep, id_deps); inodedep->id_state &= ~ONDEPLIST; inodedep->id_state |= ALLCOMPLETE; inodedep->id_bmsafemap = NULL; if (inodedep->id_state & ONWORKLIST) WORKLIST_REMOVE(&inodedep->id_list); if (inodedep->id_savedino1 != NULL) { free(inodedep->id_savedino1, M_SAVEDINO); inodedep->id_savedino1 = NULL; } if (free_inodedep(inodedep) == 0) panic("check_inode_unwritten: busy inode"); return (1); } static int check_inodedep_free(inodedep) struct inodedep *inodedep; { LOCK_OWNED(VFSTOUFS(inodedep->id_list.wk_mp)); if ((inodedep->id_state & ALLCOMPLETE) != ALLCOMPLETE || !LIST_EMPTY(&inodedep->id_dirremhd) || !LIST_EMPTY(&inodedep->id_pendinghd) || !LIST_EMPTY(&inodedep->id_bufwait) || !LIST_EMPTY(&inodedep->id_inowait) || !TAILQ_EMPTY(&inodedep->id_inoreflst) || !TAILQ_EMPTY(&inodedep->id_inoupdt) || !TAILQ_EMPTY(&inodedep->id_newinoupdt) || !TAILQ_EMPTY(&inodedep->id_extupdt) || !TAILQ_EMPTY(&inodedep->id_newextupdt) || !TAILQ_EMPTY(&inodedep->id_freeblklst) || inodedep->id_mkdiradd != NULL || inodedep->id_nlinkdelta != 0 || inodedep->id_savedino1 != NULL) return (0); return (1); } /* * Try to free an inodedep structure. Return 1 if it could be freed. */ static int free_inodedep(inodedep) struct inodedep *inodedep; { LOCK_OWNED(VFSTOUFS(inodedep->id_list.wk_mp)); if ((inodedep->id_state & (ONWORKLIST | UNLINKED)) != 0 || !check_inodedep_free(inodedep)) return (0); if (inodedep->id_state & ONDEPLIST) LIST_REMOVE(inodedep, id_deps); LIST_REMOVE(inodedep, id_hash); WORKITEM_FREE(inodedep, D_INODEDEP); return (1); } /* * Free the block referenced by a freework structure. The parent freeblks * structure is released and completed when the final cg bitmap reaches * the disk. This routine may be freeing a jnewblk which never made it to * disk in which case we do not have to wait as the operation is undone * in memory immediately. */ static void freework_freeblock(freework) struct freework *freework; { struct freeblks *freeblks; struct jnewblk *jnewblk; struct ufsmount *ump; struct workhead wkhd; struct fs *fs; int bsize; int needj; ump = VFSTOUFS(freework->fw_list.wk_mp); LOCK_OWNED(ump); /* * Handle partial truncate separately. */ if (freework->fw_indir) { complete_trunc_indir(freework); return; } freeblks = freework->fw_freeblks; fs = ump->um_fs; needj = MOUNTEDSUJ(freeblks->fb_list.wk_mp) != 0; bsize = lfragtosize(fs, freework->fw_frags); LIST_INIT(&wkhd); /* * DEPCOMPLETE is cleared in indirblk_insert() if the block lives * on the indirblk hashtable and prevents premature freeing. */ freework->fw_state |= DEPCOMPLETE; /* * SUJ needs to wait for the segment referencing freed indirect * blocks to expire so that we know the checker will not confuse * a re-allocated indirect block with its old contents. */ if (needj && freework->fw_lbn <= -UFS_NDADDR) indirblk_insert(freework); /* * If we are canceling an existing jnewblk pass it to the free * routine, otherwise pass the freeblk which will ultimately * release the freeblks. If we're not journaling, we can just * free the freeblks immediately. */ jnewblk = freework->fw_jnewblk; if (jnewblk != NULL) { cancel_jnewblk(jnewblk, &wkhd); needj = 0; } else if (needj) { freework->fw_state |= DELAYEDFREE; freeblks->fb_cgwait++; WORKLIST_INSERT(&wkhd, &freework->fw_list); } FREE_LOCK(ump); freeblks_free(ump, freeblks, btodb(bsize)); CTR4(KTR_SUJ, "freework_freeblock: ino %d blkno %jd lbn %jd size %ld", freeblks->fb_inum, freework->fw_blkno, freework->fw_lbn, bsize); ffs_blkfree(ump, fs, freeblks->fb_devvp, freework->fw_blkno, bsize, freeblks->fb_inum, freeblks->fb_vtype, &wkhd); ACQUIRE_LOCK(ump); /* * The jnewblk will be discarded and the bits in the map never * made it to disk. We can immediately free the freeblk. */ if (needj == 0) handle_written_freework(freework); } /* * We enqueue freework items that need processing back on the freeblks and * add the freeblks to the worklist. This makes it easier to find all work * required to flush a truncation in process_truncates(). */ static void freework_enqueue(freework) struct freework *freework; { struct freeblks *freeblks; freeblks = freework->fw_freeblks; if ((freework->fw_state & INPROGRESS) == 0) WORKLIST_INSERT(&freeblks->fb_freeworkhd, &freework->fw_list); if ((freeblks->fb_state & (ONWORKLIST | INPROGRESS | ALLCOMPLETE)) == ALLCOMPLETE && LIST_EMPTY(&freeblks->fb_jblkdephd)) add_to_worklist(&freeblks->fb_list, WK_NODELAY); } /* * Start, continue, or finish the process of freeing an indirect block tree. * The free operation may be paused at any point with fw_off containing the * offset to restart from. This enables us to implement some flow control * for large truncates which may fan out and generate a huge number of * dependencies. */ static void handle_workitem_indirblk(freework) struct freework *freework; { struct freeblks *freeblks; struct ufsmount *ump; struct fs *fs; freeblks = freework->fw_freeblks; ump = VFSTOUFS(freeblks->fb_list.wk_mp); fs = ump->um_fs; if (freework->fw_state & DEPCOMPLETE) { handle_written_freework(freework); return; } if (freework->fw_off == NINDIR(fs)) { freework_freeblock(freework); return; } freework->fw_state |= INPROGRESS; FREE_LOCK(ump); indir_trunc(freework, fsbtodb(fs, freework->fw_blkno), freework->fw_lbn); ACQUIRE_LOCK(ump); } /* * Called when a freework structure attached to a cg buf is written. The * ref on either the parent or the freeblks structure is released and * the freeblks is added back to the worklist if there is more work to do. */ static void handle_written_freework(freework) struct freework *freework; { struct freeblks *freeblks; struct freework *parent; freeblks = freework->fw_freeblks; parent = freework->fw_parent; if (freework->fw_state & DELAYEDFREE) freeblks->fb_cgwait--; freework->fw_state |= COMPLETE; if ((freework->fw_state & ALLCOMPLETE) == ALLCOMPLETE) WORKITEM_FREE(freework, D_FREEWORK); if (parent) { if (--parent->fw_ref == 0) freework_enqueue(parent); return; } if (--freeblks->fb_ref != 0) return; if ((freeblks->fb_state & (ALLCOMPLETE | ONWORKLIST | INPROGRESS)) == ALLCOMPLETE && LIST_EMPTY(&freeblks->fb_jblkdephd)) add_to_worklist(&freeblks->fb_list, WK_NODELAY); } /* * This workitem routine performs the block de-allocation. * The workitem is added to the pending list after the updated * inode block has been written to disk. As mentioned above, * checks regarding the number of blocks de-allocated (compared * to the number of blocks allocated for the file) are also * performed in this function. */ static int handle_workitem_freeblocks(freeblks, flags) struct freeblks *freeblks; int flags; { struct freework *freework; struct newblk *newblk; struct allocindir *aip; struct ufsmount *ump; struct worklist *wk; KASSERT(LIST_EMPTY(&freeblks->fb_jblkdephd), ("handle_workitem_freeblocks: Journal entries not written.")); ump = VFSTOUFS(freeblks->fb_list.wk_mp); ACQUIRE_LOCK(ump); while ((wk = LIST_FIRST(&freeblks->fb_freeworkhd)) != NULL) { WORKLIST_REMOVE(wk); switch (wk->wk_type) { case D_DIRREM: wk->wk_state |= COMPLETE; add_to_worklist(wk, 0); continue; case D_ALLOCDIRECT: free_newblk(WK_NEWBLK(wk)); continue; case D_ALLOCINDIR: aip = WK_ALLOCINDIR(wk); freework = NULL; if (aip->ai_state & DELAYEDFREE) { FREE_LOCK(ump); freework = newfreework(ump, freeblks, NULL, aip->ai_lbn, aip->ai_newblkno, ump->um_fs->fs_frag, 0, 0); ACQUIRE_LOCK(ump); } newblk = WK_NEWBLK(wk); if (newblk->nb_jnewblk) { freework->fw_jnewblk = newblk->nb_jnewblk; newblk->nb_jnewblk->jn_dep = &freework->fw_list; newblk->nb_jnewblk = NULL; } free_newblk(newblk); continue; case D_FREEWORK: freework = WK_FREEWORK(wk); if (freework->fw_lbn <= -UFS_NDADDR) handle_workitem_indirblk(freework); else freework_freeblock(freework); continue; default: panic("handle_workitem_freeblocks: Unknown type %s", TYPENAME(wk->wk_type)); } } if (freeblks->fb_ref != 0) { freeblks->fb_state &= ~INPROGRESS; wake_worklist(&freeblks->fb_list); freeblks = NULL; } FREE_LOCK(ump); if (freeblks) return handle_complete_freeblocks(freeblks, flags); return (0); } /* * Handle completion of block free via truncate. This allows fs_pending * to track the actual free block count more closely than if we only updated * it at the end. We must be careful to handle cases where the block count * on free was incorrect. */ static void freeblks_free(ump, freeblks, blocks) struct ufsmount *ump; struct freeblks *freeblks; int blocks; { struct fs *fs; ufs2_daddr_t remain; UFS_LOCK(ump); remain = -freeblks->fb_chkcnt; freeblks->fb_chkcnt += blocks; if (remain > 0) { if (remain < blocks) blocks = remain; fs = ump->um_fs; fs->fs_pendingblocks -= blocks; } UFS_UNLOCK(ump); } /* * Once all of the freework workitems are complete we can retire the * freeblocks dependency and any journal work awaiting completion. This * can not be called until all other dependencies are stable on disk. */ static int handle_complete_freeblocks(freeblks, flags) struct freeblks *freeblks; int flags; { struct inodedep *inodedep; struct inode *ip; struct vnode *vp; struct fs *fs; struct ufsmount *ump; ufs2_daddr_t spare; ump = VFSTOUFS(freeblks->fb_list.wk_mp); fs = ump->um_fs; flags = LK_EXCLUSIVE | flags; spare = freeblks->fb_chkcnt; /* * If we did not release the expected number of blocks we may have * to adjust the inode block count here. Only do so if it wasn't * a truncation to zero and the modrev still matches. */ if (spare && freeblks->fb_len != 0) { if (ffs_vgetf(freeblks->fb_list.wk_mp, freeblks->fb_inum, flags, &vp, FFSV_FORCEINSMQ) != 0) return (EBUSY); ip = VTOI(vp); if (DIP(ip, i_modrev) == freeblks->fb_modrev) { DIP_SET(ip, i_blocks, DIP(ip, i_blocks) - spare); ip->i_flag |= IN_CHANGE; /* * We must wait so this happens before the * journal is reclaimed. */ ffs_update(vp, 1); } vput(vp); } if (spare < 0) { UFS_LOCK(ump); fs->fs_pendingblocks += spare; UFS_UNLOCK(ump); } #ifdef QUOTA /* Handle spare. */ if (spare) quotaadj(freeblks->fb_quota, ump, -spare); quotarele(freeblks->fb_quota); #endif ACQUIRE_LOCK(ump); if (freeblks->fb_state & ONDEPLIST) { inodedep_lookup(freeblks->fb_list.wk_mp, freeblks->fb_inum, 0, &inodedep); TAILQ_REMOVE(&inodedep->id_freeblklst, freeblks, fb_next); freeblks->fb_state &= ~ONDEPLIST; if (TAILQ_EMPTY(&inodedep->id_freeblklst)) free_inodedep(inodedep); } /* * All of the freeblock deps must be complete prior to this call * so it's now safe to complete earlier outstanding journal entries. */ handle_jwork(&freeblks->fb_jwork); WORKITEM_FREE(freeblks, D_FREEBLKS); FREE_LOCK(ump); return (0); } /* * Release blocks associated with the freeblks and stored in the indirect * block dbn. If level is greater than SINGLE, the block is an indirect block * and recursive calls to indirtrunc must be used to cleanse other indirect * blocks. * * This handles partial and complete truncation of blocks. Partial is noted * with goingaway == 0. In this case the freework is completed after the * zero'd indirects are written to disk. For full truncation the freework * is completed after the block is freed. */ static void indir_trunc(freework, dbn, lbn) struct freework *freework; ufs2_daddr_t dbn; ufs_lbn_t lbn; { struct freework *nfreework; struct workhead wkhd; struct freeblks *freeblks; struct buf *bp; struct fs *fs; struct indirdep *indirdep; struct ufsmount *ump; ufs1_daddr_t *bap1; ufs2_daddr_t nb, nnb, *bap2; ufs_lbn_t lbnadd, nlbn; int i, nblocks, ufs1fmt; int freedblocks; int goingaway; int freedeps; int needj; int level; int cnt; freeblks = freework->fw_freeblks; ump = VFSTOUFS(freeblks->fb_list.wk_mp); fs = ump->um_fs; /* * Get buffer of block pointers to be freed. There are three cases: * * 1) Partial truncate caches the indirdep pointer in the freework * which provides us a back copy to the save bp which holds the * pointers we want to clear. When this completes the zero * pointers are written to the real copy. * 2) The indirect is being completely truncated, cancel_indirdep() * eliminated the real copy and placed the indirdep on the saved * copy. The indirdep and buf are discarded when this completes. * 3) The indirect was not in memory, we read a copy off of the disk * using the devvp and drop and invalidate the buffer when we're * done. */ goingaway = 1; indirdep = NULL; if (freework->fw_indir != NULL) { goingaway = 0; indirdep = freework->fw_indir; bp = indirdep->ir_savebp; if (bp == NULL || bp->b_blkno != dbn) panic("indir_trunc: Bad saved buf %p blkno %jd", bp, (intmax_t)dbn); } else if ((bp = incore(&freeblks->fb_devvp->v_bufobj, dbn)) != NULL) { /* * The lock prevents the buf dep list from changing and * indirects on devvp should only ever have one dependency. */ indirdep = WK_INDIRDEP(LIST_FIRST(&bp->b_dep)); if (indirdep == NULL || (indirdep->ir_state & GOINGAWAY) == 0) panic("indir_trunc: Bad indirdep %p from buf %p", indirdep, bp); } else if (bread(freeblks->fb_devvp, dbn, (int)fs->fs_bsize, NOCRED, &bp) != 0) { brelse(bp); return; } ACQUIRE_LOCK(ump); /* Protects against a race with complete_trunc_indir(). */ freework->fw_state &= ~INPROGRESS; /* * If we have an indirdep we need to enforce the truncation order * and discard it when it is complete. */ if (indirdep) { if (freework != TAILQ_FIRST(&indirdep->ir_trunc) && !TAILQ_EMPTY(&indirdep->ir_trunc)) { /* * Add the complete truncate to the list on the * indirdep to enforce in-order processing. */ if (freework->fw_indir == NULL) TAILQ_INSERT_TAIL(&indirdep->ir_trunc, freework, fw_next); FREE_LOCK(ump); return; } /* * If we're goingaway, free the indirdep. Otherwise it will * linger until the write completes. */ if (goingaway) free_indirdep(indirdep); } FREE_LOCK(ump); /* Initialize pointers depending on block size. */ if (ump->um_fstype == UFS1) { bap1 = (ufs1_daddr_t *)bp->b_data; nb = bap1[freework->fw_off]; ufs1fmt = 1; bap2 = NULL; } else { bap2 = (ufs2_daddr_t *)bp->b_data; nb = bap2[freework->fw_off]; ufs1fmt = 0; bap1 = NULL; } level = lbn_level(lbn); needj = MOUNTEDSUJ(UFSTOVFS(ump)) != 0; lbnadd = lbn_offset(fs, level); nblocks = btodb(fs->fs_bsize); nfreework = freework; freedeps = 0; cnt = 0; /* * Reclaim blocks. Traverses into nested indirect levels and * arranges for the current level to be freed when subordinates * are free when journaling. */ for (i = freework->fw_off; i < NINDIR(fs); i++, nb = nnb) { if (i != NINDIR(fs) - 1) { if (ufs1fmt) nnb = bap1[i+1]; else nnb = bap2[i+1]; } else nnb = 0; if (nb == 0) continue; cnt++; if (level != 0) { nlbn = (lbn + 1) - (i * lbnadd); if (needj != 0) { nfreework = newfreework(ump, freeblks, freework, nlbn, nb, fs->fs_frag, 0, 0); freedeps++; } indir_trunc(nfreework, fsbtodb(fs, nb), nlbn); } else { struct freedep *freedep; /* * Attempt to aggregate freedep dependencies for * all blocks being released to the same CG. */ LIST_INIT(&wkhd); if (needj != 0 && (nnb == 0 || (dtog(fs, nb) != dtog(fs, nnb)))) { freedep = newfreedep(freework); WORKLIST_INSERT_UNLOCKED(&wkhd, &freedep->fd_list); freedeps++; } CTR3(KTR_SUJ, "indir_trunc: ino %d blkno %jd size %ld", freeblks->fb_inum, nb, fs->fs_bsize); ffs_blkfree(ump, fs, freeblks->fb_devvp, nb, fs->fs_bsize, freeblks->fb_inum, freeblks->fb_vtype, &wkhd); } } if (goingaway) { bp->b_flags |= B_INVAL | B_NOCACHE; brelse(bp); } freedblocks = 0; if (level == 0) freedblocks = (nblocks * cnt); if (needj == 0) freedblocks += nblocks; freeblks_free(ump, freeblks, freedblocks); /* * If we are journaling set up the ref counts and offset so this * indirect can be completed when its children are free. */ if (needj) { ACQUIRE_LOCK(ump); freework->fw_off = i; freework->fw_ref += freedeps; freework->fw_ref -= NINDIR(fs) + 1; if (level == 0) freeblks->fb_cgwait += freedeps; if (freework->fw_ref == 0) freework_freeblock(freework); FREE_LOCK(ump); return; } /* * If we're not journaling we can free the indirect now. */ dbn = dbtofsb(fs, dbn); CTR3(KTR_SUJ, "indir_trunc 2: ino %d blkno %jd size %ld", freeblks->fb_inum, dbn, fs->fs_bsize); ffs_blkfree(ump, fs, freeblks->fb_devvp, dbn, fs->fs_bsize, freeblks->fb_inum, freeblks->fb_vtype, NULL); /* Non SUJ softdep does single-threaded truncations. */ if (freework->fw_blkno == dbn) { freework->fw_state |= ALLCOMPLETE; ACQUIRE_LOCK(ump); handle_written_freework(freework); FREE_LOCK(ump); } return; } /* * Cancel an allocindir when it is removed via truncation. When bp is not * NULL the indirect never appeared on disk and is scheduled to be freed * independently of the indir so we can more easily track journal work. */ static void cancel_allocindir(aip, bp, freeblks, trunc) struct allocindir *aip; struct buf *bp; struct freeblks *freeblks; int trunc; { struct indirdep *indirdep; struct freefrag *freefrag; struct newblk *newblk; newblk = (struct newblk *)aip; LIST_REMOVE(aip, ai_next); /* * We must eliminate the pointer in bp if it must be freed on its * own due to partial truncate or pending journal work. */ if (bp && (trunc || newblk->nb_jnewblk)) { /* * Clear the pointer and mark the aip to be freed * directly if it never existed on disk. */ aip->ai_state |= DELAYEDFREE; indirdep = aip->ai_indirdep; if (indirdep->ir_state & UFS1FMT) ((ufs1_daddr_t *)bp->b_data)[aip->ai_offset] = 0; else ((ufs2_daddr_t *)bp->b_data)[aip->ai_offset] = 0; } /* * When truncating the previous pointer will be freed via * savedbp. Eliminate the freefrag which would dup free. */ if (trunc && (freefrag = newblk->nb_freefrag) != NULL) { newblk->nb_freefrag = NULL; if (freefrag->ff_jdep) cancel_jfreefrag( WK_JFREEFRAG(freefrag->ff_jdep)); jwork_move(&freeblks->fb_jwork, &freefrag->ff_jwork); WORKITEM_FREE(freefrag, D_FREEFRAG); } /* * If the journal hasn't been written the jnewblk must be passed * to the call to ffs_blkfree that reclaims the space. We accomplish * this by leaving the journal dependency on the newblk to be freed * when a freework is created in handle_workitem_freeblocks(). */ cancel_newblk(newblk, NULL, &freeblks->fb_jwork); WORKLIST_INSERT(&freeblks->fb_freeworkhd, &newblk->nb_list); } /* * Create the mkdir dependencies for . and .. in a new directory. Link them * in to a newdirblk so any subsequent additions are tracked properly. The * caller is responsible for adding the mkdir1 dependency to the journal * and updating id_mkdiradd. This function returns with the per-filesystem * lock held. */ static struct mkdir * setup_newdir(dap, newinum, dinum, newdirbp, mkdirp) struct diradd *dap; ino_t newinum; ino_t dinum; struct buf *newdirbp; struct mkdir **mkdirp; { struct newblk *newblk; struct pagedep *pagedep; struct inodedep *inodedep; struct newdirblk *newdirblk; struct mkdir *mkdir1, *mkdir2; struct worklist *wk; struct jaddref *jaddref; struct ufsmount *ump; struct mount *mp; mp = dap->da_list.wk_mp; ump = VFSTOUFS(mp); newdirblk = malloc(sizeof(struct newdirblk), M_NEWDIRBLK, M_SOFTDEP_FLAGS); workitem_alloc(&newdirblk->db_list, D_NEWDIRBLK, mp); LIST_INIT(&newdirblk->db_mkdir); mkdir1 = malloc(sizeof(struct mkdir), M_MKDIR, M_SOFTDEP_FLAGS); workitem_alloc(&mkdir1->md_list, D_MKDIR, mp); mkdir1->md_state = ATTACHED | MKDIR_BODY; mkdir1->md_diradd = dap; mkdir1->md_jaddref = NULL; mkdir2 = malloc(sizeof(struct mkdir), M_MKDIR, M_SOFTDEP_FLAGS); workitem_alloc(&mkdir2->md_list, D_MKDIR, mp); mkdir2->md_state = ATTACHED | MKDIR_PARENT; mkdir2->md_diradd = dap; mkdir2->md_jaddref = NULL; if (MOUNTEDSUJ(mp) == 0) { mkdir1->md_state |= DEPCOMPLETE; mkdir2->md_state |= DEPCOMPLETE; } /* * Dependency on "." and ".." being written to disk. */ mkdir1->md_buf = newdirbp; ACQUIRE_LOCK(VFSTOUFS(mp)); LIST_INSERT_HEAD(&ump->softdep_mkdirlisthd, mkdir1, md_mkdirs); /* * We must link the pagedep, allocdirect, and newdirblk for * the initial file page so the pointer to the new directory * is not written until the directory contents are live and * any subsequent additions are not marked live until the * block is reachable via the inode. */ if (pagedep_lookup(mp, newdirbp, newinum, 0, 0, &pagedep) == 0) panic("setup_newdir: lost pagedep"); LIST_FOREACH(wk, &newdirbp->b_dep, wk_list) if (wk->wk_type == D_ALLOCDIRECT) break; if (wk == NULL) panic("setup_newdir: lost allocdirect"); if (pagedep->pd_state & NEWBLOCK) panic("setup_newdir: NEWBLOCK already set"); newblk = WK_NEWBLK(wk); pagedep->pd_state |= NEWBLOCK; pagedep->pd_newdirblk = newdirblk; newdirblk->db_pagedep = pagedep; WORKLIST_INSERT(&newblk->nb_newdirblk, &newdirblk->db_list); WORKLIST_INSERT(&newdirblk->db_mkdir, &mkdir1->md_list); /* * Look up the inodedep for the parent directory so that we * can link mkdir2 into the pending dotdot jaddref or * the inode write if there is none. If the inode is * ALLCOMPLETE and no jaddref is present all dependencies have * been satisfied and mkdir2 can be freed. */ inodedep_lookup(mp, dinum, 0, &inodedep); if (MOUNTEDSUJ(mp)) { if (inodedep == NULL) panic("setup_newdir: Lost parent."); jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref != NULL && jaddref->ja_parent == newinum && (jaddref->ja_state & MKDIR_PARENT), ("setup_newdir: bad dotdot jaddref %p", jaddref)); LIST_INSERT_HEAD(&ump->softdep_mkdirlisthd, mkdir2, md_mkdirs); mkdir2->md_jaddref = jaddref; jaddref->ja_mkdir = mkdir2; } else if (inodedep == NULL || (inodedep->id_state & ALLCOMPLETE) == ALLCOMPLETE) { dap->da_state &= ~MKDIR_PARENT; WORKITEM_FREE(mkdir2, D_MKDIR); mkdir2 = NULL; } else { LIST_INSERT_HEAD(&ump->softdep_mkdirlisthd, mkdir2, md_mkdirs); WORKLIST_INSERT(&inodedep->id_bufwait, &mkdir2->md_list); } *mkdirp = mkdir2; return (mkdir1); } /* * Directory entry addition dependencies. * * When adding a new directory entry, the inode (with its incremented link * count) must be written to disk before the directory entry's pointer to it. * Also, if the inode is newly allocated, the corresponding freemap must be * updated (on disk) before the directory entry's pointer. These requirements * are met via undo/redo on the directory entry's pointer, which consists * simply of the inode number. * * As directory entries are added and deleted, the free space within a * directory block can become fragmented. The ufs filesystem will compact * a fragmented directory block to make space for a new entry. When this * occurs, the offsets of previously added entries change. Any "diradd" * dependency structures corresponding to these entries must be updated with * the new offsets. */ /* * This routine is called after the in-memory inode's link * count has been incremented, but before the directory entry's * pointer to the inode has been set. */ int softdep_setup_directory_add(bp, dp, diroffset, newinum, newdirbp, isnewblk) struct buf *bp; /* buffer containing directory block */ struct inode *dp; /* inode for directory */ off_t diroffset; /* offset of new entry in directory */ ino_t newinum; /* inode referenced by new directory entry */ struct buf *newdirbp; /* non-NULL => contents of new mkdir */ int isnewblk; /* entry is in a newly allocated block */ { int offset; /* offset of new entry within directory block */ ufs_lbn_t lbn; /* block in directory containing new entry */ struct fs *fs; struct diradd *dap; struct newblk *newblk; struct pagedep *pagedep; struct inodedep *inodedep; struct newdirblk *newdirblk; struct mkdir *mkdir1, *mkdir2; struct jaddref *jaddref; struct ufsmount *ump; struct mount *mp; int isindir; mp = ITOVFS(dp); ump = VFSTOUFS(mp); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_directory_add called on non-softdep filesystem")); /* * Whiteouts have no dependencies. */ if (newinum == UFS_WINO) { if (newdirbp != NULL) bdwrite(newdirbp); return (0); } jaddref = NULL; mkdir1 = mkdir2 = NULL; fs = ump->um_fs; lbn = lblkno(fs, diroffset); offset = blkoff(fs, diroffset); dap = malloc(sizeof(struct diradd), M_DIRADD, M_SOFTDEP_FLAGS|M_ZERO); workitem_alloc(&dap->da_list, D_DIRADD, mp); dap->da_offset = offset; dap->da_newinum = newinum; dap->da_state = ATTACHED; LIST_INIT(&dap->da_jwork); isindir = bp->b_lblkno >= UFS_NDADDR; newdirblk = NULL; if (isnewblk && (isindir ? blkoff(fs, diroffset) : fragoff(fs, diroffset)) == 0) { newdirblk = malloc(sizeof(struct newdirblk), M_NEWDIRBLK, M_SOFTDEP_FLAGS); workitem_alloc(&newdirblk->db_list, D_NEWDIRBLK, mp); LIST_INIT(&newdirblk->db_mkdir); } /* * If we're creating a new directory setup the dependencies and set * the dap state to wait for them. Otherwise it's COMPLETE and * we can move on. */ if (newdirbp == NULL) { dap->da_state |= DEPCOMPLETE; ACQUIRE_LOCK(ump); } else { dap->da_state |= MKDIR_BODY | MKDIR_PARENT; mkdir1 = setup_newdir(dap, newinum, dp->i_number, newdirbp, &mkdir2); } /* * Link into parent directory pagedep to await its being written. */ pagedep_lookup(mp, bp, dp->i_number, lbn, DEPALLOC, &pagedep); #ifdef DEBUG if (diradd_lookup(pagedep, offset) != NULL) panic("softdep_setup_directory_add: %p already at off %d\n", diradd_lookup(pagedep, offset), offset); #endif dap->da_pagedep = pagedep; LIST_INSERT_HEAD(&pagedep->pd_diraddhd[DIRADDHASH(offset)], dap, da_pdlist); inodedep_lookup(mp, newinum, DEPALLOC, &inodedep); /* * If we're journaling, link the diradd into the jaddref so it * may be completed after the journal entry is written. Otherwise, * link the diradd into its inodedep. If the inode is not yet * written place it on the bufwait list, otherwise do the post-inode * write processing to put it on the id_pendinghd list. */ if (MOUNTEDSUJ(mp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref != NULL && jaddref->ja_parent == dp->i_number, ("softdep_setup_directory_add: bad jaddref %p", jaddref)); jaddref->ja_diroff = diroffset; jaddref->ja_diradd = dap; add_to_journal(&jaddref->ja_list); } else if ((inodedep->id_state & ALLCOMPLETE) == ALLCOMPLETE) diradd_inode_written(dap, inodedep); else WORKLIST_INSERT(&inodedep->id_bufwait, &dap->da_list); /* * Add the journal entries for . and .. links now that the primary * link is written. */ if (mkdir1 != NULL && MOUNTEDSUJ(mp)) { jaddref = (struct jaddref *)TAILQ_PREV(&jaddref->ja_ref, inoreflst, if_deps); KASSERT(jaddref != NULL && jaddref->ja_ino == jaddref->ja_parent && (jaddref->ja_state & MKDIR_BODY), ("softdep_setup_directory_add: bad dot jaddref %p", jaddref)); mkdir1->md_jaddref = jaddref; jaddref->ja_mkdir = mkdir1; /* * It is important that the dotdot journal entry * is added prior to the dot entry since dot writes * both the dot and dotdot links. These both must * be added after the primary link for the journal * to remain consistent. */ add_to_journal(&mkdir2->md_jaddref->ja_list); add_to_journal(&jaddref->ja_list); } /* * If we are adding a new directory remember this diradd so that if * we rename it we can keep the dot and dotdot dependencies. If * we are adding a new name for an inode that has a mkdiradd we * must be in rename and we have to move the dot and dotdot * dependencies to this new name. The old name is being orphaned * soon. */ if (mkdir1 != NULL) { if (inodedep->id_mkdiradd != NULL) panic("softdep_setup_directory_add: Existing mkdir"); inodedep->id_mkdiradd = dap; } else if (inodedep->id_mkdiradd) merge_diradd(inodedep, dap); if (newdirblk != NULL) { /* * There is nothing to do if we are already tracking * this block. */ if ((pagedep->pd_state & NEWBLOCK) != 0) { WORKITEM_FREE(newdirblk, D_NEWDIRBLK); FREE_LOCK(ump); return (0); } if (newblk_lookup(mp, dbtofsb(fs, bp->b_blkno), 0, &newblk) == 0) panic("softdep_setup_directory_add: lost entry"); WORKLIST_INSERT(&newblk->nb_newdirblk, &newdirblk->db_list); pagedep->pd_state |= NEWBLOCK; pagedep->pd_newdirblk = newdirblk; newdirblk->db_pagedep = pagedep; FREE_LOCK(ump); /* * If we extended into an indirect signal direnter to sync. */ if (isindir) return (1); return (0); } FREE_LOCK(ump); return (0); } /* * This procedure is called to change the offset of a directory * entry when compacting a directory block which must be owned * exclusively by the caller. Note that the actual entry movement * must be done in this procedure to ensure that no I/O completions * occur while the move is in progress. */ void softdep_change_directoryentry_offset(bp, dp, base, oldloc, newloc, entrysize) struct buf *bp; /* Buffer holding directory block. */ struct inode *dp; /* inode for directory */ caddr_t base; /* address of dp->i_offset */ caddr_t oldloc; /* address of old directory location */ caddr_t newloc; /* address of new directory location */ int entrysize; /* size of directory entry */ { int offset, oldoffset, newoffset; struct pagedep *pagedep; struct jmvref *jmvref; struct diradd *dap; struct direct *de; struct mount *mp; struct ufsmount *ump; ufs_lbn_t lbn; int flags; mp = ITOVFS(dp); ump = VFSTOUFS(mp); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_change_directoryentry_offset called on " "non-softdep filesystem")); de = (struct direct *)oldloc; jmvref = NULL; flags = 0; /* * Moves are always journaled as it would be too complex to * determine if any affected adds or removes are present in the * journal. */ if (MOUNTEDSUJ(mp)) { flags = DEPALLOC; jmvref = newjmvref(dp, de->d_ino, dp->i_offset + (oldloc - base), dp->i_offset + (newloc - base)); } lbn = lblkno(ump->um_fs, dp->i_offset); offset = blkoff(ump->um_fs, dp->i_offset); oldoffset = offset + (oldloc - base); newoffset = offset + (newloc - base); ACQUIRE_LOCK(ump); if (pagedep_lookup(mp, bp, dp->i_number, lbn, flags, &pagedep) == 0) goto done; dap = diradd_lookup(pagedep, oldoffset); if (dap) { dap->da_offset = newoffset; newoffset = DIRADDHASH(newoffset); oldoffset = DIRADDHASH(oldoffset); if ((dap->da_state & ALLCOMPLETE) != ALLCOMPLETE && newoffset != oldoffset) { LIST_REMOVE(dap, da_pdlist); LIST_INSERT_HEAD(&pagedep->pd_diraddhd[newoffset], dap, da_pdlist); } } done: if (jmvref) { jmvref->jm_pagedep = pagedep; LIST_INSERT_HEAD(&pagedep->pd_jmvrefhd, jmvref, jm_deps); add_to_journal(&jmvref->jm_list); } bcopy(oldloc, newloc, entrysize); FREE_LOCK(ump); } /* * Move the mkdir dependencies and journal work from one diradd to another * when renaming a directory. The new name must depend on the mkdir deps * completing as the old name did. Directories can only have one valid link * at a time so one must be canonical. */ static void merge_diradd(inodedep, newdap) struct inodedep *inodedep; struct diradd *newdap; { struct diradd *olddap; struct mkdir *mkdir, *nextmd; struct ufsmount *ump; short state; olddap = inodedep->id_mkdiradd; inodedep->id_mkdiradd = newdap; if ((olddap->da_state & (MKDIR_PARENT | MKDIR_BODY)) != 0) { newdap->da_state &= ~DEPCOMPLETE; ump = VFSTOUFS(inodedep->id_list.wk_mp); for (mkdir = LIST_FIRST(&ump->softdep_mkdirlisthd); mkdir; mkdir = nextmd) { nextmd = LIST_NEXT(mkdir, md_mkdirs); if (mkdir->md_diradd != olddap) continue; mkdir->md_diradd = newdap; state = mkdir->md_state & (MKDIR_PARENT | MKDIR_BODY); newdap->da_state |= state; olddap->da_state &= ~state; if ((olddap->da_state & (MKDIR_PARENT | MKDIR_BODY)) == 0) break; } if ((olddap->da_state & (MKDIR_PARENT | MKDIR_BODY)) != 0) panic("merge_diradd: unfound ref"); } /* * Any mkdir related journal items are not safe to be freed until * the new name is stable. */ jwork_move(&newdap->da_jwork, &olddap->da_jwork); olddap->da_state |= DEPCOMPLETE; complete_diradd(olddap); } /* * Move the diradd to the pending list when all diradd dependencies are * complete. */ static void complete_diradd(dap) struct diradd *dap; { struct pagedep *pagedep; if ((dap->da_state & ALLCOMPLETE) == ALLCOMPLETE) { if (dap->da_state & DIRCHG) pagedep = dap->da_previous->dm_pagedep; else pagedep = dap->da_pagedep; LIST_REMOVE(dap, da_pdlist); LIST_INSERT_HEAD(&pagedep->pd_pendinghd, dap, da_pdlist); } } /* * Cancel a diradd when a dirrem overlaps with it. We must cancel the journal * add entries and conditonally journal the remove. */ static void cancel_diradd(dap, dirrem, jremref, dotremref, dotdotremref) struct diradd *dap; struct dirrem *dirrem; struct jremref *jremref; struct jremref *dotremref; struct jremref *dotdotremref; { struct inodedep *inodedep; struct jaddref *jaddref; struct inoref *inoref; struct ufsmount *ump; struct mkdir *mkdir; /* * If no remove references were allocated we're on a non-journaled * filesystem and can skip the cancel step. */ if (jremref == NULL) { free_diradd(dap, NULL); return; } /* * Cancel the primary name an free it if it does not require * journaling. */ if (inodedep_lookup(dap->da_list.wk_mp, dap->da_newinum, 0, &inodedep) != 0) { /* Abort the addref that reference this diradd. */ TAILQ_FOREACH(inoref, &inodedep->id_inoreflst, if_deps) { if (inoref->if_list.wk_type != D_JADDREF) continue; jaddref = (struct jaddref *)inoref; if (jaddref->ja_diradd != dap) continue; if (cancel_jaddref(jaddref, inodedep, &dirrem->dm_jwork) == 0) { free_jremref(jremref); jremref = NULL; } break; } } /* * Cancel subordinate names and free them if they do not require * journaling. */ if ((dap->da_state & (MKDIR_PARENT | MKDIR_BODY)) != 0) { ump = VFSTOUFS(dap->da_list.wk_mp); LIST_FOREACH(mkdir, &ump->softdep_mkdirlisthd, md_mkdirs) { if (mkdir->md_diradd != dap) continue; if ((jaddref = mkdir->md_jaddref) == NULL) continue; mkdir->md_jaddref = NULL; if (mkdir->md_state & MKDIR_PARENT) { if (cancel_jaddref(jaddref, NULL, &dirrem->dm_jwork) == 0) { free_jremref(dotdotremref); dotdotremref = NULL; } } else { if (cancel_jaddref(jaddref, inodedep, &dirrem->dm_jwork) == 0) { free_jremref(dotremref); dotremref = NULL; } } } } if (jremref) journal_jremref(dirrem, jremref, inodedep); if (dotremref) journal_jremref(dirrem, dotremref, inodedep); if (dotdotremref) journal_jremref(dirrem, dotdotremref, NULL); jwork_move(&dirrem->dm_jwork, &dap->da_jwork); free_diradd(dap, &dirrem->dm_jwork); } /* * Free a diradd dependency structure. This routine must be called * with splbio interrupts blocked. */ static void free_diradd(dap, wkhd) struct diradd *dap; struct workhead *wkhd; { struct dirrem *dirrem; struct pagedep *pagedep; struct inodedep *inodedep; struct mkdir *mkdir, *nextmd; struct ufsmount *ump; ump = VFSTOUFS(dap->da_list.wk_mp); LOCK_OWNED(ump); LIST_REMOVE(dap, da_pdlist); if (dap->da_state & ONWORKLIST) WORKLIST_REMOVE(&dap->da_list); if ((dap->da_state & DIRCHG) == 0) { pagedep = dap->da_pagedep; } else { dirrem = dap->da_previous; pagedep = dirrem->dm_pagedep; dirrem->dm_dirinum = pagedep->pd_ino; dirrem->dm_state |= COMPLETE; if (LIST_EMPTY(&dirrem->dm_jremrefhd)) add_to_worklist(&dirrem->dm_list, 0); } if (inodedep_lookup(pagedep->pd_list.wk_mp, dap->da_newinum, 0, &inodedep) != 0) if (inodedep->id_mkdiradd == dap) inodedep->id_mkdiradd = NULL; if ((dap->da_state & (MKDIR_PARENT | MKDIR_BODY)) != 0) { for (mkdir = LIST_FIRST(&ump->softdep_mkdirlisthd); mkdir; mkdir = nextmd) { nextmd = LIST_NEXT(mkdir, md_mkdirs); if (mkdir->md_diradd != dap) continue; dap->da_state &= ~(mkdir->md_state & (MKDIR_PARENT | MKDIR_BODY)); LIST_REMOVE(mkdir, md_mkdirs); if (mkdir->md_state & ONWORKLIST) WORKLIST_REMOVE(&mkdir->md_list); if (mkdir->md_jaddref != NULL) panic("free_diradd: Unexpected jaddref"); WORKITEM_FREE(mkdir, D_MKDIR); if ((dap->da_state & (MKDIR_PARENT | MKDIR_BODY)) == 0) break; } if ((dap->da_state & (MKDIR_PARENT | MKDIR_BODY)) != 0) panic("free_diradd: unfound ref"); } if (inodedep) free_inodedep(inodedep); /* * Free any journal segments waiting for the directory write. */ handle_jwork(&dap->da_jwork); WORKITEM_FREE(dap, D_DIRADD); } /* * Directory entry removal dependencies. * * When removing a directory entry, the entry's inode pointer must be * zero'ed on disk before the corresponding inode's link count is decremented * (possibly freeing the inode for re-use). This dependency is handled by * updating the directory entry but delaying the inode count reduction until * after the directory block has been written to disk. After this point, the * inode count can be decremented whenever it is convenient. */ /* * This routine should be called immediately after removing * a directory entry. The inode's link count should not be * decremented by the calling procedure -- the soft updates * code will do this task when it is safe. */ void softdep_setup_remove(bp, dp, ip, isrmdir) struct buf *bp; /* buffer containing directory block */ struct inode *dp; /* inode for the directory being modified */ struct inode *ip; /* inode for directory entry being removed */ int isrmdir; /* indicates if doing RMDIR */ { struct dirrem *dirrem, *prevdirrem; struct inodedep *inodedep; struct ufsmount *ump; int direct; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_setup_remove called on non-softdep filesystem")); /* * Allocate a new dirrem if appropriate and ACQUIRE_LOCK. We want * newdirrem() to setup the full directory remove which requires * isrmdir > 1. */ dirrem = newdirrem(bp, dp, ip, isrmdir, &prevdirrem); /* * Add the dirrem to the inodedep's pending remove list for quick * discovery later. */ if (inodedep_lookup(UFSTOVFS(ump), ip->i_number, 0, &inodedep) == 0) panic("softdep_setup_remove: Lost inodedep."); KASSERT((inodedep->id_state & UNLINKED) == 0, ("inode unlinked")); dirrem->dm_state |= ONDEPLIST; LIST_INSERT_HEAD(&inodedep->id_dirremhd, dirrem, dm_inonext); /* * If the COMPLETE flag is clear, then there were no active * entries and we want to roll back to a zeroed entry until * the new inode is committed to disk. If the COMPLETE flag is * set then we have deleted an entry that never made it to * disk. If the entry we deleted resulted from a name change, * then the old name still resides on disk. We cannot delete * its inode (returned to us in prevdirrem) until the zeroed * directory entry gets to disk. The new inode has never been * referenced on the disk, so can be deleted immediately. */ if ((dirrem->dm_state & COMPLETE) == 0) { LIST_INSERT_HEAD(&dirrem->dm_pagedep->pd_dirremhd, dirrem, dm_next); FREE_LOCK(ump); } else { if (prevdirrem != NULL) LIST_INSERT_HEAD(&dirrem->dm_pagedep->pd_dirremhd, prevdirrem, dm_next); dirrem->dm_dirinum = dirrem->dm_pagedep->pd_ino; direct = LIST_EMPTY(&dirrem->dm_jremrefhd); FREE_LOCK(ump); if (direct) handle_workitem_remove(dirrem, 0); } } /* * Check for an entry matching 'offset' on both the pd_dirraddhd list and the * pd_pendinghd list of a pagedep. */ static struct diradd * diradd_lookup(pagedep, offset) struct pagedep *pagedep; int offset; { struct diradd *dap; LIST_FOREACH(dap, &pagedep->pd_diraddhd[DIRADDHASH(offset)], da_pdlist) if (dap->da_offset == offset) return (dap); LIST_FOREACH(dap, &pagedep->pd_pendinghd, da_pdlist) if (dap->da_offset == offset) return (dap); return (NULL); } /* * Search for a .. diradd dependency in a directory that is being removed. * If the directory was renamed to a new parent we have a diradd rather * than a mkdir for the .. entry. We need to cancel it now before * it is found in truncate(). */ static struct jremref * cancel_diradd_dotdot(ip, dirrem, jremref) struct inode *ip; struct dirrem *dirrem; struct jremref *jremref; { struct pagedep *pagedep; struct diradd *dap; struct worklist *wk; if (pagedep_lookup(ITOVFS(ip), NULL, ip->i_number, 0, 0, &pagedep) == 0) return (jremref); dap = diradd_lookup(pagedep, DOTDOT_OFFSET); if (dap == NULL) return (jremref); cancel_diradd(dap, dirrem, jremref, NULL, NULL); /* * Mark any journal work as belonging to the parent so it is freed * with the .. reference. */ LIST_FOREACH(wk, &dirrem->dm_jwork, wk_list) wk->wk_state |= MKDIR_PARENT; return (NULL); } /* * Cancel the MKDIR_PARENT mkdir component of a diradd when we're going to * replace it with a dirrem/diradd pair as a result of re-parenting a * directory. This ensures that we don't simultaneously have a mkdir and * a diradd for the same .. entry. */ static struct jremref * cancel_mkdir_dotdot(ip, dirrem, jremref) struct inode *ip; struct dirrem *dirrem; struct jremref *jremref; { struct inodedep *inodedep; struct jaddref *jaddref; struct ufsmount *ump; struct mkdir *mkdir; struct diradd *dap; struct mount *mp; mp = ITOVFS(ip); if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) == 0) return (jremref); dap = inodedep->id_mkdiradd; if (dap == NULL || (dap->da_state & MKDIR_PARENT) == 0) return (jremref); ump = VFSTOUFS(inodedep->id_list.wk_mp); for (mkdir = LIST_FIRST(&ump->softdep_mkdirlisthd); mkdir; mkdir = LIST_NEXT(mkdir, md_mkdirs)) if (mkdir->md_diradd == dap && mkdir->md_state & MKDIR_PARENT) break; if (mkdir == NULL) panic("cancel_mkdir_dotdot: Unable to find mkdir\n"); if ((jaddref = mkdir->md_jaddref) != NULL) { mkdir->md_jaddref = NULL; jaddref->ja_state &= ~MKDIR_PARENT; if (inodedep_lookup(mp, jaddref->ja_ino, 0, &inodedep) == 0) panic("cancel_mkdir_dotdot: Lost parent inodedep"); if (cancel_jaddref(jaddref, inodedep, &dirrem->dm_jwork)) { journal_jremref(dirrem, jremref, inodedep); jremref = NULL; } } if (mkdir->md_state & ONWORKLIST) WORKLIST_REMOVE(&mkdir->md_list); mkdir->md_state |= ALLCOMPLETE; complete_mkdir(mkdir); return (jremref); } static void journal_jremref(dirrem, jremref, inodedep) struct dirrem *dirrem; struct jremref *jremref; struct inodedep *inodedep; { if (inodedep == NULL) if (inodedep_lookup(jremref->jr_list.wk_mp, jremref->jr_ref.if_ino, 0, &inodedep) == 0) panic("journal_jremref: Lost inodedep"); LIST_INSERT_HEAD(&dirrem->dm_jremrefhd, jremref, jr_deps); TAILQ_INSERT_TAIL(&inodedep->id_inoreflst, &jremref->jr_ref, if_deps); add_to_journal(&jremref->jr_list); } static void dirrem_journal(dirrem, jremref, dotremref, dotdotremref) struct dirrem *dirrem; struct jremref *jremref; struct jremref *dotremref; struct jremref *dotdotremref; { struct inodedep *inodedep; if (inodedep_lookup(jremref->jr_list.wk_mp, jremref->jr_ref.if_ino, 0, &inodedep) == 0) panic("dirrem_journal: Lost inodedep"); journal_jremref(dirrem, jremref, inodedep); if (dotremref) journal_jremref(dirrem, dotremref, inodedep); if (dotdotremref) journal_jremref(dirrem, dotdotremref, NULL); } /* * Allocate a new dirrem if appropriate and return it along with * its associated pagedep. Called without a lock, returns with lock. */ static struct dirrem * newdirrem(bp, dp, ip, isrmdir, prevdirremp) struct buf *bp; /* buffer containing directory block */ struct inode *dp; /* inode for the directory being modified */ struct inode *ip; /* inode for directory entry being removed */ int isrmdir; /* indicates if doing RMDIR */ struct dirrem **prevdirremp; /* previously referenced inode, if any */ { int offset; ufs_lbn_t lbn; struct diradd *dap; struct dirrem *dirrem; struct pagedep *pagedep; struct jremref *jremref; struct jremref *dotremref; struct jremref *dotdotremref; struct vnode *dvp; struct ufsmount *ump; /* * Whiteouts have no deletion dependencies. */ if (ip == NULL) panic("newdirrem: whiteout"); dvp = ITOV(dp); ump = ITOUMP(dp); /* * If the system is over its limit and our filesystem is * responsible for more than our share of that usage and * we are not a snapshot, request some inodedep cleanup. * Limiting the number of dirrem structures will also limit * the number of freefile and freeblks structures. */ ACQUIRE_LOCK(ump); if (!IS_SNAPSHOT(ip) && softdep_excess_items(ump, D_DIRREM)) schedule_cleanup(UFSTOVFS(ump)); else FREE_LOCK(ump); dirrem = malloc(sizeof(struct dirrem), M_DIRREM, M_SOFTDEP_FLAGS | M_ZERO); workitem_alloc(&dirrem->dm_list, D_DIRREM, dvp->v_mount); LIST_INIT(&dirrem->dm_jremrefhd); LIST_INIT(&dirrem->dm_jwork); dirrem->dm_state = isrmdir ? RMDIR : 0; dirrem->dm_oldinum = ip->i_number; *prevdirremp = NULL; /* * Allocate remove reference structures to track journal write * dependencies. We will always have one for the link and * when doing directories we will always have one more for dot. * When renaming a directory we skip the dotdot link change so * this is not needed. */ jremref = dotremref = dotdotremref = NULL; if (DOINGSUJ(dvp)) { if (isrmdir) { jremref = newjremref(dirrem, dp, ip, dp->i_offset, ip->i_effnlink + 2); dotremref = newjremref(dirrem, ip, ip, DOT_OFFSET, ip->i_effnlink + 1); dotdotremref = newjremref(dirrem, ip, dp, DOTDOT_OFFSET, dp->i_effnlink + 1); dotdotremref->jr_state |= MKDIR_PARENT; } else jremref = newjremref(dirrem, dp, ip, dp->i_offset, ip->i_effnlink + 1); } ACQUIRE_LOCK(ump); lbn = lblkno(ump->um_fs, dp->i_offset); offset = blkoff(ump->um_fs, dp->i_offset); pagedep_lookup(UFSTOVFS(ump), bp, dp->i_number, lbn, DEPALLOC, &pagedep); dirrem->dm_pagedep = pagedep; dirrem->dm_offset = offset; /* * If we're renaming a .. link to a new directory, cancel any * existing MKDIR_PARENT mkdir. If it has already been canceled * the jremref is preserved for any potential diradd in this * location. This can not coincide with a rmdir. */ if (dp->i_offset == DOTDOT_OFFSET) { if (isrmdir) panic("newdirrem: .. directory change during remove?"); jremref = cancel_mkdir_dotdot(dp, dirrem, jremref); } /* * If we're removing a directory search for the .. dependency now and * cancel it. Any pending journal work will be added to the dirrem * to be completed when the workitem remove completes. */ if (isrmdir) dotdotremref = cancel_diradd_dotdot(ip, dirrem, dotdotremref); /* * Check for a diradd dependency for the same directory entry. * If present, then both dependencies become obsolete and can * be de-allocated. */ dap = diradd_lookup(pagedep, offset); if (dap == NULL) { /* * Link the jremref structures into the dirrem so they are * written prior to the pagedep. */ if (jremref) dirrem_journal(dirrem, jremref, dotremref, dotdotremref); return (dirrem); } /* * Must be ATTACHED at this point. */ if ((dap->da_state & ATTACHED) == 0) panic("newdirrem: not ATTACHED"); if (dap->da_newinum != ip->i_number) panic("newdirrem: inum %ju should be %ju", (uintmax_t)ip->i_number, (uintmax_t)dap->da_newinum); /* * If we are deleting a changed name that never made it to disk, * then return the dirrem describing the previous inode (which * represents the inode currently referenced from this entry on disk). */ if ((dap->da_state & DIRCHG) != 0) { *prevdirremp = dap->da_previous; dap->da_state &= ~DIRCHG; dap->da_pagedep = pagedep; } /* * We are deleting an entry that never made it to disk. * Mark it COMPLETE so we can delete its inode immediately. */ dirrem->dm_state |= COMPLETE; cancel_diradd(dap, dirrem, jremref, dotremref, dotdotremref); #ifdef SUJ_DEBUG if (isrmdir == 0) { struct worklist *wk; LIST_FOREACH(wk, &dirrem->dm_jwork, wk_list) if (wk->wk_state & (MKDIR_BODY | MKDIR_PARENT)) panic("bad wk %p (0x%X)\n", wk, wk->wk_state); } #endif return (dirrem); } /* * Directory entry change dependencies. * * Changing an existing directory entry requires that an add operation * be completed first followed by a deletion. The semantics for the addition * are identical to the description of adding a new entry above except * that the rollback is to the old inode number rather than zero. Once * the addition dependency is completed, the removal is done as described * in the removal routine above. */ /* * This routine should be called immediately after changing * a directory entry. The inode's link count should not be * decremented by the calling procedure -- the soft updates * code will perform this task when it is safe. */ void softdep_setup_directory_change(bp, dp, ip, newinum, isrmdir) struct buf *bp; /* buffer containing directory block */ struct inode *dp; /* inode for the directory being modified */ struct inode *ip; /* inode for directory entry being removed */ ino_t newinum; /* new inode number for changed entry */ int isrmdir; /* indicates if doing RMDIR */ { int offset; struct diradd *dap = NULL; struct dirrem *dirrem, *prevdirrem; struct pagedep *pagedep; struct inodedep *inodedep; struct jaddref *jaddref; struct mount *mp; struct ufsmount *ump; mp = ITOVFS(dp); ump = VFSTOUFS(mp); offset = blkoff(ump->um_fs, dp->i_offset); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_directory_change called on non-softdep filesystem")); /* * Whiteouts do not need diradd dependencies. */ if (newinum != UFS_WINO) { dap = malloc(sizeof(struct diradd), M_DIRADD, M_SOFTDEP_FLAGS|M_ZERO); workitem_alloc(&dap->da_list, D_DIRADD, mp); dap->da_state = DIRCHG | ATTACHED | DEPCOMPLETE; dap->da_offset = offset; dap->da_newinum = newinum; LIST_INIT(&dap->da_jwork); } /* * Allocate a new dirrem and ACQUIRE_LOCK. */ dirrem = newdirrem(bp, dp, ip, isrmdir, &prevdirrem); pagedep = dirrem->dm_pagedep; /* * The possible values for isrmdir: * 0 - non-directory file rename * 1 - directory rename within same directory * inum - directory rename to new directory of given inode number * When renaming to a new directory, we are both deleting and * creating a new directory entry, so the link count on the new * directory should not change. Thus we do not need the followup * dirrem which is usually done in handle_workitem_remove. We set * the DIRCHG flag to tell handle_workitem_remove to skip the * followup dirrem. */ if (isrmdir > 1) dirrem->dm_state |= DIRCHG; /* * Whiteouts have no additional dependencies, * so just put the dirrem on the correct list. */ if (newinum == UFS_WINO) { if ((dirrem->dm_state & COMPLETE) == 0) { LIST_INSERT_HEAD(&pagedep->pd_dirremhd, dirrem, dm_next); } else { dirrem->dm_dirinum = pagedep->pd_ino; if (LIST_EMPTY(&dirrem->dm_jremrefhd)) add_to_worklist(&dirrem->dm_list, 0); } FREE_LOCK(ump); return; } /* * Add the dirrem to the inodedep's pending remove list for quick * discovery later. A valid nlinkdelta ensures that this lookup * will not fail. */ if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) == 0) panic("softdep_setup_directory_change: Lost inodedep."); dirrem->dm_state |= ONDEPLIST; LIST_INSERT_HEAD(&inodedep->id_dirremhd, dirrem, dm_inonext); /* * If the COMPLETE flag is clear, then there were no active * entries and we want to roll back to the previous inode until * the new inode is committed to disk. If the COMPLETE flag is * set, then we have deleted an entry that never made it to disk. * If the entry we deleted resulted from a name change, then the old * inode reference still resides on disk. Any rollback that we do * needs to be to that old inode (returned to us in prevdirrem). If * the entry we deleted resulted from a create, then there is * no entry on the disk, so we want to roll back to zero rather * than the uncommitted inode. In either of the COMPLETE cases we * want to immediately free the unwritten and unreferenced inode. */ if ((dirrem->dm_state & COMPLETE) == 0) { dap->da_previous = dirrem; } else { if (prevdirrem != NULL) { dap->da_previous = prevdirrem; } else { dap->da_state &= ~DIRCHG; dap->da_pagedep = pagedep; } dirrem->dm_dirinum = pagedep->pd_ino; if (LIST_EMPTY(&dirrem->dm_jremrefhd)) add_to_worklist(&dirrem->dm_list, 0); } /* * Lookup the jaddref for this journal entry. We must finish * initializing it and make the diradd write dependent on it. * If we're not journaling, put it on the id_bufwait list if the * inode is not yet written. If it is written, do the post-inode * write processing to put it on the id_pendinghd list. */ inodedep_lookup(mp, newinum, DEPALLOC, &inodedep); if (MOUNTEDSUJ(mp)) { jaddref = (struct jaddref *)TAILQ_LAST(&inodedep->id_inoreflst, inoreflst); KASSERT(jaddref != NULL && jaddref->ja_parent == dp->i_number, ("softdep_setup_directory_change: bad jaddref %p", jaddref)); jaddref->ja_diroff = dp->i_offset; jaddref->ja_diradd = dap; LIST_INSERT_HEAD(&pagedep->pd_diraddhd[DIRADDHASH(offset)], dap, da_pdlist); add_to_journal(&jaddref->ja_list); } else if ((inodedep->id_state & ALLCOMPLETE) == ALLCOMPLETE) { dap->da_state |= COMPLETE; LIST_INSERT_HEAD(&pagedep->pd_pendinghd, dap, da_pdlist); WORKLIST_INSERT(&inodedep->id_pendinghd, &dap->da_list); } else { LIST_INSERT_HEAD(&pagedep->pd_diraddhd[DIRADDHASH(offset)], dap, da_pdlist); WORKLIST_INSERT(&inodedep->id_bufwait, &dap->da_list); } /* * If we're making a new name for a directory that has not been * committed when need to move the dot and dotdot references to * this new name. */ if (inodedep->id_mkdiradd && dp->i_offset != DOTDOT_OFFSET) merge_diradd(inodedep, dap); FREE_LOCK(ump); } /* * Called whenever the link count on an inode is changed. * It creates an inode dependency so that the new reference(s) * to the inode cannot be committed to disk until the updated * inode has been written. */ void softdep_change_linkcnt(ip) struct inode *ip; /* the inode with the increased link count */ { struct inodedep *inodedep; struct ufsmount *ump; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_change_linkcnt called on non-softdep filesystem")); ACQUIRE_LOCK(ump); inodedep_lookup(UFSTOVFS(ump), ip->i_number, DEPALLOC, &inodedep); if (ip->i_nlink < ip->i_effnlink) panic("softdep_change_linkcnt: bad delta"); inodedep->id_nlinkdelta = ip->i_nlink - ip->i_effnlink; FREE_LOCK(ump); } /* * Attach a sbdep dependency to the superblock buf so that we can keep * track of the head of the linked list of referenced but unlinked inodes. */ void softdep_setup_sbupdate(ump, fs, bp) struct ufsmount *ump; struct fs *fs; struct buf *bp; { struct sbdep *sbdep; struct worklist *wk; KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_setup_sbupdate called on non-softdep filesystem")); LIST_FOREACH(wk, &bp->b_dep, wk_list) if (wk->wk_type == D_SBDEP) break; if (wk != NULL) return; sbdep = malloc(sizeof(struct sbdep), M_SBDEP, M_SOFTDEP_FLAGS); workitem_alloc(&sbdep->sb_list, D_SBDEP, UFSTOVFS(ump)); sbdep->sb_fs = fs; sbdep->sb_ump = ump; ACQUIRE_LOCK(ump); WORKLIST_INSERT(&bp->b_dep, &sbdep->sb_list); FREE_LOCK(ump); } /* * Return the first unlinked inodedep which is ready to be the head of the * list. The inodedep and all those after it must have valid next pointers. */ static struct inodedep * first_unlinked_inodedep(ump) struct ufsmount *ump; { struct inodedep *inodedep; struct inodedep *idp; LOCK_OWNED(ump); for (inodedep = TAILQ_LAST(&ump->softdep_unlinked, inodedeplst); inodedep; inodedep = idp) { if ((inodedep->id_state & UNLINKNEXT) == 0) return (NULL); idp = TAILQ_PREV(inodedep, inodedeplst, id_unlinked); if (idp == NULL || (idp->id_state & UNLINKNEXT) == 0) break; if ((inodedep->id_state & UNLINKPREV) == 0) break; } return (inodedep); } /* * Set the sujfree unlinked head pointer prior to writing a superblock. */ static void initiate_write_sbdep(sbdep) struct sbdep *sbdep; { struct inodedep *inodedep; struct fs *bpfs; struct fs *fs; bpfs = sbdep->sb_fs; fs = sbdep->sb_ump->um_fs; inodedep = first_unlinked_inodedep(sbdep->sb_ump); if (inodedep) { fs->fs_sujfree = inodedep->id_ino; inodedep->id_state |= UNLINKPREV; } else fs->fs_sujfree = 0; bpfs->fs_sujfree = fs->fs_sujfree; } /* * After a superblock is written determine whether it must be written again * due to a changing unlinked list head. */ static int handle_written_sbdep(sbdep, bp) struct sbdep *sbdep; struct buf *bp; { struct inodedep *inodedep; struct fs *fs; LOCK_OWNED(sbdep->sb_ump); fs = sbdep->sb_fs; /* * If the superblock doesn't match the in-memory list start over. */ inodedep = first_unlinked_inodedep(sbdep->sb_ump); if ((inodedep && fs->fs_sujfree != inodedep->id_ino) || (inodedep == NULL && fs->fs_sujfree != 0)) { bdirty(bp); return (1); } WORKITEM_FREE(sbdep, D_SBDEP); if (fs->fs_sujfree == 0) return (0); /* * Now that we have a record of this inode in stable store allow it * to be written to free up pending work. Inodes may see a lot of * write activity after they are unlinked which we must not hold up. */ for (; inodedep != NULL; inodedep = TAILQ_NEXT(inodedep, id_unlinked)) { if ((inodedep->id_state & UNLINKLINKS) != UNLINKLINKS) panic("handle_written_sbdep: Bad inodedep %p (0x%X)", inodedep, inodedep->id_state); if (inodedep->id_state & UNLINKONLIST) break; inodedep->id_state |= DEPCOMPLETE | UNLINKONLIST; } return (0); } /* * Mark an inodedep as unlinked and insert it into the in-memory unlinked list. */ static void unlinked_inodedep(mp, inodedep) struct mount *mp; struct inodedep *inodedep; { struct ufsmount *ump; ump = VFSTOUFS(mp); LOCK_OWNED(ump); if (MOUNTEDSUJ(mp) == 0) return; ump->um_fs->fs_fmod = 1; if (inodedep->id_state & UNLINKED) panic("unlinked_inodedep: %p already unlinked\n", inodedep); inodedep->id_state |= UNLINKED; TAILQ_INSERT_HEAD(&ump->softdep_unlinked, inodedep, id_unlinked); } /* * Remove an inodedep from the unlinked inodedep list. This may require * disk writes if the inode has made it that far. */ static void clear_unlinked_inodedep(inodedep) struct inodedep *inodedep; { struct ufsmount *ump; struct inodedep *idp; struct inodedep *idn; struct fs *fs; struct buf *bp; ino_t ino; ino_t nino; ino_t pino; int error; ump = VFSTOUFS(inodedep->id_list.wk_mp); fs = ump->um_fs; ino = inodedep->id_ino; error = 0; for (;;) { LOCK_OWNED(ump); KASSERT((inodedep->id_state & UNLINKED) != 0, ("clear_unlinked_inodedep: inodedep %p not unlinked", inodedep)); /* * If nothing has yet been written simply remove us from * the in memory list and return. This is the most common * case where handle_workitem_remove() loses the final * reference. */ if ((inodedep->id_state & UNLINKLINKS) == 0) break; /* * If we have a NEXT pointer and no PREV pointer we can simply * clear NEXT's PREV and remove ourselves from the list. Be * careful not to clear PREV if the superblock points at * next as well. */ idn = TAILQ_NEXT(inodedep, id_unlinked); if ((inodedep->id_state & UNLINKLINKS) == UNLINKNEXT) { if (idn && fs->fs_sujfree != idn->id_ino) idn->id_state &= ~UNLINKPREV; break; } /* * Here we have an inodedep which is actually linked into * the list. We must remove it by forcing a write to the * link before us, whether it be the superblock or an inode. * Unfortunately the list may change while we're waiting * on the buf lock for either resource so we must loop until * we lock the right one. If both the superblock and an * inode point to this inode we must clear the inode first * followed by the superblock. */ idp = TAILQ_PREV(inodedep, inodedeplst, id_unlinked); pino = 0; if (idp && (idp->id_state & UNLINKNEXT)) pino = idp->id_ino; FREE_LOCK(ump); if (pino == 0) { bp = getblk(ump->um_devvp, btodb(fs->fs_sblockloc), (int)fs->fs_sbsize, 0, 0, 0); } else { error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, pino)), (int)fs->fs_bsize, NOCRED, &bp); if (error) brelse(bp); } ACQUIRE_LOCK(ump); if (error) break; /* If the list has changed restart the loop. */ idp = TAILQ_PREV(inodedep, inodedeplst, id_unlinked); nino = 0; if (idp && (idp->id_state & UNLINKNEXT)) nino = idp->id_ino; if (nino != pino || (inodedep->id_state & UNLINKPREV) != UNLINKPREV) { FREE_LOCK(ump); brelse(bp); ACQUIRE_LOCK(ump); continue; } nino = 0; idn = TAILQ_NEXT(inodedep, id_unlinked); if (idn) nino = idn->id_ino; /* * Remove us from the in memory list. After this we cannot * access the inodedep. */ KASSERT((inodedep->id_state & UNLINKED) != 0, ("clear_unlinked_inodedep: inodedep %p not unlinked", inodedep)); inodedep->id_state &= ~(UNLINKED | UNLINKLINKS | UNLINKONLIST); TAILQ_REMOVE(&ump->softdep_unlinked, inodedep, id_unlinked); FREE_LOCK(ump); /* * The predecessor's next pointer is manually updated here * so that the NEXT flag is never cleared for an element * that is in the list. */ if (pino == 0) { bcopy((caddr_t)fs, bp->b_data, (u_int)fs->fs_sbsize); ffs_oldfscompat_write((struct fs *)bp->b_data, ump); softdep_setup_sbupdate(ump, (struct fs *)bp->b_data, bp); } else if (fs->fs_magic == FS_UFS1_MAGIC) ((struct ufs1_dinode *)bp->b_data + ino_to_fsbo(fs, pino))->di_freelink = nino; else ((struct ufs2_dinode *)bp->b_data + ino_to_fsbo(fs, pino))->di_freelink = nino; /* * If the bwrite fails we have no recourse to recover. The * filesystem is corrupted already. */ bwrite(bp); ACQUIRE_LOCK(ump); /* * If the superblock pointer still needs to be cleared force * a write here. */ if (fs->fs_sujfree == ino) { FREE_LOCK(ump); bp = getblk(ump->um_devvp, btodb(fs->fs_sblockloc), (int)fs->fs_sbsize, 0, 0, 0); bcopy((caddr_t)fs, bp->b_data, (u_int)fs->fs_sbsize); ffs_oldfscompat_write((struct fs *)bp->b_data, ump); softdep_setup_sbupdate(ump, (struct fs *)bp->b_data, bp); bwrite(bp); ACQUIRE_LOCK(ump); } if (fs->fs_sujfree != ino) return; panic("clear_unlinked_inodedep: Failed to clear free head"); } if (inodedep->id_ino == fs->fs_sujfree) panic("clear_unlinked_inodedep: Freeing head of free list"); inodedep->id_state &= ~(UNLINKED | UNLINKLINKS | UNLINKONLIST); TAILQ_REMOVE(&ump->softdep_unlinked, inodedep, id_unlinked); return; } /* * This workitem decrements the inode's link count. * If the link count reaches zero, the file is removed. */ static int handle_workitem_remove(dirrem, flags) struct dirrem *dirrem; int flags; { struct inodedep *inodedep; struct workhead dotdotwk; struct worklist *wk; struct ufsmount *ump; struct mount *mp; struct vnode *vp; struct inode *ip; ino_t oldinum; if (dirrem->dm_state & ONWORKLIST) panic("handle_workitem_remove: dirrem %p still on worklist", dirrem); oldinum = dirrem->dm_oldinum; mp = dirrem->dm_list.wk_mp; ump = VFSTOUFS(mp); flags |= LK_EXCLUSIVE; if (ffs_vgetf(mp, oldinum, flags, &vp, FFSV_FORCEINSMQ) != 0) return (EBUSY); ip = VTOI(vp); ACQUIRE_LOCK(ump); if ((inodedep_lookup(mp, oldinum, 0, &inodedep)) == 0) panic("handle_workitem_remove: lost inodedep"); if (dirrem->dm_state & ONDEPLIST) LIST_REMOVE(dirrem, dm_inonext); KASSERT(LIST_EMPTY(&dirrem->dm_jremrefhd), ("handle_workitem_remove: Journal entries not written.")); /* * Move all dependencies waiting on the remove to complete * from the dirrem to the inode inowait list to be completed * after the inode has been updated and written to disk. Any * marked MKDIR_PARENT are saved to be completed when the .. ref * is removed. */ LIST_INIT(&dotdotwk); while ((wk = LIST_FIRST(&dirrem->dm_jwork)) != NULL) { WORKLIST_REMOVE(wk); if (wk->wk_state & MKDIR_PARENT) { wk->wk_state &= ~MKDIR_PARENT; WORKLIST_INSERT(&dotdotwk, wk); continue; } WORKLIST_INSERT(&inodedep->id_inowait, wk); } LIST_SWAP(&dirrem->dm_jwork, &dotdotwk, worklist, wk_list); /* * Normal file deletion. */ if ((dirrem->dm_state & RMDIR) == 0) { ip->i_nlink--; DIP_SET(ip, i_nlink, ip->i_nlink); ip->i_flag |= IN_CHANGE; if (ip->i_nlink < ip->i_effnlink) panic("handle_workitem_remove: bad file delta"); if (ip->i_nlink == 0) unlinked_inodedep(mp, inodedep); inodedep->id_nlinkdelta = ip->i_nlink - ip->i_effnlink; KASSERT(LIST_EMPTY(&dirrem->dm_jwork), ("handle_workitem_remove: worklist not empty. %s", TYPENAME(LIST_FIRST(&dirrem->dm_jwork)->wk_type))); WORKITEM_FREE(dirrem, D_DIRREM); FREE_LOCK(ump); goto out; } /* * Directory deletion. Decrement reference count for both the * just deleted parent directory entry and the reference for ".". * Arrange to have the reference count on the parent decremented * to account for the loss of "..". */ ip->i_nlink -= 2; DIP_SET(ip, i_nlink, ip->i_nlink); ip->i_flag |= IN_CHANGE; if (ip->i_nlink < ip->i_effnlink) panic("handle_workitem_remove: bad dir delta"); if (ip->i_nlink == 0) unlinked_inodedep(mp, inodedep); inodedep->id_nlinkdelta = ip->i_nlink - ip->i_effnlink; /* * Rename a directory to a new parent. Since, we are both deleting * and creating a new directory entry, the link count on the new * directory should not change. Thus we skip the followup dirrem. */ if (dirrem->dm_state & DIRCHG) { KASSERT(LIST_EMPTY(&dirrem->dm_jwork), ("handle_workitem_remove: DIRCHG and worklist not empty.")); WORKITEM_FREE(dirrem, D_DIRREM); FREE_LOCK(ump); goto out; } dirrem->dm_state = ONDEPLIST; dirrem->dm_oldinum = dirrem->dm_dirinum; /* * Place the dirrem on the parent's diremhd list. */ if (inodedep_lookup(mp, dirrem->dm_oldinum, 0, &inodedep) == 0) panic("handle_workitem_remove: lost dir inodedep"); LIST_INSERT_HEAD(&inodedep->id_dirremhd, dirrem, dm_inonext); /* * If the allocated inode has never been written to disk, then * the on-disk inode is zero'ed and we can remove the file * immediately. When journaling if the inode has been marked * unlinked and not DEPCOMPLETE we know it can never be written. */ inodedep_lookup(mp, oldinum, 0, &inodedep); if (inodedep == NULL || (inodedep->id_state & (DEPCOMPLETE | UNLINKED)) == UNLINKED || check_inode_unwritten(inodedep)) { FREE_LOCK(ump); vput(vp); return handle_workitem_remove(dirrem, flags); } WORKLIST_INSERT(&inodedep->id_inowait, &dirrem->dm_list); FREE_LOCK(ump); ip->i_flag |= IN_CHANGE; out: ffs_update(vp, 0); vput(vp); return (0); } /* * Inode de-allocation dependencies. * * When an inode's link count is reduced to zero, it can be de-allocated. We * found it convenient to postpone de-allocation until after the inode is * written to disk with its new link count (zero). At this point, all of the * on-disk inode's block pointers are nullified and, with careful dependency * list ordering, all dependencies related to the inode will be satisfied and * the corresponding dependency structures de-allocated. So, if/when the * inode is reused, there will be no mixing of old dependencies with new * ones. This artificial dependency is set up by the block de-allocation * procedure above (softdep_setup_freeblocks) and completed by the * following procedure. */ static void handle_workitem_freefile(freefile) struct freefile *freefile; { struct workhead wkhd; struct fs *fs; struct inodedep *idp; struct ufsmount *ump; int error; ump = VFSTOUFS(freefile->fx_list.wk_mp); fs = ump->um_fs; #ifdef DEBUG ACQUIRE_LOCK(ump); error = inodedep_lookup(UFSTOVFS(ump), freefile->fx_oldinum, 0, &idp); FREE_LOCK(ump); if (error) panic("handle_workitem_freefile: inodedep %p survived", idp); #endif UFS_LOCK(ump); fs->fs_pendinginodes -= 1; UFS_UNLOCK(ump); LIST_INIT(&wkhd); LIST_SWAP(&freefile->fx_jwork, &wkhd, worklist, wk_list); if ((error = ffs_freefile(ump, fs, freefile->fx_devvp, freefile->fx_oldinum, freefile->fx_mode, &wkhd)) != 0) softdep_error("handle_workitem_freefile", error); ACQUIRE_LOCK(ump); WORKITEM_FREE(freefile, D_FREEFILE); FREE_LOCK(ump); } /* * Helper function which unlinks marker element from work list and returns * the next element on the list. */ static __inline struct worklist * markernext(struct worklist *marker) { struct worklist *next; next = LIST_NEXT(marker, wk_list); LIST_REMOVE(marker, wk_list); return next; } /* * Disk writes. * * The dependency structures constructed above are most actively used when file * system blocks are written to disk. No constraints are placed on when a * block can be written, but unsatisfied update dependencies are made safe by * modifying (or replacing) the source memory for the duration of the disk * write. When the disk write completes, the memory block is again brought * up-to-date. * * In-core inode structure reclamation. * * Because there are a finite number of "in-core" inode structures, they are * reused regularly. By transferring all inode-related dependencies to the * in-memory inode block and indexing them separately (via "inodedep"s), we * can allow "in-core" inode structures to be reused at any time and avoid * any increase in contention. * * Called just before entering the device driver to initiate a new disk I/O. * The buffer must be locked, thus, no I/O completion operations can occur * while we are manipulating its associated dependencies. */ static void softdep_disk_io_initiation(bp) struct buf *bp; /* structure describing disk write to occur */ { struct worklist *wk; struct worklist marker; struct inodedep *inodedep; struct freeblks *freeblks; struct jblkdep *jblkdep; struct newblk *newblk; struct ufsmount *ump; /* * We only care about write operations. There should never * be dependencies for reads. */ if (bp->b_iocmd != BIO_WRITE) panic("softdep_disk_io_initiation: not write"); if (bp->b_vflags & BV_BKGRDINPROG) panic("softdep_disk_io_initiation: Writing buffer with " "background write in progress: %p", bp); if ((wk = LIST_FIRST(&bp->b_dep)) == NULL) return; ump = VFSTOUFS(wk->wk_mp); marker.wk_type = D_LAST + 1; /* Not a normal workitem */ PHOLD(curproc); /* Don't swap out kernel stack */ ACQUIRE_LOCK(ump); /* * Do any necessary pre-I/O processing. */ for (wk = LIST_FIRST(&bp->b_dep); wk != NULL; wk = markernext(&marker)) { LIST_INSERT_AFTER(wk, &marker, wk_list); switch (wk->wk_type) { case D_PAGEDEP: initiate_write_filepage(WK_PAGEDEP(wk), bp); continue; case D_INODEDEP: inodedep = WK_INODEDEP(wk); if (inodedep->id_fs->fs_magic == FS_UFS1_MAGIC) initiate_write_inodeblock_ufs1(inodedep, bp); else initiate_write_inodeblock_ufs2(inodedep, bp); continue; case D_INDIRDEP: initiate_write_indirdep(WK_INDIRDEP(wk), bp); continue; case D_BMSAFEMAP: initiate_write_bmsafemap(WK_BMSAFEMAP(wk), bp); continue; case D_JSEG: WK_JSEG(wk)->js_buf = NULL; continue; case D_FREEBLKS: freeblks = WK_FREEBLKS(wk); jblkdep = LIST_FIRST(&freeblks->fb_jblkdephd); /* * We have to wait for the freeblks to be journaled * before we can write an inodeblock with updated * pointers. Be careful to arrange the marker so * we revisit the freeblks if it's not removed by * the first jwait(). */ if (jblkdep != NULL) { LIST_REMOVE(&marker, wk_list); LIST_INSERT_BEFORE(wk, &marker, wk_list); jwait(&jblkdep->jb_list, MNT_WAIT); } continue; case D_ALLOCDIRECT: case D_ALLOCINDIR: /* * We have to wait for the jnewblk to be journaled * before we can write to a block if the contents * may be confused with an earlier file's indirect * at recovery time. Handle the marker as described * above. */ newblk = WK_NEWBLK(wk); if (newblk->nb_jnewblk != NULL && indirblk_lookup(newblk->nb_list.wk_mp, newblk->nb_newblkno)) { LIST_REMOVE(&marker, wk_list); LIST_INSERT_BEFORE(wk, &marker, wk_list); jwait(&newblk->nb_jnewblk->jn_list, MNT_WAIT); } continue; case D_SBDEP: initiate_write_sbdep(WK_SBDEP(wk)); continue; case D_MKDIR: case D_FREEWORK: case D_FREEDEP: case D_JSEGDEP: continue; default: panic("handle_disk_io_initiation: Unexpected type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } } FREE_LOCK(ump); PRELE(curproc); /* Allow swapout of kernel stack */ } /* * Called from within the procedure above to deal with unsatisfied * allocation dependencies in a directory. The buffer must be locked, * thus, no I/O completion operations can occur while we are * manipulating its associated dependencies. */ static void initiate_write_filepage(pagedep, bp) struct pagedep *pagedep; struct buf *bp; { struct jremref *jremref; struct jmvref *jmvref; struct dirrem *dirrem; struct diradd *dap; struct direct *ep; int i; if (pagedep->pd_state & IOSTARTED) { /* * This can only happen if there is a driver that does not * understand chaining. Here biodone will reissue the call * to strategy for the incomplete buffers. */ printf("initiate_write_filepage: already started\n"); return; } pagedep->pd_state |= IOSTARTED; /* * Wait for all journal remove dependencies to hit the disk. * We can not allow any potentially conflicting directory adds * to be visible before removes and rollback is too difficult. * The per-filesystem lock may be dropped and re-acquired, however * we hold the buf locked so the dependency can not go away. */ LIST_FOREACH(dirrem, &pagedep->pd_dirremhd, dm_next) while ((jremref = LIST_FIRST(&dirrem->dm_jremrefhd)) != NULL) jwait(&jremref->jr_list, MNT_WAIT); while ((jmvref = LIST_FIRST(&pagedep->pd_jmvrefhd)) != NULL) jwait(&jmvref->jm_list, MNT_WAIT); for (i = 0; i < DAHASHSZ; i++) { LIST_FOREACH(dap, &pagedep->pd_diraddhd[i], da_pdlist) { ep = (struct direct *) ((char *)bp->b_data + dap->da_offset); if (ep->d_ino != dap->da_newinum) panic("%s: dir inum %ju != new %ju", "initiate_write_filepage", (uintmax_t)ep->d_ino, (uintmax_t)dap->da_newinum); if (dap->da_state & DIRCHG) ep->d_ino = dap->da_previous->dm_oldinum; else ep->d_ino = 0; dap->da_state &= ~ATTACHED; dap->da_state |= UNDONE; } } } /* * Version of initiate_write_inodeblock that handles UFS1 dinodes. * Note that any bug fixes made to this routine must be done in the * version found below. * * Called from within the procedure above to deal with unsatisfied * allocation dependencies in an inodeblock. The buffer must be * locked, thus, no I/O completion operations can occur while we * are manipulating its associated dependencies. */ static void initiate_write_inodeblock_ufs1(inodedep, bp) struct inodedep *inodedep; struct buf *bp; /* The inode block */ { struct allocdirect *adp, *lastadp; struct ufs1_dinode *dp; struct ufs1_dinode *sip; struct inoref *inoref; struct ufsmount *ump; struct fs *fs; ufs_lbn_t i; #ifdef INVARIANTS ufs_lbn_t prevlbn = 0; #endif int deplist; if (inodedep->id_state & IOSTARTED) panic("initiate_write_inodeblock_ufs1: already started"); inodedep->id_state |= IOSTARTED; fs = inodedep->id_fs; ump = VFSTOUFS(inodedep->id_list.wk_mp); LOCK_OWNED(ump); dp = (struct ufs1_dinode *)bp->b_data + ino_to_fsbo(fs, inodedep->id_ino); /* * If we're on the unlinked list but have not yet written our * next pointer initialize it here. */ if ((inodedep->id_state & (UNLINKED | UNLINKNEXT)) == UNLINKED) { struct inodedep *inon; inon = TAILQ_NEXT(inodedep, id_unlinked); dp->di_freelink = inon ? inon->id_ino : 0; } /* * If the bitmap is not yet written, then the allocated * inode cannot be written to disk. */ if ((inodedep->id_state & DEPCOMPLETE) == 0) { if (inodedep->id_savedino1 != NULL) panic("initiate_write_inodeblock_ufs1: I/O underway"); FREE_LOCK(ump); sip = malloc(sizeof(struct ufs1_dinode), M_SAVEDINO, M_SOFTDEP_FLAGS); ACQUIRE_LOCK(ump); inodedep->id_savedino1 = sip; *inodedep->id_savedino1 = *dp; bzero((caddr_t)dp, sizeof(struct ufs1_dinode)); dp->di_gen = inodedep->id_savedino1->di_gen; dp->di_freelink = inodedep->id_savedino1->di_freelink; return; } /* * If no dependencies, then there is nothing to roll back. */ inodedep->id_savedsize = dp->di_size; inodedep->id_savedextsize = 0; inodedep->id_savednlink = dp->di_nlink; if (TAILQ_EMPTY(&inodedep->id_inoupdt) && TAILQ_EMPTY(&inodedep->id_inoreflst)) return; /* * Revert the link count to that of the first unwritten journal entry. */ inoref = TAILQ_FIRST(&inodedep->id_inoreflst); if (inoref) dp->di_nlink = inoref->if_nlink; /* * Set the dependencies to busy. */ for (deplist = 0, adp = TAILQ_FIRST(&inodedep->id_inoupdt); adp; adp = TAILQ_NEXT(adp, ad_next)) { #ifdef INVARIANTS if (deplist != 0 && prevlbn >= adp->ad_offset) panic("softdep_write_inodeblock: lbn order"); prevlbn = adp->ad_offset; if (adp->ad_offset < UFS_NDADDR && dp->di_db[adp->ad_offset] != adp->ad_newblkno) panic("%s: direct pointer #%jd mismatch %d != %jd", "softdep_write_inodeblock", (intmax_t)adp->ad_offset, dp->di_db[adp->ad_offset], (intmax_t)adp->ad_newblkno); if (adp->ad_offset >= UFS_NDADDR && dp->di_ib[adp->ad_offset - UFS_NDADDR] != adp->ad_newblkno) panic("%s: indirect pointer #%jd mismatch %d != %jd", "softdep_write_inodeblock", (intmax_t)adp->ad_offset - UFS_NDADDR, dp->di_ib[adp->ad_offset - UFS_NDADDR], (intmax_t)adp->ad_newblkno); deplist |= 1 << adp->ad_offset; if ((adp->ad_state & ATTACHED) == 0) panic("softdep_write_inodeblock: Unknown state 0x%x", adp->ad_state); #endif /* INVARIANTS */ adp->ad_state &= ~ATTACHED; adp->ad_state |= UNDONE; } /* * The on-disk inode cannot claim to be any larger than the last * fragment that has been written. Otherwise, the on-disk inode * might have fragments that were not the last block in the file * which would corrupt the filesystem. */ for (lastadp = NULL, adp = TAILQ_FIRST(&inodedep->id_inoupdt); adp; lastadp = adp, adp = TAILQ_NEXT(adp, ad_next)) { if (adp->ad_offset >= UFS_NDADDR) break; dp->di_db[adp->ad_offset] = adp->ad_oldblkno; /* keep going until hitting a rollback to a frag */ if (adp->ad_oldsize == 0 || adp->ad_oldsize == fs->fs_bsize) continue; dp->di_size = fs->fs_bsize * adp->ad_offset + adp->ad_oldsize; for (i = adp->ad_offset + 1; i < UFS_NDADDR; i++) { #ifdef INVARIANTS if (dp->di_db[i] != 0 && (deplist & (1 << i)) == 0) panic("softdep_write_inodeblock: lost dep1"); #endif /* INVARIANTS */ dp->di_db[i] = 0; } for (i = 0; i < UFS_NIADDR; i++) { #ifdef INVARIANTS if (dp->di_ib[i] != 0 && (deplist & ((1 << UFS_NDADDR) << i)) == 0) panic("softdep_write_inodeblock: lost dep2"); #endif /* INVARIANTS */ dp->di_ib[i] = 0; } return; } /* * If we have zero'ed out the last allocated block of the file, * roll back the size to the last currently allocated block. * We know that this last allocated block is a full-sized as * we already checked for fragments in the loop above. */ if (lastadp != NULL && dp->di_size <= (lastadp->ad_offset + 1) * fs->fs_bsize) { for (i = lastadp->ad_offset; i >= 0; i--) if (dp->di_db[i] != 0) break; dp->di_size = (i + 1) * fs->fs_bsize; } /* * The only dependencies are for indirect blocks. * * The file size for indirect block additions is not guaranteed. * Such a guarantee would be non-trivial to achieve. The conventional * synchronous write implementation also does not make this guarantee. * Fsck should catch and fix discrepancies. Arguably, the file size * can be over-estimated without destroying integrity when the file * moves into the indirect blocks (i.e., is large). If we want to * postpone fsck, we are stuck with this argument. */ for (; adp; adp = TAILQ_NEXT(adp, ad_next)) dp->di_ib[adp->ad_offset - UFS_NDADDR] = 0; } /* * Version of initiate_write_inodeblock that handles UFS2 dinodes. * Note that any bug fixes made to this routine must be done in the * version found above. * * Called from within the procedure above to deal with unsatisfied * allocation dependencies in an inodeblock. The buffer must be * locked, thus, no I/O completion operations can occur while we * are manipulating its associated dependencies. */ static void initiate_write_inodeblock_ufs2(inodedep, bp) struct inodedep *inodedep; struct buf *bp; /* The inode block */ { struct allocdirect *adp, *lastadp; struct ufs2_dinode *dp; struct ufs2_dinode *sip; struct inoref *inoref; struct ufsmount *ump; struct fs *fs; ufs_lbn_t i; #ifdef INVARIANTS ufs_lbn_t prevlbn = 0; #endif int deplist; if (inodedep->id_state & IOSTARTED) panic("initiate_write_inodeblock_ufs2: already started"); inodedep->id_state |= IOSTARTED; fs = inodedep->id_fs; ump = VFSTOUFS(inodedep->id_list.wk_mp); LOCK_OWNED(ump); dp = (struct ufs2_dinode *)bp->b_data + ino_to_fsbo(fs, inodedep->id_ino); /* * If we're on the unlinked list but have not yet written our * next pointer initialize it here. */ if ((inodedep->id_state & (UNLINKED | UNLINKNEXT)) == UNLINKED) { struct inodedep *inon; inon = TAILQ_NEXT(inodedep, id_unlinked); dp->di_freelink = inon ? inon->id_ino : 0; } /* * If the bitmap is not yet written, then the allocated * inode cannot be written to disk. */ if ((inodedep->id_state & DEPCOMPLETE) == 0) { if (inodedep->id_savedino2 != NULL) panic("initiate_write_inodeblock_ufs2: I/O underway"); FREE_LOCK(ump); sip = malloc(sizeof(struct ufs2_dinode), M_SAVEDINO, M_SOFTDEP_FLAGS); ACQUIRE_LOCK(ump); inodedep->id_savedino2 = sip; *inodedep->id_savedino2 = *dp; bzero((caddr_t)dp, sizeof(struct ufs2_dinode)); dp->di_gen = inodedep->id_savedino2->di_gen; dp->di_freelink = inodedep->id_savedino2->di_freelink; return; } /* * If no dependencies, then there is nothing to roll back. */ inodedep->id_savedsize = dp->di_size; inodedep->id_savedextsize = dp->di_extsize; inodedep->id_savednlink = dp->di_nlink; if (TAILQ_EMPTY(&inodedep->id_inoupdt) && TAILQ_EMPTY(&inodedep->id_extupdt) && TAILQ_EMPTY(&inodedep->id_inoreflst)) return; /* * Revert the link count to that of the first unwritten journal entry. */ inoref = TAILQ_FIRST(&inodedep->id_inoreflst); if (inoref) dp->di_nlink = inoref->if_nlink; /* * Set the ext data dependencies to busy. */ for (deplist = 0, adp = TAILQ_FIRST(&inodedep->id_extupdt); adp; adp = TAILQ_NEXT(adp, ad_next)) { #ifdef INVARIANTS if (deplist != 0 && prevlbn >= adp->ad_offset) panic("softdep_write_inodeblock: lbn order"); prevlbn = adp->ad_offset; if (dp->di_extb[adp->ad_offset] != adp->ad_newblkno) panic("%s: direct pointer #%jd mismatch %jd != %jd", "softdep_write_inodeblock", (intmax_t)adp->ad_offset, (intmax_t)dp->di_extb[adp->ad_offset], (intmax_t)adp->ad_newblkno); deplist |= 1 << adp->ad_offset; if ((adp->ad_state & ATTACHED) == 0) panic("softdep_write_inodeblock: Unknown state 0x%x", adp->ad_state); #endif /* INVARIANTS */ adp->ad_state &= ~ATTACHED; adp->ad_state |= UNDONE; } /* * The on-disk inode cannot claim to be any larger than the last * fragment that has been written. Otherwise, the on-disk inode * might have fragments that were not the last block in the ext * data which would corrupt the filesystem. */ for (lastadp = NULL, adp = TAILQ_FIRST(&inodedep->id_extupdt); adp; lastadp = adp, adp = TAILQ_NEXT(adp, ad_next)) { dp->di_extb[adp->ad_offset] = adp->ad_oldblkno; /* keep going until hitting a rollback to a frag */ if (adp->ad_oldsize == 0 || adp->ad_oldsize == fs->fs_bsize) continue; dp->di_extsize = fs->fs_bsize * adp->ad_offset + adp->ad_oldsize; for (i = adp->ad_offset + 1; i < UFS_NXADDR; i++) { #ifdef INVARIANTS if (dp->di_extb[i] != 0 && (deplist & (1 << i)) == 0) panic("softdep_write_inodeblock: lost dep1"); #endif /* INVARIANTS */ dp->di_extb[i] = 0; } lastadp = NULL; break; } /* * If we have zero'ed out the last allocated block of the ext * data, roll back the size to the last currently allocated block. * We know that this last allocated block is a full-sized as * we already checked for fragments in the loop above. */ if (lastadp != NULL && dp->di_extsize <= (lastadp->ad_offset + 1) * fs->fs_bsize) { for (i = lastadp->ad_offset; i >= 0; i--) if (dp->di_extb[i] != 0) break; dp->di_extsize = (i + 1) * fs->fs_bsize; } /* * Set the file data dependencies to busy. */ for (deplist = 0, adp = TAILQ_FIRST(&inodedep->id_inoupdt); adp; adp = TAILQ_NEXT(adp, ad_next)) { #ifdef INVARIANTS if (deplist != 0 && prevlbn >= adp->ad_offset) panic("softdep_write_inodeblock: lbn order"); if ((adp->ad_state & ATTACHED) == 0) panic("inodedep %p and adp %p not attached", inodedep, adp); prevlbn = adp->ad_offset; if (adp->ad_offset < UFS_NDADDR && dp->di_db[adp->ad_offset] != adp->ad_newblkno) panic("%s: direct pointer #%jd mismatch %jd != %jd", "softdep_write_inodeblock", (intmax_t)adp->ad_offset, (intmax_t)dp->di_db[adp->ad_offset], (intmax_t)adp->ad_newblkno); if (adp->ad_offset >= UFS_NDADDR && dp->di_ib[adp->ad_offset - UFS_NDADDR] != adp->ad_newblkno) panic("%s indirect pointer #%jd mismatch %jd != %jd", "softdep_write_inodeblock:", (intmax_t)adp->ad_offset - UFS_NDADDR, (intmax_t)dp->di_ib[adp->ad_offset - UFS_NDADDR], (intmax_t)adp->ad_newblkno); deplist |= 1 << adp->ad_offset; if ((adp->ad_state & ATTACHED) == 0) panic("softdep_write_inodeblock: Unknown state 0x%x", adp->ad_state); #endif /* INVARIANTS */ adp->ad_state &= ~ATTACHED; adp->ad_state |= UNDONE; } /* * The on-disk inode cannot claim to be any larger than the last * fragment that has been written. Otherwise, the on-disk inode * might have fragments that were not the last block in the file * which would corrupt the filesystem. */ for (lastadp = NULL, adp = TAILQ_FIRST(&inodedep->id_inoupdt); adp; lastadp = adp, adp = TAILQ_NEXT(adp, ad_next)) { if (adp->ad_offset >= UFS_NDADDR) break; dp->di_db[adp->ad_offset] = adp->ad_oldblkno; /* keep going until hitting a rollback to a frag */ if (adp->ad_oldsize == 0 || adp->ad_oldsize == fs->fs_bsize) continue; dp->di_size = fs->fs_bsize * adp->ad_offset + adp->ad_oldsize; for (i = adp->ad_offset + 1; i < UFS_NDADDR; i++) { #ifdef INVARIANTS if (dp->di_db[i] != 0 && (deplist & (1 << i)) == 0) panic("softdep_write_inodeblock: lost dep2"); #endif /* INVARIANTS */ dp->di_db[i] = 0; } for (i = 0; i < UFS_NIADDR; i++) { #ifdef INVARIANTS if (dp->di_ib[i] != 0 && (deplist & ((1 << UFS_NDADDR) << i)) == 0) panic("softdep_write_inodeblock: lost dep3"); #endif /* INVARIANTS */ dp->di_ib[i] = 0; } return; } /* * If we have zero'ed out the last allocated block of the file, * roll back the size to the last currently allocated block. * We know that this last allocated block is a full-sized as * we already checked for fragments in the loop above. */ if (lastadp != NULL && dp->di_size <= (lastadp->ad_offset + 1) * fs->fs_bsize) { for (i = lastadp->ad_offset; i >= 0; i--) if (dp->di_db[i] != 0) break; dp->di_size = (i + 1) * fs->fs_bsize; } /* * The only dependencies are for indirect blocks. * * The file size for indirect block additions is not guaranteed. * Such a guarantee would be non-trivial to achieve. The conventional * synchronous write implementation also does not make this guarantee. * Fsck should catch and fix discrepancies. Arguably, the file size * can be over-estimated without destroying integrity when the file * moves into the indirect blocks (i.e., is large). If we want to * postpone fsck, we are stuck with this argument. */ for (; adp; adp = TAILQ_NEXT(adp, ad_next)) dp->di_ib[adp->ad_offset - UFS_NDADDR] = 0; } /* * Cancel an indirdep as a result of truncation. Release all of the * children allocindirs and place their journal work on the appropriate * list. */ static void cancel_indirdep(indirdep, bp, freeblks) struct indirdep *indirdep; struct buf *bp; struct freeblks *freeblks; { struct allocindir *aip; /* * None of the indirect pointers will ever be visible, * so they can simply be tossed. GOINGAWAY ensures * that allocated pointers will be saved in the buffer * cache until they are freed. Note that they will * only be able to be found by their physical address * since the inode mapping the logical address will * be gone. The save buffer used for the safe copy * was allocated in setup_allocindir_phase2 using * the physical address so it could be used for this * purpose. Hence we swap the safe copy with the real * copy, allowing the safe copy to be freed and holding * on to the real copy for later use in indir_trunc. */ if (indirdep->ir_state & GOINGAWAY) panic("cancel_indirdep: already gone"); if ((indirdep->ir_state & DEPCOMPLETE) == 0) { indirdep->ir_state |= DEPCOMPLETE; LIST_REMOVE(indirdep, ir_next); } indirdep->ir_state |= GOINGAWAY; /* * Pass in bp for blocks still have journal writes * pending so we can cancel them on their own. */ while ((aip = LIST_FIRST(&indirdep->ir_deplisthd)) != NULL) cancel_allocindir(aip, bp, freeblks, 0); while ((aip = LIST_FIRST(&indirdep->ir_donehd)) != NULL) cancel_allocindir(aip, NULL, freeblks, 0); while ((aip = LIST_FIRST(&indirdep->ir_writehd)) != NULL) cancel_allocindir(aip, NULL, freeblks, 0); while ((aip = LIST_FIRST(&indirdep->ir_completehd)) != NULL) cancel_allocindir(aip, NULL, freeblks, 0); /* * If there are pending partial truncations we need to keep the * old block copy around until they complete. This is because * the current b_data is not a perfect superset of the available * blocks. */ if (TAILQ_EMPTY(&indirdep->ir_trunc)) bcopy(bp->b_data, indirdep->ir_savebp->b_data, bp->b_bcount); else bcopy(bp->b_data, indirdep->ir_saveddata, bp->b_bcount); WORKLIST_REMOVE(&indirdep->ir_list); WORKLIST_INSERT(&indirdep->ir_savebp->b_dep, &indirdep->ir_list); indirdep->ir_bp = NULL; indirdep->ir_freeblks = freeblks; } /* * Free an indirdep once it no longer has new pointers to track. */ static void free_indirdep(indirdep) struct indirdep *indirdep; { KASSERT(TAILQ_EMPTY(&indirdep->ir_trunc), ("free_indirdep: Indir trunc list not empty.")); KASSERT(LIST_EMPTY(&indirdep->ir_completehd), ("free_indirdep: Complete head not empty.")); KASSERT(LIST_EMPTY(&indirdep->ir_writehd), ("free_indirdep: write head not empty.")); KASSERT(LIST_EMPTY(&indirdep->ir_donehd), ("free_indirdep: done head not empty.")); KASSERT(LIST_EMPTY(&indirdep->ir_deplisthd), ("free_indirdep: deplist head not empty.")); KASSERT((indirdep->ir_state & DEPCOMPLETE), ("free_indirdep: %p still on newblk list.", indirdep)); KASSERT(indirdep->ir_saveddata == NULL, ("free_indirdep: %p still has saved data.", indirdep)); if (indirdep->ir_state & ONWORKLIST) WORKLIST_REMOVE(&indirdep->ir_list); WORKITEM_FREE(indirdep, D_INDIRDEP); } /* * Called before a write to an indirdep. This routine is responsible for * rolling back pointers to a safe state which includes only those * allocindirs which have been completed. */ static void initiate_write_indirdep(indirdep, bp) struct indirdep *indirdep; struct buf *bp; { struct ufsmount *ump; indirdep->ir_state |= IOSTARTED; if (indirdep->ir_state & GOINGAWAY) panic("disk_io_initiation: indirdep gone"); /* * If there are no remaining dependencies, this will be writing * the real pointers. */ if (LIST_EMPTY(&indirdep->ir_deplisthd) && TAILQ_EMPTY(&indirdep->ir_trunc)) return; /* * Replace up-to-date version with safe version. */ if (indirdep->ir_saveddata == NULL) { ump = VFSTOUFS(indirdep->ir_list.wk_mp); LOCK_OWNED(ump); FREE_LOCK(ump); indirdep->ir_saveddata = malloc(bp->b_bcount, M_INDIRDEP, M_SOFTDEP_FLAGS); ACQUIRE_LOCK(ump); } indirdep->ir_state &= ~ATTACHED; indirdep->ir_state |= UNDONE; bcopy(bp->b_data, indirdep->ir_saveddata, bp->b_bcount); bcopy(indirdep->ir_savebp->b_data, bp->b_data, bp->b_bcount); } /* * Called when an inode has been cleared in a cg bitmap. This finally * eliminates any canceled jaddrefs */ void softdep_setup_inofree(mp, bp, ino, wkhd) struct mount *mp; struct buf *bp; ino_t ino; struct workhead *wkhd; { struct worklist *wk, *wkn; struct inodedep *inodedep; struct ufsmount *ump; uint8_t *inosused; struct cg *cgp; struct fs *fs; KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_setup_inofree called on non-softdep filesystem")); ump = VFSTOUFS(mp); ACQUIRE_LOCK(ump); fs = ump->um_fs; cgp = (struct cg *)bp->b_data; inosused = cg_inosused(cgp); if (isset(inosused, ino % fs->fs_ipg)) panic("softdep_setup_inofree: inode %ju not freed.", (uintmax_t)ino); if (inodedep_lookup(mp, ino, 0, &inodedep)) panic("softdep_setup_inofree: ino %ju has existing inodedep %p", (uintmax_t)ino, inodedep); if (wkhd) { LIST_FOREACH_SAFE(wk, wkhd, wk_list, wkn) { if (wk->wk_type != D_JADDREF) continue; WORKLIST_REMOVE(wk); /* * We can free immediately even if the jaddref * isn't attached in a background write as now * the bitmaps are reconciled. */ wk->wk_state |= COMPLETE | ATTACHED; free_jaddref(WK_JADDREF(wk)); } jwork_move(&bp->b_dep, wkhd); } FREE_LOCK(ump); } /* * Called via ffs_blkfree() after a set of frags has been cleared from a cg * map. Any dependencies waiting for the write to clear are added to the * buf's list and any jnewblks that are being canceled are discarded * immediately. */ void softdep_setup_blkfree(mp, bp, blkno, frags, wkhd) struct mount *mp; struct buf *bp; ufs2_daddr_t blkno; int frags; struct workhead *wkhd; { struct bmsafemap *bmsafemap; struct jnewblk *jnewblk; struct ufsmount *ump; struct worklist *wk; struct fs *fs; #ifdef SUJ_DEBUG uint8_t *blksfree; struct cg *cgp; ufs2_daddr_t jstart; ufs2_daddr_t jend; ufs2_daddr_t end; long bno; int i; #endif CTR3(KTR_SUJ, "softdep_setup_blkfree: blkno %jd frags %d wk head %p", blkno, frags, wkhd); ump = VFSTOUFS(mp); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_setup_blkfree called on non-softdep filesystem")); ACQUIRE_LOCK(ump); /* Lookup the bmsafemap so we track when it is dirty. */ fs = ump->um_fs; bmsafemap = bmsafemap_lookup(mp, bp, dtog(fs, blkno), NULL); /* * Detach any jnewblks which have been canceled. They must linger * until the bitmap is cleared again by ffs_blkfree() to prevent * an unjournaled allocation from hitting the disk. */ if (wkhd) { while ((wk = LIST_FIRST(wkhd)) != NULL) { CTR2(KTR_SUJ, "softdep_setup_blkfree: blkno %jd wk type %d", blkno, wk->wk_type); WORKLIST_REMOVE(wk); if (wk->wk_type != D_JNEWBLK) { WORKLIST_INSERT(&bmsafemap->sm_freehd, wk); continue; } jnewblk = WK_JNEWBLK(wk); KASSERT(jnewblk->jn_state & GOINGAWAY, ("softdep_setup_blkfree: jnewblk not canceled.")); #ifdef SUJ_DEBUG /* * Assert that this block is free in the bitmap * before we discard the jnewblk. */ cgp = (struct cg *)bp->b_data; blksfree = cg_blksfree(cgp); bno = dtogd(fs, jnewblk->jn_blkno); for (i = jnewblk->jn_oldfrags; i < jnewblk->jn_frags; i++) { if (isset(blksfree, bno + i)) continue; panic("softdep_setup_blkfree: not free"); } #endif /* * Even if it's not attached we can free immediately * as the new bitmap is correct. */ wk->wk_state |= COMPLETE | ATTACHED; free_jnewblk(jnewblk); } } #ifdef SUJ_DEBUG /* * Assert that we are not freeing a block which has an outstanding * allocation dependency. */ fs = VFSTOUFS(mp)->um_fs; bmsafemap = bmsafemap_lookup(mp, bp, dtog(fs, blkno), NULL); end = blkno + frags; LIST_FOREACH(jnewblk, &bmsafemap->sm_jnewblkhd, jn_deps) { /* * Don't match against blocks that will be freed when the * background write is done. */ if ((jnewblk->jn_state & (ATTACHED | COMPLETE | DEPCOMPLETE)) == (COMPLETE | DEPCOMPLETE)) continue; jstart = jnewblk->jn_blkno + jnewblk->jn_oldfrags; jend = jnewblk->jn_blkno + jnewblk->jn_frags; if ((blkno >= jstart && blkno < jend) || (end > jstart && end <= jend)) { printf("state 0x%X %jd - %d %d dep %p\n", jnewblk->jn_state, jnewblk->jn_blkno, jnewblk->jn_oldfrags, jnewblk->jn_frags, jnewblk->jn_dep); panic("softdep_setup_blkfree: " "%jd-%jd(%d) overlaps with %jd-%jd", blkno, end, frags, jstart, jend); } } #endif FREE_LOCK(ump); } /* * Revert a block allocation when the journal record that describes it * is not yet written. */ static int jnewblk_rollback(jnewblk, fs, cgp, blksfree) struct jnewblk *jnewblk; struct fs *fs; struct cg *cgp; uint8_t *blksfree; { ufs1_daddr_t fragno; long cgbno, bbase; int frags, blk; int i; frags = 0; cgbno = dtogd(fs, jnewblk->jn_blkno); /* * We have to test which frags need to be rolled back. We may * be operating on a stale copy when doing background writes. */ for (i = jnewblk->jn_oldfrags; i < jnewblk->jn_frags; i++) if (isclr(blksfree, cgbno + i)) frags++; if (frags == 0) return (0); /* * This is mostly ffs_blkfree() sans some validation and * superblock updates. */ if (frags == fs->fs_frag) { fragno = fragstoblks(fs, cgbno); ffs_setblock(fs, blksfree, fragno); ffs_clusteracct(fs, cgp, fragno, 1); cgp->cg_cs.cs_nbfree++; } else { cgbno += jnewblk->jn_oldfrags; bbase = cgbno - fragnum(fs, cgbno); /* Decrement the old frags. */ blk = blkmap(fs, blksfree, bbase); ffs_fragacct(fs, blk, cgp->cg_frsum, -1); /* Deallocate the fragment */ for (i = 0; i < frags; i++) setbit(blksfree, cgbno + i); cgp->cg_cs.cs_nffree += frags; /* Add back in counts associated with the new frags */ blk = blkmap(fs, blksfree, bbase); ffs_fragacct(fs, blk, cgp->cg_frsum, 1); /* If a complete block has been reassembled, account for it. */ fragno = fragstoblks(fs, bbase); if (ffs_isblock(fs, blksfree, fragno)) { cgp->cg_cs.cs_nffree -= fs->fs_frag; ffs_clusteracct(fs, cgp, fragno, 1); cgp->cg_cs.cs_nbfree++; } } stat_jnewblk++; jnewblk->jn_state &= ~ATTACHED; jnewblk->jn_state |= UNDONE; return (frags); } static void initiate_write_bmsafemap(bmsafemap, bp) struct bmsafemap *bmsafemap; struct buf *bp; /* The cg block. */ { struct jaddref *jaddref; struct jnewblk *jnewblk; uint8_t *inosused; uint8_t *blksfree; struct cg *cgp; struct fs *fs; ino_t ino; /* * If this is a background write, we did this at the time that * the copy was made, so do not need to do it again. */ if (bmsafemap->sm_state & IOSTARTED) return; bmsafemap->sm_state |= IOSTARTED; /* * Clear any inode allocations which are pending journal writes. */ if (LIST_FIRST(&bmsafemap->sm_jaddrefhd) != NULL) { cgp = (struct cg *)bp->b_data; fs = VFSTOUFS(bmsafemap->sm_list.wk_mp)->um_fs; inosused = cg_inosused(cgp); LIST_FOREACH(jaddref, &bmsafemap->sm_jaddrefhd, ja_bmdeps) { ino = jaddref->ja_ino % fs->fs_ipg; if (isset(inosused, ino)) { if ((jaddref->ja_mode & IFMT) == IFDIR) cgp->cg_cs.cs_ndir--; cgp->cg_cs.cs_nifree++; clrbit(inosused, ino); jaddref->ja_state &= ~ATTACHED; jaddref->ja_state |= UNDONE; stat_jaddref++; } else panic("initiate_write_bmsafemap: inode %ju " "marked free", (uintmax_t)jaddref->ja_ino); } } /* * Clear any block allocations which are pending journal writes. */ if (LIST_FIRST(&bmsafemap->sm_jnewblkhd) != NULL) { cgp = (struct cg *)bp->b_data; fs = VFSTOUFS(bmsafemap->sm_list.wk_mp)->um_fs; blksfree = cg_blksfree(cgp); LIST_FOREACH(jnewblk, &bmsafemap->sm_jnewblkhd, jn_deps) { if (jnewblk_rollback(jnewblk, fs, cgp, blksfree)) continue; panic("initiate_write_bmsafemap: block %jd " "marked free", jnewblk->jn_blkno); } } /* * Move allocation lists to the written lists so they can be * cleared once the block write is complete. */ LIST_SWAP(&bmsafemap->sm_inodedephd, &bmsafemap->sm_inodedepwr, inodedep, id_deps); LIST_SWAP(&bmsafemap->sm_newblkhd, &bmsafemap->sm_newblkwr, newblk, nb_deps); LIST_SWAP(&bmsafemap->sm_freehd, &bmsafemap->sm_freewr, worklist, wk_list); } /* * This routine is called during the completion interrupt * service routine for a disk write (from the procedure called * by the device driver to inform the filesystem caches of * a request completion). It should be called early in this * procedure, before the block is made available to other * processes or other routines are called. * */ static void softdep_disk_write_complete(bp) struct buf *bp; /* describes the completed disk write */ { struct worklist *wk; struct worklist *owk; struct ufsmount *ump; struct workhead reattach; struct freeblks *freeblks; struct buf *sbp; /* * If an error occurred while doing the write, then the data * has not hit the disk and the dependencies cannot be processed. * But we do have to go through and roll forward any dependencies * that were rolled back before the disk write. */ if ((bp->b_ioflags & BIO_ERROR) != 0 && (bp->b_flags & B_INVAL) == 0) { LIST_FOREACH(wk, &bp->b_dep, wk_list) { switch (wk->wk_type) { case D_PAGEDEP: handle_written_filepage(WK_PAGEDEP(wk), bp, 0); continue; case D_INODEDEP: handle_written_inodeblock(WK_INODEDEP(wk), bp, 0); continue; case D_BMSAFEMAP: handle_written_bmsafemap(WK_BMSAFEMAP(wk), bp, 0); continue; case D_INDIRDEP: handle_written_indirdep(WK_INDIRDEP(wk), bp, &sbp, 0); continue; default: /* nothing to roll forward */ continue; } } return; } if ((wk = LIST_FIRST(&bp->b_dep)) == NULL) return; ump = VFSTOUFS(wk->wk_mp); LIST_INIT(&reattach); /* * This lock must not be released anywhere in this code segment. */ sbp = NULL; owk = NULL; ACQUIRE_LOCK(ump); while ((wk = LIST_FIRST(&bp->b_dep)) != NULL) { WORKLIST_REMOVE(wk); atomic_add_long(&dep_write[wk->wk_type], 1); if (wk == owk) panic("duplicate worklist: %p\n", wk); owk = wk; switch (wk->wk_type) { case D_PAGEDEP: if (handle_written_filepage(WK_PAGEDEP(wk), bp, WRITESUCCEEDED)) WORKLIST_INSERT(&reattach, wk); continue; case D_INODEDEP: if (handle_written_inodeblock(WK_INODEDEP(wk), bp, WRITESUCCEEDED)) WORKLIST_INSERT(&reattach, wk); continue; case D_BMSAFEMAP: if (handle_written_bmsafemap(WK_BMSAFEMAP(wk), bp, WRITESUCCEEDED)) WORKLIST_INSERT(&reattach, wk); continue; case D_MKDIR: handle_written_mkdir(WK_MKDIR(wk), MKDIR_BODY); continue; case D_ALLOCDIRECT: wk->wk_state |= COMPLETE; handle_allocdirect_partdone(WK_ALLOCDIRECT(wk), NULL); continue; case D_ALLOCINDIR: wk->wk_state |= COMPLETE; handle_allocindir_partdone(WK_ALLOCINDIR(wk)); continue; case D_INDIRDEP: if (handle_written_indirdep(WK_INDIRDEP(wk), bp, &sbp, WRITESUCCEEDED)) WORKLIST_INSERT(&reattach, wk); continue; case D_FREEBLKS: wk->wk_state |= COMPLETE; freeblks = WK_FREEBLKS(wk); if ((wk->wk_state & ALLCOMPLETE) == ALLCOMPLETE && LIST_EMPTY(&freeblks->fb_jblkdephd)) add_to_worklist(wk, WK_NODELAY); continue; case D_FREEWORK: handle_written_freework(WK_FREEWORK(wk)); break; case D_JSEGDEP: free_jsegdep(WK_JSEGDEP(wk)); continue; case D_JSEG: handle_written_jseg(WK_JSEG(wk), bp); continue; case D_SBDEP: if (handle_written_sbdep(WK_SBDEP(wk), bp)) WORKLIST_INSERT(&reattach, wk); continue; case D_FREEDEP: free_freedep(WK_FREEDEP(wk)); continue; default: panic("handle_disk_write_complete: Unknown type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } } /* * Reattach any requests that must be redone. */ while ((wk = LIST_FIRST(&reattach)) != NULL) { WORKLIST_REMOVE(wk); WORKLIST_INSERT(&bp->b_dep, wk); } FREE_LOCK(ump); if (sbp) brelse(sbp); } /* * Called from within softdep_disk_write_complete above. Note that * this routine is always called from interrupt level with further * splbio interrupts blocked. */ static void handle_allocdirect_partdone(adp, wkhd) struct allocdirect *adp; /* the completed allocdirect */ struct workhead *wkhd; /* Work to do when inode is writtne. */ { struct allocdirectlst *listhead; struct allocdirect *listadp; struct inodedep *inodedep; long bsize; if ((adp->ad_state & ALLCOMPLETE) != ALLCOMPLETE) return; /* * The on-disk inode cannot claim to be any larger than the last * fragment that has been written. Otherwise, the on-disk inode * might have fragments that were not the last block in the file * which would corrupt the filesystem. Thus, we cannot free any * allocdirects after one whose ad_oldblkno claims a fragment as * these blocks must be rolled back to zero before writing the inode. * We check the currently active set of allocdirects in id_inoupdt * or id_extupdt as appropriate. */ inodedep = adp->ad_inodedep; bsize = inodedep->id_fs->fs_bsize; if (adp->ad_state & EXTDATA) listhead = &inodedep->id_extupdt; else listhead = &inodedep->id_inoupdt; TAILQ_FOREACH(listadp, listhead, ad_next) { /* found our block */ if (listadp == adp) break; /* continue if ad_oldlbn is not a fragment */ if (listadp->ad_oldsize == 0 || listadp->ad_oldsize == bsize) continue; /* hit a fragment */ return; } /* * If we have reached the end of the current list without * finding the just finished dependency, then it must be * on the future dependency list. Future dependencies cannot * be freed until they are moved to the current list. */ if (listadp == NULL) { #ifdef DEBUG if (adp->ad_state & EXTDATA) listhead = &inodedep->id_newextupdt; else listhead = &inodedep->id_newinoupdt; TAILQ_FOREACH(listadp, listhead, ad_next) /* found our block */ if (listadp == adp) break; if (listadp == NULL) panic("handle_allocdirect_partdone: lost dep"); #endif /* DEBUG */ return; } /* * If we have found the just finished dependency, then queue * it along with anything that follows it that is complete. * Since the pointer has not yet been written in the inode * as the dependency prevents it, place the allocdirect on the * bufwait list where it will be freed once the pointer is * valid. */ if (wkhd == NULL) wkhd = &inodedep->id_bufwait; for (; adp; adp = listadp) { listadp = TAILQ_NEXT(adp, ad_next); if ((adp->ad_state & ALLCOMPLETE) != ALLCOMPLETE) return; TAILQ_REMOVE(listhead, adp, ad_next); WORKLIST_INSERT(wkhd, &adp->ad_block.nb_list); } } /* * Called from within softdep_disk_write_complete above. This routine * completes successfully written allocindirs. */ static void handle_allocindir_partdone(aip) struct allocindir *aip; /* the completed allocindir */ { struct indirdep *indirdep; if ((aip->ai_state & ALLCOMPLETE) != ALLCOMPLETE) return; indirdep = aip->ai_indirdep; LIST_REMOVE(aip, ai_next); /* * Don't set a pointer while the buffer is undergoing IO or while * we have active truncations. */ if (indirdep->ir_state & UNDONE || !TAILQ_EMPTY(&indirdep->ir_trunc)) { LIST_INSERT_HEAD(&indirdep->ir_donehd, aip, ai_next); return; } if (indirdep->ir_state & UFS1FMT) ((ufs1_daddr_t *)indirdep->ir_savebp->b_data)[aip->ai_offset] = aip->ai_newblkno; else ((ufs2_daddr_t *)indirdep->ir_savebp->b_data)[aip->ai_offset] = aip->ai_newblkno; /* * Await the pointer write before freeing the allocindir. */ LIST_INSERT_HEAD(&indirdep->ir_writehd, aip, ai_next); } /* * Release segments held on a jwork list. */ static void handle_jwork(wkhd) struct workhead *wkhd; { struct worklist *wk; while ((wk = LIST_FIRST(wkhd)) != NULL) { WORKLIST_REMOVE(wk); switch (wk->wk_type) { case D_JSEGDEP: free_jsegdep(WK_JSEGDEP(wk)); continue; case D_FREEDEP: free_freedep(WK_FREEDEP(wk)); continue; case D_FREEFRAG: rele_jseg(WK_JSEG(WK_FREEFRAG(wk)->ff_jdep)); WORKITEM_FREE(wk, D_FREEFRAG); continue; case D_FREEWORK: handle_written_freework(WK_FREEWORK(wk)); continue; default: panic("handle_jwork: Unknown type %s\n", TYPENAME(wk->wk_type)); } } } /* * Handle the bufwait list on an inode when it is safe to release items * held there. This normally happens after an inode block is written but * may be delayed and handled later if there are pending journal items that * are not yet safe to be released. */ static struct freefile * handle_bufwait(inodedep, refhd) struct inodedep *inodedep; struct workhead *refhd; { struct jaddref *jaddref; struct freefile *freefile; struct worklist *wk; freefile = NULL; while ((wk = LIST_FIRST(&inodedep->id_bufwait)) != NULL) { WORKLIST_REMOVE(wk); switch (wk->wk_type) { case D_FREEFILE: /* * We defer adding freefile to the worklist * until all other additions have been made to * ensure that it will be done after all the * old blocks have been freed. */ if (freefile != NULL) panic("handle_bufwait: freefile"); freefile = WK_FREEFILE(wk); continue; case D_MKDIR: handle_written_mkdir(WK_MKDIR(wk), MKDIR_PARENT); continue; case D_DIRADD: diradd_inode_written(WK_DIRADD(wk), inodedep); continue; case D_FREEFRAG: wk->wk_state |= COMPLETE; if ((wk->wk_state & ALLCOMPLETE) == ALLCOMPLETE) add_to_worklist(wk, 0); continue; case D_DIRREM: wk->wk_state |= COMPLETE; add_to_worklist(wk, 0); continue; case D_ALLOCDIRECT: case D_ALLOCINDIR: free_newblk(WK_NEWBLK(wk)); continue; case D_JNEWBLK: wk->wk_state |= COMPLETE; free_jnewblk(WK_JNEWBLK(wk)); continue; /* * Save freed journal segments and add references on * the supplied list which will delay their release * until the cg bitmap is cleared on disk. */ case D_JSEGDEP: if (refhd == NULL) free_jsegdep(WK_JSEGDEP(wk)); else WORKLIST_INSERT(refhd, wk); continue; case D_JADDREF: jaddref = WK_JADDREF(wk); TAILQ_REMOVE(&inodedep->id_inoreflst, &jaddref->ja_ref, if_deps); /* * Transfer any jaddrefs to the list to be freed with * the bitmap if we're handling a removed file. */ if (refhd == NULL) { wk->wk_state |= COMPLETE; free_jaddref(jaddref); } else WORKLIST_INSERT(refhd, wk); continue; default: panic("handle_bufwait: Unknown type %p(%s)", wk, TYPENAME(wk->wk_type)); /* NOTREACHED */ } } return (freefile); } /* * Called from within softdep_disk_write_complete above to restore * in-memory inode block contents to their most up-to-date state. Note * that this routine is always called from interrupt level with further * interrupts from this device blocked. * * If the write did not succeed, we will do all the roll-forward * operations, but we will not take the actions that will allow its * dependencies to be processed. */ static int handle_written_inodeblock(inodedep, bp, flags) struct inodedep *inodedep; struct buf *bp; /* buffer containing the inode block */ int flags; { struct freefile *freefile; struct allocdirect *adp, *nextadp; struct ufs1_dinode *dp1 = NULL; struct ufs2_dinode *dp2 = NULL; struct workhead wkhd; int hadchanges, fstype; ino_t freelink; LIST_INIT(&wkhd); hadchanges = 0; freefile = NULL; if ((inodedep->id_state & IOSTARTED) == 0) panic("handle_written_inodeblock: not started"); inodedep->id_state &= ~IOSTARTED; if (inodedep->id_fs->fs_magic == FS_UFS1_MAGIC) { fstype = UFS1; dp1 = (struct ufs1_dinode *)bp->b_data + ino_to_fsbo(inodedep->id_fs, inodedep->id_ino); freelink = dp1->di_freelink; } else { fstype = UFS2; dp2 = (struct ufs2_dinode *)bp->b_data + ino_to_fsbo(inodedep->id_fs, inodedep->id_ino); freelink = dp2->di_freelink; } /* * Leave this inodeblock dirty until it's in the list. */ if ((inodedep->id_state & (UNLINKED | UNLINKONLIST)) == UNLINKED && (flags & WRITESUCCEEDED)) { struct inodedep *inon; inon = TAILQ_NEXT(inodedep, id_unlinked); if ((inon == NULL && freelink == 0) || (inon && inon->id_ino == freelink)) { if (inon) inon->id_state |= UNLINKPREV; inodedep->id_state |= UNLINKNEXT; } hadchanges = 1; } /* * If we had to rollback the inode allocation because of * bitmaps being incomplete, then simply restore it. * Keep the block dirty so that it will not be reclaimed until * all associated dependencies have been cleared and the * corresponding updates written to disk. */ if (inodedep->id_savedino1 != NULL) { hadchanges = 1; if (fstype == UFS1) *dp1 = *inodedep->id_savedino1; else *dp2 = *inodedep->id_savedino2; free(inodedep->id_savedino1, M_SAVEDINO); inodedep->id_savedino1 = NULL; if ((bp->b_flags & B_DELWRI) == 0) stat_inode_bitmap++; bdirty(bp); /* * If the inode is clear here and GOINGAWAY it will never * be written. Process the bufwait and clear any pending * work which may include the freefile. */ if (inodedep->id_state & GOINGAWAY) goto bufwait; return (1); } if (flags & WRITESUCCEEDED) inodedep->id_state |= COMPLETE; /* * Roll forward anything that had to be rolled back before * the inode could be updated. */ for (adp = TAILQ_FIRST(&inodedep->id_inoupdt); adp; adp = nextadp) { nextadp = TAILQ_NEXT(adp, ad_next); if (adp->ad_state & ATTACHED) panic("handle_written_inodeblock: new entry"); if (fstype == UFS1) { if (adp->ad_offset < UFS_NDADDR) { if (dp1->di_db[adp->ad_offset]!=adp->ad_oldblkno) panic("%s %s #%jd mismatch %d != %jd", "handle_written_inodeblock:", "direct pointer", (intmax_t)adp->ad_offset, dp1->di_db[adp->ad_offset], (intmax_t)adp->ad_oldblkno); dp1->di_db[adp->ad_offset] = adp->ad_newblkno; } else { if (dp1->di_ib[adp->ad_offset - UFS_NDADDR] != 0) panic("%s: %s #%jd allocated as %d", "handle_written_inodeblock", "indirect pointer", (intmax_t)adp->ad_offset - UFS_NDADDR, dp1->di_ib[adp->ad_offset - UFS_NDADDR]); dp1->di_ib[adp->ad_offset - UFS_NDADDR] = adp->ad_newblkno; } } else { if (adp->ad_offset < UFS_NDADDR) { if (dp2->di_db[adp->ad_offset]!=adp->ad_oldblkno) panic("%s: %s #%jd %s %jd != %jd", "handle_written_inodeblock", "direct pointer", (intmax_t)adp->ad_offset, "mismatch", (intmax_t)dp2->di_db[adp->ad_offset], (intmax_t)adp->ad_oldblkno); dp2->di_db[adp->ad_offset] = adp->ad_newblkno; } else { if (dp2->di_ib[adp->ad_offset - UFS_NDADDR] != 0) panic("%s: %s #%jd allocated as %jd", "handle_written_inodeblock", "indirect pointer", (intmax_t)adp->ad_offset - UFS_NDADDR, (intmax_t) dp2->di_ib[adp->ad_offset - UFS_NDADDR]); dp2->di_ib[adp->ad_offset - UFS_NDADDR] = adp->ad_newblkno; } } adp->ad_state &= ~UNDONE; adp->ad_state |= ATTACHED; hadchanges = 1; } for (adp = TAILQ_FIRST(&inodedep->id_extupdt); adp; adp = nextadp) { nextadp = TAILQ_NEXT(adp, ad_next); if (adp->ad_state & ATTACHED) panic("handle_written_inodeblock: new entry"); if (dp2->di_extb[adp->ad_offset] != adp->ad_oldblkno) panic("%s: direct pointers #%jd %s %jd != %jd", "handle_written_inodeblock", (intmax_t)adp->ad_offset, "mismatch", (intmax_t)dp2->di_extb[adp->ad_offset], (intmax_t)adp->ad_oldblkno); dp2->di_extb[adp->ad_offset] = adp->ad_newblkno; adp->ad_state &= ~UNDONE; adp->ad_state |= ATTACHED; hadchanges = 1; } if (hadchanges && (bp->b_flags & B_DELWRI) == 0) stat_direct_blk_ptrs++; /* * Reset the file size to its most up-to-date value. */ if (inodedep->id_savedsize == -1 || inodedep->id_savedextsize == -1) panic("handle_written_inodeblock: bad size"); if (inodedep->id_savednlink > LINK_MAX) panic("handle_written_inodeblock: Invalid link count " "%jd for inodedep %p", (uintmax_t)inodedep->id_savednlink, inodedep); if (fstype == UFS1) { if (dp1->di_nlink != inodedep->id_savednlink) { dp1->di_nlink = inodedep->id_savednlink; hadchanges = 1; } if (dp1->di_size != inodedep->id_savedsize) { dp1->di_size = inodedep->id_savedsize; hadchanges = 1; } } else { if (dp2->di_nlink != inodedep->id_savednlink) { dp2->di_nlink = inodedep->id_savednlink; hadchanges = 1; } if (dp2->di_size != inodedep->id_savedsize) { dp2->di_size = inodedep->id_savedsize; hadchanges = 1; } if (dp2->di_extsize != inodedep->id_savedextsize) { dp2->di_extsize = inodedep->id_savedextsize; hadchanges = 1; } } inodedep->id_savedsize = -1; inodedep->id_savedextsize = -1; inodedep->id_savednlink = -1; /* * If there were any rollbacks in the inode block, then it must be * marked dirty so that its will eventually get written back in * its correct form. */ if (hadchanges) bdirty(bp); bufwait: /* * If the write did not succeed, we have done all the roll-forward * operations, but we cannot take the actions that will allow its * dependencies to be processed. */ if ((flags & WRITESUCCEEDED) == 0) return (hadchanges); /* * Process any allocdirects that completed during the update. */ if ((adp = TAILQ_FIRST(&inodedep->id_inoupdt)) != NULL) handle_allocdirect_partdone(adp, &wkhd); if ((adp = TAILQ_FIRST(&inodedep->id_extupdt)) != NULL) handle_allocdirect_partdone(adp, &wkhd); /* * Process deallocations that were held pending until the * inode had been written to disk. Freeing of the inode * is delayed until after all blocks have been freed to * avoid creation of new triples * before the old ones have been deleted. Completely * unlinked inodes are not processed until the unlinked * inode list is written or the last reference is removed. */ if ((inodedep->id_state & (UNLINKED | UNLINKONLIST)) != UNLINKED) { freefile = handle_bufwait(inodedep, NULL); if (freefile && !LIST_EMPTY(&wkhd)) { WORKLIST_INSERT(&wkhd, &freefile->fx_list); freefile = NULL; } } /* * Move rolled forward dependency completions to the bufwait list * now that those that were already written have been processed. */ if (!LIST_EMPTY(&wkhd) && hadchanges == 0) panic("handle_written_inodeblock: bufwait but no changes"); jwork_move(&inodedep->id_bufwait, &wkhd); if (freefile != NULL) { /* * If the inode is goingaway it was never written. Fake up * the state here so free_inodedep() can succeed. */ if (inodedep->id_state & GOINGAWAY) inodedep->id_state |= COMPLETE | DEPCOMPLETE; if (free_inodedep(inodedep) == 0) panic("handle_written_inodeblock: live inodedep %p", inodedep); add_to_worklist(&freefile->fx_list, 0); return (0); } /* * If no outstanding dependencies, free it. */ if (free_inodedep(inodedep) || (TAILQ_FIRST(&inodedep->id_inoreflst) == 0 && TAILQ_FIRST(&inodedep->id_inoupdt) == 0 && TAILQ_FIRST(&inodedep->id_extupdt) == 0 && LIST_FIRST(&inodedep->id_bufwait) == 0)) return (0); return (hadchanges); } /* * Perform needed roll-forwards and kick off any dependencies that * can now be processed. * * If the write did not succeed, we will do all the roll-forward * operations, but we will not take the actions that will allow its * dependencies to be processed. */ static int handle_written_indirdep(indirdep, bp, bpp, flags) struct indirdep *indirdep; struct buf *bp; struct buf **bpp; int flags; { struct allocindir *aip; struct buf *sbp; int chgs; if (indirdep->ir_state & GOINGAWAY) panic("handle_written_indirdep: indirdep gone"); if ((indirdep->ir_state & IOSTARTED) == 0) panic("handle_written_indirdep: IO not started"); chgs = 0; /* * If there were rollbacks revert them here. */ if (indirdep->ir_saveddata) { bcopy(indirdep->ir_saveddata, bp->b_data, bp->b_bcount); if (TAILQ_EMPTY(&indirdep->ir_trunc)) { free(indirdep->ir_saveddata, M_INDIRDEP); indirdep->ir_saveddata = NULL; } chgs = 1; } indirdep->ir_state &= ~(UNDONE | IOSTARTED); indirdep->ir_state |= ATTACHED; /* * If the write did not succeed, we have done all the roll-forward * operations, but we cannot take the actions that will allow its * dependencies to be processed. */ if ((flags & WRITESUCCEEDED) == 0) { stat_indir_blk_ptrs++; bdirty(bp); return (1); } /* * Move allocindirs with written pointers to the completehd if * the indirdep's pointer is not yet written. Otherwise * free them here. */ while ((aip = LIST_FIRST(&indirdep->ir_writehd)) != NULL) { LIST_REMOVE(aip, ai_next); if ((indirdep->ir_state & DEPCOMPLETE) == 0) { LIST_INSERT_HEAD(&indirdep->ir_completehd, aip, ai_next); newblk_freefrag(&aip->ai_block); continue; } free_newblk(&aip->ai_block); } /* * Move allocindirs that have finished dependency processing from * the done list to the write list after updating the pointers. */ if (TAILQ_EMPTY(&indirdep->ir_trunc)) { while ((aip = LIST_FIRST(&indirdep->ir_donehd)) != NULL) { handle_allocindir_partdone(aip); if (aip == LIST_FIRST(&indirdep->ir_donehd)) panic("disk_write_complete: not gone"); chgs = 1; } } /* * Preserve the indirdep if there were any changes or if it is not * yet valid on disk. */ if (chgs) { stat_indir_blk_ptrs++; bdirty(bp); return (1); } /* * If there were no changes we can discard the savedbp and detach * ourselves from the buf. We are only carrying completed pointers * in this case. */ sbp = indirdep->ir_savebp; sbp->b_flags |= B_INVAL | B_NOCACHE; indirdep->ir_savebp = NULL; indirdep->ir_bp = NULL; if (*bpp != NULL) panic("handle_written_indirdep: bp already exists."); *bpp = sbp; /* * The indirdep may not be freed until its parent points at it. */ if (indirdep->ir_state & DEPCOMPLETE) free_indirdep(indirdep); return (0); } /* * Process a diradd entry after its dependent inode has been written. * This routine must be called with splbio interrupts blocked. */ static void diradd_inode_written(dap, inodedep) struct diradd *dap; struct inodedep *inodedep; { dap->da_state |= COMPLETE; complete_diradd(dap); WORKLIST_INSERT(&inodedep->id_pendinghd, &dap->da_list); } /* * Returns true if the bmsafemap will have rollbacks when written. Must only * be called with the per-filesystem lock and the buf lock on the cg held. */ static int bmsafemap_backgroundwrite(bmsafemap, bp) struct bmsafemap *bmsafemap; struct buf *bp; { int dirty; LOCK_OWNED(VFSTOUFS(bmsafemap->sm_list.wk_mp)); dirty = !LIST_EMPTY(&bmsafemap->sm_jaddrefhd) | !LIST_EMPTY(&bmsafemap->sm_jnewblkhd); /* * If we're initiating a background write we need to process the * rollbacks as they exist now, not as they exist when IO starts. * No other consumers will look at the contents of the shadowed * buf so this is safe to do here. */ if (bp->b_xflags & BX_BKGRDMARKER) initiate_write_bmsafemap(bmsafemap, bp); return (dirty); } /* * Re-apply an allocation when a cg write is complete. */ static int jnewblk_rollforward(jnewblk, fs, cgp, blksfree) struct jnewblk *jnewblk; struct fs *fs; struct cg *cgp; uint8_t *blksfree; { ufs1_daddr_t fragno; ufs2_daddr_t blkno; long cgbno, bbase; int frags, blk; int i; frags = 0; cgbno = dtogd(fs, jnewblk->jn_blkno); for (i = jnewblk->jn_oldfrags; i < jnewblk->jn_frags; i++) { if (isclr(blksfree, cgbno + i)) panic("jnewblk_rollforward: re-allocated fragment"); frags++; } if (frags == fs->fs_frag) { blkno = fragstoblks(fs, cgbno); ffs_clrblock(fs, blksfree, (long)blkno); ffs_clusteracct(fs, cgp, blkno, -1); cgp->cg_cs.cs_nbfree--; } else { bbase = cgbno - fragnum(fs, cgbno); cgbno += jnewblk->jn_oldfrags; /* If a complete block had been reassembled, account for it. */ fragno = fragstoblks(fs, bbase); if (ffs_isblock(fs, blksfree, fragno)) { cgp->cg_cs.cs_nffree += fs->fs_frag; ffs_clusteracct(fs, cgp, fragno, -1); cgp->cg_cs.cs_nbfree--; } /* Decrement the old frags. */ blk = blkmap(fs, blksfree, bbase); ffs_fragacct(fs, blk, cgp->cg_frsum, -1); /* Allocate the fragment */ for (i = 0; i < frags; i++) clrbit(blksfree, cgbno + i); cgp->cg_cs.cs_nffree -= frags; /* Add back in counts associated with the new frags */ blk = blkmap(fs, blksfree, bbase); ffs_fragacct(fs, blk, cgp->cg_frsum, 1); } return (frags); } /* * Complete a write to a bmsafemap structure. Roll forward any bitmap * changes if it's not a background write. Set all written dependencies * to DEPCOMPLETE and free the structure if possible. * * If the write did not succeed, we will do all the roll-forward * operations, but we will not take the actions that will allow its * dependencies to be processed. */ static int handle_written_bmsafemap(bmsafemap, bp, flags) struct bmsafemap *bmsafemap; struct buf *bp; int flags; { struct newblk *newblk; struct inodedep *inodedep; struct jaddref *jaddref, *jatmp; struct jnewblk *jnewblk, *jntmp; struct ufsmount *ump; uint8_t *inosused; uint8_t *blksfree; struct cg *cgp; struct fs *fs; ino_t ino; int foreground; int chgs; if ((bmsafemap->sm_state & IOSTARTED) == 0) panic("handle_written_bmsafemap: Not started\n"); ump = VFSTOUFS(bmsafemap->sm_list.wk_mp); chgs = 0; bmsafemap->sm_state &= ~IOSTARTED; foreground = (bp->b_xflags & BX_BKGRDMARKER) == 0; /* * If write was successful, release journal work that was waiting * on the write. Otherwise move the work back. */ if (flags & WRITESUCCEEDED) handle_jwork(&bmsafemap->sm_freewr); else LIST_CONCAT(&bmsafemap->sm_freehd, &bmsafemap->sm_freewr, worklist, wk_list); /* * Restore unwritten inode allocation pending jaddref writes. */ if (!LIST_EMPTY(&bmsafemap->sm_jaddrefhd)) { cgp = (struct cg *)bp->b_data; fs = VFSTOUFS(bmsafemap->sm_list.wk_mp)->um_fs; inosused = cg_inosused(cgp); LIST_FOREACH_SAFE(jaddref, &bmsafemap->sm_jaddrefhd, ja_bmdeps, jatmp) { if ((jaddref->ja_state & UNDONE) == 0) continue; ino = jaddref->ja_ino % fs->fs_ipg; if (isset(inosused, ino)) panic("handle_written_bmsafemap: " "re-allocated inode"); /* Do the roll-forward only if it's a real copy. */ if (foreground) { if ((jaddref->ja_mode & IFMT) == IFDIR) cgp->cg_cs.cs_ndir++; cgp->cg_cs.cs_nifree--; setbit(inosused, ino); chgs = 1; } jaddref->ja_state &= ~UNDONE; jaddref->ja_state |= ATTACHED; free_jaddref(jaddref); } } /* * Restore any block allocations which are pending journal writes. */ if (LIST_FIRST(&bmsafemap->sm_jnewblkhd) != NULL) { cgp = (struct cg *)bp->b_data; fs = VFSTOUFS(bmsafemap->sm_list.wk_mp)->um_fs; blksfree = cg_blksfree(cgp); LIST_FOREACH_SAFE(jnewblk, &bmsafemap->sm_jnewblkhd, jn_deps, jntmp) { if ((jnewblk->jn_state & UNDONE) == 0) continue; /* Do the roll-forward only if it's a real copy. */ if (foreground && jnewblk_rollforward(jnewblk, fs, cgp, blksfree)) chgs = 1; jnewblk->jn_state &= ~(UNDONE | NEWBLOCK); jnewblk->jn_state |= ATTACHED; free_jnewblk(jnewblk); } } /* * If the write did not succeed, we have done all the roll-forward * operations, but we cannot take the actions that will allow its * dependencies to be processed. */ if ((flags & WRITESUCCEEDED) == 0) { LIST_CONCAT(&bmsafemap->sm_newblkhd, &bmsafemap->sm_newblkwr, newblk, nb_deps); LIST_CONCAT(&bmsafemap->sm_freehd, &bmsafemap->sm_freewr, worklist, wk_list); if (foreground) bdirty(bp); return (1); } while ((newblk = LIST_FIRST(&bmsafemap->sm_newblkwr))) { newblk->nb_state |= DEPCOMPLETE; newblk->nb_state &= ~ONDEPLIST; newblk->nb_bmsafemap = NULL; LIST_REMOVE(newblk, nb_deps); if (newblk->nb_list.wk_type == D_ALLOCDIRECT) handle_allocdirect_partdone( WK_ALLOCDIRECT(&newblk->nb_list), NULL); else if (newblk->nb_list.wk_type == D_ALLOCINDIR) handle_allocindir_partdone( WK_ALLOCINDIR(&newblk->nb_list)); else if (newblk->nb_list.wk_type != D_NEWBLK) panic("handle_written_bmsafemap: Unexpected type: %s", TYPENAME(newblk->nb_list.wk_type)); } while ((inodedep = LIST_FIRST(&bmsafemap->sm_inodedepwr)) != NULL) { inodedep->id_state |= DEPCOMPLETE; inodedep->id_state &= ~ONDEPLIST; LIST_REMOVE(inodedep, id_deps); inodedep->id_bmsafemap = NULL; } LIST_REMOVE(bmsafemap, sm_next); if (chgs == 0 && LIST_EMPTY(&bmsafemap->sm_jaddrefhd) && LIST_EMPTY(&bmsafemap->sm_jnewblkhd) && LIST_EMPTY(&bmsafemap->sm_newblkhd) && LIST_EMPTY(&bmsafemap->sm_inodedephd) && LIST_EMPTY(&bmsafemap->sm_freehd)) { LIST_REMOVE(bmsafemap, sm_hash); WORKITEM_FREE(bmsafemap, D_BMSAFEMAP); return (0); } LIST_INSERT_HEAD(&ump->softdep_dirtycg, bmsafemap, sm_next); if (foreground) bdirty(bp); return (1); } /* * Try to free a mkdir dependency. */ static void complete_mkdir(mkdir) struct mkdir *mkdir; { struct diradd *dap; if ((mkdir->md_state & ALLCOMPLETE) != ALLCOMPLETE) return; LIST_REMOVE(mkdir, md_mkdirs); dap = mkdir->md_diradd; dap->da_state &= ~(mkdir->md_state & (MKDIR_PARENT | MKDIR_BODY)); if ((dap->da_state & (MKDIR_PARENT | MKDIR_BODY)) == 0) { dap->da_state |= DEPCOMPLETE; complete_diradd(dap); } WORKITEM_FREE(mkdir, D_MKDIR); } /* * Handle the completion of a mkdir dependency. */ static void handle_written_mkdir(mkdir, type) struct mkdir *mkdir; int type; { if ((mkdir->md_state & (MKDIR_PARENT | MKDIR_BODY)) != type) panic("handle_written_mkdir: bad type"); mkdir->md_state |= COMPLETE; complete_mkdir(mkdir); } static int free_pagedep(pagedep) struct pagedep *pagedep; { int i; if (pagedep->pd_state & NEWBLOCK) return (0); if (!LIST_EMPTY(&pagedep->pd_dirremhd)) return (0); for (i = 0; i < DAHASHSZ; i++) if (!LIST_EMPTY(&pagedep->pd_diraddhd[i])) return (0); if (!LIST_EMPTY(&pagedep->pd_pendinghd)) return (0); if (!LIST_EMPTY(&pagedep->pd_jmvrefhd)) return (0); if (pagedep->pd_state & ONWORKLIST) WORKLIST_REMOVE(&pagedep->pd_list); LIST_REMOVE(pagedep, pd_hash); WORKITEM_FREE(pagedep, D_PAGEDEP); return (1); } /* * Called from within softdep_disk_write_complete above. * A write operation was just completed. Removed inodes can * now be freed and associated block pointers may be committed. * Note that this routine is always called from interrupt level * with further interrupts from this device blocked. * * If the write did not succeed, we will do all the roll-forward * operations, but we will not take the actions that will allow its * dependencies to be processed. */ static int handle_written_filepage(pagedep, bp, flags) struct pagedep *pagedep; struct buf *bp; /* buffer containing the written page */ int flags; { struct dirrem *dirrem; struct diradd *dap, *nextdap; struct direct *ep; int i, chgs; if ((pagedep->pd_state & IOSTARTED) == 0) panic("handle_written_filepage: not started"); pagedep->pd_state &= ~IOSTARTED; if ((flags & WRITESUCCEEDED) == 0) goto rollforward; /* * Process any directory removals that have been committed. */ while ((dirrem = LIST_FIRST(&pagedep->pd_dirremhd)) != NULL) { LIST_REMOVE(dirrem, dm_next); dirrem->dm_state |= COMPLETE; dirrem->dm_dirinum = pagedep->pd_ino; KASSERT(LIST_EMPTY(&dirrem->dm_jremrefhd), ("handle_written_filepage: Journal entries not written.")); add_to_worklist(&dirrem->dm_list, 0); } /* * Free any directory additions that have been committed. * If it is a newly allocated block, we have to wait until * the on-disk directory inode claims the new block. */ if ((pagedep->pd_state & NEWBLOCK) == 0) while ((dap = LIST_FIRST(&pagedep->pd_pendinghd)) != NULL) free_diradd(dap, NULL); rollforward: /* * Uncommitted directory entries must be restored. */ for (chgs = 0, i = 0; i < DAHASHSZ; i++) { for (dap = LIST_FIRST(&pagedep->pd_diraddhd[i]); dap; dap = nextdap) { nextdap = LIST_NEXT(dap, da_pdlist); if (dap->da_state & ATTACHED) panic("handle_written_filepage: attached"); ep = (struct direct *) ((char *)bp->b_data + dap->da_offset); ep->d_ino = dap->da_newinum; dap->da_state &= ~UNDONE; dap->da_state |= ATTACHED; chgs = 1; /* * If the inode referenced by the directory has * been written out, then the dependency can be * moved to the pending list. */ if ((dap->da_state & ALLCOMPLETE) == ALLCOMPLETE) { LIST_REMOVE(dap, da_pdlist); LIST_INSERT_HEAD(&pagedep->pd_pendinghd, dap, da_pdlist); } } } /* * If there were any rollbacks in the directory, then it must be * marked dirty so that its will eventually get written back in * its correct form. */ if (chgs || (flags & WRITESUCCEEDED) == 0) { if ((bp->b_flags & B_DELWRI) == 0) stat_dir_entry++; bdirty(bp); return (1); } /* * If we are not waiting for a new directory block to be * claimed by its inode, then the pagedep will be freed. * Otherwise it will remain to track any new entries on * the page in case they are fsync'ed. */ free_pagedep(pagedep); return (0); } /* * Writing back in-core inode structures. * * The filesystem only accesses an inode's contents when it occupies an * "in-core" inode structure. These "in-core" structures are separate from * the page frames used to cache inode blocks. Only the latter are * transferred to/from the disk. So, when the updated contents of the * "in-core" inode structure are copied to the corresponding in-memory inode * block, the dependencies are also transferred. The following procedure is * called when copying a dirty "in-core" inode to a cached inode block. */ /* * Called when an inode is loaded from disk. If the effective link count * differed from the actual link count when it was last flushed, then we * need to ensure that the correct effective link count is put back. */ void softdep_load_inodeblock(ip) struct inode *ip; /* the "in_core" copy of the inode */ { struct inodedep *inodedep; struct ufsmount *ump; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_load_inodeblock called on non-softdep filesystem")); /* * Check for alternate nlink count. */ ip->i_effnlink = ip->i_nlink; ACQUIRE_LOCK(ump); if (inodedep_lookup(UFSTOVFS(ump), ip->i_number, 0, &inodedep) == 0) { FREE_LOCK(ump); return; } ip->i_effnlink -= inodedep->id_nlinkdelta; FREE_LOCK(ump); } /* * This routine is called just before the "in-core" inode * information is to be copied to the in-memory inode block. * Recall that an inode block contains several inodes. If * the force flag is set, then the dependencies will be * cleared so that the update can always be made. Note that * the buffer is locked when this routine is called, so we * will never be in the middle of writing the inode block * to disk. */ void softdep_update_inodeblock(ip, bp, waitfor) struct inode *ip; /* the "in_core" copy of the inode */ struct buf *bp; /* the buffer containing the inode block */ int waitfor; /* nonzero => update must be allowed */ { struct inodedep *inodedep; struct inoref *inoref; struct ufsmount *ump; struct worklist *wk; struct mount *mp; struct buf *ibp; struct fs *fs; int error; ump = ITOUMP(ip); mp = UFSTOVFS(ump); KASSERT(MOUNTEDSOFTDEP(mp) != 0, ("softdep_update_inodeblock called on non-softdep filesystem")); fs = ump->um_fs; /* * Preserve the freelink that is on disk. clear_unlinked_inodedep() * does not have access to the in-core ip so must write directly into * the inode block buffer when setting freelink. */ if (fs->fs_magic == FS_UFS1_MAGIC) DIP_SET(ip, i_freelink, ((struct ufs1_dinode *)bp->b_data + ino_to_fsbo(fs, ip->i_number))->di_freelink); else DIP_SET(ip, i_freelink, ((struct ufs2_dinode *)bp->b_data + ino_to_fsbo(fs, ip->i_number))->di_freelink); /* * If the effective link count is not equal to the actual link * count, then we must track the difference in an inodedep while * the inode is (potentially) tossed out of the cache. Otherwise, * if there is no existing inodedep, then there are no dependencies * to track. */ ACQUIRE_LOCK(ump); again: if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) == 0) { FREE_LOCK(ump); if (ip->i_effnlink != ip->i_nlink) panic("softdep_update_inodeblock: bad link count"); return; } if (inodedep->id_nlinkdelta != ip->i_nlink - ip->i_effnlink) panic("softdep_update_inodeblock: bad delta"); /* * If we're flushing all dependencies we must also move any waiting * for journal writes onto the bufwait list prior to I/O. */ if (waitfor) { TAILQ_FOREACH(inoref, &inodedep->id_inoreflst, if_deps) { if ((inoref->if_state & (DEPCOMPLETE | GOINGAWAY)) == DEPCOMPLETE) { jwait(&inoref->if_list, MNT_WAIT); goto again; } } } /* * Changes have been initiated. Anything depending on these * changes cannot occur until this inode has been written. */ inodedep->id_state &= ~COMPLETE; if ((inodedep->id_state & ONWORKLIST) == 0) WORKLIST_INSERT(&bp->b_dep, &inodedep->id_list); /* * Any new dependencies associated with the incore inode must * now be moved to the list associated with the buffer holding * the in-memory copy of the inode. Once merged process any * allocdirects that are completed by the merger. */ merge_inode_lists(&inodedep->id_newinoupdt, &inodedep->id_inoupdt); if (!TAILQ_EMPTY(&inodedep->id_inoupdt)) handle_allocdirect_partdone(TAILQ_FIRST(&inodedep->id_inoupdt), NULL); merge_inode_lists(&inodedep->id_newextupdt, &inodedep->id_extupdt); if (!TAILQ_EMPTY(&inodedep->id_extupdt)) handle_allocdirect_partdone(TAILQ_FIRST(&inodedep->id_extupdt), NULL); /* * Now that the inode has been pushed into the buffer, the * operations dependent on the inode being written to disk * can be moved to the id_bufwait so that they will be * processed when the buffer I/O completes. */ while ((wk = LIST_FIRST(&inodedep->id_inowait)) != NULL) { WORKLIST_REMOVE(wk); WORKLIST_INSERT(&inodedep->id_bufwait, wk); } /* * Newly allocated inodes cannot be written until the bitmap * that allocates them have been written (indicated by * DEPCOMPLETE being set in id_state). If we are doing a * forced sync (e.g., an fsync on a file), we force the bitmap * to be written so that the update can be done. */ if (waitfor == 0) { FREE_LOCK(ump); return; } retry: if ((inodedep->id_state & (DEPCOMPLETE | GOINGAWAY)) != 0) { FREE_LOCK(ump); return; } ibp = inodedep->id_bmsafemap->sm_buf; ibp = getdirtybuf(ibp, LOCK_PTR(ump), MNT_WAIT); if (ibp == NULL) { /* * If ibp came back as NULL, the dependency could have been * freed while we slept. Look it up again, and check to see * that it has completed. */ if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) != 0) goto retry; FREE_LOCK(ump); return; } FREE_LOCK(ump); if ((error = bwrite(ibp)) != 0) softdep_error("softdep_update_inodeblock: bwrite", error); } /* * Merge the a new inode dependency list (such as id_newinoupdt) into an * old inode dependency list (such as id_inoupdt). This routine must be * called with splbio interrupts blocked. */ static void merge_inode_lists(newlisthead, oldlisthead) struct allocdirectlst *newlisthead; struct allocdirectlst *oldlisthead; { struct allocdirect *listadp, *newadp; newadp = TAILQ_FIRST(newlisthead); for (listadp = TAILQ_FIRST(oldlisthead); listadp && newadp;) { if (listadp->ad_offset < newadp->ad_offset) { listadp = TAILQ_NEXT(listadp, ad_next); continue; } TAILQ_REMOVE(newlisthead, newadp, ad_next); TAILQ_INSERT_BEFORE(listadp, newadp, ad_next); if (listadp->ad_offset == newadp->ad_offset) { allocdirect_merge(oldlisthead, newadp, listadp); listadp = newadp; } newadp = TAILQ_FIRST(newlisthead); } while ((newadp = TAILQ_FIRST(newlisthead)) != NULL) { TAILQ_REMOVE(newlisthead, newadp, ad_next); TAILQ_INSERT_TAIL(oldlisthead, newadp, ad_next); } } /* * If we are doing an fsync, then we must ensure that any directory * entries for the inode have been written after the inode gets to disk. */ int softdep_fsync(vp) struct vnode *vp; /* the "in_core" copy of the inode */ { struct inodedep *inodedep; struct pagedep *pagedep; struct inoref *inoref; struct ufsmount *ump; struct worklist *wk; struct diradd *dap; struct mount *mp; struct vnode *pvp; struct inode *ip; struct buf *bp; struct fs *fs; struct thread *td = curthread; int error, flushparent, pagedep_new_block; ino_t parentino; ufs_lbn_t lbn; ip = VTOI(vp); mp = vp->v_mount; ump = VFSTOUFS(mp); fs = ump->um_fs; if (MOUNTEDSOFTDEP(mp) == 0) return (0); ACQUIRE_LOCK(ump); restart: if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) == 0) { FREE_LOCK(ump); return (0); } TAILQ_FOREACH(inoref, &inodedep->id_inoreflst, if_deps) { if ((inoref->if_state & (DEPCOMPLETE | GOINGAWAY)) == DEPCOMPLETE) { jwait(&inoref->if_list, MNT_WAIT); goto restart; } } if (!LIST_EMPTY(&inodedep->id_inowait) || !TAILQ_EMPTY(&inodedep->id_extupdt) || !TAILQ_EMPTY(&inodedep->id_newextupdt) || !TAILQ_EMPTY(&inodedep->id_inoupdt) || !TAILQ_EMPTY(&inodedep->id_newinoupdt)) panic("softdep_fsync: pending ops %p", inodedep); for (error = 0, flushparent = 0; ; ) { if ((wk = LIST_FIRST(&inodedep->id_pendinghd)) == NULL) break; if (wk->wk_type != D_DIRADD) panic("softdep_fsync: Unexpected type %s", TYPENAME(wk->wk_type)); dap = WK_DIRADD(wk); /* * Flush our parent if this directory entry has a MKDIR_PARENT * dependency or is contained in a newly allocated block. */ if (dap->da_state & DIRCHG) pagedep = dap->da_previous->dm_pagedep; else pagedep = dap->da_pagedep; parentino = pagedep->pd_ino; lbn = pagedep->pd_lbn; if ((dap->da_state & (MKDIR_BODY | COMPLETE)) != COMPLETE) panic("softdep_fsync: dirty"); if ((dap->da_state & MKDIR_PARENT) || (pagedep->pd_state & NEWBLOCK)) flushparent = 1; else flushparent = 0; /* * If we are being fsync'ed as part of vgone'ing this vnode, * then we will not be able to release and recover the * vnode below, so we just have to give up on writing its * directory entry out. It will eventually be written, just * not now, but then the user was not asking to have it * written, so we are not breaking any promises. */ if (vp->v_iflag & VI_DOOMED) break; /* * We prevent deadlock by always fetching inodes from the * root, moving down the directory tree. Thus, when fetching * our parent directory, we first try to get the lock. If * that fails, we must unlock ourselves before requesting * the lock on our parent. See the comment in ufs_lookup * for details on possible races. */ FREE_LOCK(ump); if (ffs_vgetf(mp, parentino, LK_NOWAIT | LK_EXCLUSIVE, &pvp, FFSV_FORCEINSMQ)) { error = vfs_busy(mp, MBF_NOWAIT); if (error != 0) { vfs_ref(mp); VOP_UNLOCK(vp, 0); error = vfs_busy(mp, 0); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); vfs_rel(mp); if (error != 0) return (ENOENT); if (vp->v_iflag & VI_DOOMED) { vfs_unbusy(mp); return (ENOENT); } } VOP_UNLOCK(vp, 0); error = ffs_vgetf(mp, parentino, LK_EXCLUSIVE, &pvp, FFSV_FORCEINSMQ); vfs_unbusy(mp); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); if (vp->v_iflag & VI_DOOMED) { if (error == 0) vput(pvp); error = ENOENT; } if (error != 0) return (error); } /* * All MKDIR_PARENT dependencies and all the NEWBLOCK pagedeps * that are contained in direct blocks will be resolved by * doing a ffs_update. Pagedeps contained in indirect blocks * may require a complete sync'ing of the directory. So, we * try the cheap and fast ffs_update first, and if that fails, * then we do the slower ffs_syncvnode of the directory. */ if (flushparent) { int locked; if ((error = ffs_update(pvp, 1)) != 0) { vput(pvp); return (error); } ACQUIRE_LOCK(ump); locked = 1; if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) != 0) { if ((wk = LIST_FIRST(&inodedep->id_pendinghd)) != NULL) { if (wk->wk_type != D_DIRADD) panic("softdep_fsync: Unexpected type %s", TYPENAME(wk->wk_type)); dap = WK_DIRADD(wk); if (dap->da_state & DIRCHG) pagedep = dap->da_previous->dm_pagedep; else pagedep = dap->da_pagedep; pagedep_new_block = pagedep->pd_state & NEWBLOCK; FREE_LOCK(ump); locked = 0; if (pagedep_new_block && (error = ffs_syncvnode(pvp, MNT_WAIT, 0))) { vput(pvp); return (error); } } } if (locked) FREE_LOCK(ump); } /* * Flush directory page containing the inode's name. */ error = bread(pvp, lbn, blksize(fs, VTOI(pvp), lbn), td->td_ucred, &bp); if (error == 0) error = bwrite(bp); else brelse(bp); vput(pvp); if (error != 0) return (error); ACQUIRE_LOCK(ump); if (inodedep_lookup(mp, ip->i_number, 0, &inodedep) == 0) break; } FREE_LOCK(ump); return (0); } /* * Flush all the dirty bitmaps associated with the block device * before flushing the rest of the dirty blocks so as to reduce * the number of dependencies that will have to be rolled back. * * XXX Unused? */ void softdep_fsync_mountdev(vp) struct vnode *vp; { struct buf *bp, *nbp; struct worklist *wk; struct bufobj *bo; if (!vn_isdisk(vp, NULL)) panic("softdep_fsync_mountdev: vnode not a disk"); bo = &vp->v_bufobj; restart: BO_LOCK(bo); TAILQ_FOREACH_SAFE(bp, &bo->bo_dirty.bv_hd, b_bobufs, nbp) { /* * If it is already scheduled, skip to the next buffer. */ if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL)) continue; if ((bp->b_flags & B_DELWRI) == 0) panic("softdep_fsync_mountdev: not dirty"); /* * We are only interested in bitmaps with outstanding * dependencies. */ if ((wk = LIST_FIRST(&bp->b_dep)) == NULL || wk->wk_type != D_BMSAFEMAP || (bp->b_vflags & BV_BKGRDINPROG)) { BUF_UNLOCK(bp); continue; } BO_UNLOCK(bo); bremfree(bp); (void) bawrite(bp); goto restart; } drain_output(vp); BO_UNLOCK(bo); } /* * Sync all cylinder groups that were dirty at the time this function is * called. Newly dirtied cgs will be inserted before the sentinel. This * is used to flush freedep activity that may be holding up writes to a * indirect block. */ static int sync_cgs(mp, waitfor) struct mount *mp; int waitfor; { struct bmsafemap *bmsafemap; struct bmsafemap *sentinel; struct ufsmount *ump; struct buf *bp; int error; sentinel = malloc(sizeof(*sentinel), M_BMSAFEMAP, M_ZERO | M_WAITOK); sentinel->sm_cg = -1; ump = VFSTOUFS(mp); error = 0; ACQUIRE_LOCK(ump); LIST_INSERT_HEAD(&ump->softdep_dirtycg, sentinel, sm_next); for (bmsafemap = LIST_NEXT(sentinel, sm_next); bmsafemap != NULL; bmsafemap = LIST_NEXT(sentinel, sm_next)) { /* Skip sentinels and cgs with no work to release. */ if (bmsafemap->sm_cg == -1 || (LIST_EMPTY(&bmsafemap->sm_freehd) && LIST_EMPTY(&bmsafemap->sm_freewr))) { LIST_REMOVE(sentinel, sm_next); LIST_INSERT_AFTER(bmsafemap, sentinel, sm_next); continue; } /* * If we don't get the lock and we're waiting try again, if * not move on to the next buf and try to sync it. */ bp = getdirtybuf(bmsafemap->sm_buf, LOCK_PTR(ump), waitfor); if (bp == NULL && waitfor == MNT_WAIT) continue; LIST_REMOVE(sentinel, sm_next); LIST_INSERT_AFTER(bmsafemap, sentinel, sm_next); if (bp == NULL) continue; FREE_LOCK(ump); if (waitfor == MNT_NOWAIT) bawrite(bp); else error = bwrite(bp); ACQUIRE_LOCK(ump); if (error) break; } LIST_REMOVE(sentinel, sm_next); FREE_LOCK(ump); free(sentinel, M_BMSAFEMAP); return (error); } /* * This routine is called when we are trying to synchronously flush a * file. This routine must eliminate any filesystem metadata dependencies * so that the syncing routine can succeed. */ int softdep_sync_metadata(struct vnode *vp) { struct inode *ip; int error; ip = VTOI(vp); KASSERT(MOUNTEDSOFTDEP(vp->v_mount) != 0, ("softdep_sync_metadata called on non-softdep filesystem")); /* * Ensure that any direct block dependencies have been cleared, * truncations are started, and inode references are journaled. */ ACQUIRE_LOCK(VFSTOUFS(vp->v_mount)); /* * Write all journal records to prevent rollbacks on devvp. */ if (vp->v_type == VCHR) softdep_flushjournal(vp->v_mount); error = flush_inodedep_deps(vp, vp->v_mount, ip->i_number); /* * Ensure that all truncates are written so we won't find deps on * indirect blocks. */ process_truncates(vp); FREE_LOCK(VFSTOUFS(vp->v_mount)); return (error); } /* * This routine is called when we are attempting to sync a buf with * dependencies. If waitfor is MNT_NOWAIT it attempts to schedule any * other IO it can but returns EBUSY if the buffer is not yet able to * be written. Dependencies which will not cause rollbacks will always * return 0. */ int softdep_sync_buf(struct vnode *vp, struct buf *bp, int waitfor) { struct indirdep *indirdep; struct pagedep *pagedep; struct allocindir *aip; struct newblk *newblk; struct ufsmount *ump; struct buf *nbp; struct worklist *wk; int i, error; KASSERT(MOUNTEDSOFTDEP(vp->v_mount) != 0, ("softdep_sync_buf called on non-softdep filesystem")); /* * For VCHR we just don't want to force flush any dependencies that * will cause rollbacks. */ if (vp->v_type == VCHR) { if (waitfor == MNT_NOWAIT && softdep_count_dependencies(bp, 0)) return (EBUSY); return (0); } ump = VFSTOUFS(vp->v_mount); ACQUIRE_LOCK(ump); /* * As we hold the buffer locked, none of its dependencies * will disappear. */ error = 0; top: LIST_FOREACH(wk, &bp->b_dep, wk_list) { switch (wk->wk_type) { case D_ALLOCDIRECT: case D_ALLOCINDIR: newblk = WK_NEWBLK(wk); if (newblk->nb_jnewblk != NULL) { if (waitfor == MNT_NOWAIT) { error = EBUSY; goto out_unlock; } jwait(&newblk->nb_jnewblk->jn_list, waitfor); goto top; } if (newblk->nb_state & DEPCOMPLETE || waitfor == MNT_NOWAIT) continue; nbp = newblk->nb_bmsafemap->sm_buf; nbp = getdirtybuf(nbp, LOCK_PTR(ump), waitfor); if (nbp == NULL) goto top; FREE_LOCK(ump); if ((error = bwrite(nbp)) != 0) goto out; ACQUIRE_LOCK(ump); continue; case D_INDIRDEP: indirdep = WK_INDIRDEP(wk); if (waitfor == MNT_NOWAIT) { if (!TAILQ_EMPTY(&indirdep->ir_trunc) || !LIST_EMPTY(&indirdep->ir_deplisthd)) { error = EBUSY; goto out_unlock; } } if (!TAILQ_EMPTY(&indirdep->ir_trunc)) panic("softdep_sync_buf: truncation pending."); restart: LIST_FOREACH(aip, &indirdep->ir_deplisthd, ai_next) { newblk = (struct newblk *)aip; if (newblk->nb_jnewblk != NULL) { jwait(&newblk->nb_jnewblk->jn_list, waitfor); goto restart; } if (newblk->nb_state & DEPCOMPLETE) continue; nbp = newblk->nb_bmsafemap->sm_buf; nbp = getdirtybuf(nbp, LOCK_PTR(ump), waitfor); if (nbp == NULL) goto restart; FREE_LOCK(ump); if ((error = bwrite(nbp)) != 0) goto out; ACQUIRE_LOCK(ump); goto restart; } continue; case D_PAGEDEP: /* * Only flush directory entries in synchronous passes. */ if (waitfor != MNT_WAIT) { error = EBUSY; goto out_unlock; } /* * While syncing snapshots, we must allow recursive * lookups. */ BUF_AREC(bp); /* * We are trying to sync a directory that may * have dependencies on both its own metadata * and/or dependencies on the inodes of any * recently allocated files. We walk its diradd * lists pushing out the associated inode. */ pagedep = WK_PAGEDEP(wk); for (i = 0; i < DAHASHSZ; i++) { if (LIST_FIRST(&pagedep->pd_diraddhd[i]) == 0) continue; if ((error = flush_pagedep_deps(vp, wk->wk_mp, &pagedep->pd_diraddhd[i]))) { BUF_NOREC(bp); goto out_unlock; } } BUF_NOREC(bp); continue; case D_FREEWORK: case D_FREEDEP: case D_JSEGDEP: case D_JNEWBLK: continue; default: panic("softdep_sync_buf: Unknown type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } } out_unlock: FREE_LOCK(ump); out: return (error); } /* * Flush the dependencies associated with an inodedep. * Called with splbio blocked. */ static int flush_inodedep_deps(vp, mp, ino) struct vnode *vp; struct mount *mp; ino_t ino; { struct inodedep *inodedep; struct inoref *inoref; struct ufsmount *ump; int error, waitfor; /* * This work is done in two passes. The first pass grabs most * of the buffers and begins asynchronously writing them. The * only way to wait for these asynchronous writes is to sleep * on the filesystem vnode which may stay busy for a long time * if the filesystem is active. So, instead, we make a second * pass over the dependencies blocking on each write. In the * usual case we will be blocking against a write that we * initiated, so when it is done the dependency will have been * resolved. Thus the second pass is expected to end quickly. * We give a brief window at the top of the loop to allow * any pending I/O to complete. */ ump = VFSTOUFS(mp); LOCK_OWNED(ump); for (error = 0, waitfor = MNT_NOWAIT; ; ) { if (error) return (error); FREE_LOCK(ump); ACQUIRE_LOCK(ump); restart: if (inodedep_lookup(mp, ino, 0, &inodedep) == 0) return (0); TAILQ_FOREACH(inoref, &inodedep->id_inoreflst, if_deps) { if ((inoref->if_state & (DEPCOMPLETE | GOINGAWAY)) == DEPCOMPLETE) { jwait(&inoref->if_list, MNT_WAIT); goto restart; } } if (flush_deplist(&inodedep->id_inoupdt, waitfor, &error) || flush_deplist(&inodedep->id_newinoupdt, waitfor, &error) || flush_deplist(&inodedep->id_extupdt, waitfor, &error) || flush_deplist(&inodedep->id_newextupdt, waitfor, &error)) continue; /* * If pass2, we are done, otherwise do pass 2. */ if (waitfor == MNT_WAIT) break; waitfor = MNT_WAIT; } /* * Try freeing inodedep in case all dependencies have been removed. */ if (inodedep_lookup(mp, ino, 0, &inodedep) != 0) (void) free_inodedep(inodedep); return (0); } /* * Flush an inode dependency list. * Called with splbio blocked. */ static int flush_deplist(listhead, waitfor, errorp) struct allocdirectlst *listhead; int waitfor; int *errorp; { struct allocdirect *adp; struct newblk *newblk; struct ufsmount *ump; struct buf *bp; if ((adp = TAILQ_FIRST(listhead)) == NULL) return (0); ump = VFSTOUFS(adp->ad_list.wk_mp); LOCK_OWNED(ump); TAILQ_FOREACH(adp, listhead, ad_next) { newblk = (struct newblk *)adp; if (newblk->nb_jnewblk != NULL) { jwait(&newblk->nb_jnewblk->jn_list, MNT_WAIT); return (1); } if (newblk->nb_state & DEPCOMPLETE) continue; bp = newblk->nb_bmsafemap->sm_buf; bp = getdirtybuf(bp, LOCK_PTR(ump), waitfor); if (bp == NULL) { if (waitfor == MNT_NOWAIT) continue; return (1); } FREE_LOCK(ump); if (waitfor == MNT_NOWAIT) bawrite(bp); else *errorp = bwrite(bp); ACQUIRE_LOCK(ump); return (1); } return (0); } /* * Flush dependencies associated with an allocdirect block. */ static int flush_newblk_dep(vp, mp, lbn) struct vnode *vp; struct mount *mp; ufs_lbn_t lbn; { struct newblk *newblk; struct ufsmount *ump; struct bufobj *bo; struct inode *ip; struct buf *bp; ufs2_daddr_t blkno; int error; error = 0; bo = &vp->v_bufobj; ip = VTOI(vp); blkno = DIP(ip, i_db[lbn]); if (blkno == 0) panic("flush_newblk_dep: Missing block"); ump = VFSTOUFS(mp); ACQUIRE_LOCK(ump); /* * Loop until all dependencies related to this block are satisfied. * We must be careful to restart after each sleep in case a write * completes some part of this process for us. */ for (;;) { if (newblk_lookup(mp, blkno, 0, &newblk) == 0) { FREE_LOCK(ump); break; } if (newblk->nb_list.wk_type != D_ALLOCDIRECT) panic("flush_newblk_deps: Bad newblk %p", newblk); /* * Flush the journal. */ if (newblk->nb_jnewblk != NULL) { jwait(&newblk->nb_jnewblk->jn_list, MNT_WAIT); continue; } /* * Write the bitmap dependency. */ if ((newblk->nb_state & DEPCOMPLETE) == 0) { bp = newblk->nb_bmsafemap->sm_buf; bp = getdirtybuf(bp, LOCK_PTR(ump), MNT_WAIT); if (bp == NULL) continue; FREE_LOCK(ump); error = bwrite(bp); if (error) break; ACQUIRE_LOCK(ump); continue; } /* * Write the buffer. */ FREE_LOCK(ump); BO_LOCK(bo); bp = gbincore(bo, lbn); if (bp != NULL) { error = BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo)); if (error == ENOLCK) { ACQUIRE_LOCK(ump); error = 0; continue; /* Slept, retry */ } if (error != 0) break; /* Failed */ if (bp->b_flags & B_DELWRI) { bremfree(bp); error = bwrite(bp); if (error) break; } else BUF_UNLOCK(bp); } else BO_UNLOCK(bo); /* * We have to wait for the direct pointers to * point at the newdirblk before the dependency * will go away. */ error = ffs_update(vp, 1); if (error) break; ACQUIRE_LOCK(ump); } return (error); } /* * Eliminate a pagedep dependency by flushing out all its diradd dependencies. * Called with splbio blocked. */ static int flush_pagedep_deps(pvp, mp, diraddhdp) struct vnode *pvp; struct mount *mp; struct diraddhd *diraddhdp; { struct inodedep *inodedep; struct inoref *inoref; struct ufsmount *ump; struct diradd *dap; struct vnode *vp; int error = 0; struct buf *bp; ino_t inum; struct diraddhd unfinished; LIST_INIT(&unfinished); ump = VFSTOUFS(mp); LOCK_OWNED(ump); restart: while ((dap = LIST_FIRST(diraddhdp)) != NULL) { /* * Flush ourselves if this directory entry * has a MKDIR_PARENT dependency. */ if (dap->da_state & MKDIR_PARENT) { FREE_LOCK(ump); if ((error = ffs_update(pvp, 1)) != 0) break; ACQUIRE_LOCK(ump); /* * If that cleared dependencies, go on to next. */ if (dap != LIST_FIRST(diraddhdp)) continue; /* * All MKDIR_PARENT dependencies and all the * NEWBLOCK pagedeps that are contained in direct * blocks were resolved by doing above ffs_update. * Pagedeps contained in indirect blocks may * require a complete sync'ing of the directory. * We are in the midst of doing a complete sync, * so if they are not resolved in this pass we * defer them for now as they will be sync'ed by * our caller shortly. */ LIST_REMOVE(dap, da_pdlist); LIST_INSERT_HEAD(&unfinished, dap, da_pdlist); continue; } /* * A newly allocated directory must have its "." and * ".." entries written out before its name can be * committed in its parent. */ inum = dap->da_newinum; if (inodedep_lookup(UFSTOVFS(ump), inum, 0, &inodedep) == 0) panic("flush_pagedep_deps: lost inode1"); /* * Wait for any pending journal adds to complete so we don't * cause rollbacks while syncing. */ TAILQ_FOREACH(inoref, &inodedep->id_inoreflst, if_deps) { if ((inoref->if_state & (DEPCOMPLETE | GOINGAWAY)) == DEPCOMPLETE) { jwait(&inoref->if_list, MNT_WAIT); goto restart; } } if (dap->da_state & MKDIR_BODY) { FREE_LOCK(ump); if ((error = ffs_vgetf(mp, inum, LK_EXCLUSIVE, &vp, FFSV_FORCEINSMQ))) break; error = flush_newblk_dep(vp, mp, 0); /* * If we still have the dependency we might need to * update the vnode to sync the new link count to * disk. */ if (error == 0 && dap == LIST_FIRST(diraddhdp)) error = ffs_update(vp, 1); vput(vp); if (error != 0) break; ACQUIRE_LOCK(ump); /* * If that cleared dependencies, go on to next. */ if (dap != LIST_FIRST(diraddhdp)) continue; if (dap->da_state & MKDIR_BODY) { inodedep_lookup(UFSTOVFS(ump), inum, 0, &inodedep); panic("flush_pagedep_deps: MKDIR_BODY " "inodedep %p dap %p vp %p", inodedep, dap, vp); } } /* * Flush the inode on which the directory entry depends. * Having accounted for MKDIR_PARENT and MKDIR_BODY above, * the only remaining dependency is that the updated inode * count must get pushed to disk. The inode has already * been pushed into its inode buffer (via VOP_UPDATE) at * the time of the reference count change. So we need only * locate that buffer, ensure that there will be no rollback * caused by a bitmap dependency, then write the inode buffer. */ retry: if (inodedep_lookup(UFSTOVFS(ump), inum, 0, &inodedep) == 0) panic("flush_pagedep_deps: lost inode"); /* * If the inode still has bitmap dependencies, * push them to disk. */ if ((inodedep->id_state & (DEPCOMPLETE | GOINGAWAY)) == 0) { bp = inodedep->id_bmsafemap->sm_buf; bp = getdirtybuf(bp, LOCK_PTR(ump), MNT_WAIT); if (bp == NULL) goto retry; FREE_LOCK(ump); if ((error = bwrite(bp)) != 0) break; ACQUIRE_LOCK(ump); if (dap != LIST_FIRST(diraddhdp)) continue; } /* * If the inode is still sitting in a buffer waiting * to be written or waiting for the link count to be * adjusted update it here to flush it to disk. */ if (dap == LIST_FIRST(diraddhdp)) { FREE_LOCK(ump); if ((error = ffs_vgetf(mp, inum, LK_EXCLUSIVE, &vp, FFSV_FORCEINSMQ))) break; error = ffs_update(vp, 1); vput(vp); if (error) break; ACQUIRE_LOCK(ump); } /* * If we have failed to get rid of all the dependencies * then something is seriously wrong. */ if (dap == LIST_FIRST(diraddhdp)) { inodedep_lookup(UFSTOVFS(ump), inum, 0, &inodedep); panic("flush_pagedep_deps: failed to flush " "inodedep %p ino %ju dap %p", inodedep, (uintmax_t)inum, dap); } } if (error) ACQUIRE_LOCK(ump); while ((dap = LIST_FIRST(&unfinished)) != NULL) { LIST_REMOVE(dap, da_pdlist); LIST_INSERT_HEAD(diraddhdp, dap, da_pdlist); } return (error); } /* * A large burst of file addition or deletion activity can drive the * memory load excessively high. First attempt to slow things down * using the techniques below. If that fails, this routine requests * the offending operations to fall back to running synchronously * until the memory load returns to a reasonable level. */ int softdep_slowdown(vp) struct vnode *vp; { struct ufsmount *ump; int jlow; int max_softdeps_hard; KASSERT(MOUNTEDSOFTDEP(vp->v_mount) != 0, ("softdep_slowdown called on non-softdep filesystem")); ump = VFSTOUFS(vp->v_mount); ACQUIRE_LOCK(ump); jlow = 0; /* * Check for journal space if needed. */ if (DOINGSUJ(vp)) { if (journal_space(ump, 0) == 0) jlow = 1; } /* * If the system is under its limits and our filesystem is * not responsible for more than our share of the usage and * we are not low on journal space, then no need to slow down. */ max_softdeps_hard = max_softdeps * 11 / 10; if (dep_current[D_DIRREM] < max_softdeps_hard / 2 && dep_current[D_INODEDEP] < max_softdeps_hard && dep_current[D_INDIRDEP] < max_softdeps_hard / 1000 && dep_current[D_FREEBLKS] < max_softdeps_hard && jlow == 0 && ump->softdep_curdeps[D_DIRREM] < (max_softdeps_hard / 2) / stat_flush_threads && ump->softdep_curdeps[D_INODEDEP] < max_softdeps_hard / stat_flush_threads && ump->softdep_curdeps[D_INDIRDEP] < (max_softdeps_hard / 1000) / stat_flush_threads && ump->softdep_curdeps[D_FREEBLKS] < max_softdeps_hard / stat_flush_threads) { FREE_LOCK(ump); return (0); } /* * If the journal is low or our filesystem is over its limit * then speedup the cleanup. */ if (ump->softdep_curdeps[D_INDIRDEP] < (max_softdeps_hard / 1000) / stat_flush_threads || jlow) softdep_speedup(ump); stat_sync_limit_hit += 1; FREE_LOCK(ump); /* * We only slow down the rate at which new dependencies are * generated if we are not using journaling. With journaling, * the cleanup should always be sufficient to keep things * under control. */ if (DOINGSUJ(vp)) return (0); return (1); } /* * Called by the allocation routines when they are about to fail * in the hope that we can free up the requested resource (inodes * or disk space). * * First check to see if the work list has anything on it. If it has, * clean up entries until we successfully free the requested resource. * Because this process holds inodes locked, we cannot handle any remove * requests that might block on a locked inode as that could lead to * deadlock. If the worklist yields none of the requested resource, * start syncing out vnodes to free up the needed space. */ int softdep_request_cleanup(fs, vp, cred, resource) struct fs *fs; struct vnode *vp; struct ucred *cred; int resource; { struct ufsmount *ump; struct mount *mp; long starttime; ufs2_daddr_t needed; int error, failed_vnode; /* * If we are being called because of a process doing a * copy-on-write, then it is not safe to process any * worklist items as we will recurse into the copyonwrite * routine. This will result in an incoherent snapshot. * If the vnode that we hold is a snapshot, we must avoid * handling other resources that could cause deadlock. */ if ((curthread->td_pflags & TDP_COWINPROGRESS) || IS_SNAPSHOT(VTOI(vp))) return (0); if (resource == FLUSH_BLOCKS_WAIT) stat_cleanup_blkrequests += 1; else stat_cleanup_inorequests += 1; mp = vp->v_mount; ump = VFSTOUFS(mp); mtx_assert(UFS_MTX(ump), MA_OWNED); UFS_UNLOCK(ump); error = ffs_update(vp, 1); if (error != 0 || MOUNTEDSOFTDEP(mp) == 0) { UFS_LOCK(ump); return (0); } /* * If we are in need of resources, start by cleaning up * any block removals associated with our inode. */ ACQUIRE_LOCK(ump); process_removes(vp); process_truncates(vp); FREE_LOCK(ump); /* * Now clean up at least as many resources as we will need. * * When requested to clean up inodes, the number that are needed * is set by the number of simultaneous writers (mnt_writeopcount) * plus a bit of slop (2) in case some more writers show up while * we are cleaning. * * When requested to free up space, the amount of space that * we need is enough blocks to allocate a full-sized segment * (fs_contigsumsize). The number of such segments that will * be needed is set by the number of simultaneous writers * (mnt_writeopcount) plus a bit of slop (2) in case some more * writers show up while we are cleaning. * * Additionally, if we are unpriviledged and allocating space, * we need to ensure that we clean up enough blocks to get the * needed number of blocks over the threshold of the minimum * number of blocks required to be kept free by the filesystem * (fs_minfree). */ if (resource == FLUSH_INODES_WAIT) { needed = vp->v_mount->mnt_writeopcount + 2; } else if (resource == FLUSH_BLOCKS_WAIT) { needed = (vp->v_mount->mnt_writeopcount + 2) * fs->fs_contigsumsize; if (priv_check_cred(cred, PRIV_VFS_BLOCKRESERVE, 0)) needed += fragstoblks(fs, roundup((fs->fs_dsize * fs->fs_minfree / 100) - fs->fs_cstotal.cs_nffree, fs->fs_frag)); } else { UFS_LOCK(ump); printf("softdep_request_cleanup: Unknown resource type %d\n", resource); return (0); } starttime = time_second; retry: if ((resource == FLUSH_BLOCKS_WAIT && ump->softdep_on_worklist > 0 && fs->fs_cstotal.cs_nbfree <= needed) || (resource == FLUSH_INODES_WAIT && fs->fs_pendinginodes > 0 && fs->fs_cstotal.cs_nifree <= needed)) { ACQUIRE_LOCK(ump); if (ump->softdep_on_worklist > 0 && process_worklist_item(UFSTOVFS(ump), ump->softdep_on_worklist, LK_NOWAIT) != 0) stat_worklist_push += 1; FREE_LOCK(ump); } /* * If we still need resources and there are no more worklist * entries to process to obtain them, we have to start flushing * the dirty vnodes to force the release of additional requests * to the worklist that we can then process to reap addition * resources. We walk the vnodes associated with the mount point * until we get the needed worklist requests that we can reap. * * If there are several threads all needing to clean the same * mount point, only one is allowed to walk the mount list. * When several threads all try to walk the same mount list, * they end up competing with each other and often end up in * livelock. This approach ensures that forward progress is * made at the cost of occational ENOSPC errors being returned * that might otherwise have been avoided. */ error = 1; if ((resource == FLUSH_BLOCKS_WAIT && fs->fs_cstotal.cs_nbfree <= needed) || (resource == FLUSH_INODES_WAIT && fs->fs_pendinginodes > 0 && fs->fs_cstotal.cs_nifree <= needed)) { ACQUIRE_LOCK(ump); if ((ump->um_softdep->sd_flags & FLUSH_RC_ACTIVE) == 0) { ump->um_softdep->sd_flags |= FLUSH_RC_ACTIVE; FREE_LOCK(ump); failed_vnode = softdep_request_cleanup_flush(mp, ump); ACQUIRE_LOCK(ump); ump->um_softdep->sd_flags &= ~FLUSH_RC_ACTIVE; FREE_LOCK(ump); if (ump->softdep_on_worklist > 0) { stat_cleanup_retries += 1; if (!failed_vnode) goto retry; } } else { FREE_LOCK(ump); error = 0; } stat_cleanup_failures += 1; } if (time_second - starttime > stat_cleanup_high_delay) stat_cleanup_high_delay = time_second - starttime; UFS_LOCK(ump); return (error); } /* * Scan the vnodes for the specified mount point flushing out any * vnodes that can be locked without waiting. Finally, try to flush * the device associated with the mount point if it can be locked * without waiting. * * We return 0 if we were able to lock every vnode in our scan. * If we had to skip one or more vnodes, we return 1. */ static int softdep_request_cleanup_flush(mp, ump) struct mount *mp; struct ufsmount *ump; { struct thread *td; struct vnode *lvp, *mvp; int failed_vnode; failed_vnode = 0; td = curthread; MNT_VNODE_FOREACH_ALL(lvp, mp, mvp) { if (TAILQ_FIRST(&lvp->v_bufobj.bo_dirty.bv_hd) == 0) { VI_UNLOCK(lvp); continue; } if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, td) != 0) { failed_vnode = 1; continue; } if (lvp->v_vflag & VV_NOSYNC) { /* unlinked */ vput(lvp); continue; } (void) ffs_syncvnode(lvp, MNT_NOWAIT, 0); vput(lvp); } lvp = ump->um_devvp; if (vn_lock(lvp, LK_EXCLUSIVE | LK_NOWAIT) == 0) { VOP_FSYNC(lvp, MNT_NOWAIT, td); VOP_UNLOCK(lvp, 0); } return (failed_vnode); } static bool softdep_excess_items(struct ufsmount *ump, int item) { KASSERT(item >= 0 && item < D_LAST, ("item %d", item)); return (dep_current[item] > max_softdeps && ump->softdep_curdeps[item] > max_softdeps / stat_flush_threads); } static void schedule_cleanup(struct mount *mp) { struct ufsmount *ump; struct thread *td; ump = VFSTOUFS(mp); LOCK_OWNED(ump); FREE_LOCK(ump); td = curthread; if ((td->td_pflags & TDP_KTHREAD) != 0 && (td->td_proc->p_flag2 & P2_AST_SU) == 0) { /* * No ast is delivered to kernel threads, so nobody * would deref the mp. Some kernel threads * explicitely check for AST, e.g. NFS daemon does * this in the serving loop. */ return; } if (td->td_su != NULL) vfs_rel(td->td_su); vfs_ref(mp); td->td_su = mp; thread_lock(td); td->td_flags |= TDF_ASTPENDING; thread_unlock(td); } static void softdep_ast_cleanup_proc(struct thread *td) { struct mount *mp; struct ufsmount *ump; int error; bool req; while ((mp = td->td_su) != NULL) { td->td_su = NULL; error = vfs_busy(mp, MBF_NOWAIT); vfs_rel(mp); if (error != 0) return; if (ffs_own_mount(mp) && MOUNTEDSOFTDEP(mp)) { ump = VFSTOUFS(mp); for (;;) { req = false; ACQUIRE_LOCK(ump); if (softdep_excess_items(ump, D_INODEDEP)) { req = true; request_cleanup(mp, FLUSH_INODES); } if (softdep_excess_items(ump, D_DIRREM)) { req = true; request_cleanup(mp, FLUSH_BLOCKS); } FREE_LOCK(ump); if (softdep_excess_items(ump, D_NEWBLK) || softdep_excess_items(ump, D_ALLOCDIRECT) || softdep_excess_items(ump, D_ALLOCINDIR)) { error = vn_start_write(NULL, &mp, V_WAIT); if (error == 0) { req = true; VFS_SYNC(mp, MNT_WAIT); vn_finished_write(mp); } } if ((td->td_pflags & TDP_KTHREAD) != 0 || !req) break; } } vfs_unbusy(mp); } if ((mp = td->td_su) != NULL) { td->td_su = NULL; vfs_rel(mp); } } /* * If memory utilization has gotten too high, deliberately slow things * down and speed up the I/O processing. */ static int request_cleanup(mp, resource) struct mount *mp; int resource; { struct thread *td = curthread; struct ufsmount *ump; ump = VFSTOUFS(mp); LOCK_OWNED(ump); /* * We never hold up the filesystem syncer or buf daemon. */ if (td->td_pflags & (TDP_SOFTDEP|TDP_NORUNNINGBUF)) return (0); /* * First check to see if the work list has gotten backlogged. * If it has, co-opt this process to help clean up two entries. * Because this process may hold inodes locked, we cannot * handle any remove requests that might block on a locked * inode as that could lead to deadlock. We set TDP_SOFTDEP * to avoid recursively processing the worklist. */ if (ump->softdep_on_worklist > max_softdeps / 10) { td->td_pflags |= TDP_SOFTDEP; process_worklist_item(mp, 2, LK_NOWAIT); td->td_pflags &= ~TDP_SOFTDEP; stat_worklist_push += 2; return(1); } /* * Next, we attempt to speed up the syncer process. If that * is successful, then we allow the process to continue. */ if (softdep_speedup(ump) && resource != FLUSH_BLOCKS_WAIT && resource != FLUSH_INODES_WAIT) return(0); /* * If we are resource constrained on inode dependencies, try * flushing some dirty inodes. Otherwise, we are constrained * by file deletions, so try accelerating flushes of directories * with removal dependencies. We would like to do the cleanup * here, but we probably hold an inode locked at this point and * that might deadlock against one that we try to clean. So, * the best that we can do is request the syncer daemon to do * the cleanup for us. */ switch (resource) { case FLUSH_INODES: case FLUSH_INODES_WAIT: ACQUIRE_GBLLOCK(&lk); stat_ino_limit_push += 1; req_clear_inodedeps += 1; FREE_GBLLOCK(&lk); stat_countp = &stat_ino_limit_hit; break; case FLUSH_BLOCKS: case FLUSH_BLOCKS_WAIT: ACQUIRE_GBLLOCK(&lk); stat_blk_limit_push += 1; req_clear_remove += 1; FREE_GBLLOCK(&lk); stat_countp = &stat_blk_limit_hit; break; default: panic("request_cleanup: unknown type"); } /* * Hopefully the syncer daemon will catch up and awaken us. * We wait at most tickdelay before proceeding in any case. */ ACQUIRE_GBLLOCK(&lk); FREE_LOCK(ump); proc_waiting += 1; if (callout_pending(&softdep_callout) == FALSE) callout_reset(&softdep_callout, tickdelay > 2 ? tickdelay : 2, pause_timer, 0); if ((td->td_pflags & TDP_KTHREAD) == 0) msleep((caddr_t)&proc_waiting, &lk, PPAUSE, "softupdate", 0); proc_waiting -= 1; FREE_GBLLOCK(&lk); ACQUIRE_LOCK(ump); return (1); } /* * Awaken processes pausing in request_cleanup and clear proc_waiting * to indicate that there is no longer a timer running. Pause_timer * will be called with the global softdep mutex (&lk) locked. */ static void pause_timer(arg) void *arg; { GBLLOCK_OWNED(&lk); /* * The callout_ API has acquired mtx and will hold it around this * function call. */ *stat_countp += proc_waiting; wakeup(&proc_waiting); } /* * If requested, try removing inode or removal dependencies. */ static void check_clear_deps(mp) struct mount *mp; { /* * If we are suspended, it may be because of our using * too many inodedeps, so help clear them out. */ if (MOUNTEDSUJ(mp) && VFSTOUFS(mp)->softdep_jblocks->jb_suspended) clear_inodedeps(mp); /* * General requests for cleanup of backed up dependencies */ ACQUIRE_GBLLOCK(&lk); if (req_clear_inodedeps) { req_clear_inodedeps -= 1; FREE_GBLLOCK(&lk); clear_inodedeps(mp); ACQUIRE_GBLLOCK(&lk); wakeup(&proc_waiting); } if (req_clear_remove) { req_clear_remove -= 1; FREE_GBLLOCK(&lk); clear_remove(mp); ACQUIRE_GBLLOCK(&lk); wakeup(&proc_waiting); } FREE_GBLLOCK(&lk); } /* * Flush out a directory with at least one removal dependency in an effort to * reduce the number of dirrem, freefile, and freeblks dependency structures. */ static void clear_remove(mp) struct mount *mp; { struct pagedep_hashhead *pagedephd; struct pagedep *pagedep; struct ufsmount *ump; struct vnode *vp; struct bufobj *bo; int error, cnt; ino_t ino; ump = VFSTOUFS(mp); LOCK_OWNED(ump); for (cnt = 0; cnt <= ump->pagedep_hash_size; cnt++) { pagedephd = &ump->pagedep_hashtbl[ump->pagedep_nextclean++]; if (ump->pagedep_nextclean > ump->pagedep_hash_size) ump->pagedep_nextclean = 0; LIST_FOREACH(pagedep, pagedephd, pd_hash) { if (LIST_EMPTY(&pagedep->pd_dirremhd)) continue; ino = pagedep->pd_ino; if (vn_start_write(NULL, &mp, V_NOWAIT) != 0) continue; FREE_LOCK(ump); /* * Let unmount clear deps */ error = vfs_busy(mp, MBF_NOWAIT); if (error != 0) goto finish_write; error = ffs_vgetf(mp, ino, LK_EXCLUSIVE, &vp, FFSV_FORCEINSMQ); vfs_unbusy(mp); if (error != 0) { softdep_error("clear_remove: vget", error); goto finish_write; } if ((error = ffs_syncvnode(vp, MNT_NOWAIT, 0))) softdep_error("clear_remove: fsync", error); bo = &vp->v_bufobj; BO_LOCK(bo); drain_output(vp); BO_UNLOCK(bo); vput(vp); finish_write: vn_finished_write(mp); ACQUIRE_LOCK(ump); return; } } } /* * Clear out a block of dirty inodes in an effort to reduce * the number of inodedep dependency structures. */ static void clear_inodedeps(mp) struct mount *mp; { struct inodedep_hashhead *inodedephd; struct inodedep *inodedep; struct ufsmount *ump; struct vnode *vp; struct fs *fs; int error, cnt; ino_t firstino, lastino, ino; ump = VFSTOUFS(mp); fs = ump->um_fs; LOCK_OWNED(ump); /* * Pick a random inode dependency to be cleared. * We will then gather up all the inodes in its block * that have dependencies and flush them out. */ for (cnt = 0; cnt <= ump->inodedep_hash_size; cnt++) { inodedephd = &ump->inodedep_hashtbl[ump->inodedep_nextclean++]; if (ump->inodedep_nextclean > ump->inodedep_hash_size) ump->inodedep_nextclean = 0; if ((inodedep = LIST_FIRST(inodedephd)) != NULL) break; } if (inodedep == NULL) return; /* * Find the last inode in the block with dependencies. */ firstino = rounddown2(inodedep->id_ino, INOPB(fs)); for (lastino = firstino + INOPB(fs) - 1; lastino > firstino; lastino--) if (inodedep_lookup(mp, lastino, 0, &inodedep) != 0) break; /* * Asynchronously push all but the last inode with dependencies. * Synchronously push the last inode with dependencies to ensure * that the inode block gets written to free up the inodedeps. */ for (ino = firstino; ino <= lastino; ino++) { if (inodedep_lookup(mp, ino, 0, &inodedep) == 0) continue; if (vn_start_write(NULL, &mp, V_NOWAIT) != 0) continue; FREE_LOCK(ump); error = vfs_busy(mp, MBF_NOWAIT); /* Let unmount clear deps */ if (error != 0) { vn_finished_write(mp); ACQUIRE_LOCK(ump); return; } if ((error = ffs_vgetf(mp, ino, LK_EXCLUSIVE, &vp, FFSV_FORCEINSMQ)) != 0) { softdep_error("clear_inodedeps: vget", error); vfs_unbusy(mp); vn_finished_write(mp); ACQUIRE_LOCK(ump); return; } vfs_unbusy(mp); if (ino == lastino) { if ((error = ffs_syncvnode(vp, MNT_WAIT, 0))) softdep_error("clear_inodedeps: fsync1", error); } else { if ((error = ffs_syncvnode(vp, MNT_NOWAIT, 0))) softdep_error("clear_inodedeps: fsync2", error); BO_LOCK(&vp->v_bufobj); drain_output(vp); BO_UNLOCK(&vp->v_bufobj); } vput(vp); vn_finished_write(mp); ACQUIRE_LOCK(ump); } } void softdep_buf_append(bp, wkhd) struct buf *bp; struct workhead *wkhd; { struct worklist *wk; struct ufsmount *ump; if ((wk = LIST_FIRST(wkhd)) == NULL) return; KASSERT(MOUNTEDSOFTDEP(wk->wk_mp) != 0, ("softdep_buf_append called on non-softdep filesystem")); ump = VFSTOUFS(wk->wk_mp); ACQUIRE_LOCK(ump); while ((wk = LIST_FIRST(wkhd)) != NULL) { WORKLIST_REMOVE(wk); WORKLIST_INSERT(&bp->b_dep, wk); } FREE_LOCK(ump); } void softdep_inode_append(ip, cred, wkhd) struct inode *ip; struct ucred *cred; struct workhead *wkhd; { struct buf *bp; struct fs *fs; struct ufsmount *ump; int error; ump = ITOUMP(ip); KASSERT(MOUNTEDSOFTDEP(UFSTOVFS(ump)) != 0, ("softdep_inode_append called on non-softdep filesystem")); fs = ump->um_fs; error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)), (int)fs->fs_bsize, cred, &bp); if (error) { bqrelse(bp); softdep_freework(wkhd); return; } softdep_buf_append(bp, wkhd); bqrelse(bp); } void softdep_freework(wkhd) struct workhead *wkhd; { struct worklist *wk; struct ufsmount *ump; if ((wk = LIST_FIRST(wkhd)) == NULL) return; KASSERT(MOUNTEDSOFTDEP(wk->wk_mp) != 0, ("softdep_freework called on non-softdep filesystem")); ump = VFSTOUFS(wk->wk_mp); ACQUIRE_LOCK(ump); handle_jwork(wkhd); FREE_LOCK(ump); } /* * Function to determine if the buffer has outstanding dependencies * that will cause a roll-back if the buffer is written. If wantcount * is set, return number of dependencies, otherwise just yes or no. */ static int softdep_count_dependencies(bp, wantcount) struct buf *bp; int wantcount; { struct worklist *wk; struct ufsmount *ump; struct bmsafemap *bmsafemap; struct freework *freework; struct inodedep *inodedep; struct indirdep *indirdep; struct freeblks *freeblks; struct allocindir *aip; struct pagedep *pagedep; struct dirrem *dirrem; struct newblk *newblk; struct mkdir *mkdir; struct diradd *dap; struct vnode *vp; struct mount *mp; int i, retval; retval = 0; if (LIST_EMPTY(&bp->b_dep)) return (0); vp = bp->b_vp; /* * The ump mount point is stable after we get a correct * pointer, since bp is locked and this prevents unmount from * proceed. But to get to it, we cannot dereference bp->b_dep * head wk_mp, because we do not yet own SU ump lock and * workitem might be freed while dereferenced. */ retry: if (vp->v_type == VCHR) { - VOP_LOCK(vp, LK_RETRY | LK_EXCLUSIVE); + VI_LOCK(vp); mp = vp->v_type == VCHR ? vp->v_rdev->si_mountpt : NULL; - VOP_UNLOCK(vp, 0); + VI_UNLOCK(vp); if (mp == NULL) goto retry; } else if (vp->v_type == VREG) { mp = vp->v_mount; } else { return (0); } ump = VFSTOUFS(mp); ACQUIRE_LOCK(ump); LIST_FOREACH(wk, &bp->b_dep, wk_list) { switch (wk->wk_type) { case D_INODEDEP: inodedep = WK_INODEDEP(wk); if ((inodedep->id_state & DEPCOMPLETE) == 0) { /* bitmap allocation dependency */ retval += 1; if (!wantcount) goto out; } if (TAILQ_FIRST(&inodedep->id_inoupdt)) { /* direct block pointer dependency */ retval += 1; if (!wantcount) goto out; } if (TAILQ_FIRST(&inodedep->id_extupdt)) { /* direct block pointer dependency */ retval += 1; if (!wantcount) goto out; } if (TAILQ_FIRST(&inodedep->id_inoreflst)) { /* Add reference dependency. */ retval += 1; if (!wantcount) goto out; } continue; case D_INDIRDEP: indirdep = WK_INDIRDEP(wk); TAILQ_FOREACH(freework, &indirdep->ir_trunc, fw_next) { /* indirect truncation dependency */ retval += 1; if (!wantcount) goto out; } LIST_FOREACH(aip, &indirdep->ir_deplisthd, ai_next) { /* indirect block pointer dependency */ retval += 1; if (!wantcount) goto out; } continue; case D_PAGEDEP: pagedep = WK_PAGEDEP(wk); LIST_FOREACH(dirrem, &pagedep->pd_dirremhd, dm_next) { if (LIST_FIRST(&dirrem->dm_jremrefhd)) { /* Journal remove ref dependency. */ retval += 1; if (!wantcount) goto out; } } for (i = 0; i < DAHASHSZ; i++) { LIST_FOREACH(dap, &pagedep->pd_diraddhd[i], da_pdlist) { /* directory entry dependency */ retval += 1; if (!wantcount) goto out; } } continue; case D_BMSAFEMAP: bmsafemap = WK_BMSAFEMAP(wk); if (LIST_FIRST(&bmsafemap->sm_jaddrefhd)) { /* Add reference dependency. */ retval += 1; if (!wantcount) goto out; } if (LIST_FIRST(&bmsafemap->sm_jnewblkhd)) { /* Allocate block dependency. */ retval += 1; if (!wantcount) goto out; } continue; case D_FREEBLKS: freeblks = WK_FREEBLKS(wk); if (LIST_FIRST(&freeblks->fb_jblkdephd)) { /* Freeblk journal dependency. */ retval += 1; if (!wantcount) goto out; } continue; case D_ALLOCDIRECT: case D_ALLOCINDIR: newblk = WK_NEWBLK(wk); if (newblk->nb_jnewblk) { /* Journal allocate dependency. */ retval += 1; if (!wantcount) goto out; } continue; case D_MKDIR: mkdir = WK_MKDIR(wk); if (mkdir->md_jaddref) { /* Journal reference dependency. */ retval += 1; if (!wantcount) goto out; } continue; case D_FREEWORK: case D_FREEDEP: case D_JSEGDEP: case D_JSEG: case D_SBDEP: /* never a dependency on these blocks */ continue; default: panic("softdep_count_dependencies: Unexpected type %s", TYPENAME(wk->wk_type)); /* NOTREACHED */ } } out: FREE_LOCK(ump); return (retval); } /* * Acquire exclusive access to a buffer. * Must be called with a locked mtx parameter. * Return acquired buffer or NULL on failure. */ static struct buf * getdirtybuf(bp, lock, waitfor) struct buf *bp; struct rwlock *lock; int waitfor; { int error; if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) != 0) { if (waitfor != MNT_WAIT) return (NULL); error = BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, lock); /* * Even if we successfully acquire bp here, we have dropped * lock, which may violates our guarantee. */ if (error == 0) BUF_UNLOCK(bp); else if (error != ENOLCK) panic("getdirtybuf: inconsistent lock: %d", error); rw_wlock(lock); return (NULL); } if ((bp->b_vflags & BV_BKGRDINPROG) != 0) { if (lock != BO_LOCKPTR(bp->b_bufobj) && waitfor == MNT_WAIT) { rw_wunlock(lock); BO_LOCK(bp->b_bufobj); BUF_UNLOCK(bp); if ((bp->b_vflags & BV_BKGRDINPROG) != 0) { bp->b_vflags |= BV_BKGRDWAIT; msleep(&bp->b_xflags, BO_LOCKPTR(bp->b_bufobj), PRIBIO | PDROP, "getbuf", 0); } else BO_UNLOCK(bp->b_bufobj); rw_wlock(lock); return (NULL); } BUF_UNLOCK(bp); if (waitfor != MNT_WAIT) return (NULL); /* * The lock argument must be bp->b_vp's mutex in * this case. */ #ifdef DEBUG_VFS_LOCKS if (bp->b_vp->v_type != VCHR) ASSERT_BO_WLOCKED(bp->b_bufobj); #endif bp->b_vflags |= BV_BKGRDWAIT; rw_sleep(&bp->b_xflags, lock, PRIBIO, "getbuf", 0); return (NULL); } if ((bp->b_flags & B_DELWRI) == 0) { BUF_UNLOCK(bp); return (NULL); } bremfree(bp); return (bp); } /* * Check if it is safe to suspend the file system now. On entry, * the vnode interlock for devvp should be held. Return 0 with * the mount interlock held if the file system can be suspended now, * otherwise return EAGAIN with the mount interlock held. */ int softdep_check_suspend(struct mount *mp, struct vnode *devvp, int softdep_depcnt, int softdep_accdepcnt, int secondary_writes, int secondary_accwrites) { struct bufobj *bo; struct ufsmount *ump; struct inodedep *inodedep; int error, unlinked; bo = &devvp->v_bufobj; ASSERT_BO_WLOCKED(bo); /* * If we are not running with soft updates, then we need only * deal with secondary writes as we try to suspend. */ if (MOUNTEDSOFTDEP(mp) == 0) { MNT_ILOCK(mp); while (mp->mnt_secondary_writes != 0) { BO_UNLOCK(bo); msleep(&mp->mnt_secondary_writes, MNT_MTX(mp), (PUSER - 1) | PDROP, "secwr", 0); BO_LOCK(bo); MNT_ILOCK(mp); } /* * Reasons for needing more work before suspend: * - Dirty buffers on devvp. * - Secondary writes occurred after start of vnode sync loop */ error = 0; if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0 || secondary_writes != 0 || mp->mnt_secondary_writes != 0 || secondary_accwrites != mp->mnt_secondary_accwrites) error = EAGAIN; BO_UNLOCK(bo); return (error); } /* * If we are running with soft updates, then we need to coordinate * with them as we try to suspend. */ ump = VFSTOUFS(mp); for (;;) { if (!TRY_ACQUIRE_LOCK(ump)) { BO_UNLOCK(bo); ACQUIRE_LOCK(ump); FREE_LOCK(ump); BO_LOCK(bo); continue; } MNT_ILOCK(mp); if (mp->mnt_secondary_writes != 0) { FREE_LOCK(ump); BO_UNLOCK(bo); msleep(&mp->mnt_secondary_writes, MNT_MTX(mp), (PUSER - 1) | PDROP, "secwr", 0); BO_LOCK(bo); continue; } break; } unlinked = 0; if (MOUNTEDSUJ(mp)) { for (inodedep = TAILQ_FIRST(&ump->softdep_unlinked); inodedep != NULL; inodedep = TAILQ_NEXT(inodedep, id_unlinked)) { if ((inodedep->id_state & (UNLINKED | UNLINKLINKS | UNLINKONLIST)) != (UNLINKED | UNLINKLINKS | UNLINKONLIST) || !check_inodedep_free(inodedep)) continue; unlinked++; } } /* * Reasons for needing more work before suspend: * - Dirty buffers on devvp. * - Softdep activity occurred after start of vnode sync loop * - Secondary writes occurred after start of vnode sync loop */ error = 0; if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0 || softdep_depcnt != unlinked || ump->softdep_deps != unlinked || softdep_accdepcnt != ump->softdep_accdeps || secondary_writes != 0 || mp->mnt_secondary_writes != 0 || secondary_accwrites != mp->mnt_secondary_accwrites) error = EAGAIN; FREE_LOCK(ump); BO_UNLOCK(bo); return (error); } /* * Get the number of dependency structures for the file system, both * the current number and the total number allocated. These will * later be used to detect that softdep processing has occurred. */ void softdep_get_depcounts(struct mount *mp, int *softdep_depsp, int *softdep_accdepsp) { struct ufsmount *ump; if (MOUNTEDSOFTDEP(mp) == 0) { *softdep_depsp = 0; *softdep_accdepsp = 0; return; } ump = VFSTOUFS(mp); ACQUIRE_LOCK(ump); *softdep_depsp = ump->softdep_deps; *softdep_accdepsp = ump->softdep_accdeps; FREE_LOCK(ump); } /* * Wait for pending output on a vnode to complete. * Must be called with vnode lock and interlock locked. * * XXX: Should just be a call to bufobj_wwait(). */ static void drain_output(vp) struct vnode *vp; { struct bufobj *bo; bo = &vp->v_bufobj; ASSERT_VOP_LOCKED(vp, "drain_output"); ASSERT_BO_WLOCKED(bo); while (bo->bo_numoutput) { bo->bo_flag |= BO_WWAIT; msleep((caddr_t)&bo->bo_numoutput, BO_LOCKPTR(bo), PRIBIO + 1, "drainvp", 0); } } /* * Called whenever a buffer that is being invalidated or reallocated * contains dependencies. This should only happen if an I/O error has * occurred. The routine is called with the buffer locked. */ static void softdep_deallocate_dependencies(bp) struct buf *bp; { if ((bp->b_ioflags & BIO_ERROR) == 0) panic("softdep_deallocate_dependencies: dangling deps"); if (bp->b_vp != NULL && bp->b_vp->v_mount != NULL) softdep_error(bp->b_vp->v_mount->mnt_stat.f_mntonname, bp->b_error); else printf("softdep_deallocate_dependencies: " "got error %d while accessing filesystem\n", bp->b_error); if (bp->b_error != ENXIO) panic("softdep_deallocate_dependencies: unrecovered I/O error"); } /* * Function to handle asynchronous write errors in the filesystem. */ static void softdep_error(func, error) char *func; int error; { /* XXX should do something better! */ printf("%s: got error %d while accessing filesystem\n", func, error); } #ifdef DDB static void inodedep_print(struct inodedep *inodedep, int verbose) { db_printf("%p fs %p st %x ino %jd inoblk %jd delta %jd nlink %jd" " saveino %p\n", inodedep, inodedep->id_fs, inodedep->id_state, (intmax_t)inodedep->id_ino, (intmax_t)fsbtodb(inodedep->id_fs, ino_to_fsba(inodedep->id_fs, inodedep->id_ino)), (intmax_t)inodedep->id_nlinkdelta, (intmax_t)inodedep->id_savednlink, inodedep->id_savedino1); if (verbose == 0) return; db_printf("\tpendinghd %p, bufwait %p, inowait %p, inoreflst %p, " "mkdiradd %p\n", LIST_FIRST(&inodedep->id_pendinghd), LIST_FIRST(&inodedep->id_bufwait), LIST_FIRST(&inodedep->id_inowait), TAILQ_FIRST(&inodedep->id_inoreflst), inodedep->id_mkdiradd); db_printf("\tinoupdt %p, newinoupdt %p, extupdt %p, newextupdt %p\n", TAILQ_FIRST(&inodedep->id_inoupdt), TAILQ_FIRST(&inodedep->id_newinoupdt), TAILQ_FIRST(&inodedep->id_extupdt), TAILQ_FIRST(&inodedep->id_newextupdt)); } DB_SHOW_COMMAND(inodedep, db_show_inodedep) { if (have_addr == 0) { db_printf("Address required\n"); return; } inodedep_print((struct inodedep*)addr, 1); } DB_SHOW_COMMAND(inodedeps, db_show_inodedeps) { struct inodedep_hashhead *inodedephd; struct inodedep *inodedep; struct ufsmount *ump; int cnt; if (have_addr == 0) { db_printf("Address required\n"); return; } ump = (struct ufsmount *)addr; for (cnt = 0; cnt < ump->inodedep_hash_size; cnt++) { inodedephd = &ump->inodedep_hashtbl[cnt]; LIST_FOREACH(inodedep, inodedephd, id_hash) { inodedep_print(inodedep, 0); } } } DB_SHOW_COMMAND(worklist, db_show_worklist) { struct worklist *wk; if (have_addr == 0) { db_printf("Address required\n"); return; } wk = (struct worklist *)addr; printf("worklist: %p type %s state 0x%X\n", wk, TYPENAME(wk->wk_type), wk->wk_state); } DB_SHOW_COMMAND(workhead, db_show_workhead) { struct workhead *wkhd; struct worklist *wk; int i; if (have_addr == 0) { db_printf("Address required\n"); return; } wkhd = (struct workhead *)addr; wk = LIST_FIRST(wkhd); for (i = 0; i < 100 && wk != NULL; i++, wk = LIST_NEXT(wk, wk_list)) db_printf("worklist: %p type %s state 0x%X", wk, TYPENAME(wk->wk_type), wk->wk_state); if (i == 100) db_printf("workhead overflow"); printf("\n"); } DB_SHOW_COMMAND(mkdirs, db_show_mkdirs) { struct mkdirlist *mkdirlisthd; struct jaddref *jaddref; struct diradd *diradd; struct mkdir *mkdir; if (have_addr == 0) { db_printf("Address required\n"); return; } mkdirlisthd = (struct mkdirlist *)addr; LIST_FOREACH(mkdir, mkdirlisthd, md_mkdirs) { diradd = mkdir->md_diradd; db_printf("mkdir: %p state 0x%X dap %p state 0x%X", mkdir, mkdir->md_state, diradd, diradd->da_state); if ((jaddref = mkdir->md_jaddref) != NULL) db_printf(" jaddref %p jaddref state 0x%X", jaddref, jaddref->ja_state); db_printf("\n"); } } /* exported to ffs_vfsops.c */ extern void db_print_ffs(struct ufsmount *ump); void db_print_ffs(struct ufsmount *ump) { db_printf("mp %p %s devvp %p fs %p su_wl %d su_deps %d su_req %d\n", ump->um_mountp, ump->um_mountp->mnt_stat.f_mntonname, ump->um_devvp, ump->um_fs, ump->softdep_on_worklist, ump->softdep_deps, ump->softdep_req); } #endif /* DDB */ #endif /* SOFTUPDATES */ Index: projects/runtime-coverage/sys/vm/swap_pager.c =================================================================== --- projects/runtime-coverage/sys/vm/swap_pager.c (revision 322921) +++ projects/runtime-coverage/sys/vm/swap_pager.c (revision 322922) @@ -1,2799 +1,2784 @@ /*- * Copyright (c) 1998 Matthew Dillon, * Copyright (c) 1994 John S. Dyson * Copyright (c) 1990 University of Utah. * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * the Systems Programming Group of the University of Utah Computer * Science Department. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * New Swap System * Matthew Dillon * * Radix Bitmap 'blists'. * * - The new swapper uses the new radix bitmap code. This should scale * to arbitrarily small or arbitrarily large swap spaces and an almost * arbitrary degree of fragmentation. * * Features: * * - on the fly reallocation of swap during putpages. The new system * does not try to keep previously allocated swap blocks for dirty * pages. * * - on the fly deallocation of swap * * - No more garbage collection required. Unnecessarily allocated swap * blocks only exist for dirty vm_page_t's now and these are already * cycled (in a high-load system) by the pager. We also do on-the-fly * removal of invalidated swap blocks when a page is destroyed * or renamed. * * from: Utah $Hdr: swap_pager.c 1.4 91/04/30$ * * @(#)swap_pager.c 8.9 (Berkeley) 3/21/94 * @(#)vm_swap.c 8.5 (Berkeley) 2/17/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include "opt_swap.h" #include "opt_vm.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * MAX_PAGEOUT_CLUSTER must be a power of 2 between 1 and 64. * The 64-page limit is due to the radix code (kern/subr_blist.c). */ #ifndef MAX_PAGEOUT_CLUSTER #define MAX_PAGEOUT_CLUSTER 32 #endif #if !defined(SWB_NPAGES) #define SWB_NPAGES MAX_PAGEOUT_CLUSTER #endif +#define SWAP_META_PAGES PCTRIE_COUNT + /* - * The swblock structure maps an object and a small, fixed-size range - * of page indices to disk addresses within a swap area. - * The collection of these mappings is implemented as a hash table. - * Unused disk addresses within a swap area are allocated and managed - * using a blist. + * A swblk structure maps each page index within a + * SWAP_META_PAGES-aligned and sized range to the address of an + * on-disk swap block (or SWAPBLK_NONE). The collection of these + * mappings for an entire vm object is implemented as a pc-trie. */ -#define SWAP_META_PAGES 32 -#define SWAP_META_MASK (SWAP_META_PAGES - 1) - -struct swblock { - struct swblock *swb_hnext; - vm_object_t swb_object; - vm_pindex_t swb_index; - int swb_count; - daddr_t swb_pages[SWAP_META_PAGES]; +struct swblk { + vm_pindex_t p; + daddr_t d[SWAP_META_PAGES]; }; static MALLOC_DEFINE(M_VMPGDATA, "vm_pgdata", "swap pager private data"); static struct mtx sw_dev_mtx; static TAILQ_HEAD(, swdevt) swtailq = TAILQ_HEAD_INITIALIZER(swtailq); static struct swdevt *swdevhd; /* Allocate from here next */ static int nswapdev; /* Number of swap devices */ int swap_pager_avail; static struct sx swdev_syscall_lock; /* serialize swap(on|off) */ static vm_ooffset_t swap_total; SYSCTL_QUAD(_vm, OID_AUTO, swap_total, CTLFLAG_RD, &swap_total, 0, "Total amount of available swap storage."); static vm_ooffset_t swap_reserved; SYSCTL_QUAD(_vm, OID_AUTO, swap_reserved, CTLFLAG_RD, &swap_reserved, 0, "Amount of swap storage needed to back all allocated anonymous memory."); static int overcommit = 0; SYSCTL_INT(_vm, OID_AUTO, overcommit, CTLFLAG_RW, &overcommit, 0, "Configure virtual memory overcommit behavior. See tuning(7) " "for details."); static unsigned long swzone; SYSCTL_ULONG(_vm, OID_AUTO, swzone, CTLFLAG_RD, &swzone, 0, "Actual size of swap metadata zone"); static unsigned long swap_maxpages; SYSCTL_ULONG(_vm, OID_AUTO, swap_maxpages, CTLFLAG_RD, &swap_maxpages, 0, "Maximum amount of swap supported"); /* bits from overcommit */ #define SWAP_RESERVE_FORCE_ON (1 << 0) #define SWAP_RESERVE_RLIMIT_ON (1 << 1) #define SWAP_RESERVE_ALLOW_NONWIRED (1 << 2) int swap_reserve(vm_ooffset_t incr) { return (swap_reserve_by_cred(incr, curthread->td_ucred)); } int swap_reserve_by_cred(vm_ooffset_t incr, struct ucred *cred) { vm_ooffset_t r, s; int res, error; static int curfail; static struct timeval lastfail; struct uidinfo *uip; uip = cred->cr_ruidinfo; if (incr & PAGE_MASK) panic("swap_reserve: & PAGE_MASK"); #ifdef RACCT if (racct_enable) { PROC_LOCK(curproc); error = racct_add(curproc, RACCT_SWAP, incr); PROC_UNLOCK(curproc); if (error != 0) return (0); } #endif res = 0; mtx_lock(&sw_dev_mtx); r = swap_reserved + incr; if (overcommit & SWAP_RESERVE_ALLOW_NONWIRED) { s = vm_cnt.v_page_count - vm_cnt.v_free_reserved - vm_cnt.v_wire_count; s *= PAGE_SIZE; } else s = 0; s += swap_total; if ((overcommit & SWAP_RESERVE_FORCE_ON) == 0 || r <= s || (error = priv_check(curthread, PRIV_VM_SWAP_NOQUOTA)) == 0) { res = 1; swap_reserved = r; } mtx_unlock(&sw_dev_mtx); if (res) { UIDINFO_VMSIZE_LOCK(uip); if ((overcommit & SWAP_RESERVE_RLIMIT_ON) != 0 && uip->ui_vmsize + incr > lim_cur(curthread, RLIMIT_SWAP) && priv_check(curthread, PRIV_VM_SWAP_NORLIMIT)) res = 0; else uip->ui_vmsize += incr; UIDINFO_VMSIZE_UNLOCK(uip); if (!res) { mtx_lock(&sw_dev_mtx); swap_reserved -= incr; mtx_unlock(&sw_dev_mtx); } } if (!res && ppsratecheck(&lastfail, &curfail, 1)) { printf("uid %d, pid %d: swap reservation for %jd bytes failed\n", uip->ui_uid, curproc->p_pid, incr); } #ifdef RACCT if (!res) { PROC_LOCK(curproc); racct_sub(curproc, RACCT_SWAP, incr); PROC_UNLOCK(curproc); } #endif return (res); } void swap_reserve_force(vm_ooffset_t incr) { struct uidinfo *uip; mtx_lock(&sw_dev_mtx); swap_reserved += incr; mtx_unlock(&sw_dev_mtx); #ifdef RACCT PROC_LOCK(curproc); racct_add_force(curproc, RACCT_SWAP, incr); PROC_UNLOCK(curproc); #endif uip = curthread->td_ucred->cr_ruidinfo; PROC_LOCK(curproc); UIDINFO_VMSIZE_LOCK(uip); uip->ui_vmsize += incr; UIDINFO_VMSIZE_UNLOCK(uip); PROC_UNLOCK(curproc); } void swap_release(vm_ooffset_t decr) { struct ucred *cred; PROC_LOCK(curproc); cred = curthread->td_ucred; swap_release_by_cred(decr, cred); PROC_UNLOCK(curproc); } void swap_release_by_cred(vm_ooffset_t decr, struct ucred *cred) { struct uidinfo *uip; uip = cred->cr_ruidinfo; if (decr & PAGE_MASK) panic("swap_release: & PAGE_MASK"); mtx_lock(&sw_dev_mtx); if (swap_reserved < decr) panic("swap_reserved < decr"); swap_reserved -= decr; mtx_unlock(&sw_dev_mtx); UIDINFO_VMSIZE_LOCK(uip); if (uip->ui_vmsize < decr) printf("negative vmsize for uid = %d\n", uip->ui_uid); uip->ui_vmsize -= decr; UIDINFO_VMSIZE_UNLOCK(uip); racct_sub_cred(cred, RACCT_SWAP, decr); } #define SWM_FREE 0x02 /* free, period */ #define SWM_POP 0x04 /* pop out */ int swap_pager_full = 2; /* swap space exhaustion (task killing) */ static int swap_pager_almost_full = 1; /* swap space exhaustion (w/hysteresis)*/ static int nsw_rcount; /* free read buffers */ static int nsw_wcount_sync; /* limit write buffers / synchronous */ static int nsw_wcount_async; /* limit write buffers / asynchronous */ static int nsw_wcount_async_max;/* assigned maximum */ static int nsw_cluster_max; /* maximum VOP I/O allowed */ static int sysctl_swap_async_max(SYSCTL_HANDLER_ARGS); SYSCTL_PROC(_vm, OID_AUTO, swap_async_max, CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, NULL, 0, sysctl_swap_async_max, "I", "Maximum running async swap ops"); -static struct swblock **swhash; -static int swhash_mask; -static struct mtx swhash_mtx; - static struct sx sw_alloc_sx; /* * "named" and "unnamed" anon region objects. Try to reduce the overhead * of searching a named list by hashing it just a little. */ #define NOBJLISTS 8 #define NOBJLIST(handle) \ (&swap_pager_object_list[((int)(intptr_t)handle >> 4) & (NOBJLISTS-1)]) static struct pagerlst swap_pager_object_list[NOBJLISTS]; -static uma_zone_t swap_zone; +static uma_zone_t swblk_zone; +static uma_zone_t swpctrie_zone; /* * pagerops for OBJT_SWAP - "swap pager". Some ops are also global procedure * calls hooked from other parts of the VM system and do not appear here. * (see vm/swap_pager.h). */ static vm_object_t swap_pager_alloc(void *handle, vm_ooffset_t size, vm_prot_t prot, vm_ooffset_t offset, struct ucred *); static void swap_pager_dealloc(vm_object_t object); static int swap_pager_getpages(vm_object_t, vm_page_t *, int, int *, int *); static int swap_pager_getpages_async(vm_object_t, vm_page_t *, int, int *, int *, pgo_getpages_iodone_t, void *); static void swap_pager_putpages(vm_object_t, vm_page_t *, int, boolean_t, int *); static boolean_t swap_pager_haspage(vm_object_t object, vm_pindex_t pindex, int *before, int *after); static void swap_pager_init(void); static void swap_pager_unswapped(vm_page_t); static void swap_pager_swapoff(struct swdevt *sp); struct pagerops swappagerops = { .pgo_init = swap_pager_init, /* early system initialization of pager */ .pgo_alloc = swap_pager_alloc, /* allocate an OBJT_SWAP object */ .pgo_dealloc = swap_pager_dealloc, /* deallocate an OBJT_SWAP object */ .pgo_getpages = swap_pager_getpages, /* pagein */ .pgo_getpages_async = swap_pager_getpages_async, /* pagein (async) */ .pgo_putpages = swap_pager_putpages, /* pageout */ .pgo_haspage = swap_pager_haspage, /* get backing store status for page */ .pgo_pageunswapped = swap_pager_unswapped, /* remove swap related to page */ }; /* * swap_*() routines are externally accessible. swp_*() routines are * internal. */ static int nswap_lowat = 128; /* in pages, swap_pager_almost_full warn */ static int nswap_hiwat = 512; /* in pages, swap_pager_almost_full warn */ SYSCTL_INT(_vm, OID_AUTO, dmmax, CTLFLAG_RD, &nsw_cluster_max, 0, "Maximum size of a swap block in pages"); static void swp_sizecheck(void); static void swp_pager_async_iodone(struct buf *bp); static int swapongeom(struct vnode *); static int swaponvp(struct thread *, struct vnode *, u_long); static int swapoff_one(struct swdevt *sp, struct ucred *cred); /* * Swap bitmap functions */ static void swp_pager_freeswapspace(daddr_t blk, int npages); static daddr_t swp_pager_getswapspace(int npages); /* * Metadata functions */ -static struct swblock **swp_pager_hash(vm_object_t object, vm_pindex_t index); static void swp_pager_meta_build(vm_object_t, vm_pindex_t, daddr_t); static void swp_pager_meta_free(vm_object_t, vm_pindex_t, vm_pindex_t); static void swp_pager_meta_free_all(vm_object_t); static daddr_t swp_pager_meta_ctl(vm_object_t, vm_pindex_t, int); +static void * +swblk_trie_alloc(struct pctrie *ptree) +{ + + return (uma_zalloc(swpctrie_zone, M_NOWAIT | (curproc == pageproc ? + M_USE_RESERVE : 0))); +} + +static void +swblk_trie_free(struct pctrie *ptree, void *node) +{ + + uma_zfree(swpctrie_zone, node); +} + +PCTRIE_DEFINE(SWAP, swblk, p, swblk_trie_alloc, swblk_trie_free); + /* * SWP_SIZECHECK() - update swap_pager_full indication * * update the swap_pager_almost_full indication and warn when we are * about to run out of swap space, using lowat/hiwat hysteresis. * * Clear swap_pager_full ( task killing ) indication when lowat is met. * * No restrictions on call * This routine may not block. */ static void swp_sizecheck(void) { if (swap_pager_avail < nswap_lowat) { if (swap_pager_almost_full == 0) { printf("swap_pager: out of swap space\n"); swap_pager_almost_full = 1; } } else { swap_pager_full = 0; if (swap_pager_avail > nswap_hiwat) swap_pager_almost_full = 0; } } /* - * SWP_PAGER_HASH() - hash swap meta data - * - * This is an helper function which hashes the swapblk given - * the object and page index. It returns a pointer to a pointer - * to the object, or a pointer to a NULL pointer if it could not - * find a swapblk. - */ -static struct swblock ** -swp_pager_hash(vm_object_t object, vm_pindex_t index) -{ - struct swblock **pswap; - struct swblock *swap; - - index &= ~(vm_pindex_t)SWAP_META_MASK; - pswap = &swhash[(index ^ (int)(intptr_t)object) & swhash_mask]; - while ((swap = *pswap) != NULL) { - if (swap->swb_object == object && - swap->swb_index == index - ) { - break; - } - pswap = &swap->swb_hnext; - } - return (pswap); -} - -/* * SWAP_PAGER_INIT() - initialize the swap pager! * * Expected to be started from system init. NOTE: This code is run * before much else so be careful what you depend on. Most of the VM * system has yet to be initialized at this point. */ static void swap_pager_init(void) { /* * Initialize object lists */ int i; for (i = 0; i < NOBJLISTS; ++i) TAILQ_INIT(&swap_pager_object_list[i]); mtx_init(&sw_dev_mtx, "swapdev", NULL, MTX_DEF); sx_init(&sw_alloc_sx, "swspsx"); sx_init(&swdev_syscall_lock, "swsysc"); } /* * SWAP_PAGER_SWAP_INIT() - swap pager initialization from pageout process * * Expected to be started from pageout process once, prior to entering * its main loop. */ void swap_pager_swap_init(void) { unsigned long n, n2; /* * Number of in-transit swap bp operations. Don't * exhaust the pbufs completely. Make sure we * initialize workable values (0 will work for hysteresis * but it isn't very efficient). * * The nsw_cluster_max is constrained by the bp->b_pages[] * array (MAXPHYS/PAGE_SIZE) and our locally defined * MAX_PAGEOUT_CLUSTER. Also be aware that swap ops are * constrained by the swap device interleave stripe size. * * Currently we hardwire nsw_wcount_async to 4. This limit is * designed to prevent other I/O from having high latencies due to * our pageout I/O. The value 4 works well for one or two active swap * devices but is probably a little low if you have more. Even so, * a higher value would probably generate only a limited improvement * with three or four active swap devices since the system does not * typically have to pageout at extreme bandwidths. We will want * at least 2 per swap devices, and 4 is a pretty good value if you * have one NFS swap device due to the command/ack latency over NFS. * So it all works out pretty well. */ nsw_cluster_max = min((MAXPHYS/PAGE_SIZE), MAX_PAGEOUT_CLUSTER); mtx_lock(&pbuf_mtx); nsw_rcount = (nswbuf + 1) / 2; nsw_wcount_sync = (nswbuf + 3) / 4; nsw_wcount_async = 4; nsw_wcount_async_max = nsw_wcount_async; mtx_unlock(&pbuf_mtx); /* - * Initialize our zone. Right now I'm just guessing on the number - * we need based on the number of pages in the system. Each swblock - * can hold 32 pages, so this is probably overkill. This reservation - * is typically limited to around 32MB by default. + * Initialize our zone, guessing on the number we need based + * on the number of pages in the system. */ n = vm_cnt.v_page_count / 2; - if (maxswzone && n > maxswzone / sizeof(struct swblock)) - n = maxswzone / sizeof(struct swblock); + if (maxswzone && n > maxswzone / sizeof(struct swblk)) + n = maxswzone / sizeof(struct swblk); + swpctrie_zone = uma_zcreate("swpctrie", pctrie_node_size(), NULL, NULL, + pctrie_zone_init, NULL, UMA_ALIGN_PTR, + UMA_ZONE_NOFREE | UMA_ZONE_VM); + if (swpctrie_zone == NULL) + panic("failed to create swap pctrie zone."); + swblk_zone = uma_zcreate("swblk", sizeof(struct swblk), NULL, NULL, + NULL, NULL, _Alignof(struct swblk) - 1, + UMA_ZONE_NOFREE | UMA_ZONE_VM); + if (swblk_zone == NULL) + panic("failed to create swap blk zone."); n2 = n; - swap_zone = uma_zcreate("SWAPMETA", sizeof(struct swblock), NULL, NULL, - NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE | UMA_ZONE_VM); - if (swap_zone == NULL) - panic("failed to create swap_zone."); do { - if (uma_zone_reserve_kva(swap_zone, n)) + if (uma_zone_reserve_kva(swblk_zone, n)) break; /* * if the allocation failed, try a zone two thirds the * size of the previous attempt. */ n -= ((n + 2) / 3); } while (n > 0); if (n2 != n) - printf("Swap zone entries reduced from %lu to %lu.\n", n2, n); + printf("Swap blk zone entries reduced from %lu to %lu.\n", + n2, n); swap_maxpages = n * SWAP_META_PAGES; - swzone = n * sizeof(struct swblock); - n2 = n; - - /* - * Initialize our meta-data hash table. The swapper does not need to - * be quite as efficient as the VM system, so we do not use an - * oversized hash table. - * - * n: size of hash table, must be power of 2 - * swhash_mask: hash table index mask - */ - for (n = 1; n < n2 / 8; n *= 2) - ; - swhash = malloc(sizeof(struct swblock *) * n, M_VMPGDATA, M_WAITOK | M_ZERO); - swhash_mask = n - 1; - mtx_init(&swhash_mtx, "swap_pager swhash", NULL, MTX_DEF); + swzone = n * sizeof(struct swblk); + if (!uma_zone_reserve_kva(swpctrie_zone, n)) + printf("Cannot reserve swap pctrie zone, " + "reduce kern.maxswzone.\n"); } static vm_object_t swap_pager_alloc_init(void *handle, struct ucred *cred, vm_ooffset_t size, vm_ooffset_t offset) { vm_object_t object; if (cred != NULL) { if (!swap_reserve_by_cred(size, cred)) return (NULL); crhold(cred); } + + /* + * The un_pager.swp.swp_blks trie is initialized by + * vm_object_allocate() to ensure the correct order of + * visibility to other threads. + */ object = vm_object_allocate(OBJT_SWAP, OFF_TO_IDX(offset + PAGE_MASK + size)); + object->handle = handle; if (cred != NULL) { object->cred = cred; object->charge = size; } - object->un_pager.swp.swp_bcount = 0; return (object); } /* * SWAP_PAGER_ALLOC() - allocate a new OBJT_SWAP VM object and instantiate * its metadata structures. * * This routine is called from the mmap and fork code to create a new * OBJT_SWAP object. * * This routine must ensure that no live duplicate is created for * the named object request, which is protected against by * holding the sw_alloc_sx lock in case handle != NULL. */ static vm_object_t swap_pager_alloc(void *handle, vm_ooffset_t size, vm_prot_t prot, vm_ooffset_t offset, struct ucred *cred) { vm_object_t object; if (handle != NULL) { /* * Reference existing named region or allocate new one. There * should not be a race here against swp_pager_meta_build() * as called from vm_page_remove() in regards to the lookup * of the handle. */ sx_xlock(&sw_alloc_sx); object = vm_pager_object_lookup(NOBJLIST(handle), handle); if (object == NULL) { object = swap_pager_alloc_init(handle, cred, size, offset); if (object != NULL) { TAILQ_INSERT_TAIL(NOBJLIST(object->handle), object, pager_object_list); } } sx_xunlock(&sw_alloc_sx); } else { object = swap_pager_alloc_init(handle, cred, size, offset); } return (object); } /* * SWAP_PAGER_DEALLOC() - remove swap metadata from object * * The swap backing for the object is destroyed. The code is * designed such that we can reinstantiate it later, but this * routine is typically called only when the entire object is * about to be destroyed. * * The object must be locked. */ static void swap_pager_dealloc(vm_object_t object) { VM_OBJECT_ASSERT_WLOCKED(object); KASSERT((object->flags & OBJ_DEAD) != 0, ("dealloc of reachable obj")); /* * Remove from list right away so lookups will fail if we block for * pageout completion. */ if (object->handle != NULL) { VM_OBJECT_WUNLOCK(object); sx_xlock(&sw_alloc_sx); TAILQ_REMOVE(NOBJLIST(object->handle), object, pager_object_list); sx_xunlock(&sw_alloc_sx); VM_OBJECT_WLOCK(object); } vm_object_pip_wait(object, "swpdea"); /* * Free all remaining metadata. We only bother to free it from * the swap meta data. We do not attempt to free swapblk's still * associated with vm_page_t's for this object. We do not care * if paging is still in progress on some objects. */ swp_pager_meta_free_all(object); object->handle = NULL; object->type = OBJT_DEAD; } /************************************************************************ * SWAP PAGER BITMAP ROUTINES * ************************************************************************/ /* * SWP_PAGER_GETSWAPSPACE() - allocate raw swap space * * Allocate swap for the requested number of pages. The starting * swap block number (a page index) is returned or SWAPBLK_NONE * if the allocation failed. * * Also has the side effect of advising that somebody made a mistake * when they configured swap and didn't configure enough. * * This routine may not sleep. * * We allocate in round-robin fashion from the configured devices. */ static daddr_t swp_pager_getswapspace(int npages) { daddr_t blk; struct swdevt *sp; int i; blk = SWAPBLK_NONE; mtx_lock(&sw_dev_mtx); sp = swdevhd; for (i = 0; i < nswapdev; i++) { if (sp == NULL) sp = TAILQ_FIRST(&swtailq); if (!(sp->sw_flags & SW_CLOSING)) { blk = blist_alloc(sp->sw_blist, npages); if (blk != SWAPBLK_NONE) { blk += sp->sw_first; sp->sw_used += npages; swap_pager_avail -= npages; swp_sizecheck(); swdevhd = TAILQ_NEXT(sp, sw_list); goto done; } } sp = TAILQ_NEXT(sp, sw_list); } if (swap_pager_full != 2) { printf("swap_pager_getswapspace(%d): failed\n", npages); swap_pager_full = 2; swap_pager_almost_full = 1; } swdevhd = NULL; done: mtx_unlock(&sw_dev_mtx); return (blk); } static int swp_pager_isondev(daddr_t blk, struct swdevt *sp) { return (blk >= sp->sw_first && blk < sp->sw_end); } static void swp_pager_strategy(struct buf *bp) { struct swdevt *sp; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { if (bp->b_blkno >= sp->sw_first && bp->b_blkno < sp->sw_end) { mtx_unlock(&sw_dev_mtx); if ((sp->sw_flags & SW_UNMAPPED) != 0 && unmapped_buf_allowed) { bp->b_data = unmapped_buf; bp->b_offset = 0; } else { pmap_qenter((vm_offset_t)bp->b_data, &bp->b_pages[0], bp->b_bcount / PAGE_SIZE); } sp->sw_strategy(bp, sp); return; } } panic("Swapdev not found"); } /* * SWP_PAGER_FREESWAPSPACE() - free raw swap space * * This routine returns the specified swap blocks back to the bitmap. * * This routine may not sleep. */ static void swp_pager_freeswapspace(daddr_t blk, int npages) { struct swdevt *sp; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { if (blk >= sp->sw_first && blk < sp->sw_end) { sp->sw_used -= npages; /* * If we are attempting to stop swapping on * this device, we don't want to mark any * blocks free lest they be reused. */ if ((sp->sw_flags & SW_CLOSING) == 0) { blist_free(sp->sw_blist, blk - sp->sw_first, npages); swap_pager_avail += npages; swp_sizecheck(); } mtx_unlock(&sw_dev_mtx); return; } } panic("Swapdev not found"); } /* * SWAP_PAGER_FREESPACE() - frees swap blocks associated with a page * range within an object. * * This is a globally accessible routine. * * This routine removes swapblk assignments from swap metadata. * * The external callers of this routine typically have already destroyed * or renamed vm_page_t's associated with this range in the object so * we should be ok. * * The object must be locked. */ void swap_pager_freespace(vm_object_t object, vm_pindex_t start, vm_size_t size) { swp_pager_meta_free(object, start, size); } /* * SWAP_PAGER_RESERVE() - reserve swap blocks in object * * Assigns swap blocks to the specified range within the object. The * swap blocks are not zeroed. Any previous swap assignment is destroyed. * * Returns 0 on success, -1 on failure. */ int swap_pager_reserve(vm_object_t object, vm_pindex_t start, vm_size_t size) { int n = 0; daddr_t blk = SWAPBLK_NONE; vm_pindex_t beg = start; /* save start index */ VM_OBJECT_WLOCK(object); while (size) { if (n == 0) { n = BLIST_MAX_ALLOC; while ((blk = swp_pager_getswapspace(n)) == SWAPBLK_NONE) { n >>= 1; if (n == 0) { swp_pager_meta_free(object, beg, start - beg); VM_OBJECT_WUNLOCK(object); return (-1); } } } swp_pager_meta_build(object, start, blk); --size; ++start; ++blk; --n; } swp_pager_meta_free(object, start, n); VM_OBJECT_WUNLOCK(object); return (0); } /* * SWAP_PAGER_COPY() - copy blocks from source pager to destination pager * and destroy the source. * * Copy any valid swapblks from the source to the destination. In * cases where both the source and destination have a valid swapblk, * we keep the destination's. * * This routine is allowed to sleep. It may sleep allocating metadata * indirectly through swp_pager_meta_build() or if paging is still in * progress on the source. * * The source object contains no vm_page_t's (which is just as well) * * The source object is of type OBJT_SWAP. * * The source and destination objects must be locked. * Both object locks may temporarily be released. */ void swap_pager_copy(vm_object_t srcobject, vm_object_t dstobject, vm_pindex_t offset, int destroysource) { vm_pindex_t i; VM_OBJECT_ASSERT_WLOCKED(srcobject); VM_OBJECT_ASSERT_WLOCKED(dstobject); /* * If destroysource is set, we remove the source object from the * swap_pager internal queue now. */ if (destroysource && srcobject->handle != NULL) { vm_object_pip_add(srcobject, 1); VM_OBJECT_WUNLOCK(srcobject); vm_object_pip_add(dstobject, 1); VM_OBJECT_WUNLOCK(dstobject); sx_xlock(&sw_alloc_sx); TAILQ_REMOVE(NOBJLIST(srcobject->handle), srcobject, pager_object_list); sx_xunlock(&sw_alloc_sx); VM_OBJECT_WLOCK(dstobject); vm_object_pip_wakeup(dstobject); VM_OBJECT_WLOCK(srcobject); vm_object_pip_wakeup(srcobject); } /* * transfer source to destination. */ for (i = 0; i < dstobject->size; ++i) { daddr_t dstaddr; /* * Locate (without changing) the swapblk on the destination, * unless it is invalid in which case free it silently, or * if the destination is a resident page, in which case the * source is thrown away. */ dstaddr = swp_pager_meta_ctl(dstobject, i, 0); if (dstaddr == SWAPBLK_NONE) { /* * Destination has no swapblk and is not resident, * copy source. */ daddr_t srcaddr; srcaddr = swp_pager_meta_ctl( srcobject, i + offset, SWM_POP ); if (srcaddr != SWAPBLK_NONE) { /* * swp_pager_meta_build() can sleep. */ vm_object_pip_add(srcobject, 1); VM_OBJECT_WUNLOCK(srcobject); vm_object_pip_add(dstobject, 1); swp_pager_meta_build(dstobject, i, srcaddr); vm_object_pip_wakeup(dstobject); VM_OBJECT_WLOCK(srcobject); vm_object_pip_wakeup(srcobject); } } else { /* * Destination has valid swapblk or it is represented * by a resident page. We destroy the sourceblock. */ swp_pager_meta_ctl(srcobject, i + offset, SWM_FREE); } } /* * Free left over swap blocks in source. * * We have to revert the type to OBJT_DEFAULT so we do not accidentally * double-remove the object from the swap queues. */ if (destroysource) { swp_pager_meta_free_all(srcobject); /* * Reverting the type is not necessary, the caller is going * to destroy srcobject directly, but I'm doing it here * for consistency since we've removed the object from its * queues. */ srcobject->type = OBJT_DEFAULT; } } /* * SWAP_PAGER_HASPAGE() - determine if we have good backing store for * the requested page. * * We determine whether good backing store exists for the requested * page and return TRUE if it does, FALSE if it doesn't. * * If TRUE, we also try to determine how much valid, contiguous backing * store exists before and after the requested page. */ static boolean_t swap_pager_haspage(vm_object_t object, vm_pindex_t pindex, int *before, int *after) { daddr_t blk, blk0; int i; VM_OBJECT_ASSERT_LOCKED(object); /* * do we have good backing store at the requested index ? */ blk0 = swp_pager_meta_ctl(object, pindex, 0); if (blk0 == SWAPBLK_NONE) { if (before) *before = 0; if (after) *after = 0; return (FALSE); } /* * find backwards-looking contiguous good backing store */ if (before != NULL) { for (i = 1; i < SWB_NPAGES; i++) { if (i > pindex) break; blk = swp_pager_meta_ctl(object, pindex - i, 0); if (blk != blk0 - i) break; } *before = i - 1; } /* * find forward-looking contiguous good backing store */ if (after != NULL) { for (i = 1; i < SWB_NPAGES; i++) { blk = swp_pager_meta_ctl(object, pindex + i, 0); if (blk != blk0 + i) break; } *after = i - 1; } return (TRUE); } /* * SWAP_PAGER_PAGE_UNSWAPPED() - remove swap backing store related to page * * This removes any associated swap backing store, whether valid or * not, from the page. * * This routine is typically called when a page is made dirty, at * which point any associated swap can be freed. MADV_FREE also * calls us in a special-case situation * * NOTE!!! If the page is clean and the swap was valid, the caller * should make the page dirty before calling this routine. This routine * does NOT change the m->dirty status of the page. Also: MADV_FREE * depends on it. * * This routine may not sleep. * * The object containing the page must be locked. */ static void swap_pager_unswapped(vm_page_t m) { swp_pager_meta_ctl(m->object, m->pindex, SWM_FREE); } /* * swap_pager_getpages() - bring pages in from swap * * Attempt to page in the pages in array "m" of length "count". The caller * may optionally specify that additional pages preceding and succeeding * the specified range be paged in. The number of such pages is returned * in the "rbehind" and "rahead" parameters, and they will be in the * inactive queue upon return. * * The pages in "m" must be busied and will remain busied upon return. */ static int swap_pager_getpages(vm_object_t object, vm_page_t *m, int count, int *rbehind, int *rahead) { struct buf *bp; vm_page_t mpred, msucc, p; vm_pindex_t pindex; daddr_t blk; int i, j, maxahead, maxbehind, reqcount, shift; reqcount = count; VM_OBJECT_WUNLOCK(object); bp = getpbuf(&nsw_rcount); VM_OBJECT_WLOCK(object); if (!swap_pager_haspage(object, m[0]->pindex, &maxbehind, &maxahead)) { relpbuf(bp, &nsw_rcount); return (VM_PAGER_FAIL); } /* * Clip the readahead and readbehind ranges to exclude resident pages. */ if (rahead != NULL) { KASSERT(reqcount - 1 <= maxahead, ("page count %d extends beyond swap block", reqcount)); *rahead = imin(*rahead, maxahead - (reqcount - 1)); pindex = m[reqcount - 1]->pindex; msucc = TAILQ_NEXT(m[reqcount - 1], listq); if (msucc != NULL && msucc->pindex - pindex - 1 < *rahead) *rahead = msucc->pindex - pindex - 1; } if (rbehind != NULL) { *rbehind = imin(*rbehind, maxbehind); pindex = m[0]->pindex; mpred = TAILQ_PREV(m[0], pglist, listq); if (mpred != NULL && pindex - mpred->pindex - 1 < *rbehind) *rbehind = pindex - mpred->pindex - 1; } /* * Allocate readahead and readbehind pages. */ shift = rbehind != NULL ? *rbehind : 0; if (shift != 0) { for (i = 1; i <= shift; i++) { p = vm_page_alloc(object, m[0]->pindex - i, VM_ALLOC_NORMAL); if (p == NULL) { /* Shift allocated pages to the left. */ for (j = 0; j < i - 1; j++) bp->b_pages[j] = bp->b_pages[j + shift - i + 1]; break; } bp->b_pages[shift - i] = p; } shift = i - 1; *rbehind = shift; } for (i = 0; i < reqcount; i++) bp->b_pages[i + shift] = m[i]; if (rahead != NULL) { for (i = 0; i < *rahead; i++) { p = vm_page_alloc(object, m[reqcount - 1]->pindex + i + 1, VM_ALLOC_NORMAL); if (p == NULL) break; bp->b_pages[shift + reqcount + i] = p; } *rahead = i; } if (rbehind != NULL) count += *rbehind; if (rahead != NULL) count += *rahead; vm_object_pip_add(object, count); for (i = 0; i < count; i++) bp->b_pages[i]->oflags |= VPO_SWAPINPROG; pindex = bp->b_pages[0]->pindex; blk = swp_pager_meta_ctl(object, pindex, 0); KASSERT(blk != SWAPBLK_NONE, ("no swap blocking containing %p(%jx)", object, (uintmax_t)pindex)); VM_OBJECT_WUNLOCK(object); bp->b_flags |= B_PAGING; bp->b_iocmd = BIO_READ; bp->b_iodone = swp_pager_async_iodone; bp->b_rcred = crhold(thread0.td_ucred); bp->b_wcred = crhold(thread0.td_ucred); bp->b_blkno = blk; bp->b_bcount = PAGE_SIZE * count; bp->b_bufsize = PAGE_SIZE * count; bp->b_npages = count; bp->b_pgbefore = rbehind != NULL ? *rbehind : 0; bp->b_pgafter = rahead != NULL ? *rahead : 0; VM_CNT_INC(v_swapin); VM_CNT_ADD(v_swappgsin, count); /* * perform the I/O. NOTE!!! bp cannot be considered valid after * this point because we automatically release it on completion. * Instead, we look at the one page we are interested in which we * still hold a lock on even through the I/O completion. * * The other pages in our m[] array are also released on completion, * so we cannot assume they are valid anymore either. * * NOTE: b_blkno is destroyed by the call to swapdev_strategy */ BUF_KERNPROC(bp); swp_pager_strategy(bp); /* * Wait for the pages we want to complete. VPO_SWAPINPROG is always * cleared on completion. If an I/O error occurs, SWAPBLK_NONE * is set in the metadata for each page in the request. */ VM_OBJECT_WLOCK(object); while ((m[0]->oflags & VPO_SWAPINPROG) != 0) { m[0]->oflags |= VPO_SWAPSLEEP; VM_CNT_INC(v_intrans); if (VM_OBJECT_SLEEP(object, &object->paging_in_progress, PSWP, "swread", hz * 20)) { printf( "swap_pager: indefinite wait buffer: bufobj: %p, blkno: %jd, size: %ld\n", bp->b_bufobj, (intmax_t)bp->b_blkno, bp->b_bcount); } } /* * If we had an unrecoverable read error pages will not be valid. */ for (i = 0; i < reqcount; i++) if (m[i]->valid != VM_PAGE_BITS_ALL) return (VM_PAGER_ERROR); return (VM_PAGER_OK); /* * A final note: in a low swap situation, we cannot deallocate swap * and mark a page dirty here because the caller is likely to mark * the page clean when we return, causing the page to possibly revert * to all-zero's later. */ } /* * swap_pager_getpages_async(): * * Right now this is emulation of asynchronous operation on top of * swap_pager_getpages(). */ static int swap_pager_getpages_async(vm_object_t object, vm_page_t *m, int count, int *rbehind, int *rahead, pgo_getpages_iodone_t iodone, void *arg) { int r, error; r = swap_pager_getpages(object, m, count, rbehind, rahead); VM_OBJECT_WUNLOCK(object); switch (r) { case VM_PAGER_OK: error = 0; break; case VM_PAGER_ERROR: error = EIO; break; case VM_PAGER_FAIL: error = EINVAL; break; default: panic("unhandled swap_pager_getpages() error %d", r); } (iodone)(arg, m, count, error); VM_OBJECT_WLOCK(object); return (r); } /* * swap_pager_putpages: * * Assign swap (if necessary) and initiate I/O on the specified pages. * * We support both OBJT_DEFAULT and OBJT_SWAP objects. DEFAULT objects * are automatically converted to SWAP objects. * * In a low memory situation we may block in VOP_STRATEGY(), but the new * vm_page reservation system coupled with properly written VFS devices * should ensure that no low-memory deadlock occurs. This is an area * which needs work. * * The parent has N vm_object_pip_add() references prior to * calling us and will remove references for rtvals[] that are * not set to VM_PAGER_PEND. We need to remove the rest on I/O * completion. * * The parent has soft-busy'd the pages it passes us and will unbusy * those whos rtvals[] entry is not set to VM_PAGER_PEND on return. * We need to unbusy the rest on I/O completion. */ static void swap_pager_putpages(vm_object_t object, vm_page_t *m, int count, int flags, int *rtvals) { int i, n; boolean_t sync; if (count && m[0]->object != object) { panic("swap_pager_putpages: object mismatch %p/%p", object, m[0]->object ); } /* * Step 1 * * Turn object into OBJT_SWAP * check for bogus sysops * force sync if not pageout process */ if (object->type != OBJT_SWAP) swp_pager_meta_build(object, 0, SWAPBLK_NONE); VM_OBJECT_WUNLOCK(object); n = 0; if (curproc != pageproc) sync = TRUE; else sync = (flags & VM_PAGER_PUT_SYNC) != 0; /* * Step 2 * * Assign swap blocks and issue I/O. We reallocate swap on the fly. * The page is left dirty until the pageout operation completes * successfully. */ for (i = 0; i < count; i += n) { int j; struct buf *bp; daddr_t blk; /* * Maximum I/O size is limited by a number of factors. */ n = min(BLIST_MAX_ALLOC, count - i); n = min(n, nsw_cluster_max); /* * Get biggest block of swap we can. If we fail, fall * back and try to allocate a smaller block. Don't go * overboard trying to allocate space if it would overly * fragment swap. */ while ( (blk = swp_pager_getswapspace(n)) == SWAPBLK_NONE && n > 4 ) { n >>= 1; } if (blk == SWAPBLK_NONE) { for (j = 0; j < n; ++j) rtvals[i+j] = VM_PAGER_FAIL; continue; } /* * All I/O parameters have been satisfied, build the I/O * request and assign the swap space. */ if (sync == TRUE) { bp = getpbuf(&nsw_wcount_sync); } else { bp = getpbuf(&nsw_wcount_async); bp->b_flags = B_ASYNC; } bp->b_flags |= B_PAGING; bp->b_iocmd = BIO_WRITE; bp->b_rcred = crhold(thread0.td_ucred); bp->b_wcred = crhold(thread0.td_ucred); bp->b_bcount = PAGE_SIZE * n; bp->b_bufsize = PAGE_SIZE * n; bp->b_blkno = blk; VM_OBJECT_WLOCK(object); for (j = 0; j < n; ++j) { vm_page_t mreq = m[i+j]; swp_pager_meta_build( mreq->object, mreq->pindex, blk + j ); MPASS(mreq->dirty == VM_PAGE_BITS_ALL); mreq->oflags |= VPO_SWAPINPROG; bp->b_pages[j] = mreq; } VM_OBJECT_WUNLOCK(object); bp->b_npages = n; /* * Must set dirty range for NFS to work. */ bp->b_dirtyoff = 0; bp->b_dirtyend = bp->b_bcount; VM_CNT_INC(v_swapout); VM_CNT_ADD(v_swappgsout, bp->b_npages); /* * We unconditionally set rtvals[] to VM_PAGER_PEND so that we * can call the async completion routine at the end of a * synchronous I/O operation. Otherwise, our caller would * perform duplicate unbusy and wakeup operations on the page * and object, respectively. */ for (j = 0; j < n; j++) rtvals[i + j] = VM_PAGER_PEND; /* * asynchronous * * NOTE: b_blkno is destroyed by the call to swapdev_strategy */ if (sync == FALSE) { bp->b_iodone = swp_pager_async_iodone; BUF_KERNPROC(bp); swp_pager_strategy(bp); continue; } /* * synchronous * * NOTE: b_blkno is destroyed by the call to swapdev_strategy */ bp->b_iodone = bdone; swp_pager_strategy(bp); /* * Wait for the sync I/O to complete. */ bwait(bp, PVM, "swwrt"); /* * Now that we are through with the bp, we can call the * normal async completion, which frees everything up. */ swp_pager_async_iodone(bp); } VM_OBJECT_WLOCK(object); } /* * swp_pager_async_iodone: * * Completion routine for asynchronous reads and writes from/to swap. * Also called manually by synchronous code to finish up a bp. * * This routine may not sleep. */ static void swp_pager_async_iodone(struct buf *bp) { int i; vm_object_t object = NULL; /* * report error */ if (bp->b_ioflags & BIO_ERROR) { printf( "swap_pager: I/O error - %s failed; blkno %ld," "size %ld, error %d\n", ((bp->b_iocmd == BIO_READ) ? "pagein" : "pageout"), (long)bp->b_blkno, (long)bp->b_bcount, bp->b_error ); } /* * remove the mapping for kernel virtual */ if (buf_mapped(bp)) pmap_qremove((vm_offset_t)bp->b_data, bp->b_npages); else bp->b_data = bp->b_kvabase; if (bp->b_npages) { object = bp->b_pages[0]->object; VM_OBJECT_WLOCK(object); } /* * cleanup pages. If an error occurs writing to swap, we are in * very serious trouble. If it happens to be a disk error, though, * we may be able to recover by reassigning the swap later on. So * in this case we remove the m->swapblk assignment for the page * but do not free it in the rlist. The errornous block(s) are thus * never reallocated as swap. Redirty the page and continue. */ for (i = 0; i < bp->b_npages; ++i) { vm_page_t m = bp->b_pages[i]; m->oflags &= ~VPO_SWAPINPROG; if (m->oflags & VPO_SWAPSLEEP) { m->oflags &= ~VPO_SWAPSLEEP; wakeup(&object->paging_in_progress); } if (bp->b_ioflags & BIO_ERROR) { /* * If an error occurs I'd love to throw the swapblk * away without freeing it back to swapspace, so it * can never be used again. But I can't from an * interrupt. */ if (bp->b_iocmd == BIO_READ) { /* * NOTE: for reads, m->dirty will probably * be overridden by the original caller of * getpages so don't play cute tricks here. */ m->valid = 0; } else { /* * If a write error occurs, reactivate page * so it doesn't clog the inactive list, * then finish the I/O. */ vm_page_dirty(m); vm_page_lock(m); vm_page_activate(m); vm_page_unlock(m); vm_page_sunbusy(m); } } else if (bp->b_iocmd == BIO_READ) { /* * NOTE: for reads, m->dirty will probably be * overridden by the original caller of getpages so * we cannot set them in order to free the underlying * swap in a low-swap situation. I don't think we'd * want to do that anyway, but it was an optimization * that existed in the old swapper for a time before * it got ripped out due to precisely this problem. */ KASSERT(!pmap_page_is_mapped(m), ("swp_pager_async_iodone: page %p is mapped", m)); KASSERT(m->dirty == 0, ("swp_pager_async_iodone: page %p is dirty", m)); m->valid = VM_PAGE_BITS_ALL; if (i < bp->b_pgbefore || i >= bp->b_npages - bp->b_pgafter) vm_page_readahead_finish(m); } else { /* * For write success, clear the dirty * status, then finish the I/O ( which decrements the * busy count and possibly wakes waiter's up ). * A page is only written to swap after a period of * inactivity. Therefore, we do not expect it to be * reused. */ KASSERT(!pmap_page_is_write_mapped(m), ("swp_pager_async_iodone: page %p is not write" " protected", m)); vm_page_undirty(m); vm_page_lock(m); vm_page_deactivate_noreuse(m); vm_page_unlock(m); vm_page_sunbusy(m); } } /* * adjust pip. NOTE: the original parent may still have its own * pip refs on the object. */ if (object != NULL) { vm_object_pip_wakeupn(object, bp->b_npages); VM_OBJECT_WUNLOCK(object); } /* * swapdev_strategy() manually sets b_vp and b_bufobj before calling * bstrategy(). Set them back to NULL now we're done with it, or we'll * trigger a KASSERT in relpbuf(). */ if (bp->b_vp) { bp->b_vp = NULL; bp->b_bufobj = NULL; } /* * release the physical I/O buffer */ relpbuf( bp, ((bp->b_iocmd == BIO_READ) ? &nsw_rcount : ((bp->b_flags & B_ASYNC) ? &nsw_wcount_async : &nsw_wcount_sync ) ) ); } int swap_pager_nswapdev(void) { return (nswapdev); } /* * SWP_PAGER_FORCE_PAGEIN() - force a swap block to be paged in * * This routine dissociates the page at the given index within an object * from its backing store, paging it in if it does not reside in memory. * If the page is paged in, it is marked dirty and placed in the laundry * queue. The page is marked dirty because it no longer has backing * store. It is placed in the laundry queue because it has not been * accessed recently. Otherwise, it would already reside in memory. * * We also attempt to swap in all other pages in the swap block. * However, we only guarantee that the one at the specified index is * paged in. * * XXX - The code to page the whole block in doesn't work, so we * revert to the one-by-one behavior for now. Sigh. */ static inline void swp_pager_force_pagein(vm_object_t object, vm_pindex_t pindex) { vm_page_t m; vm_object_pip_add(object, 1); m = vm_page_grab(object, pindex, VM_ALLOC_NORMAL); if (m->valid == VM_PAGE_BITS_ALL) { vm_object_pip_wakeup(object); vm_page_dirty(m); vm_page_lock(m); vm_page_activate(m); vm_page_unlock(m); vm_page_xunbusy(m); vm_pager_page_unswapped(m); return; } if (swap_pager_getpages(object, &m, 1, NULL, NULL) != VM_PAGER_OK) panic("swap_pager_force_pagein: read from swap failed");/*XXX*/ vm_object_pip_wakeup(object); vm_page_dirty(m); vm_page_lock(m); vm_page_launder(m); vm_page_unlock(m); vm_page_xunbusy(m); vm_pager_page_unswapped(m); } /* * swap_pager_swapoff: * * Page in all of the pages that have been paged out to the * given device. The corresponding blocks in the bitmap must be * marked as allocated and the device must be flagged SW_CLOSING. * There may be no processes swapped out to the device. * * This routine may block. */ static void swap_pager_swapoff(struct swdevt *sp) { - struct swblock *swap; - vm_object_t locked_obj, object; - vm_pindex_t pindex; - int i, j, retries; + struct swblk *sb; + vm_object_t object; + vm_pindex_t pi; + int i, retries; sx_assert(&swdev_syscall_lock, SA_XLOCKED); retries = 0; - locked_obj = NULL; full_rescan: - mtx_lock(&swhash_mtx); - for (i = 0; i <= swhash_mask; i++) { /* '<=' is correct here */ -restart: - for (swap = swhash[i]; swap != NULL; swap = swap->swb_hnext) { - object = swap->swb_object; - pindex = swap->swb_index; - for (j = 0; j < SWAP_META_PAGES; ++j) { - if (!swp_pager_isondev(swap->swb_pages[j], sp)) + mtx_lock(&vm_object_list_mtx); + TAILQ_FOREACH(object, &vm_object_list, object_list) { + if (object->type != OBJT_SWAP) + continue; + mtx_unlock(&vm_object_list_mtx); + /* Depends on type-stability. */ + VM_OBJECT_WLOCK(object); + + /* + * Dead objects are eventually terminated on their own. + */ + if ((object->flags & OBJ_DEAD) != 0) + goto next_obj; + + /* + * Sync with fences placed after pctrie + * initialization. We must not access pctrie below + * unless we checked that our object is swap and not + * dead. + */ + atomic_thread_fence_acq(); + if (object->type != OBJT_SWAP) + goto next_obj; + + for (pi = 0; (sb = SWAP_PCTRIE_LOOKUP_GE( + &object->un_pager.swp.swp_blks, pi)) != NULL; ) { + pi = sb->p + SWAP_META_PAGES; + for (i = 0; i < SWAP_META_PAGES; i++) { + if (sb->d[i] == SWAPBLK_NONE) continue; - if (locked_obj != object) { - if (locked_obj != NULL) - VM_OBJECT_WUNLOCK(locked_obj); - locked_obj = object; - if (!VM_OBJECT_TRYWLOCK(object)) { - mtx_unlock(&swhash_mtx); - /* Depends on type-stability. */ - VM_OBJECT_WLOCK(object); - mtx_lock(&swhash_mtx); - goto restart; - } - } - MPASS(locked_obj == object); - mtx_unlock(&swhash_mtx); - swp_pager_force_pagein(object, pindex + j); - mtx_lock(&swhash_mtx); - goto restart; + if (swp_pager_isondev(sb->d[i], sp)) + swp_pager_force_pagein(object, + sb->p + i); } } +next_obj: + VM_OBJECT_WUNLOCK(object); + mtx_lock(&vm_object_list_mtx); } - mtx_unlock(&swhash_mtx); - if (locked_obj != NULL) { - VM_OBJECT_WUNLOCK(locked_obj); - locked_obj = NULL; - } + mtx_unlock(&vm_object_list_mtx); + if (sp->sw_used) { /* * Objects may be locked or paging to the device being * removed, so we will miss their pages and need to * make another pass. We have marked this device as * SW_CLOSING, so the activity should finish soon. */ retries++; if (retries > 100) { panic("swapoff: failed to locate %d swap blocks", sp->sw_used); } pause("swpoff", hz / 20); goto full_rescan; } EVENTHANDLER_INVOKE(swapoff, sp); } /************************************************************************ * SWAP META DATA * ************************************************************************ * * These routines manipulate the swap metadata stored in the * OBJT_SWAP object. * * Swap metadata is implemented with a global hash and not directly * linked into the object. Instead the object simply contains * appropriate tracking counters. */ /* * SWP_PAGER_META_BUILD() - add swap block to swap meta data for object * * We first convert the object to a swap object if it is a default * object. * * The specified swapblk is added to the object's swap metadata. If * the swapblk is not valid, it is freed instead. Any previously * assigned swapblk is freed. */ static void swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk) { - static volatile int exhausted; - struct swblock *swap; - struct swblock **pswap; - int idx; + static volatile int swblk_zone_exhausted, swpctrie_zone_exhausted; + struct swblk *sb; + vm_pindex_t modpi, rdpi; + int error, i; VM_OBJECT_ASSERT_WLOCKED(object); + /* * Convert default object to swap object if necessary */ if (object->type != OBJT_SWAP) { + pctrie_init(&object->un_pager.swp.swp_blks); + + /* + * Ensure that swap_pager_swapoff()'s iteration over + * object_list does not see a garbage pctrie. + */ + atomic_thread_fence_rel(); + object->type = OBJT_SWAP; - object->un_pager.swp.swp_bcount = 0; KASSERT(object->handle == NULL, ("default pager with handle")); } - /* - * Locate hash entry. If not found create, but if we aren't adding - * anything just return. If we run out of space in the map we wait - * and, since the hash table may have changed, retry. - */ -retry: - mtx_lock(&swhash_mtx); - pswap = swp_pager_hash(object, pindex); - - if ((swap = *pswap) == NULL) { - int i; - + rdpi = rounddown(pindex, SWAP_META_PAGES); + sb = SWAP_PCTRIE_LOOKUP(&object->un_pager.swp.swp_blks, rdpi); + if (sb == NULL) { if (swapblk == SWAPBLK_NONE) - goto done; - - swap = *pswap = uma_zalloc(swap_zone, M_NOWAIT | - (curproc == pageproc ? M_USE_RESERVE : 0)); - if (swap == NULL) { - mtx_unlock(&swhash_mtx); + return; + for (;;) { + sb = uma_zalloc(swblk_zone, M_NOWAIT | (curproc == + pageproc ? M_USE_RESERVE : 0)); + if (sb != NULL) { + sb->p = rdpi; + for (i = 0; i < SWAP_META_PAGES; i++) + sb->d[i] = SWAPBLK_NONE; + if (atomic_cmpset_int(&swblk_zone_exhausted, + 1, 0)) + printf("swblk zone ok\n"); + break; + } VM_OBJECT_WUNLOCK(object); - if (uma_zone_exhausted(swap_zone)) { - if (atomic_cmpset_int(&exhausted, 0, 1)) - printf("swap zone exhausted, " + if (uma_zone_exhausted(swblk_zone)) { + if (atomic_cmpset_int(&swblk_zone_exhausted, + 0, 1)) + printf("swap blk zone exhausted, " "increase kern.maxswzone\n"); vm_pageout_oom(VM_OOM_SWAPZ); - pause("swzonex", 10); + pause("swzonxb", 10); } else VM_WAIT; VM_OBJECT_WLOCK(object); - goto retry; } - - if (atomic_cmpset_int(&exhausted, 1, 0)) - printf("swap zone ok\n"); - - swap->swb_hnext = NULL; - swap->swb_object = object; - swap->swb_index = pindex & ~(vm_pindex_t)SWAP_META_MASK; - swap->swb_count = 0; - - ++object->un_pager.swp.swp_bcount; - - for (i = 0; i < SWAP_META_PAGES; ++i) - swap->swb_pages[i] = SWAPBLK_NONE; + for (;;) { + error = SWAP_PCTRIE_INSERT( + &object->un_pager.swp.swp_blks, sb); + if (error == 0) { + if (atomic_cmpset_int(&swpctrie_zone_exhausted, + 1, 0)) + printf("swpctrie zone ok\n"); + break; + } + VM_OBJECT_WUNLOCK(object); + if (uma_zone_exhausted(swpctrie_zone)) { + if (atomic_cmpset_int(&swpctrie_zone_exhausted, + 0, 1)) + printf("swap pctrie zone exhausted, " + "increase kern.maxswzone\n"); + vm_pageout_oom(VM_OOM_SWAPZ); + pause("swzonxp", 10); + } else + VM_WAIT; + VM_OBJECT_WLOCK(object); + } } + MPASS(sb->p == rdpi); - /* - * Delete prior contents of metadata - */ - idx = pindex & SWAP_META_MASK; - - if (swap->swb_pages[idx] != SWAPBLK_NONE) { - swp_pager_freeswapspace(swap->swb_pages[idx], 1); - --swap->swb_count; - } - - /* - * Enter block into metadata - */ - swap->swb_pages[idx] = swapblk; - if (swapblk != SWAPBLK_NONE) - ++swap->swb_count; -done: - mtx_unlock(&swhash_mtx); + modpi = pindex % SWAP_META_PAGES; + /* Delete prior contents of metadata. */ + if (sb->d[modpi] != SWAPBLK_NONE) + swp_pager_freeswapspace(sb->d[modpi], 1); + /* Enter block into metadata. */ + sb->d[modpi] = swapblk; } /* * SWP_PAGER_META_FREE() - free a range of blocks in the object's swap metadata * * The requested range of blocks is freed, with any associated swap * returned to the swap bitmap. * * This routine will free swap metadata structures as they are cleaned * out. This routine does *NOT* operate on swap metadata associated * with resident pages. */ static void -swp_pager_meta_free(vm_object_t object, vm_pindex_t index, vm_pindex_t count) +swp_pager_meta_free(vm_object_t object, vm_pindex_t pindex, vm_pindex_t count) { - struct swblock **pswap, *swap; - vm_pindex_t c; - daddr_t v; - int n, sidx; + struct swblk *sb; + vm_pindex_t last; + int i; + bool empty; VM_OBJECT_ASSERT_LOCKED(object); if (object->type != OBJT_SWAP || count == 0) return; - mtx_lock(&swhash_mtx); - for (c = 0; c < count;) { - pswap = swp_pager_hash(object, index); - sidx = index & SWAP_META_MASK; - n = SWAP_META_PAGES - sidx; - index += n; - if ((swap = *pswap) == NULL) { - c += n; - continue; - } - for (; c < count && sidx < SWAP_META_PAGES; ++c, ++sidx) { - if ((v = swap->swb_pages[sidx]) == SWAPBLK_NONE) + last = pindex + count - 1; + for (;;) { + sb = SWAP_PCTRIE_LOOKUP_GE(&object->un_pager.swp.swp_blks, + rounddown(pindex, SWAP_META_PAGES)); + if (sb == NULL || sb->p > last) + break; + empty = true; + for (i = 0; i < SWAP_META_PAGES; i++) { + if (sb->d[i] == SWAPBLK_NONE) continue; - swp_pager_freeswapspace(v, 1); - swap->swb_pages[sidx] = SWAPBLK_NONE; - if (--swap->swb_count == 0) { - *pswap = swap->swb_hnext; - uma_zfree(swap_zone, swap); - --object->un_pager.swp.swp_bcount; - c += SWAP_META_PAGES - sidx; - break; - } + if (pindex <= sb->p + i && sb->p + i <= last) { + swp_pager_freeswapspace(sb->d[i], 1); + sb->d[i] = SWAPBLK_NONE; + } else + empty = false; } + pindex = sb->p + SWAP_META_PAGES; + if (empty) { + SWAP_PCTRIE_REMOVE(&object->un_pager.swp.swp_blks, + sb->p); + uma_zfree(swblk_zone, sb); + } } - mtx_unlock(&swhash_mtx); } /* * SWP_PAGER_META_FREE_ALL() - destroy all swap metadata associated with object * * This routine locates and destroys all swap metadata associated with * an object. */ static void swp_pager_meta_free_all(vm_object_t object) { - struct swblock **pswap, *swap; - vm_pindex_t index; - daddr_t v; + struct swblk *sb; + vm_pindex_t pindex; int i; VM_OBJECT_ASSERT_WLOCKED(object); if (object->type != OBJT_SWAP) return; - index = 0; - while (object->un_pager.swp.swp_bcount != 0) { - mtx_lock(&swhash_mtx); - pswap = swp_pager_hash(object, index); - if ((swap = *pswap) != NULL) { - for (i = 0; i < SWAP_META_PAGES; ++i) { - v = swap->swb_pages[i]; - if (v != SWAPBLK_NONE) { - --swap->swb_count; - swp_pager_freeswapspace(v, 1); - } - } - if (swap->swb_count != 0) - panic( - "swap_pager_meta_free_all: swb_count != 0"); - *pswap = swap->swb_hnext; - uma_zfree(swap_zone, swap); - --object->un_pager.swp.swp_bcount; + for (pindex = 0; (sb = SWAP_PCTRIE_LOOKUP_GE( + &object->un_pager.swp.swp_blks, pindex)) != NULL;) { + pindex = sb->p + SWAP_META_PAGES; + for (i = 0; i < SWAP_META_PAGES; i++) { + if (sb->d[i] != SWAPBLK_NONE) + swp_pager_freeswapspace(sb->d[i], 1); } - mtx_unlock(&swhash_mtx); - index += SWAP_META_PAGES; + SWAP_PCTRIE_REMOVE(&object->un_pager.swp.swp_blks, sb->p); + uma_zfree(swblk_zone, sb); } } /* * SWP_PAGER_METACTL() - misc control of swap and vm_page_t meta data. * * This routine is capable of looking up, popping, or freeing * swapblk assignments in the swap meta data or in the vm_page_t. * The routine typically returns the swapblk being looked-up, or popped, * or SWAPBLK_NONE if the block was freed, or SWAPBLK_NONE if the block * was invalid. This routine will automatically free any invalid * meta-data swapblks. * - * It is not possible to store invalid swapblks in the swap meta data - * (other then a literal 'SWAPBLK_NONE'), so we don't bother checking. - * * When acting on a busy resident page and paging is in progress, we * have to wait until paging is complete but otherwise can act on the * busy page. * * SWM_FREE remove and free swap block from metadata * SWM_POP remove from meta data but do not free.. pop it out */ static daddr_t swp_pager_meta_ctl(vm_object_t object, vm_pindex_t pindex, int flags) { - struct swblock **pswap; - struct swblock *swap; + struct swblk *sb; daddr_t r1; - int idx; + int i; VM_OBJECT_ASSERT_LOCKED(object); /* * The meta data only exists of the object is OBJT_SWAP * and even then might not be allocated yet. */ if (object->type != OBJT_SWAP) return (SWAPBLK_NONE); - r1 = SWAPBLK_NONE; - mtx_lock(&swhash_mtx); - pswap = swp_pager_hash(object, pindex); - - if ((swap = *pswap) != NULL) { - idx = pindex & SWAP_META_MASK; - r1 = swap->swb_pages[idx]; - - if (r1 != SWAPBLK_NONE) { - if (flags & SWM_FREE) { - swp_pager_freeswapspace(r1, 1); - r1 = SWAPBLK_NONE; - } - if (flags & (SWM_FREE|SWM_POP)) { - swap->swb_pages[idx] = SWAPBLK_NONE; - if (--swap->swb_count == 0) { - *pswap = swap->swb_hnext; - uma_zfree(swap_zone, swap); - --object->un_pager.swp.swp_bcount; - } - } + sb = SWAP_PCTRIE_LOOKUP(&object->un_pager.swp.swp_blks, + rounddown(pindex, SWAP_META_PAGES)); + if (sb == NULL) + return (SWAPBLK_NONE); + r1 = sb->d[pindex % SWAP_META_PAGES]; + if (r1 == SWAPBLK_NONE) + return (SWAPBLK_NONE); + if ((flags & (SWM_FREE | SWM_POP)) != 0) { + sb->d[pindex % SWAP_META_PAGES] = SWAPBLK_NONE; + for (i = 0; i < SWAP_META_PAGES; i++) { + if (sb->d[i] != SWAPBLK_NONE) + break; } + if (i == SWAP_META_PAGES) { + SWAP_PCTRIE_REMOVE(&object->un_pager.swp.swp_blks, + rounddown(pindex, SWAP_META_PAGES)); + uma_zfree(swblk_zone, sb); + } } - mtx_unlock(&swhash_mtx); + if ((flags & SWM_FREE) != 0) { + swp_pager_freeswapspace(r1, 1); + r1 = SWAPBLK_NONE; + } return (r1); } /* * Returns the least page index which is greater than or equal to the * parameter pindex and for which there is a swap block allocated. * Returns object's size if the object's type is not swap or if there * are no allocated swap blocks for the object after the requested * pindex. */ vm_pindex_t swap_pager_find_least(vm_object_t object, vm_pindex_t pindex) { - struct swblock **pswap, *swap; - vm_pindex_t i, j, lim; - int idx; + struct swblk *sb; + int i; VM_OBJECT_ASSERT_LOCKED(object); - if (object->type != OBJT_SWAP || object->un_pager.swp.swp_bcount == 0) + if (object->type != OBJT_SWAP) return (object->size); - mtx_lock(&swhash_mtx); - for (j = pindex; j < object->size; j = lim) { - pswap = swp_pager_hash(object, j); - lim = rounddown2(j + SWAP_META_PAGES, SWAP_META_PAGES); - if (lim > object->size) - lim = object->size; - if ((swap = *pswap) != NULL) { - for (idx = j & SWAP_META_MASK, i = j; i < lim; - i++, idx++) { - if (swap->swb_pages[idx] != SWAPBLK_NONE) - goto found; - } + sb = SWAP_PCTRIE_LOOKUP_GE(&object->un_pager.swp.swp_blks, + rounddown(pindex, SWAP_META_PAGES)); + if (sb == NULL) + return (object->size); + if (sb->p < pindex) { + for (i = pindex % SWAP_META_PAGES; i < SWAP_META_PAGES; i++) { + if (sb->d[i] != SWAPBLK_NONE) + return (sb->p + i); } + sb = SWAP_PCTRIE_LOOKUP_GE(&object->un_pager.swp.swp_blks, + roundup(pindex, SWAP_META_PAGES)); + if (sb == NULL) + return (object->size); } - i = object->size; -found: - mtx_unlock(&swhash_mtx); - return (i); + for (i = 0; i < SWAP_META_PAGES; i++) { + if (sb->d[i] != SWAPBLK_NONE) + return (sb->p + i); + } + + /* + * We get here if a swblk is present in the trie but it + * doesn't map any blocks. + */ + MPASS(0); + return (object->size); } /* * System call swapon(name) enables swapping on device name, * which must be in the swdevsw. Return EBUSY * if already swapping on this device. */ #ifndef _SYS_SYSPROTO_H_ struct swapon_args { char *name; }; #endif /* * MPSAFE */ /* ARGSUSED */ int sys_swapon(struct thread *td, struct swapon_args *uap) { struct vattr attr; struct vnode *vp; struct nameidata nd; int error; error = priv_check(td, PRIV_SWAPON); if (error) return (error); sx_xlock(&swdev_syscall_lock); /* * Swap metadata may not fit in the KVM if we have physical * memory of >1GB. */ - if (swap_zone == NULL) { + if (swblk_zone == NULL) { error = ENOMEM; goto done; } NDINIT(&nd, LOOKUP, ISOPEN | FOLLOW | AUDITVNODE1, UIO_USERSPACE, uap->name, td); error = namei(&nd); if (error) goto done; NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; if (vn_isdisk(vp, &error)) { error = swapongeom(vp); } else if (vp->v_type == VREG && (vp->v_mount->mnt_vfc->vfc_flags & VFCF_NETWORK) != 0 && (error = VOP_GETATTR(vp, &attr, td->td_ucred)) == 0) { /* * Allow direct swapping to NFS regular files in the same * way that nfs_mountroot() sets up diskless swapping. */ error = swaponvp(td, vp, attr.va_size / DEV_BSIZE); } if (error) vrele(vp); done: sx_xunlock(&swdev_syscall_lock); return (error); } /* * Check that the total amount of swap currently configured does not * exceed half the theoretical maximum. If it does, print a warning * message and return -1; otherwise, return 0. */ static int swapon_check_swzone(unsigned long npages) { unsigned long maxpages; /* absolute maximum we can handle assuming 100% efficiency */ - maxpages = uma_zone_get_max(swap_zone) * SWAP_META_PAGES; + maxpages = uma_zone_get_max(swblk_zone) * SWAP_META_PAGES; /* recommend using no more than half that amount */ if (npages > maxpages / 2) { printf("warning: total configured swap (%lu pages) " "exceeds maximum recommended amount (%lu pages).\n", npages, maxpages / 2); printf("warning: increase kern.maxswzone " "or reduce amount of swap.\n"); return (-1); } return (0); } static void swaponsomething(struct vnode *vp, void *id, u_long nblks, sw_strategy_t *strategy, sw_close_t *close, dev_t dev, int flags) { struct swdevt *sp, *tsp; swblk_t dvbase; u_long mblocks; /* * nblks is in DEV_BSIZE'd chunks, convert to PAGE_SIZE'd chunks. * First chop nblks off to page-align it, then convert. * * sw->sw_nblks is in page-sized chunks now too. */ nblks &= ~(ctodb(1) - 1); nblks = dbtoc(nblks); /* * If we go beyond this, we get overflows in the radix * tree bitmap code. */ mblocks = 0x40000000 / BLIST_META_RADIX; if (nblks > mblocks) { printf( "WARNING: reducing swap size to maximum of %luMB per unit\n", mblocks / 1024 / 1024 * PAGE_SIZE); nblks = mblocks; } sp = malloc(sizeof *sp, M_VMPGDATA, M_WAITOK | M_ZERO); sp->sw_vp = vp; sp->sw_id = id; sp->sw_dev = dev; sp->sw_flags = 0; sp->sw_nblks = nblks; sp->sw_used = 0; sp->sw_strategy = strategy; sp->sw_close = close; sp->sw_flags = flags; sp->sw_blist = blist_create(nblks, M_WAITOK); /* * Do not free the first two block in order to avoid overwriting * any bsd label at the front of the partition */ blist_free(sp->sw_blist, 2, nblks - 2); dvbase = 0; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(tsp, &swtailq, sw_list) { if (tsp->sw_end >= dvbase) { /* * We put one uncovered page between the devices * in order to definitively prevent any cross-device * I/O requests */ dvbase = tsp->sw_end + 1; } } sp->sw_first = dvbase; sp->sw_end = dvbase + nblks; TAILQ_INSERT_TAIL(&swtailq, sp, sw_list); nswapdev++; swap_pager_avail += nblks - 2; swap_total += (vm_ooffset_t)nblks * PAGE_SIZE; swapon_check_swzone(swap_total / PAGE_SIZE); swp_sizecheck(); mtx_unlock(&sw_dev_mtx); EVENTHANDLER_INVOKE(swapon, sp); } /* * SYSCALL: swapoff(devname) * * Disable swapping on the given device. * * XXX: Badly designed system call: it should use a device index * rather than filename as specification. We keep sw_vp around * only to make this work. */ #ifndef _SYS_SYSPROTO_H_ struct swapoff_args { char *name; }; #endif /* * MPSAFE */ /* ARGSUSED */ int sys_swapoff(struct thread *td, struct swapoff_args *uap) { struct vnode *vp; struct nameidata nd; struct swdevt *sp; int error; error = priv_check(td, PRIV_SWAPOFF); if (error) return (error); sx_xlock(&swdev_syscall_lock); NDINIT(&nd, LOOKUP, FOLLOW | AUDITVNODE1, UIO_USERSPACE, uap->name, td); error = namei(&nd); if (error) goto done; NDFREE(&nd, NDF_ONLY_PNBUF); vp = nd.ni_vp; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { if (sp->sw_vp == vp) break; } mtx_unlock(&sw_dev_mtx); if (sp == NULL) { error = EINVAL; goto done; } error = swapoff_one(sp, td->td_ucred); done: sx_xunlock(&swdev_syscall_lock); return (error); } static int swapoff_one(struct swdevt *sp, struct ucred *cred) { u_long nblks; #ifdef MAC int error; #endif sx_assert(&swdev_syscall_lock, SA_XLOCKED); #ifdef MAC (void) vn_lock(sp->sw_vp, LK_EXCLUSIVE | LK_RETRY); error = mac_system_check_swapoff(cred, sp->sw_vp); (void) VOP_UNLOCK(sp->sw_vp, 0); if (error != 0) return (error); #endif nblks = sp->sw_nblks; /* * We can turn off this swap device safely only if the * available virtual memory in the system will fit the amount * of data we will have to page back in, plus an epsilon so * the system doesn't become critically low on swap space. */ if (vm_cnt.v_free_count + swap_pager_avail < nblks + nswap_lowat) return (ENOMEM); /* * Prevent further allocations on this device. */ mtx_lock(&sw_dev_mtx); sp->sw_flags |= SW_CLOSING; swap_pager_avail -= blist_fill(sp->sw_blist, 0, nblks); swap_total -= (vm_ooffset_t)nblks * PAGE_SIZE; mtx_unlock(&sw_dev_mtx); /* * Page in the contents of the device and close it. */ swap_pager_swapoff(sp); sp->sw_close(curthread, sp); mtx_lock(&sw_dev_mtx); sp->sw_id = NULL; TAILQ_REMOVE(&swtailq, sp, sw_list); nswapdev--; if (nswapdev == 0) { swap_pager_full = 2; swap_pager_almost_full = 1; } if (swdevhd == sp) swdevhd = NULL; mtx_unlock(&sw_dev_mtx); blist_destroy(sp->sw_blist); free(sp, M_VMPGDATA); return (0); } void swapoff_all(void) { struct swdevt *sp, *spt; const char *devname; int error; sx_xlock(&swdev_syscall_lock); mtx_lock(&sw_dev_mtx); TAILQ_FOREACH_SAFE(sp, &swtailq, sw_list, spt) { mtx_unlock(&sw_dev_mtx); if (vn_isdisk(sp->sw_vp, NULL)) devname = devtoname(sp->sw_vp->v_rdev); else devname = "[file]"; error = swapoff_one(sp, thread0.td_ucred); if (error != 0) { printf("Cannot remove swap device %s (error=%d), " "skipping.\n", devname, error); } else if (bootverbose) { printf("Swap device %s removed.\n", devname); } mtx_lock(&sw_dev_mtx); } mtx_unlock(&sw_dev_mtx); sx_xunlock(&swdev_syscall_lock); } void swap_pager_status(int *total, int *used) { struct swdevt *sp; *total = 0; *used = 0; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { *total += sp->sw_nblks; *used += sp->sw_used; } mtx_unlock(&sw_dev_mtx); } int swap_dev_info(int name, struct xswdev *xs, char *devname, size_t len) { struct swdevt *sp; const char *tmp_devname; int error, n; n = 0; error = ENOENT; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { if (n != name) { n++; continue; } xs->xsw_version = XSWDEV_VERSION; xs->xsw_dev = sp->sw_dev; xs->xsw_flags = sp->sw_flags; xs->xsw_nblks = sp->sw_nblks; xs->xsw_used = sp->sw_used; if (devname != NULL) { if (vn_isdisk(sp->sw_vp, NULL)) tmp_devname = devtoname(sp->sw_vp->v_rdev); else tmp_devname = "[file]"; strncpy(devname, tmp_devname, len); } error = 0; break; } mtx_unlock(&sw_dev_mtx); return (error); } #if defined(COMPAT_FREEBSD11) #define XSWDEV_VERSION_11 1 struct xswdev11 { u_int xsw_version; uint32_t xsw_dev; int xsw_flags; int xsw_nblks; int xsw_used; }; #endif static int sysctl_vm_swap_info(SYSCTL_HANDLER_ARGS) { struct xswdev xs; #if defined(COMPAT_FREEBSD11) struct xswdev11 xs11; #endif int error; if (arg2 != 1) /* name length */ return (EINVAL); error = swap_dev_info(*(int *)arg1, &xs, NULL, 0); if (error != 0) return (error); #if defined(COMPAT_FREEBSD11) if (req->oldlen == sizeof(xs11)) { xs11.xsw_version = XSWDEV_VERSION_11; xs11.xsw_dev = xs.xsw_dev; /* truncation */ xs11.xsw_flags = xs.xsw_flags; xs11.xsw_nblks = xs.xsw_nblks; xs11.xsw_used = xs.xsw_used; error = SYSCTL_OUT(req, &xs11, sizeof(xs11)); } else #endif error = SYSCTL_OUT(req, &xs, sizeof(xs)); return (error); } SYSCTL_INT(_vm, OID_AUTO, nswapdev, CTLFLAG_RD, &nswapdev, 0, "Number of swap devices"); SYSCTL_NODE(_vm, OID_AUTO, swap_info, CTLFLAG_RD | CTLFLAG_MPSAFE, sysctl_vm_swap_info, "Swap statistics by device"); /* - * vmspace_swap_count() - count the approximate swap usage in pages for a - * vmspace. - * - * The map must be locked. - * - * Swap usage is determined by taking the proportional swap used by - * VM objects backing the VM map. To make up for fractional losses, - * if the VM object has any swap use at all the associated map entries - * count for at least 1 swap page. + * Count the approximate swap usage in pages for a vmspace. The + * shadowed or not yet copied on write swap blocks are not accounted. + * The map must be locked. */ long vmspace_swap_count(struct vmspace *vmspace) { vm_map_t map; vm_map_entry_t cur; vm_object_t object; - long count, n; + struct swblk *sb; + vm_pindex_t e, pi; + long count; + int i; map = &vmspace->vm_map; count = 0; for (cur = map->header.next; cur != &map->header; cur = cur->next) { - if ((cur->eflags & MAP_ENTRY_IS_SUB_MAP) == 0 && - (object = cur->object.vm_object) != NULL) { - VM_OBJECT_WLOCK(object); - if (object->type == OBJT_SWAP && - object->un_pager.swp.swp_bcount != 0) { - n = (cur->end - cur->start) / PAGE_SIZE; - count += object->un_pager.swp.swp_bcount * - SWAP_META_PAGES * n / object->size + 1; + if ((cur->eflags & MAP_ENTRY_IS_SUB_MAP) != 0) + continue; + object = cur->object.vm_object; + if (object == NULL || object->type != OBJT_SWAP) + continue; + VM_OBJECT_RLOCK(object); + if (object->type != OBJT_SWAP) + goto unlock; + pi = OFF_TO_IDX(cur->offset); + e = pi + OFF_TO_IDX(cur->end - cur->start); + for (;; pi = sb->p + SWAP_META_PAGES) { + sb = SWAP_PCTRIE_LOOKUP_GE( + &object->un_pager.swp.swp_blks, pi); + if (sb == NULL || sb->p >= e) + break; + for (i = 0; i < SWAP_META_PAGES; i++) { + if (sb->p + i < e && + sb->d[i] != SWAPBLK_NONE) + count++; } - VM_OBJECT_WUNLOCK(object); } +unlock: + VM_OBJECT_RUNLOCK(object); } return (count); } /* * GEOM backend * * Swapping onto disk devices. * */ static g_orphan_t swapgeom_orphan; static struct g_class g_swap_class = { .name = "SWAP", .version = G_VERSION, .orphan = swapgeom_orphan, }; DECLARE_GEOM_CLASS(g_swap_class, g_class); static void swapgeom_close_ev(void *arg, int flags) { struct g_consumer *cp; cp = arg; g_access(cp, -1, -1, 0); g_detach(cp); g_destroy_consumer(cp); } /* * Add a reference to the g_consumer for an inflight transaction. */ static void swapgeom_acquire(struct g_consumer *cp) { mtx_assert(&sw_dev_mtx, MA_OWNED); cp->index++; } /* * Remove a reference from the g_consumer. Post a close event if all * references go away, since the function might be called from the * biodone context. */ static void swapgeom_release(struct g_consumer *cp, struct swdevt *sp) { mtx_assert(&sw_dev_mtx, MA_OWNED); cp->index--; if (cp->index == 0) { if (g_post_event(swapgeom_close_ev, cp, M_NOWAIT, NULL) == 0) sp->sw_id = NULL; } } static void swapgeom_done(struct bio *bp2) { struct swdevt *sp; struct buf *bp; struct g_consumer *cp; bp = bp2->bio_caller2; cp = bp2->bio_from; bp->b_ioflags = bp2->bio_flags; if (bp2->bio_error) bp->b_ioflags |= BIO_ERROR; bp->b_resid = bp->b_bcount - bp2->bio_completed; bp->b_error = bp2->bio_error; bufdone(bp); sp = bp2->bio_caller1; mtx_lock(&sw_dev_mtx); swapgeom_release(cp, sp); mtx_unlock(&sw_dev_mtx); g_destroy_bio(bp2); } static void swapgeom_strategy(struct buf *bp, struct swdevt *sp) { struct bio *bio; struct g_consumer *cp; mtx_lock(&sw_dev_mtx); cp = sp->sw_id; if (cp == NULL) { mtx_unlock(&sw_dev_mtx); bp->b_error = ENXIO; bp->b_ioflags |= BIO_ERROR; bufdone(bp); return; } swapgeom_acquire(cp); mtx_unlock(&sw_dev_mtx); if (bp->b_iocmd == BIO_WRITE) bio = g_new_bio(); else bio = g_alloc_bio(); if (bio == NULL) { mtx_lock(&sw_dev_mtx); swapgeom_release(cp, sp); mtx_unlock(&sw_dev_mtx); bp->b_error = ENOMEM; bp->b_ioflags |= BIO_ERROR; bufdone(bp); return; } bio->bio_caller1 = sp; bio->bio_caller2 = bp; bio->bio_cmd = bp->b_iocmd; bio->bio_offset = (bp->b_blkno - sp->sw_first) * PAGE_SIZE; bio->bio_length = bp->b_bcount; bio->bio_done = swapgeom_done; if (!buf_mapped(bp)) { bio->bio_ma = bp->b_pages; bio->bio_data = unmapped_buf; bio->bio_ma_offset = (vm_offset_t)bp->b_offset & PAGE_MASK; bio->bio_ma_n = bp->b_npages; bio->bio_flags |= BIO_UNMAPPED; } else { bio->bio_data = bp->b_data; bio->bio_ma = NULL; } g_io_request(bio, cp); return; } static void swapgeom_orphan(struct g_consumer *cp) { struct swdevt *sp; int destroy; mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { if (sp->sw_id == cp) { sp->sw_flags |= SW_CLOSING; break; } } /* * Drop reference we were created with. Do directly since we're in a * special context where we don't have to queue the call to * swapgeom_close_ev(). */ cp->index--; destroy = ((sp != NULL) && (cp->index == 0)); if (destroy) sp->sw_id = NULL; mtx_unlock(&sw_dev_mtx); if (destroy) swapgeom_close_ev(cp, 0); } static void swapgeom_close(struct thread *td, struct swdevt *sw) { struct g_consumer *cp; mtx_lock(&sw_dev_mtx); cp = sw->sw_id; sw->sw_id = NULL; mtx_unlock(&sw_dev_mtx); /* * swapgeom_close() may be called from the biodone context, * where we cannot perform topology changes. Delegate the * work to the events thread. */ if (cp != NULL) g_waitfor_event(swapgeom_close_ev, cp, M_WAITOK, NULL); } static int swapongeom_locked(struct cdev *dev, struct vnode *vp) { struct g_provider *pp; struct g_consumer *cp; static struct g_geom *gp; struct swdevt *sp; u_long nblks; int error; pp = g_dev_getprovider(dev); if (pp == NULL) return (ENODEV); mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { cp = sp->sw_id; if (cp != NULL && cp->provider == pp) { mtx_unlock(&sw_dev_mtx); return (EBUSY); } } mtx_unlock(&sw_dev_mtx); if (gp == NULL) gp = g_new_geomf(&g_swap_class, "swap"); cp = g_new_consumer(gp); cp->index = 1; /* Number of active I/Os, plus one for being active. */ cp->flags |= G_CF_DIRECT_SEND | G_CF_DIRECT_RECEIVE; g_attach(cp, pp); /* * XXX: Every time you think you can improve the margin for * footshooting, somebody depends on the ability to do so: * savecore(8) wants to write to our swapdev so we cannot * set an exclusive count :-( */ error = g_access(cp, 1, 1, 0); if (error != 0) { g_detach(cp); g_destroy_consumer(cp); return (error); } nblks = pp->mediasize / DEV_BSIZE; swaponsomething(vp, cp, nblks, swapgeom_strategy, swapgeom_close, dev2udev(dev), (pp->flags & G_PF_ACCEPT_UNMAPPED) != 0 ? SW_UNMAPPED : 0); return (0); } static int swapongeom(struct vnode *vp) { int error; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); if (vp->v_type != VCHR || (vp->v_iflag & VI_DOOMED) != 0) { error = ENOENT; } else { g_topology_lock(); error = swapongeom_locked(vp->v_rdev, vp); g_topology_unlock(); } VOP_UNLOCK(vp, 0); return (error); } /* * VNODE backend * * This is used mainly for network filesystem (read: probably only tested * with NFS) swapfiles. * */ static void swapdev_strategy(struct buf *bp, struct swdevt *sp) { struct vnode *vp2; bp->b_blkno = ctodb(bp->b_blkno - sp->sw_first); vp2 = sp->sw_id; vhold(vp2); if (bp->b_iocmd == BIO_WRITE) { if (bp->b_bufobj) bufobj_wdrop(bp->b_bufobj); bufobj_wref(&vp2->v_bufobj); } if (bp->b_bufobj != &vp2->v_bufobj) bp->b_bufobj = &vp2->v_bufobj; bp->b_vp = vp2; bp->b_iooffset = dbtob(bp->b_blkno); bstrategy(bp); return; } static void swapdev_close(struct thread *td, struct swdevt *sp) { VOP_CLOSE(sp->sw_vp, FREAD | FWRITE, td->td_ucred, td); vrele(sp->sw_vp); } static int swaponvp(struct thread *td, struct vnode *vp, u_long nblks) { struct swdevt *sp; int error; if (nblks == 0) return (ENXIO); mtx_lock(&sw_dev_mtx); TAILQ_FOREACH(sp, &swtailq, sw_list) { if (sp->sw_id == vp) { mtx_unlock(&sw_dev_mtx); return (EBUSY); } } mtx_unlock(&sw_dev_mtx); (void) vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); #ifdef MAC error = mac_system_check_swapon(td->td_ucred, vp); if (error == 0) #endif error = VOP_OPEN(vp, FREAD | FWRITE, td->td_ucred, td, NULL); (void) VOP_UNLOCK(vp, 0); if (error) return (error); swaponsomething(vp, vp, nblks, swapdev_strategy, swapdev_close, NODEV, 0); return (0); } static int sysctl_swap_async_max(SYSCTL_HANDLER_ARGS) { int error, new, n; new = nsw_wcount_async_max; error = sysctl_handle_int(oidp, &new, 0, req); if (error != 0 || req->newptr == NULL) return (error); if (new > nswbuf / 2 || new < 1) return (EINVAL); mtx_lock(&pbuf_mtx); while (nsw_wcount_async_max != new) { /* * Adjust difference. If the current async count is too low, * we will need to sqeeze our update slowly in. Sleep with a * higher priority than getpbuf() to finish faster. */ n = new - nsw_wcount_async_max; if (nsw_wcount_async + n >= 0) { nsw_wcount_async += n; nsw_wcount_async_max += n; wakeup(&nsw_wcount_async); } else { nsw_wcount_async_max -= nsw_wcount_async; nsw_wcount_async = 0; msleep(&nsw_wcount_async, &pbuf_mtx, PSWP, "swpsysctl", 0); } } mtx_unlock(&pbuf_mtx); return (0); } Index: projects/runtime-coverage/sys/vm/vm_object.c =================================================================== --- projects/runtime-coverage/sys/vm/vm_object.c (revision 322921) +++ projects/runtime-coverage/sys/vm/vm_object.c (revision 322922) @@ -1,2663 +1,2675 @@ /*- * Copyright (c) 1991, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * The Mach Operating System project at Carnegie-Mellon University. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)vm_object.c 8.5 (Berkeley) 3/22/94 * * * Copyright (c) 1987, 1990 Carnegie-Mellon University. * All rights reserved. * * Authors: Avadis Tevanian, Jr., Michael Wayne Young * * Permission to use, copy, modify and distribute this software and * its documentation is hereby granted, provided that both the copyright * notice and this permission notice appear in all copies of the * software, derivative works or modified versions, and any portions * thereof, and that both notices appear in supporting documentation. * * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS" * CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND * FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE. * * Carnegie Mellon requests users of this software to return to * * Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU * School of Computer Science * Carnegie Mellon University * Pittsburgh PA 15213-3890 * * any improvements or extensions that they make and grant Carnegie the * rights to redistribute these changes. */ /* * Virtual memory object module. */ #include __FBSDID("$FreeBSD$"); #include "opt_vm.h" #include #include #include #include #include #include +#include #include #include #include /* for curproc, pageproc */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static int old_msync; SYSCTL_INT(_vm, OID_AUTO, old_msync, CTLFLAG_RW, &old_msync, 0, "Use old (insecure) msync behavior"); static int vm_object_page_collect_flush(vm_object_t object, vm_page_t p, int pagerflags, int flags, boolean_t *clearobjflags, boolean_t *eio); static boolean_t vm_object_page_remove_write(vm_page_t p, int flags, boolean_t *clearobjflags); static void vm_object_qcollapse(vm_object_t object); static void vm_object_vndeallocate(vm_object_t object); /* * Virtual memory objects maintain the actual data * associated with allocated virtual memory. A given * page of memory exists within exactly one object. * * An object is only deallocated when all "references" * are given up. Only one "reference" to a given * region of an object should be writeable. * * Associated with each object is a list of all resident * memory pages belonging to that object; this list is * maintained by the "vm_page" module, and locked by the object's * lock. * * Each object also records a "pager" routine which is * used to retrieve (and store) pages to the proper backing * storage. In addition, objects may be backed by other * objects from which they were virtual-copied. * * The only items within the object structure which are * modified after time of creation are: * reference count locked by object's lock * pager routine locked by object's lock * */ struct object_q vm_object_list; struct mtx vm_object_list_mtx; /* lock for object list and count */ struct vm_object kernel_object_store; struct vm_object kmem_object_store; static SYSCTL_NODE(_vm_stats, OID_AUTO, object, CTLFLAG_RD, 0, "VM object stats"); static long object_collapses; SYSCTL_LONG(_vm_stats_object, OID_AUTO, collapses, CTLFLAG_RD, &object_collapses, 0, "VM object collapses"); static long object_bypasses; SYSCTL_LONG(_vm_stats_object, OID_AUTO, bypasses, CTLFLAG_RD, &object_bypasses, 0, "VM object bypasses"); static uma_zone_t obj_zone; static int vm_object_zinit(void *mem, int size, int flags); #ifdef INVARIANTS static void vm_object_zdtor(void *mem, int size, void *arg); static void vm_object_zdtor(void *mem, int size, void *arg) { vm_object_t object; object = (vm_object_t)mem; KASSERT(object->ref_count == 0, ("object %p ref_count = %d", object, object->ref_count)); KASSERT(TAILQ_EMPTY(&object->memq), ("object %p has resident pages in its memq", object)); KASSERT(vm_radix_is_empty(&object->rtree), ("object %p has resident pages in its trie", object)); #if VM_NRESERVLEVEL > 0 KASSERT(LIST_EMPTY(&object->rvq), ("object %p has reservations", object)); #endif KASSERT(object->paging_in_progress == 0, ("object %p paging_in_progress = %d", object, object->paging_in_progress)); KASSERT(object->resident_page_count == 0, ("object %p resident_page_count = %d", object, object->resident_page_count)); KASSERT(object->shadow_count == 0, ("object %p shadow_count = %d", object, object->shadow_count)); KASSERT(object->type == OBJT_DEAD, ("object %p has non-dead type %d", object, object->type)); } #endif static int vm_object_zinit(void *mem, int size, int flags) { vm_object_t object; object = (vm_object_t)mem; rw_init_flags(&object->lock, "vm object", RW_DUPOK | RW_NEW); /* These are true for any object that has been freed */ object->type = OBJT_DEAD; object->ref_count = 0; vm_radix_init(&object->rtree); object->paging_in_progress = 0; object->resident_page_count = 0; object->shadow_count = 0; + object->flags = OBJ_DEAD; mtx_lock(&vm_object_list_mtx); TAILQ_INSERT_TAIL(&vm_object_list, object, object_list); mtx_unlock(&vm_object_list_mtx); return (0); } static void _vm_object_allocate(objtype_t type, vm_pindex_t size, vm_object_t object) { TAILQ_INIT(&object->memq); LIST_INIT(&object->shadow_head); object->type = type; + if (type == OBJT_SWAP) + pctrie_init(&object->un_pager.swp.swp_blks); + + /* + * Ensure that swap_pager_swapoff() iteration over object_list + * sees up to date type and pctrie head if it observed + * non-dead object. + */ + atomic_thread_fence_rel(); + switch (type) { case OBJT_DEAD: panic("_vm_object_allocate: can't create OBJT_DEAD"); case OBJT_DEFAULT: case OBJT_SWAP: object->flags = OBJ_ONEMAPPING; break; case OBJT_DEVICE: case OBJT_SG: object->flags = OBJ_FICTITIOUS | OBJ_UNMANAGED; break; case OBJT_MGTDEVICE: object->flags = OBJ_FICTITIOUS; break; case OBJT_PHYS: object->flags = OBJ_UNMANAGED; break; case OBJT_VNODE: object->flags = 0; break; default: panic("_vm_object_allocate: type %d is undefined", type); } object->size = size; object->generation = 1; object->ref_count = 1; object->memattr = VM_MEMATTR_DEFAULT; object->cred = NULL; object->charge = 0; object->handle = NULL; object->backing_object = NULL; object->backing_object_offset = (vm_ooffset_t) 0; #if VM_NRESERVLEVEL > 0 LIST_INIT(&object->rvq); #endif umtx_shm_object_init(object); } /* * vm_object_init: * * Initialize the VM objects module. */ void vm_object_init(void) { TAILQ_INIT(&vm_object_list); mtx_init(&vm_object_list_mtx, "vm object_list", NULL, MTX_DEF); rw_init(&kernel_object->lock, "kernel vm object"); _vm_object_allocate(OBJT_PHYS, atop(VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS), kernel_object); #if VM_NRESERVLEVEL > 0 kernel_object->flags |= OBJ_COLORED; kernel_object->pg_color = (u_short)atop(VM_MIN_KERNEL_ADDRESS); #endif rw_init(&kmem_object->lock, "kmem vm object"); _vm_object_allocate(OBJT_PHYS, atop(VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS), kmem_object); #if VM_NRESERVLEVEL > 0 kmem_object->flags |= OBJ_COLORED; kmem_object->pg_color = (u_short)atop(VM_MIN_KERNEL_ADDRESS); #endif /* * The lock portion of struct vm_object must be type stable due * to vm_pageout_fallback_object_lock locking a vm object * without holding any references to it. */ obj_zone = uma_zcreate("VM OBJECT", sizeof (struct vm_object), NULL, #ifdef INVARIANTS vm_object_zdtor, #else NULL, #endif vm_object_zinit, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); vm_radix_zinit(); } void vm_object_clear_flag(vm_object_t object, u_short bits) { VM_OBJECT_ASSERT_WLOCKED(object); object->flags &= ~bits; } /* * Sets the default memory attribute for the specified object. Pages * that are allocated to this object are by default assigned this memory * attribute. * * Presently, this function must be called before any pages are allocated * to the object. In the future, this requirement may be relaxed for * "default" and "swap" objects. */ int vm_object_set_memattr(vm_object_t object, vm_memattr_t memattr) { VM_OBJECT_ASSERT_WLOCKED(object); switch (object->type) { case OBJT_DEFAULT: case OBJT_DEVICE: case OBJT_MGTDEVICE: case OBJT_PHYS: case OBJT_SG: case OBJT_SWAP: case OBJT_VNODE: if (!TAILQ_EMPTY(&object->memq)) return (KERN_FAILURE); break; case OBJT_DEAD: return (KERN_INVALID_ARGUMENT); default: panic("vm_object_set_memattr: object %p is of undefined type", object); } object->memattr = memattr; return (KERN_SUCCESS); } void vm_object_pip_add(vm_object_t object, short i) { VM_OBJECT_ASSERT_WLOCKED(object); object->paging_in_progress += i; } void vm_object_pip_subtract(vm_object_t object, short i) { VM_OBJECT_ASSERT_WLOCKED(object); object->paging_in_progress -= i; } void vm_object_pip_wakeup(vm_object_t object) { VM_OBJECT_ASSERT_WLOCKED(object); object->paging_in_progress--; if ((object->flags & OBJ_PIPWNT) && object->paging_in_progress == 0) { vm_object_clear_flag(object, OBJ_PIPWNT); wakeup(object); } } void vm_object_pip_wakeupn(vm_object_t object, short i) { VM_OBJECT_ASSERT_WLOCKED(object); if (i) object->paging_in_progress -= i; if ((object->flags & OBJ_PIPWNT) && object->paging_in_progress == 0) { vm_object_clear_flag(object, OBJ_PIPWNT); wakeup(object); } } void vm_object_pip_wait(vm_object_t object, char *waitid) { VM_OBJECT_ASSERT_WLOCKED(object); while (object->paging_in_progress) { object->flags |= OBJ_PIPWNT; VM_OBJECT_SLEEP(object, object, PVM, waitid, 0); } } /* * vm_object_allocate: * * Returns a new object with the given size. */ vm_object_t vm_object_allocate(objtype_t type, vm_pindex_t size) { vm_object_t object; object = (vm_object_t)uma_zalloc(obj_zone, M_WAITOK); _vm_object_allocate(type, size, object); return (object); } /* * vm_object_reference: * * Gets another reference to the given object. Note: OBJ_DEAD * objects can be referenced during final cleaning. */ void vm_object_reference(vm_object_t object) { if (object == NULL) return; VM_OBJECT_WLOCK(object); vm_object_reference_locked(object); VM_OBJECT_WUNLOCK(object); } /* * vm_object_reference_locked: * * Gets another reference to the given object. * * The object must be locked. */ void vm_object_reference_locked(vm_object_t object) { struct vnode *vp; VM_OBJECT_ASSERT_WLOCKED(object); object->ref_count++; if (object->type == OBJT_VNODE) { vp = object->handle; vref(vp); } } /* * Handle deallocating an object of type OBJT_VNODE. */ static void vm_object_vndeallocate(vm_object_t object) { struct vnode *vp = (struct vnode *) object->handle; VM_OBJECT_ASSERT_WLOCKED(object); KASSERT(object->type == OBJT_VNODE, ("vm_object_vndeallocate: not a vnode object")); KASSERT(vp != NULL, ("vm_object_vndeallocate: missing vp")); #ifdef INVARIANTS if (object->ref_count == 0) { vn_printf(vp, "vm_object_vndeallocate "); panic("vm_object_vndeallocate: bad object reference count"); } #endif if (!umtx_shm_vnobj_persistent && object->ref_count == 1) umtx_shm_object_terminated(object); /* * The test for text of vp vnode does not need a bypass to * reach right VV_TEXT there, since it is obtained from * object->handle. */ if (object->ref_count > 1 || (vp->v_vflag & VV_TEXT) == 0) { object->ref_count--; VM_OBJECT_WUNLOCK(object); /* vrele may need the vnode lock. */ vrele(vp); } else { vhold(vp); VM_OBJECT_WUNLOCK(object); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); vdrop(vp); VM_OBJECT_WLOCK(object); object->ref_count--; if (object->type == OBJT_DEAD) { VM_OBJECT_WUNLOCK(object); VOP_UNLOCK(vp, 0); } else { if (object->ref_count == 0) VOP_UNSET_TEXT(vp); VM_OBJECT_WUNLOCK(object); vput(vp); } } } /* * vm_object_deallocate: * * Release a reference to the specified object, * gained either through a vm_object_allocate * or a vm_object_reference call. When all references * are gone, storage associated with this object * may be relinquished. * * No object may be locked. */ void vm_object_deallocate(vm_object_t object) { vm_object_t temp; struct vnode *vp; while (object != NULL) { VM_OBJECT_WLOCK(object); if (object->type == OBJT_VNODE) { vm_object_vndeallocate(object); return; } KASSERT(object->ref_count != 0, ("vm_object_deallocate: object deallocated too many times: %d", object->type)); /* * If the reference count goes to 0 we start calling * vm_object_terminate() on the object chain. * A ref count of 1 may be a special case depending on the * shadow count being 0 or 1. */ object->ref_count--; if (object->ref_count > 1) { VM_OBJECT_WUNLOCK(object); return; } else if (object->ref_count == 1) { if (object->type == OBJT_SWAP && (object->flags & OBJ_TMPFS) != 0) { vp = object->un_pager.swp.swp_tmpfs; vhold(vp); VM_OBJECT_WUNLOCK(object); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); VM_OBJECT_WLOCK(object); if (object->type == OBJT_DEAD || object->ref_count != 1) { VM_OBJECT_WUNLOCK(object); VOP_UNLOCK(vp, 0); vdrop(vp); return; } if ((object->flags & OBJ_TMPFS) != 0) VOP_UNSET_TEXT(vp); VOP_UNLOCK(vp, 0); vdrop(vp); } if (object->shadow_count == 0 && object->handle == NULL && (object->type == OBJT_DEFAULT || (object->type == OBJT_SWAP && (object->flags & OBJ_TMPFS_NODE) == 0))) { vm_object_set_flag(object, OBJ_ONEMAPPING); } else if ((object->shadow_count == 1) && (object->handle == NULL) && (object->type == OBJT_DEFAULT || object->type == OBJT_SWAP)) { vm_object_t robject; robject = LIST_FIRST(&object->shadow_head); KASSERT(robject != NULL, ("vm_object_deallocate: ref_count: %d, shadow_count: %d", object->ref_count, object->shadow_count)); KASSERT((robject->flags & OBJ_TMPFS_NODE) == 0, ("shadowed tmpfs v_object %p", object)); if (!VM_OBJECT_TRYWLOCK(robject)) { /* * Avoid a potential deadlock. */ object->ref_count++; VM_OBJECT_WUNLOCK(object); /* * More likely than not the thread * holding robject's lock has lower * priority than the current thread. * Let the lower priority thread run. */ pause("vmo_de", 1); continue; } /* * Collapse object into its shadow unless its * shadow is dead. In that case, object will * be deallocated by the thread that is * deallocating its shadow. */ if ((robject->flags & OBJ_DEAD) == 0 && (robject->handle == NULL) && (robject->type == OBJT_DEFAULT || robject->type == OBJT_SWAP)) { robject->ref_count++; retry: if (robject->paging_in_progress) { VM_OBJECT_WUNLOCK(object); vm_object_pip_wait(robject, "objde1"); temp = robject->backing_object; if (object == temp) { VM_OBJECT_WLOCK(object); goto retry; } } else if (object->paging_in_progress) { VM_OBJECT_WUNLOCK(robject); object->flags |= OBJ_PIPWNT; VM_OBJECT_SLEEP(object, object, PDROP | PVM, "objde2", 0); VM_OBJECT_WLOCK(robject); temp = robject->backing_object; if (object == temp) { VM_OBJECT_WLOCK(object); goto retry; } } else VM_OBJECT_WUNLOCK(object); if (robject->ref_count == 1) { robject->ref_count--; object = robject; goto doterm; } object = robject; vm_object_collapse(object); VM_OBJECT_WUNLOCK(object); continue; } VM_OBJECT_WUNLOCK(robject); } VM_OBJECT_WUNLOCK(object); return; } doterm: umtx_shm_object_terminated(object); temp = object->backing_object; if (temp != NULL) { KASSERT((object->flags & OBJ_TMPFS_NODE) == 0, ("shadowed tmpfs v_object 2 %p", object)); VM_OBJECT_WLOCK(temp); LIST_REMOVE(object, shadow_list); temp->shadow_count--; VM_OBJECT_WUNLOCK(temp); object->backing_object = NULL; } /* * Don't double-terminate, we could be in a termination * recursion due to the terminate having to sync data * to disk. */ if ((object->flags & OBJ_DEAD) == 0) vm_object_terminate(object); else VM_OBJECT_WUNLOCK(object); object = temp; } } /* * vm_object_destroy removes the object from the global object list * and frees the space for the object. */ void vm_object_destroy(vm_object_t object) { /* * Release the allocation charge. */ if (object->cred != NULL) { swap_release_by_cred(object->charge, object->cred); object->charge = 0; crfree(object->cred); object->cred = NULL; } /* * Free the space for the object. */ uma_zfree(obj_zone, object); } /* * vm_object_terminate_pages removes any remaining pageable pages * from the object and resets the object to an empty state. */ static void vm_object_terminate_pages(vm_object_t object) { vm_page_t p, p_next; VM_OBJECT_ASSERT_WLOCKED(object); /* * Free any remaining pageable pages. This also removes them from the * paging queues. However, don't free wired pages, just remove them * from the object. Rather than incrementally removing each page from * the object, the page and object are reset to any empty state. */ TAILQ_FOREACH_SAFE(p, &object->memq, listq, p_next) { vm_page_assert_unbusied(p); vm_page_lock(p); /* * Optimize the page's removal from the object by resetting * its "object" field. Specifically, if the page is not * wired, then the effect of this assignment is that * vm_page_free()'s call to vm_page_remove() will return * immediately without modifying the page or the object. */ p->object = NULL; if (p->wire_count == 0) { vm_page_free(p); VM_CNT_INC(v_pfree); } vm_page_unlock(p); } /* * If the object contained any pages, then reset it to an empty state. * None of the object's fields, including "resident_page_count", were * modified by the preceding loop. */ if (object->resident_page_count != 0) { vm_radix_reclaim_allnodes(&object->rtree); TAILQ_INIT(&object->memq); object->resident_page_count = 0; if (object->type == OBJT_VNODE) vdrop(object->handle); } } /* * vm_object_terminate actually destroys the specified object, freeing * up all previously used resources. * * The object must be locked. * This routine may block. */ void vm_object_terminate(vm_object_t object) { VM_OBJECT_ASSERT_WLOCKED(object); /* * Make sure no one uses us. */ vm_object_set_flag(object, OBJ_DEAD); /* * wait for the pageout daemon to be done with the object */ vm_object_pip_wait(object, "objtrm"); KASSERT(!object->paging_in_progress, ("vm_object_terminate: pageout in progress")); /* * Clean and free the pages, as appropriate. All references to the * object are gone, so we don't need to lock it. */ if (object->type == OBJT_VNODE) { struct vnode *vp = (struct vnode *)object->handle; /* * Clean pages and flush buffers. */ vm_object_page_clean(object, 0, 0, OBJPC_SYNC); VM_OBJECT_WUNLOCK(object); vinvalbuf(vp, V_SAVE, 0, 0); BO_LOCK(&vp->v_bufobj); vp->v_bufobj.bo_flag |= BO_DEAD; BO_UNLOCK(&vp->v_bufobj); VM_OBJECT_WLOCK(object); } KASSERT(object->ref_count == 0, ("vm_object_terminate: object with references, ref_count=%d", object->ref_count)); if ((object->flags & OBJ_PG_DTOR) == 0) vm_object_terminate_pages(object); #if VM_NRESERVLEVEL > 0 if (__predict_false(!LIST_EMPTY(&object->rvq))) vm_reserv_break_all(object); #endif KASSERT(object->cred == NULL || object->type == OBJT_DEFAULT || object->type == OBJT_SWAP, ("%s: non-swap obj %p has cred", __func__, object)); /* * Let the pager know object is dead. */ vm_pager_deallocate(object); VM_OBJECT_WUNLOCK(object); vm_object_destroy(object); } /* * Make the page read-only so that we can clear the object flags. However, if * this is a nosync mmap then the object is likely to stay dirty so do not * mess with the page and do not clear the object flags. Returns TRUE if the * page should be flushed, and FALSE otherwise. */ static boolean_t vm_object_page_remove_write(vm_page_t p, int flags, boolean_t *clearobjflags) { /* * If we have been asked to skip nosync pages and this is a * nosync page, skip it. Note that the object flags were not * cleared in this case so we do not have to set them. */ if ((flags & OBJPC_NOSYNC) != 0 && (p->oflags & VPO_NOSYNC) != 0) { *clearobjflags = FALSE; return (FALSE); } else { pmap_remove_write(p); return (p->dirty != 0); } } /* * vm_object_page_clean * * Clean all dirty pages in the specified range of object. Leaves page * on whatever queue it is currently on. If NOSYNC is set then do not * write out pages with VPO_NOSYNC set (originally comes from MAP_NOSYNC), * leaving the object dirty. * * When stuffing pages asynchronously, allow clustering. XXX we need a * synchronous clustering mode implementation. * * Odd semantics: if start == end, we clean everything. * * The object must be locked. * * Returns FALSE if some page from the range was not written, as * reported by the pager, and TRUE otherwise. */ boolean_t vm_object_page_clean(vm_object_t object, vm_ooffset_t start, vm_ooffset_t end, int flags) { vm_page_t np, p; vm_pindex_t pi, tend, tstart; int curgeneration, n, pagerflags; boolean_t clearobjflags, eio, res; VM_OBJECT_ASSERT_WLOCKED(object); /* * The OBJ_MIGHTBEDIRTY flag is only set for OBJT_VNODE * objects. The check below prevents the function from * operating on non-vnode objects. */ if ((object->flags & OBJ_MIGHTBEDIRTY) == 0 || object->resident_page_count == 0) return (TRUE); pagerflags = (flags & (OBJPC_SYNC | OBJPC_INVAL)) != 0 ? VM_PAGER_PUT_SYNC : VM_PAGER_CLUSTER_OK; pagerflags |= (flags & OBJPC_INVAL) != 0 ? VM_PAGER_PUT_INVAL : 0; tstart = OFF_TO_IDX(start); tend = (end == 0) ? object->size : OFF_TO_IDX(end + PAGE_MASK); clearobjflags = tstart == 0 && tend >= object->size; res = TRUE; rescan: curgeneration = object->generation; for (p = vm_page_find_least(object, tstart); p != NULL; p = np) { pi = p->pindex; if (pi >= tend) break; np = TAILQ_NEXT(p, listq); if (p->valid == 0) continue; if (vm_page_sleep_if_busy(p, "vpcwai")) { if (object->generation != curgeneration) { if ((flags & OBJPC_SYNC) != 0) goto rescan; else clearobjflags = FALSE; } np = vm_page_find_least(object, pi); continue; } if (!vm_object_page_remove_write(p, flags, &clearobjflags)) continue; n = vm_object_page_collect_flush(object, p, pagerflags, flags, &clearobjflags, &eio); if (eio) { res = FALSE; clearobjflags = FALSE; } if (object->generation != curgeneration) { if ((flags & OBJPC_SYNC) != 0) goto rescan; else clearobjflags = FALSE; } /* * If the VOP_PUTPAGES() did a truncated write, so * that even the first page of the run is not fully * written, vm_pageout_flush() returns 0 as the run * length. Since the condition that caused truncated * write may be permanent, e.g. exhausted free space, * accepting n == 0 would cause an infinite loop. * * Forwarding the iterator leaves the unwritten page * behind, but there is not much we can do there if * filesystem refuses to write it. */ if (n == 0) { n = 1; clearobjflags = FALSE; } np = vm_page_find_least(object, pi + n); } #if 0 VOP_FSYNC(vp, (pagerflags & VM_PAGER_PUT_SYNC) ? MNT_WAIT : 0); #endif if (clearobjflags) vm_object_clear_flag(object, OBJ_MIGHTBEDIRTY); return (res); } static int vm_object_page_collect_flush(vm_object_t object, vm_page_t p, int pagerflags, int flags, boolean_t *clearobjflags, boolean_t *eio) { vm_page_t ma[vm_pageout_page_count], p_first, tp; int count, i, mreq, runlen; vm_page_lock_assert(p, MA_NOTOWNED); VM_OBJECT_ASSERT_WLOCKED(object); count = 1; mreq = 0; for (tp = p; count < vm_pageout_page_count; count++) { tp = vm_page_next(tp); if (tp == NULL || vm_page_busied(tp)) break; if (!vm_object_page_remove_write(tp, flags, clearobjflags)) break; } for (p_first = p; count < vm_pageout_page_count; count++) { tp = vm_page_prev(p_first); if (tp == NULL || vm_page_busied(tp)) break; if (!vm_object_page_remove_write(tp, flags, clearobjflags)) break; p_first = tp; mreq++; } for (tp = p_first, i = 0; i < count; tp = TAILQ_NEXT(tp, listq), i++) ma[i] = tp; vm_pageout_flush(ma, count, pagerflags, mreq, &runlen, eio); return (runlen); } /* * Note that there is absolutely no sense in writing out * anonymous objects, so we track down the vnode object * to write out. * We invalidate (remove) all pages from the address space * for semantic correctness. * * If the backing object is a device object with unmanaged pages, then any * mappings to the specified range of pages must be removed before this * function is called. * * Note: certain anonymous maps, such as MAP_NOSYNC maps, * may start out with a NULL object. */ boolean_t vm_object_sync(vm_object_t object, vm_ooffset_t offset, vm_size_t size, boolean_t syncio, boolean_t invalidate) { vm_object_t backing_object; struct vnode *vp; struct mount *mp; int error, flags, fsync_after; boolean_t res; if (object == NULL) return (TRUE); res = TRUE; error = 0; VM_OBJECT_WLOCK(object); while ((backing_object = object->backing_object) != NULL) { VM_OBJECT_WLOCK(backing_object); offset += object->backing_object_offset; VM_OBJECT_WUNLOCK(object); object = backing_object; if (object->size < OFF_TO_IDX(offset + size)) size = IDX_TO_OFF(object->size) - offset; } /* * Flush pages if writing is allowed, invalidate them * if invalidation requested. Pages undergoing I/O * will be ignored by vm_object_page_remove(). * * We cannot lock the vnode and then wait for paging * to complete without deadlocking against vm_fault. * Instead we simply call vm_object_page_remove() and * allow it to block internally on a page-by-page * basis when it encounters pages undergoing async * I/O. */ if (object->type == OBJT_VNODE && (object->flags & OBJ_MIGHTBEDIRTY) != 0) { vp = object->handle; VM_OBJECT_WUNLOCK(object); (void) vn_start_write(vp, &mp, V_WAIT); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); if (syncio && !invalidate && offset == 0 && atop(size) == object->size) { /* * If syncing the whole mapping of the file, * it is faster to schedule all the writes in * async mode, also allowing the clustering, * and then wait for i/o to complete. */ flags = 0; fsync_after = TRUE; } else { flags = (syncio || invalidate) ? OBJPC_SYNC : 0; flags |= invalidate ? (OBJPC_SYNC | OBJPC_INVAL) : 0; fsync_after = FALSE; } VM_OBJECT_WLOCK(object); res = vm_object_page_clean(object, offset, offset + size, flags); VM_OBJECT_WUNLOCK(object); if (fsync_after) error = VOP_FSYNC(vp, MNT_WAIT, curthread); VOP_UNLOCK(vp, 0); vn_finished_write(mp); if (error != 0) res = FALSE; VM_OBJECT_WLOCK(object); } if ((object->type == OBJT_VNODE || object->type == OBJT_DEVICE) && invalidate) { if (object->type == OBJT_DEVICE) /* * The option OBJPR_NOTMAPPED must be passed here * because vm_object_page_remove() cannot remove * unmanaged mappings. */ flags = OBJPR_NOTMAPPED; else if (old_msync) flags = 0; else flags = OBJPR_CLEANONLY; vm_object_page_remove(object, OFF_TO_IDX(offset), OFF_TO_IDX(offset + size + PAGE_MASK), flags); } VM_OBJECT_WUNLOCK(object); return (res); } /* * Determine whether the given advice can be applied to the object. Advice is * not applied to unmanaged pages since they never belong to page queues, and * since MADV_FREE is destructive, it can apply only to anonymous pages that * have been mapped at most once. */ static bool vm_object_advice_applies(vm_object_t object, int advice) { if ((object->flags & OBJ_UNMANAGED) != 0) return (false); if (advice != MADV_FREE) return (true); return ((object->type == OBJT_DEFAULT || object->type == OBJT_SWAP) && (object->flags & OBJ_ONEMAPPING) != 0); } static void vm_object_madvise_freespace(vm_object_t object, int advice, vm_pindex_t pindex, vm_size_t size) { if (advice == MADV_FREE && object->type == OBJT_SWAP) swap_pager_freespace(object, pindex, size); } /* * vm_object_madvise: * * Implements the madvise function at the object/page level. * * MADV_WILLNEED (any object) * * Activate the specified pages if they are resident. * * MADV_DONTNEED (any object) * * Deactivate the specified pages if they are resident. * * MADV_FREE (OBJT_DEFAULT/OBJT_SWAP objects, * OBJ_ONEMAPPING only) * * Deactivate and clean the specified pages if they are * resident. This permits the process to reuse the pages * without faulting or the kernel to reclaim the pages * without I/O. */ void vm_object_madvise(vm_object_t object, vm_pindex_t pindex, vm_pindex_t end, int advice) { vm_pindex_t tpindex; vm_object_t backing_object, tobject; vm_page_t m, tm; if (object == NULL) return; relookup: VM_OBJECT_WLOCK(object); if (!vm_object_advice_applies(object, advice)) { VM_OBJECT_WUNLOCK(object); return; } for (m = vm_page_find_least(object, pindex); pindex < end; pindex++) { tobject = object; /* * If the next page isn't resident in the top-level object, we * need to search the shadow chain. When applying MADV_FREE, we * take care to release any swap space used to store * non-resident pages. */ if (m == NULL || pindex < m->pindex) { /* * Optimize a common case: if the top-level object has * no backing object, we can skip over the non-resident * range in constant time. */ if (object->backing_object == NULL) { tpindex = (m != NULL && m->pindex < end) ? m->pindex : end; vm_object_madvise_freespace(object, advice, pindex, tpindex - pindex); if ((pindex = tpindex) == end) break; goto next_page; } tpindex = pindex; do { vm_object_madvise_freespace(tobject, advice, tpindex, 1); /* * Prepare to search the next object in the * chain. */ backing_object = tobject->backing_object; if (backing_object == NULL) goto next_pindex; VM_OBJECT_WLOCK(backing_object); tpindex += OFF_TO_IDX(tobject->backing_object_offset); if (tobject != object) VM_OBJECT_WUNLOCK(tobject); tobject = backing_object; if (!vm_object_advice_applies(tobject, advice)) goto next_pindex; } while ((tm = vm_page_lookup(tobject, tpindex)) == NULL); } else { next_page: tm = m; m = TAILQ_NEXT(m, listq); } /* * If the page is not in a normal state, skip it. */ if (tm->valid != VM_PAGE_BITS_ALL) goto next_pindex; vm_page_lock(tm); if (tm->hold_count != 0 || tm->wire_count != 0) { vm_page_unlock(tm); goto next_pindex; } KASSERT((tm->flags & PG_FICTITIOUS) == 0, ("vm_object_madvise: page %p is fictitious", tm)); KASSERT((tm->oflags & VPO_UNMANAGED) == 0, ("vm_object_madvise: page %p is not managed", tm)); if (vm_page_busied(tm)) { if (object != tobject) VM_OBJECT_WUNLOCK(tobject); VM_OBJECT_WUNLOCK(object); if (advice == MADV_WILLNEED) { /* * Reference the page before unlocking and * sleeping so that the page daemon is less * likely to reclaim it. */ vm_page_aflag_set(tm, PGA_REFERENCED); } vm_page_busy_sleep(tm, "madvpo", false); goto relookup; } vm_page_advise(tm, advice); vm_page_unlock(tm); vm_object_madvise_freespace(tobject, advice, tm->pindex, 1); next_pindex: if (tobject != object) VM_OBJECT_WUNLOCK(tobject); } VM_OBJECT_WUNLOCK(object); } /* * vm_object_shadow: * * Create a new object which is backed by the * specified existing object range. The source * object reference is deallocated. * * The new object and offset into that object * are returned in the source parameters. */ void vm_object_shadow( vm_object_t *object, /* IN/OUT */ vm_ooffset_t *offset, /* IN/OUT */ vm_size_t length) { vm_object_t source; vm_object_t result; source = *object; /* * Don't create the new object if the old object isn't shared. */ if (source != NULL) { VM_OBJECT_WLOCK(source); if (source->ref_count == 1 && source->handle == NULL && (source->type == OBJT_DEFAULT || source->type == OBJT_SWAP)) { VM_OBJECT_WUNLOCK(source); return; } VM_OBJECT_WUNLOCK(source); } /* * Allocate a new object with the given length. */ result = vm_object_allocate(OBJT_DEFAULT, atop(length)); /* * The new object shadows the source object, adding a reference to it. * Our caller changes his reference to point to the new object, * removing a reference to the source object. Net result: no change * of reference count. * * Try to optimize the result object's page color when shadowing * in order to maintain page coloring consistency in the combined * shadowed object. */ result->backing_object = source; /* * Store the offset into the source object, and fix up the offset into * the new object. */ result->backing_object_offset = *offset; if (source != NULL) { VM_OBJECT_WLOCK(source); LIST_INSERT_HEAD(&source->shadow_head, result, shadow_list); source->shadow_count++; #if VM_NRESERVLEVEL > 0 result->flags |= source->flags & OBJ_COLORED; result->pg_color = (source->pg_color + OFF_TO_IDX(*offset)) & ((1 << (VM_NFREEORDER - 1)) - 1); #endif VM_OBJECT_WUNLOCK(source); } /* * Return the new things */ *offset = 0; *object = result; } /* * vm_object_split: * * Split the pages in a map entry into a new object. This affords * easier removal of unused pages, and keeps object inheritance from * being a negative impact on memory usage. */ void vm_object_split(vm_map_entry_t entry) { vm_page_t m, m_next; vm_object_t orig_object, new_object, source; vm_pindex_t idx, offidxstart; vm_size_t size; orig_object = entry->object.vm_object; if (orig_object->type != OBJT_DEFAULT && orig_object->type != OBJT_SWAP) return; if (orig_object->ref_count <= 1) return; VM_OBJECT_WUNLOCK(orig_object); offidxstart = OFF_TO_IDX(entry->offset); size = atop(entry->end - entry->start); /* * If swap_pager_copy() is later called, it will convert new_object * into a swap object. */ new_object = vm_object_allocate(OBJT_DEFAULT, size); /* * At this point, the new object is still private, so the order in * which the original and new objects are locked does not matter. */ VM_OBJECT_WLOCK(new_object); VM_OBJECT_WLOCK(orig_object); source = orig_object->backing_object; if (source != NULL) { VM_OBJECT_WLOCK(source); if ((source->flags & OBJ_DEAD) != 0) { VM_OBJECT_WUNLOCK(source); VM_OBJECT_WUNLOCK(orig_object); VM_OBJECT_WUNLOCK(new_object); vm_object_deallocate(new_object); VM_OBJECT_WLOCK(orig_object); return; } LIST_INSERT_HEAD(&source->shadow_head, new_object, shadow_list); source->shadow_count++; vm_object_reference_locked(source); /* for new_object */ vm_object_clear_flag(source, OBJ_ONEMAPPING); VM_OBJECT_WUNLOCK(source); new_object->backing_object_offset = orig_object->backing_object_offset + entry->offset; new_object->backing_object = source; } if (orig_object->cred != NULL) { new_object->cred = orig_object->cred; crhold(orig_object->cred); new_object->charge = ptoa(size); KASSERT(orig_object->charge >= ptoa(size), ("orig_object->charge < 0")); orig_object->charge -= ptoa(size); } retry: m = vm_page_find_least(orig_object, offidxstart); for (; m != NULL && (idx = m->pindex - offidxstart) < size; m = m_next) { m_next = TAILQ_NEXT(m, listq); /* * We must wait for pending I/O to complete before we can * rename the page. * * We do not have to VM_PROT_NONE the page as mappings should * not be changed by this operation. */ if (vm_page_busied(m)) { VM_OBJECT_WUNLOCK(new_object); vm_page_lock(m); VM_OBJECT_WUNLOCK(orig_object); vm_page_busy_sleep(m, "spltwt", false); VM_OBJECT_WLOCK(orig_object); VM_OBJECT_WLOCK(new_object); goto retry; } /* vm_page_rename() will dirty the page. */ if (vm_page_rename(m, new_object, idx)) { VM_OBJECT_WUNLOCK(new_object); VM_OBJECT_WUNLOCK(orig_object); VM_WAIT; VM_OBJECT_WLOCK(orig_object); VM_OBJECT_WLOCK(new_object); goto retry; } #if VM_NRESERVLEVEL > 0 /* * If some of the reservation's allocated pages remain with * the original object, then transferring the reservation to * the new object is neither particularly beneficial nor * particularly harmful as compared to leaving the reservation * with the original object. If, however, all of the * reservation's allocated pages are transferred to the new * object, then transferring the reservation is typically * beneficial. Determining which of these two cases applies * would be more costly than unconditionally renaming the * reservation. */ vm_reserv_rename(m, new_object, orig_object, offidxstart); #endif if (orig_object->type == OBJT_SWAP) vm_page_xbusy(m); } if (orig_object->type == OBJT_SWAP) { /* * swap_pager_copy() can sleep, in which case the orig_object's * and new_object's locks are released and reacquired. */ swap_pager_copy(orig_object, new_object, offidxstart, 0); TAILQ_FOREACH(m, &new_object->memq, listq) vm_page_xunbusy(m); } VM_OBJECT_WUNLOCK(orig_object); VM_OBJECT_WUNLOCK(new_object); entry->object.vm_object = new_object; entry->offset = 0LL; vm_object_deallocate(orig_object); VM_OBJECT_WLOCK(new_object); } #define OBSC_COLLAPSE_NOWAIT 0x0002 #define OBSC_COLLAPSE_WAIT 0x0004 static vm_page_t vm_object_collapse_scan_wait(vm_object_t object, vm_page_t p, vm_page_t next, int op) { vm_object_t backing_object; VM_OBJECT_ASSERT_WLOCKED(object); backing_object = object->backing_object; VM_OBJECT_ASSERT_WLOCKED(backing_object); KASSERT(p == NULL || vm_page_busied(p), ("unbusy page %p", p)); KASSERT(p == NULL || p->object == object || p->object == backing_object, ("invalid ownership %p %p %p", p, object, backing_object)); if ((op & OBSC_COLLAPSE_NOWAIT) != 0) return (next); if (p != NULL) vm_page_lock(p); VM_OBJECT_WUNLOCK(object); VM_OBJECT_WUNLOCK(backing_object); if (p == NULL) VM_WAIT; else vm_page_busy_sleep(p, "vmocol", false); VM_OBJECT_WLOCK(object); VM_OBJECT_WLOCK(backing_object); return (TAILQ_FIRST(&backing_object->memq)); } static bool vm_object_scan_all_shadowed(vm_object_t object) { vm_object_t backing_object; vm_page_t p, pp; vm_pindex_t backing_offset_index, new_pindex, pi, ps; VM_OBJECT_ASSERT_WLOCKED(object); VM_OBJECT_ASSERT_WLOCKED(object->backing_object); backing_object = object->backing_object; if (backing_object->type != OBJT_DEFAULT && backing_object->type != OBJT_SWAP) return (false); pi = backing_offset_index = OFF_TO_IDX(object->backing_object_offset); p = vm_page_find_least(backing_object, pi); ps = swap_pager_find_least(backing_object, pi); /* * Only check pages inside the parent object's range and * inside the parent object's mapping of the backing object. */ for (;; pi++) { if (p != NULL && p->pindex < pi) p = TAILQ_NEXT(p, listq); if (ps < pi) ps = swap_pager_find_least(backing_object, pi); if (p == NULL && ps >= backing_object->size) break; else if (p == NULL) pi = ps; else pi = MIN(p->pindex, ps); new_pindex = pi - backing_offset_index; if (new_pindex >= object->size) break; /* * See if the parent has the page or if the parent's object * pager has the page. If the parent has the page but the page * is not valid, the parent's object pager must have the page. * * If this fails, the parent does not completely shadow the * object and we might as well give up now. */ pp = vm_page_lookup(object, new_pindex); if ((pp == NULL || pp->valid == 0) && !vm_pager_has_page(object, new_pindex, NULL, NULL)) return (false); } return (true); } static bool vm_object_collapse_scan(vm_object_t object, int op) { vm_object_t backing_object; vm_page_t next, p, pp; vm_pindex_t backing_offset_index, new_pindex; VM_OBJECT_ASSERT_WLOCKED(object); VM_OBJECT_ASSERT_WLOCKED(object->backing_object); backing_object = object->backing_object; backing_offset_index = OFF_TO_IDX(object->backing_object_offset); /* * Initial conditions */ if ((op & OBSC_COLLAPSE_WAIT) != 0) vm_object_set_flag(backing_object, OBJ_DEAD); /* * Our scan */ for (p = TAILQ_FIRST(&backing_object->memq); p != NULL; p = next) { next = TAILQ_NEXT(p, listq); new_pindex = p->pindex - backing_offset_index; /* * Check for busy page */ if (vm_page_busied(p)) { next = vm_object_collapse_scan_wait(object, p, next, op); continue; } KASSERT(p->object == backing_object, ("vm_object_collapse_scan: object mismatch")); if (p->pindex < backing_offset_index || new_pindex >= object->size) { if (backing_object->type == OBJT_SWAP) swap_pager_freespace(backing_object, p->pindex, 1); /* * Page is out of the parent object's range, we can * simply destroy it. */ vm_page_lock(p); KASSERT(!pmap_page_is_mapped(p), ("freeing mapped page %p", p)); if (p->wire_count == 0) vm_page_free(p); else vm_page_remove(p); vm_page_unlock(p); continue; } pp = vm_page_lookup(object, new_pindex); if (pp != NULL && vm_page_busied(pp)) { /* * The page in the parent is busy and possibly not * (yet) valid. Until its state is finalized by the * busy bit owner, we can't tell whether it shadows the * original page. Therefore, we must either skip it * and the original (backing_object) page or wait for * its state to be finalized. * * This is due to a race with vm_fault() where we must * unbusy the original (backing_obj) page before we can * (re)lock the parent. Hence we can get here. */ next = vm_object_collapse_scan_wait(object, pp, next, op); continue; } KASSERT(pp == NULL || pp->valid != 0, ("unbusy invalid page %p", pp)); if (pp != NULL || vm_pager_has_page(object, new_pindex, NULL, NULL)) { /* * The page already exists in the parent OR swap exists * for this location in the parent. Leave the parent's * page alone. Destroy the original page from the * backing object. */ if (backing_object->type == OBJT_SWAP) swap_pager_freespace(backing_object, p->pindex, 1); vm_page_lock(p); KASSERT(!pmap_page_is_mapped(p), ("freeing mapped page %p", p)); if (p->wire_count == 0) vm_page_free(p); else vm_page_remove(p); vm_page_unlock(p); continue; } /* * Page does not exist in parent, rename the page from the * backing object to the main object. * * If the page was mapped to a process, it can remain mapped * through the rename. vm_page_rename() will dirty the page. */ if (vm_page_rename(p, object, new_pindex)) { next = vm_object_collapse_scan_wait(object, NULL, next, op); continue; } /* Use the old pindex to free the right page. */ if (backing_object->type == OBJT_SWAP) swap_pager_freespace(backing_object, new_pindex + backing_offset_index, 1); #if VM_NRESERVLEVEL > 0 /* * Rename the reservation. */ vm_reserv_rename(p, object, backing_object, backing_offset_index); #endif } return (true); } /* * this version of collapse allows the operation to occur earlier and * when paging_in_progress is true for an object... This is not a complete * operation, but should plug 99.9% of the rest of the leaks. */ static void vm_object_qcollapse(vm_object_t object) { vm_object_t backing_object = object->backing_object; VM_OBJECT_ASSERT_WLOCKED(object); VM_OBJECT_ASSERT_WLOCKED(backing_object); if (backing_object->ref_count != 1) return; vm_object_collapse_scan(object, OBSC_COLLAPSE_NOWAIT); } /* * vm_object_collapse: * * Collapse an object with the object backing it. * Pages in the backing object are moved into the * parent, and the backing object is deallocated. */ void vm_object_collapse(vm_object_t object) { vm_object_t backing_object, new_backing_object; VM_OBJECT_ASSERT_WLOCKED(object); while (TRUE) { /* * Verify that the conditions are right for collapse: * * The object exists and the backing object exists. */ if ((backing_object = object->backing_object) == NULL) break; /* * we check the backing object first, because it is most likely * not collapsable. */ VM_OBJECT_WLOCK(backing_object); if (backing_object->handle != NULL || (backing_object->type != OBJT_DEFAULT && backing_object->type != OBJT_SWAP) || (backing_object->flags & OBJ_DEAD) || object->handle != NULL || (object->type != OBJT_DEFAULT && object->type != OBJT_SWAP) || (object->flags & OBJ_DEAD)) { VM_OBJECT_WUNLOCK(backing_object); break; } if (object->paging_in_progress != 0 || backing_object->paging_in_progress != 0) { vm_object_qcollapse(object); VM_OBJECT_WUNLOCK(backing_object); break; } /* * We know that we can either collapse the backing object (if * the parent is the only reference to it) or (perhaps) have * the parent bypass the object if the parent happens to shadow * all the resident pages in the entire backing object. * * This is ignoring pager-backed pages such as swap pages. * vm_object_collapse_scan fails the shadowing test in this * case. */ if (backing_object->ref_count == 1) { vm_object_pip_add(object, 1); vm_object_pip_add(backing_object, 1); /* * If there is exactly one reference to the backing * object, we can collapse it into the parent. */ vm_object_collapse_scan(object, OBSC_COLLAPSE_WAIT); #if VM_NRESERVLEVEL > 0 /* * Break any reservations from backing_object. */ if (__predict_false(!LIST_EMPTY(&backing_object->rvq))) vm_reserv_break_all(backing_object); #endif /* * Move the pager from backing_object to object. */ if (backing_object->type == OBJT_SWAP) { /* * swap_pager_copy() can sleep, in which case * the backing_object's and object's locks are * released and reacquired. * Since swap_pager_copy() is being asked to * destroy the source, it will change the * backing_object's type to OBJT_DEFAULT. */ swap_pager_copy( backing_object, object, OFF_TO_IDX(object->backing_object_offset), TRUE); } /* * Object now shadows whatever backing_object did. * Note that the reference to * backing_object->backing_object moves from within * backing_object to within object. */ LIST_REMOVE(object, shadow_list); backing_object->shadow_count--; if (backing_object->backing_object) { VM_OBJECT_WLOCK(backing_object->backing_object); LIST_REMOVE(backing_object, shadow_list); LIST_INSERT_HEAD( &backing_object->backing_object->shadow_head, object, shadow_list); /* * The shadow_count has not changed. */ VM_OBJECT_WUNLOCK(backing_object->backing_object); } object->backing_object = backing_object->backing_object; object->backing_object_offset += backing_object->backing_object_offset; /* * Discard backing_object. * * Since the backing object has no pages, no pager left, * and no object references within it, all that is * necessary is to dispose of it. */ KASSERT(backing_object->ref_count == 1, ( "backing_object %p was somehow re-referenced during collapse!", backing_object)); vm_object_pip_wakeup(backing_object); backing_object->type = OBJT_DEAD; backing_object->ref_count = 0; VM_OBJECT_WUNLOCK(backing_object); vm_object_destroy(backing_object); vm_object_pip_wakeup(object); object_collapses++; } else { /* * If we do not entirely shadow the backing object, * there is nothing we can do so we give up. */ if (object->resident_page_count != object->size && !vm_object_scan_all_shadowed(object)) { VM_OBJECT_WUNLOCK(backing_object); break; } /* * Make the parent shadow the next object in the * chain. Deallocating backing_object will not remove * it, since its reference count is at least 2. */ LIST_REMOVE(object, shadow_list); backing_object->shadow_count--; new_backing_object = backing_object->backing_object; if ((object->backing_object = new_backing_object) != NULL) { VM_OBJECT_WLOCK(new_backing_object); LIST_INSERT_HEAD( &new_backing_object->shadow_head, object, shadow_list ); new_backing_object->shadow_count++; vm_object_reference_locked(new_backing_object); VM_OBJECT_WUNLOCK(new_backing_object); object->backing_object_offset += backing_object->backing_object_offset; } /* * Drop the reference count on backing_object. Since * its ref_count was at least 2, it will not vanish. */ backing_object->ref_count--; VM_OBJECT_WUNLOCK(backing_object); object_bypasses++; } /* * Try again with this object's new backing object. */ } } /* * vm_object_page_remove: * * For the given object, either frees or invalidates each of the * specified pages. In general, a page is freed. However, if a page is * wired for any reason other than the existence of a managed, wired * mapping, then it may be invalidated but not removed from the object. * Pages are specified by the given range ["start", "end") and the option * OBJPR_CLEANONLY. As a special case, if "end" is zero, then the range * extends from "start" to the end of the object. If the option * OBJPR_CLEANONLY is specified, then only the non-dirty pages within the * specified range are affected. If the option OBJPR_NOTMAPPED is * specified, then the pages within the specified range must have no * mappings. Otherwise, if this option is not specified, any mappings to * the specified pages are removed before the pages are freed or * invalidated. * * In general, this operation should only be performed on objects that * contain managed pages. There are, however, two exceptions. First, it * is performed on the kernel and kmem objects by vm_map_entry_delete(). * Second, it is used by msync(..., MS_INVALIDATE) to invalidate device- * backed pages. In both of these cases, the option OBJPR_CLEANONLY must * not be specified and the option OBJPR_NOTMAPPED must be specified. * * The object must be locked. */ void vm_object_page_remove(vm_object_t object, vm_pindex_t start, vm_pindex_t end, int options) { vm_page_t p, next; VM_OBJECT_ASSERT_WLOCKED(object); KASSERT((object->flags & OBJ_UNMANAGED) == 0 || (options & (OBJPR_CLEANONLY | OBJPR_NOTMAPPED)) == OBJPR_NOTMAPPED, ("vm_object_page_remove: illegal options for object %p", object)); if (object->resident_page_count == 0) return; vm_object_pip_add(object, 1); again: p = vm_page_find_least(object, start); /* * Here, the variable "p" is either (1) the page with the least pindex * greater than or equal to the parameter "start" or (2) NULL. */ for (; p != NULL && (p->pindex < end || end == 0); p = next) { next = TAILQ_NEXT(p, listq); /* * If the page is wired for any reason besides the existence * of managed, wired mappings, then it cannot be freed. For * example, fictitious pages, which represent device memory, * are inherently wired and cannot be freed. They can, * however, be invalidated if the option OBJPR_CLEANONLY is * not specified. */ vm_page_lock(p); if (vm_page_xbusied(p)) { VM_OBJECT_WUNLOCK(object); vm_page_busy_sleep(p, "vmopax", true); VM_OBJECT_WLOCK(object); goto again; } if (p->wire_count != 0) { if ((options & OBJPR_NOTMAPPED) == 0) pmap_remove_all(p); if ((options & OBJPR_CLEANONLY) == 0) { p->valid = 0; vm_page_undirty(p); } goto next; } if (vm_page_busied(p)) { VM_OBJECT_WUNLOCK(object); vm_page_busy_sleep(p, "vmopar", false); VM_OBJECT_WLOCK(object); goto again; } KASSERT((p->flags & PG_FICTITIOUS) == 0, ("vm_object_page_remove: page %p is fictitious", p)); if ((options & OBJPR_CLEANONLY) != 0 && p->valid != 0) { if ((options & OBJPR_NOTMAPPED) == 0) pmap_remove_write(p); if (p->dirty) goto next; } if ((options & OBJPR_NOTMAPPED) == 0) pmap_remove_all(p); vm_page_free(p); next: vm_page_unlock(p); } vm_object_pip_wakeup(object); } /* * vm_object_page_noreuse: * * For the given object, attempt to move the specified pages to * the head of the inactive queue. This bypasses regular LRU * operation and allows the pages to be reused quickly under memory * pressure. If a page is wired for any reason, then it will not * be queued. Pages are specified by the range ["start", "end"). * As a special case, if "end" is zero, then the range extends from * "start" to the end of the object. * * This operation should only be performed on objects that * contain non-fictitious, managed pages. * * The object must be locked. */ void vm_object_page_noreuse(vm_object_t object, vm_pindex_t start, vm_pindex_t end) { struct mtx *mtx, *new_mtx; vm_page_t p, next; VM_OBJECT_ASSERT_LOCKED(object); KASSERT((object->flags & (OBJ_FICTITIOUS | OBJ_UNMANAGED)) == 0, ("vm_object_page_noreuse: illegal object %p", object)); if (object->resident_page_count == 0) return; p = vm_page_find_least(object, start); /* * Here, the variable "p" is either (1) the page with the least pindex * greater than or equal to the parameter "start" or (2) NULL. */ mtx = NULL; for (; p != NULL && (p->pindex < end || end == 0); p = next) { next = TAILQ_NEXT(p, listq); /* * Avoid releasing and reacquiring the same page lock. */ new_mtx = vm_page_lockptr(p); if (mtx != new_mtx) { if (mtx != NULL) mtx_unlock(mtx); mtx = new_mtx; mtx_lock(mtx); } vm_page_deactivate_noreuse(p); } if (mtx != NULL) mtx_unlock(mtx); } /* * Populate the specified range of the object with valid pages. Returns * TRUE if the range is successfully populated and FALSE otherwise. * * Note: This function should be optimized to pass a larger array of * pages to vm_pager_get_pages() before it is applied to a non- * OBJT_DEVICE object. * * The object must be locked. */ boolean_t vm_object_populate(vm_object_t object, vm_pindex_t start, vm_pindex_t end) { vm_page_t m; vm_pindex_t pindex; int rv; VM_OBJECT_ASSERT_WLOCKED(object); for (pindex = start; pindex < end; pindex++) { m = vm_page_grab(object, pindex, VM_ALLOC_NORMAL); if (m->valid != VM_PAGE_BITS_ALL) { rv = vm_pager_get_pages(object, &m, 1, NULL, NULL); if (rv != VM_PAGER_OK) { vm_page_lock(m); vm_page_free(m); vm_page_unlock(m); break; } } /* * Keep "m" busy because a subsequent iteration may unlock * the object. */ } if (pindex > start) { m = vm_page_lookup(object, start); while (m != NULL && m->pindex < pindex) { vm_page_xunbusy(m); m = TAILQ_NEXT(m, listq); } } return (pindex == end); } /* * Routine: vm_object_coalesce * Function: Coalesces two objects backing up adjoining * regions of memory into a single object. * * returns TRUE if objects were combined. * * NOTE: Only works at the moment if the second object is NULL - * if it's not, which object do we lock first? * * Parameters: * prev_object First object to coalesce * prev_offset Offset into prev_object * prev_size Size of reference to prev_object * next_size Size of reference to the second object * reserved Indicator that extension region has * swap accounted for * * Conditions: * The object must *not* be locked. */ boolean_t vm_object_coalesce(vm_object_t prev_object, vm_ooffset_t prev_offset, vm_size_t prev_size, vm_size_t next_size, boolean_t reserved) { vm_pindex_t next_pindex; if (prev_object == NULL) return (TRUE); VM_OBJECT_WLOCK(prev_object); if ((prev_object->type != OBJT_DEFAULT && prev_object->type != OBJT_SWAP) || (prev_object->flags & OBJ_TMPFS_NODE) != 0) { VM_OBJECT_WUNLOCK(prev_object); return (FALSE); } /* * Try to collapse the object first */ vm_object_collapse(prev_object); /* * Can't coalesce if: . more than one reference . paged out . shadows * another object . has a copy elsewhere (any of which mean that the * pages not mapped to prev_entry may be in use anyway) */ if (prev_object->backing_object != NULL) { VM_OBJECT_WUNLOCK(prev_object); return (FALSE); } prev_size >>= PAGE_SHIFT; next_size >>= PAGE_SHIFT; next_pindex = OFF_TO_IDX(prev_offset) + prev_size; if ((prev_object->ref_count > 1) && (prev_object->size != next_pindex)) { VM_OBJECT_WUNLOCK(prev_object); return (FALSE); } /* * Account for the charge. */ if (prev_object->cred != NULL) { /* * If prev_object was charged, then this mapping, * although not charged now, may become writable * later. Non-NULL cred in the object would prevent * swap reservation during enabling of the write * access, so reserve swap now. Failed reservation * cause allocation of the separate object for the map * entry, and swap reservation for this entry is * managed in appropriate time. */ if (!reserved && !swap_reserve_by_cred(ptoa(next_size), prev_object->cred)) { VM_OBJECT_WUNLOCK(prev_object); return (FALSE); } prev_object->charge += ptoa(next_size); } /* * Remove any pages that may still be in the object from a previous * deallocation. */ if (next_pindex < prev_object->size) { vm_object_page_remove(prev_object, next_pindex, next_pindex + next_size, 0); if (prev_object->type == OBJT_SWAP) swap_pager_freespace(prev_object, next_pindex, next_size); #if 0 if (prev_object->cred != NULL) { KASSERT(prev_object->charge >= ptoa(prev_object->size - next_pindex), ("object %p overcharged 1 %jx %jx", prev_object, (uintmax_t)next_pindex, (uintmax_t)next_size)); prev_object->charge -= ptoa(prev_object->size - next_pindex); } #endif } /* * Extend the object if necessary. */ if (next_pindex + next_size > prev_object->size) prev_object->size = next_pindex + next_size; VM_OBJECT_WUNLOCK(prev_object); return (TRUE); } void vm_object_set_writeable_dirty(vm_object_t object) { VM_OBJECT_ASSERT_WLOCKED(object); if (object->type != OBJT_VNODE) { if ((object->flags & OBJ_TMPFS_NODE) != 0) { KASSERT(object->type == OBJT_SWAP, ("non-swap tmpfs")); vm_object_set_flag(object, OBJ_TMPFS_DIRTY); } return; } object->generation++; if ((object->flags & OBJ_MIGHTBEDIRTY) != 0) return; vm_object_set_flag(object, OBJ_MIGHTBEDIRTY); } /* * vm_object_unwire: * * For each page offset within the specified range of the given object, * find the highest-level page in the shadow chain and unwire it. A page * must exist at every page offset, and the highest-level page must be * wired. */ void vm_object_unwire(vm_object_t object, vm_ooffset_t offset, vm_size_t length, uint8_t queue) { vm_object_t tobject; vm_page_t m, tm; vm_pindex_t end_pindex, pindex, tpindex; int depth, locked_depth; KASSERT((offset & PAGE_MASK) == 0, ("vm_object_unwire: offset is not page aligned")); KASSERT((length & PAGE_MASK) == 0, ("vm_object_unwire: length is not a multiple of PAGE_SIZE")); /* The wired count of a fictitious page never changes. */ if ((object->flags & OBJ_FICTITIOUS) != 0) return; pindex = OFF_TO_IDX(offset); end_pindex = pindex + atop(length); locked_depth = 1; VM_OBJECT_RLOCK(object); m = vm_page_find_least(object, pindex); while (pindex < end_pindex) { if (m == NULL || pindex < m->pindex) { /* * The first object in the shadow chain doesn't * contain a page at the current index. Therefore, * the page must exist in a backing object. */ tobject = object; tpindex = pindex; depth = 0; do { tpindex += OFF_TO_IDX(tobject->backing_object_offset); tobject = tobject->backing_object; KASSERT(tobject != NULL, ("vm_object_unwire: missing page")); if ((tobject->flags & OBJ_FICTITIOUS) != 0) goto next_page; depth++; if (depth == locked_depth) { locked_depth++; VM_OBJECT_RLOCK(tobject); } } while ((tm = vm_page_lookup(tobject, tpindex)) == NULL); } else { tm = m; m = TAILQ_NEXT(m, listq); } vm_page_lock(tm); vm_page_unwire(tm, queue); vm_page_unlock(tm); next_page: pindex++; } /* Release the accumulated object locks. */ for (depth = 0; depth < locked_depth; depth++) { tobject = object->backing_object; VM_OBJECT_RUNLOCK(object); object = tobject; } } struct vnode * vm_object_vnode(vm_object_t object) { VM_OBJECT_ASSERT_LOCKED(object); if (object->type == OBJT_VNODE) return (object->handle); if (object->type == OBJT_SWAP && (object->flags & OBJ_TMPFS) != 0) return (object->un_pager.swp.swp_tmpfs); return (NULL); } static int sysctl_vm_object_list(SYSCTL_HANDLER_ARGS) { struct kinfo_vmobject *kvo; char *fullpath, *freepath; struct vnode *vp; struct vattr va; vm_object_t obj; vm_page_t m; int count, error; if (req->oldptr == NULL) { /* * If an old buffer has not been provided, generate an * estimate of the space needed for a subsequent call. */ mtx_lock(&vm_object_list_mtx); count = 0; TAILQ_FOREACH(obj, &vm_object_list, object_list) { if (obj->type == OBJT_DEAD) continue; count++; } mtx_unlock(&vm_object_list_mtx); return (SYSCTL_OUT(req, NULL, sizeof(struct kinfo_vmobject) * count * 11 / 10)); } kvo = malloc(sizeof(*kvo), M_TEMP, M_WAITOK); error = 0; /* * VM objects are type stable and are never removed from the * list once added. This allows us to safely read obj->object_list * after reacquiring the VM object lock. */ mtx_lock(&vm_object_list_mtx); TAILQ_FOREACH(obj, &vm_object_list, object_list) { if (obj->type == OBJT_DEAD) continue; VM_OBJECT_RLOCK(obj); if (obj->type == OBJT_DEAD) { VM_OBJECT_RUNLOCK(obj); continue; } mtx_unlock(&vm_object_list_mtx); kvo->kvo_size = ptoa(obj->size); kvo->kvo_resident = obj->resident_page_count; kvo->kvo_ref_count = obj->ref_count; kvo->kvo_shadow_count = obj->shadow_count; kvo->kvo_memattr = obj->memattr; kvo->kvo_active = 0; kvo->kvo_inactive = 0; TAILQ_FOREACH(m, &obj->memq, listq) { /* * A page may belong to the object but be * dequeued and set to PQ_NONE while the * object lock is not held. This makes the * reads of m->queue below racy, and we do not * count pages set to PQ_NONE. However, this * sysctl is only meant to give an * approximation of the system anyway. */ if (vm_page_active(m)) kvo->kvo_active++; else if (vm_page_inactive(m)) kvo->kvo_inactive++; } kvo->kvo_vn_fileid = 0; kvo->kvo_vn_fsid = 0; kvo->kvo_vn_fsid_freebsd11 = 0; freepath = NULL; fullpath = ""; vp = NULL; switch (obj->type) { case OBJT_DEFAULT: kvo->kvo_type = KVME_TYPE_DEFAULT; break; case OBJT_VNODE: kvo->kvo_type = KVME_TYPE_VNODE; vp = obj->handle; vref(vp); break; case OBJT_SWAP: kvo->kvo_type = KVME_TYPE_SWAP; break; case OBJT_DEVICE: kvo->kvo_type = KVME_TYPE_DEVICE; break; case OBJT_PHYS: kvo->kvo_type = KVME_TYPE_PHYS; break; case OBJT_DEAD: kvo->kvo_type = KVME_TYPE_DEAD; break; case OBJT_SG: kvo->kvo_type = KVME_TYPE_SG; break; case OBJT_MGTDEVICE: kvo->kvo_type = KVME_TYPE_MGTDEVICE; break; default: kvo->kvo_type = KVME_TYPE_UNKNOWN; break; } VM_OBJECT_RUNLOCK(obj); if (vp != NULL) { vn_fullpath(curthread, vp, &fullpath, &freepath); vn_lock(vp, LK_SHARED | LK_RETRY); if (VOP_GETATTR(vp, &va, curthread->td_ucred) == 0) { kvo->kvo_vn_fileid = va.va_fileid; kvo->kvo_vn_fsid = va.va_fsid; kvo->kvo_vn_fsid_freebsd11 = va.va_fsid; /* truncate */ } vput(vp); } strlcpy(kvo->kvo_path, fullpath, sizeof(kvo->kvo_path)); if (freepath != NULL) free(freepath, M_TEMP); /* Pack record size down */ kvo->kvo_structsize = offsetof(struct kinfo_vmobject, kvo_path) + strlen(kvo->kvo_path) + 1; kvo->kvo_structsize = roundup(kvo->kvo_structsize, sizeof(uint64_t)); error = SYSCTL_OUT(req, kvo, kvo->kvo_structsize); mtx_lock(&vm_object_list_mtx); if (error) break; } mtx_unlock(&vm_object_list_mtx); free(kvo, M_TEMP); return (error); } SYSCTL_PROC(_vm, OID_AUTO, objects, CTLTYPE_STRUCT | CTLFLAG_RW | CTLFLAG_SKIP | CTLFLAG_MPSAFE, NULL, 0, sysctl_vm_object_list, "S,kinfo_vmobject", "List of VM objects"); #include "opt_ddb.h" #ifdef DDB #include #include #include static int _vm_object_in_map(vm_map_t map, vm_object_t object, vm_map_entry_t entry) { vm_map_t tmpm; vm_map_entry_t tmpe; vm_object_t obj; int entcount; if (map == 0) return 0; if (entry == 0) { tmpe = map->header.next; entcount = map->nentries; while (entcount-- && (tmpe != &map->header)) { if (_vm_object_in_map(map, object, tmpe)) { return 1; } tmpe = tmpe->next; } } else if (entry->eflags & MAP_ENTRY_IS_SUB_MAP) { tmpm = entry->object.sub_map; tmpe = tmpm->header.next; entcount = tmpm->nentries; while (entcount-- && tmpe != &tmpm->header) { if (_vm_object_in_map(tmpm, object, tmpe)) { return 1; } tmpe = tmpe->next; } } else if ((obj = entry->object.vm_object) != NULL) { for (; obj; obj = obj->backing_object) if (obj == object) { return 1; } } return 0; } static int vm_object_in_map(vm_object_t object) { struct proc *p; /* sx_slock(&allproc_lock); */ FOREACH_PROC_IN_SYSTEM(p) { if (!p->p_vmspace /* || (p->p_flag & (P_SYSTEM|P_WEXIT)) */) continue; if (_vm_object_in_map(&p->p_vmspace->vm_map, object, 0)) { /* sx_sunlock(&allproc_lock); */ return 1; } } /* sx_sunlock(&allproc_lock); */ if (_vm_object_in_map(kernel_map, object, 0)) return 1; return 0; } DB_SHOW_COMMAND(vmochk, vm_object_check) { vm_object_t object; /* * make sure that internal objs are in a map somewhere * and none have zero ref counts. */ TAILQ_FOREACH(object, &vm_object_list, object_list) { if (object->handle == NULL && (object->type == OBJT_DEFAULT || object->type == OBJT_SWAP)) { if (object->ref_count == 0) { db_printf("vmochk: internal obj has zero ref count: %ld\n", (long)object->size); } if (!vm_object_in_map(object)) { db_printf( "vmochk: internal obj is not in a map: " "ref: %d, size: %lu: 0x%lx, backing_object: %p\n", object->ref_count, (u_long)object->size, (u_long)object->size, (void *)object->backing_object); } } } } /* * vm_object_print: [ debug ] */ DB_SHOW_COMMAND(object, vm_object_print_static) { /* XXX convert args. */ vm_object_t object = (vm_object_t)addr; boolean_t full = have_addr; vm_page_t p; /* XXX count is an (unused) arg. Avoid shadowing it. */ #define count was_count int count; if (object == NULL) return; db_iprintf( "Object %p: type=%d, size=0x%jx, res=%d, ref=%d, flags=0x%x ruid %d charge %jx\n", object, (int)object->type, (uintmax_t)object->size, object->resident_page_count, object->ref_count, object->flags, object->cred ? object->cred->cr_ruid : -1, (uintmax_t)object->charge); db_iprintf(" sref=%d, backing_object(%d)=(%p)+0x%jx\n", object->shadow_count, object->backing_object ? object->backing_object->ref_count : 0, object->backing_object, (uintmax_t)object->backing_object_offset); if (!full) return; db_indent += 2; count = 0; TAILQ_FOREACH(p, &object->memq, listq) { if (count == 0) db_iprintf("memory:="); else if (count == 6) { db_printf("\n"); db_iprintf(" ..."); count = 0; } else db_printf(","); count++; db_printf("(off=0x%jx,page=0x%jx)", (uintmax_t)p->pindex, (uintmax_t)VM_PAGE_TO_PHYS(p)); } if (count != 0) db_printf("\n"); db_indent -= 2; } /* XXX. */ #undef count /* XXX need this non-static entry for calling from vm_map_print. */ void vm_object_print( /* db_expr_t */ long addr, boolean_t have_addr, /* db_expr_t */ long count, char *modif) { vm_object_print_static(addr, have_addr, count, modif); } DB_SHOW_COMMAND(vmopag, vm_object_print_pages) { vm_object_t object; vm_pindex_t fidx; vm_paddr_t pa; vm_page_t m, prev_m; int rcount, nl, c; nl = 0; TAILQ_FOREACH(object, &vm_object_list, object_list) { db_printf("new object: %p\n", (void *)object); if (nl > 18) { c = cngetc(); if (c != ' ') return; nl = 0; } nl++; rcount = 0; fidx = 0; pa = -1; TAILQ_FOREACH(m, &object->memq, listq) { if (m->pindex > 128) break; if ((prev_m = TAILQ_PREV(m, pglist, listq)) != NULL && prev_m->pindex + 1 != m->pindex) { if (rcount) { db_printf(" index(%ld)run(%d)pa(0x%lx)\n", (long)fidx, rcount, (long)pa); if (nl > 18) { c = cngetc(); if (c != ' ') return; nl = 0; } nl++; rcount = 0; } } if (rcount && (VM_PAGE_TO_PHYS(m) == pa + rcount * PAGE_SIZE)) { ++rcount; continue; } if (rcount) { db_printf(" index(%ld)run(%d)pa(0x%lx)\n", (long)fidx, rcount, (long)pa); if (nl > 18) { c = cngetc(); if (c != ' ') return; nl = 0; } nl++; } fidx = m->pindex; pa = VM_PAGE_TO_PHYS(m); rcount = 1; } if (rcount) { db_printf(" index(%ld)run(%d)pa(0x%lx)\n", (long)fidx, rcount, (long)pa); if (nl > 18) { c = cngetc(); if (c != ' ') return; nl = 0; } nl++; } } } #endif /* DDB */ Index: projects/runtime-coverage/sys/vm/vm_object.h =================================================================== --- projects/runtime-coverage/sys/vm/vm_object.h (revision 322921) +++ projects/runtime-coverage/sys/vm/vm_object.h (revision 322922) @@ -1,332 +1,332 @@ /*- * Copyright (c) 1991, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * The Mach Operating System project at Carnegie-Mellon University. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)vm_object.h 8.3 (Berkeley) 1/12/94 * * * Copyright (c) 1987, 1990 Carnegie-Mellon University. * All rights reserved. * * Authors: Avadis Tevanian, Jr., Michael Wayne Young * * Permission to use, copy, modify and distribute this software and * its documentation is hereby granted, provided that both the copyright * notice and this permission notice appear in all copies of the * software, derivative works or modified versions, and any portions * thereof, and that both notices appear in supporting documentation. * * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS" * CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND * FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE. * * Carnegie Mellon requests users of this software to return to * * Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU * School of Computer Science * Carnegie Mellon University * Pittsburgh PA 15213-3890 * * any improvements or extensions that they make and grant Carnegie the * rights to redistribute these changes. * * $FreeBSD$ */ /* * Virtual memory object module definitions. */ #ifndef _VM_OBJECT_ #define _VM_OBJECT_ #include #include #include +#include #include #include /* * Types defined: * * vm_object_t Virtual memory object. * * List of locks * (c) const until freed * (o) per-object lock * (f) free pages queue mutex * */ struct vm_object { struct rwlock lock; TAILQ_ENTRY(vm_object) object_list; /* list of all objects */ LIST_HEAD(, vm_object) shadow_head; /* objects that this is a shadow for */ LIST_ENTRY(vm_object) shadow_list; /* chain of shadow objects */ TAILQ_HEAD(respgs, vm_page) memq; /* list of resident pages */ struct vm_radix rtree; /* root of the resident page radix trie*/ vm_pindex_t size; /* Object size */ int generation; /* generation ID */ int ref_count; /* How many refs?? */ int shadow_count; /* how many objects that this is a shadow for */ vm_memattr_t memattr; /* default memory attribute for pages */ objtype_t type; /* type of pager */ u_short flags; /* see below */ u_short pg_color; /* (c) color of first page in obj */ u_int paging_in_progress; /* Paging (in or out) so don't collapse or destroy */ int resident_page_count; /* number of resident pages */ struct vm_object *backing_object; /* object that I'm a shadow of */ vm_ooffset_t backing_object_offset;/* Offset in backing object */ TAILQ_ENTRY(vm_object) pager_object_list; /* list of all objects of this pager type */ LIST_HEAD(, vm_reserv) rvq; /* list of reservations */ void *handle; union { /* * VNode pager * * vnp_size - current size of file */ struct { off_t vnp_size; vm_ooffset_t writemappings; } vnp; /* * Device pager * * devp_pglist - list of allocated pages */ struct { TAILQ_HEAD(, vm_page) devp_pglist; struct cdev_pager_ops *ops; struct cdev *dev; } devp; /* * SG pager * * sgp_pglist - list of allocated pages */ struct { TAILQ_HEAD(, vm_page) sgp_pglist; } sgp; /* * Swap pager * * swp_tmpfs - back-pointer to the tmpfs vnode, * if any, which uses the vm object * as backing store. The handle * cannot be reused for linking, * because the vnode can be * reclaimed and recreated, making * the handle changed and hash-chain * invalid. * - * swp_bcount - number of swap 'swblock' metablocks, each - * contains up to 16 swapblk assignments. - * see vm/swap_pager.h + * swp_blks - pc-trie of the allocated swap blocks. + * */ struct { void *swp_tmpfs; - int swp_bcount; + struct pctrie swp_blks; } swp; } un_pager; struct ucred *cred; vm_ooffset_t charge; void *umtx_data; }; /* * Flags */ #define OBJ_FICTITIOUS 0x0001 /* (c) contains fictitious pages */ #define OBJ_UNMANAGED 0x0002 /* (c) contains unmanaged pages */ #define OBJ_POPULATE 0x0004 /* pager implements populate() */ #define OBJ_DEAD 0x0008 /* dead objects (during rundown) */ #define OBJ_NOSPLIT 0x0010 /* dont split this object */ #define OBJ_UMTXDEAD 0x0020 /* umtx pshared was terminated */ #define OBJ_PIPWNT 0x0040 /* paging in progress wanted */ #define OBJ_PG_DTOR 0x0080 /* dont reset object, leave that for dtor */ #define OBJ_MIGHTBEDIRTY 0x0100 /* object might be dirty, only for vnode */ #define OBJ_TMPFS_NODE 0x0200 /* object belongs to tmpfs VREG node */ #define OBJ_TMPFS_DIRTY 0x0400 /* dirty tmpfs obj */ #define OBJ_COLORED 0x1000 /* pg_color is defined */ #define OBJ_ONEMAPPING 0x2000 /* One USE (a single, non-forked) mapping flag */ #define OBJ_DISCONNECTWNT 0x4000 /* disconnect from vnode wanted */ #define OBJ_TMPFS 0x8000 /* has tmpfs vnode allocated */ /* * Helpers to perform conversion between vm_object page indexes and offsets. * IDX_TO_OFF() converts an index into an offset. * OFF_TO_IDX() converts an offset into an index. Since offsets are signed * by default, the sign propagation in OFF_TO_IDX(), when applied to * negative offsets, is intentional and returns a vm_object page index * that cannot be created by a userspace mapping. * UOFF_TO_IDX() treats the offset as an unsigned value and converts it * into an index accordingly. Use it only when the full range of offset * values are allowed. Currently, this only applies to device mappings. * OBJ_MAX_SIZE specifies the maximum page index corresponding to the * maximum unsigned offset. */ #define IDX_TO_OFF(idx) (((vm_ooffset_t)(idx)) << PAGE_SHIFT) #define OFF_TO_IDX(off) ((vm_pindex_t)(((vm_ooffset_t)(off)) >> PAGE_SHIFT)) #define UOFF_TO_IDX(off) (((vm_pindex_t)(off)) >> PAGE_SHIFT) #define OBJ_MAX_SIZE (UOFF_TO_IDX(UINT64_MAX) + 1) #ifdef _KERNEL #define OBJPC_SYNC 0x1 /* sync I/O */ #define OBJPC_INVAL 0x2 /* invalidate */ #define OBJPC_NOSYNC 0x4 /* skip if VPO_NOSYNC */ /* * The following options are supported by vm_object_page_remove(). */ #define OBJPR_CLEANONLY 0x1 /* Don't remove dirty pages. */ #define OBJPR_NOTMAPPED 0x2 /* Don't unmap pages. */ TAILQ_HEAD(object_q, vm_object); extern struct object_q vm_object_list; /* list of allocated objects */ extern struct mtx vm_object_list_mtx; /* lock for object list and count */ extern struct vm_object kernel_object_store; extern struct vm_object kmem_object_store; #define kernel_object (&kernel_object_store) #define kmem_object (&kmem_object_store) #define VM_OBJECT_ASSERT_LOCKED(object) \ rw_assert(&(object)->lock, RA_LOCKED) #define VM_OBJECT_ASSERT_RLOCKED(object) \ rw_assert(&(object)->lock, RA_RLOCKED) #define VM_OBJECT_ASSERT_WLOCKED(object) \ rw_assert(&(object)->lock, RA_WLOCKED) #define VM_OBJECT_ASSERT_UNLOCKED(object) \ rw_assert(&(object)->lock, RA_UNLOCKED) #define VM_OBJECT_LOCK_DOWNGRADE(object) \ rw_downgrade(&(object)->lock) #define VM_OBJECT_RLOCK(object) \ rw_rlock(&(object)->lock) #define VM_OBJECT_RUNLOCK(object) \ rw_runlock(&(object)->lock) #define VM_OBJECT_SLEEP(object, wchan, pri, wmesg, timo) \ rw_sleep((wchan), &(object)->lock, (pri), (wmesg), (timo)) #define VM_OBJECT_TRYRLOCK(object) \ rw_try_rlock(&(object)->lock) #define VM_OBJECT_TRYWLOCK(object) \ rw_try_wlock(&(object)->lock) #define VM_OBJECT_TRYUPGRADE(object) \ rw_try_upgrade(&(object)->lock) #define VM_OBJECT_WLOCK(object) \ rw_wlock(&(object)->lock) #define VM_OBJECT_WOWNED(object) \ rw_wowned(&(object)->lock) #define VM_OBJECT_WUNLOCK(object) \ rw_wunlock(&(object)->lock) /* * The object must be locked or thread private. */ static __inline void vm_object_set_flag(vm_object_t object, u_short bits) { object->flags |= bits; } /* * Conditionally set the object's color, which (1) enables the allocation * of physical memory reservations for anonymous objects and larger-than- * superpage-sized named objects and (2) determines the first page offset * within the object at which a reservation may be allocated. In other * words, the color determines the alignment of the object with respect * to the largest superpage boundary. When mapping named objects, like * files or POSIX shared memory objects, the color should be set to zero * before a virtual address is selected for the mapping. In contrast, * for anonymous objects, the color may be set after the virtual address * is selected. * * The object must be locked. */ static __inline void vm_object_color(vm_object_t object, u_short color) { if ((object->flags & OBJ_COLORED) == 0) { object->pg_color = color; object->flags |= OBJ_COLORED; } } void vm_object_clear_flag(vm_object_t object, u_short bits); void vm_object_pip_add(vm_object_t object, short i); void vm_object_pip_subtract(vm_object_t object, short i); void vm_object_pip_wakeup(vm_object_t object); void vm_object_pip_wakeupn(vm_object_t object, short i); void vm_object_pip_wait(vm_object_t object, char *waitid); void umtx_shm_object_init(vm_object_t object); void umtx_shm_object_terminated(vm_object_t object); extern int umtx_shm_vnobj_persistent; vm_object_t vm_object_allocate (objtype_t, vm_pindex_t); boolean_t vm_object_coalesce(vm_object_t, vm_ooffset_t, vm_size_t, vm_size_t, boolean_t); void vm_object_collapse (vm_object_t); void vm_object_deallocate (vm_object_t); void vm_object_destroy (vm_object_t); void vm_object_terminate (vm_object_t); void vm_object_set_writeable_dirty (vm_object_t); void vm_object_init (void); void vm_object_madvise(vm_object_t, vm_pindex_t, vm_pindex_t, int); boolean_t vm_object_page_clean(vm_object_t object, vm_ooffset_t start, vm_ooffset_t end, int flags); void vm_object_page_noreuse(vm_object_t object, vm_pindex_t start, vm_pindex_t end); void vm_object_page_remove(vm_object_t object, vm_pindex_t start, vm_pindex_t end, int options); boolean_t vm_object_populate(vm_object_t, vm_pindex_t, vm_pindex_t); void vm_object_print(long addr, boolean_t have_addr, long count, char *modif); void vm_object_reference (vm_object_t); void vm_object_reference_locked(vm_object_t); int vm_object_set_memattr(vm_object_t object, vm_memattr_t memattr); void vm_object_shadow (vm_object_t *, vm_ooffset_t *, vm_size_t); void vm_object_split(vm_map_entry_t); boolean_t vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t, boolean_t); void vm_object_unwire(vm_object_t object, vm_ooffset_t offset, vm_size_t length, uint8_t queue); struct vnode *vm_object_vnode(vm_object_t object); #endif /* _KERNEL */ #endif /* _VM_OBJECT_ */ Index: projects/runtime-coverage/usr.bin/calendar/calendars/calendar.freebsd =================================================================== --- projects/runtime-coverage/usr.bin/calendar/calendars/calendar.freebsd (revision 322921) +++ projects/runtime-coverage/usr.bin/calendar/calendars/calendar.freebsd (revision 322922) @@ -1,452 +1,454 @@ /* * FreeBSD * * $FreeBSD$ */ #ifndef _calendar_freebsd_ #define _calendar_freebsd_ 01/01 Dimitry Andric born in Utrecht, the Netherlands, 1969 01/01 Lev Serebryakov born in Leningrad, USSR, 1979 01/01 Alexander Langer born in Duesseldorf, Nordrhein-Westfalen, Germany, 1981 01/02 Ion-Mihai "IOnut" Tetcu born in Bucharest, Romania, 1980 01/02 Patrick Li born in Beijing, People's Republic of China, 1985 01/03 Tetsurou Okazaki born in Mobara, Chiba, Japan, 1972 01/04 Hiroyuki Hanai born in Kagawa pre., Japan, 1969 01/06 Philippe Audeoud born in Bretigny-Sur-Orge, France, 1980 01/08 Michael L. Hostbaek born in Copenhagen, Denmark, 1977 01/10 Jean-Yves Lefort born in Charleroi, Belgium, 1980 01/12 Yen-Ming Lee born in Taipei, Taiwan, Republic of China, 1977 01/12 Ying-Chieh Liao born in Taipei, Taiwan, Republic of China, 1979 01/12 Kristof Provost born in Aalst, Belgium, 1983 01/13 Ruslan Bukin born in Dudinka, Russian Federation, 1985 01/14 Yi-Jheng Lin born in Taichung, Taiwan, Republic of China, 1985 01/15 Anne Dickison born in Madison, Indiana, United States, 1976 01/16 Ariff Abdullah born in Kuala Lumpur, Malaysia, 1978 01/16 Dmitry Sivachenko born in Moscow, USSR, 1978 01/16 Vanilla I. Shu born in Taipei, Taiwan, Republic of China, 1978 01/17 Raphael Kubo da Costa born in Sao Paulo, Sao Paulo, Brazil, 1989 01/18 Dejan Lesjak born in Ljubljana, Slovenia, Yugoslavia, 1977 01/19 Marshall Kirk McKusick born in Wilmington, Delaware, United States, 1954 01/19 Ruslan Ermilov born in Simferopol, USSR, 1974 01/19 Marcelo S. Araujo born in Joinville, Santa Catarina, Brazil, 1981 01/20 Poul-Henning Kamp born in Korsoer, Denmark, 1966 01/21 Mahdi Mokhtari born in Tehran, Iran, 1995 01/22 Johann Visagie born in Cape Town, South Africa, 1970 01/23 Hideyuki KURASHINA born in Niigata, Japan, 1982 01/24 Fabien Thomas born in Avignon, France, 1971 01/24 Matteo Riondato born in Padova, Italy, 1986 01/25 Nick Hibma born in Groningen, the Netherlands, 1972 01/25 Bernd Walter born in Moers, Nordrhein-Westfalen, Germany, 1974 01/26 Andrew Gallatin born in Buffalo, New York, United States, 1970 01/27 Nick Sayer born in San Diego, California, United States, 1968 01/27 Jacques Anthony Vidrine born in Baton Rouge, Louisiana, United States, 1971 01/27 Ngie Cooper born in Seattle, Washington, United States, 1984 01/31 Hidetoshi Shimokawa born in Yokohama, Kanagawa, Japan, 1970 02/01 Doug Rabson born in London, England, 1966 02/01 Nicola Vitale born in Busto Arsizio, Varese, Italy, 1976 02/01 Paul Saab born in Champaign-Urbana, Illinois, United States, 1978 02/01 Martin Wilke born in Ludwigsfelde, Brandenburg, Germany, 1980 02/01 Christian Brueffer born in Gronau, Nordrhein-Westfalen, Germany, 1982 02/01 Steven Kreuzer born in Oceanside, New York, United States, 1982 02/01 Juli Mallett born in Washington, Pennsylvania, United States, 1985 02/02 Diomidis D. Spinellis born in Athens, Greece, 1967 02/02 Michael W Lucas born in Detroit, Michigan, United States, 1967 02/02 Dmitry Chagin born in Stalingrad, USSR, 1976 02/02 Yoichi Nakayama born in Tsu, Mie, Japan, 1976 02/02 Yoshihiro Takahashi born in Yokohama, Kanagawa, Japan, 1976 02/03 Jason Helfman born in Royal Oak, Michigan, United States, 1972 02/04 Eitan Adler born in West Hempstead, New York, United States, 1991 02/05 Frank Laszlo born in Howell, Michigan, United States, 1983 02/06 Julien Charbon born in Saint Etienne, Loire, France, 1978 02/07 Bjoern Heidotting born in Uelsen, Germany, 1980 02/10 David Greenman born in Portland, Oregon, United States, 1968 02/10 Paul Richards born in Ammanford, Carmarthenshire, United Kingdom, 1968 02/10 Simon Barner born in Rosenheim, Bayern, Germany, 1980 02/10 Jason E. Hale born in Pittsburgh, Pennsylvania, United States, 1982 02/13 Jesper Skriver born in Aarhus, Denmark, 1975 02/13 Steve Wills born in Lynchburg, Virginia, United States, 1975 02/13 Andrey Slusar born in Odessa, USSR, 1979 02/13 David W. Chapman Jr. born in Bethel, Connecticut, United States, 1981 02/14 Manolis Kiagias born in Chania, Greece, 1970 02/14 Erwin Lansing born in 's-Hertogenbosch, the Netherlands, 1975 02/14 Martin Blapp born in Olten, Switzerland, 1976 02/15 Hiren Panchasara born in Ahmedabad, Gujarat, India, 1984 02/16 Justin Hibbits born in Toledo, Ohio, United States, 1983 02/16 Tobias Christian Berner born in Bern, Switzerland, 1985 02/18 Christoph Moench-Tegeder born in Hannover, Niedersachsen, Germany, 1980 02/19 Murray Stokely born in Jacksonville, Florida, United States, 1979 02/20 Anders Nordby born in Oslo, Norway, 1976 02/21 Alexey Zelkin born in Simferopol, Ukraine, 1978 02/22 Brooks Davis born in Longview, Washington, United States, 1976 02/22 Jake Burkholder born in Maynooth, Ontario, Canada, 1979 02/23 Peter Wemm born in Perth, Western Australia, Australia, 1971 02/23 Mathieu Arnold born in Champigny sur Marne, Val de Marne, France, 1978 02/24 Johan Karlsson born in Mariannelund, Sweden, 1974 02/24 Colin Percival born in Burnaby, Canada, 1981 02/26 Pietro Cerutti born in Faido, Switzerland, 1984 02/28 Daichi GOTO born in Shimizu Suntou, Shizuoka, Japan, 1980 02/28 Ruslan Makhmatkhanov born in Rostov-on-Don, USSR, 1984 03/01 Hye-Shik Chang born in Daejeon, Republic of Korea, 1980 03/02 Cy Schubert born in Edmonton, Alberta, Canada, 1956 03/03 Sergey Matveychuk born in Moscow, Russian Federation, 1973 03/03 Doug White born in Eugene, Oregon, United States, 1977 03/03 Gordon Tetlow born in Reno, Nevada, United States, 1978 03/04 Oleksandr Tymoshenko born in Chernihiv, Ukraine, 1980 03/05 Baptiste Daroussin born in Beauvais, France, 1980 03/05 Philip Paeps born in Leuven, Belgium, 1983 03/05 Ulf Lilleengen born in Hamar, Norway, 1985 03/06 Christopher Piazza born in Kamloops, British Columbia, Canada, 1981 03/07 Michael P. Pritchard born in Los Angeles, California, United States, 1964 03/07 Giorgos Keramidas born in Athens, Greece, 1976 03/10 Andreas Klemm born in Duesseldorf, Nordrhein-Westfalen, Germany, 1963 03/10 Luiz Otavio O Souza born in Bauru, Sao Paulo, Brazil, 1978 03/10 Nikolai Lifanov born in Moscow, Russian Federation, 1989 03/11 Soeren Straarup born in Andst, Denmark, 1978 03/12 Greg Lewis born in Adelaide, South Australia, Australia, 1969 03/13 Alexander Leidinger born in Neunkirchen, Saarland, Germany, 1976 03/13 Will Andrews born in Pontiac, Michigan, United States, 1982 03/14 Bernhard Froehlich born in Graz, Styria, Austria, 1985 03/15 Paolo Pisati born in Lodi, Italy, 1977 03/15 Brian Fundakowski Feldman born in Alexandria, Virginia, United States, 1983 03/17 Michael Smith born in Bankstown, New South Wales, Australia, 1971 03/17 Alexander Motin born in Simferopol, Ukraine, 1979 03/18 Koop Mast born in Dokkum, the Netherlands, 1981 03/19 Mikhail Teterin born in Kyiv, Ukraine, 1972 03/20 Joseph S. Atkinson born in Batesville, Arkansas, United States, 1977 03/20 Henrik Brix Andersen born in Aarhus, Denmark, 1978 03/20 MANTANI Nobutaka born in Hiroshima, Japan, 1978 03/20 Cameron Grant died in Hemel Hempstead, United Kingdom, 2005 03/22 Brad Davis born in Farmington, New Mexico, United States, 1983 03/23 Daniel C. Sobral born in Brasilia, Distrito Federal, Brazil, 1971 03/23 Benno Rice born in Adelaide, South Australia, Australia, 1977 03/24 Marcel Moolenaar born in Hilversum, the Netherlands, 1968 03/24 Emanuel Haupt born in Zurich, Switzerland, 1979 03/25 Andrew R. Reiter born in Springfield, Massachusetts, United States, 1980 03/26 Jonathan Anderson born in Ottawa, Ontario, Canada, 1983 03/27 Josef El-Rayes born in Linz, Austria, 1982 03/28 Sean C. Farley born in Indianapolis, Indiana, United States, 1970 03/29 Thierry Thomas born in Luxeuil les Bains, France, 1961 03/30 Po-Chuan Hsieh born in Taipei, Taiwan, Republic of China, 1978 03/31 First quarter status reports are due on 04/15 04/01 Matthew Jacob born in San Francisco, California, United States, 1958 +04/01 Alexander V. Chernikov born in Moscow, Russian Federation, 1984 04/01 Bill Fenner born in Bellefonte, Pennsylvania, United States, 1971 04/01 Peter Edwards born in Dublin, Ireland, 1973 04/03 Hellmuth Michaelis born in Kiel, Schleswig-Holstein, Germany, 1958 04/03 Tong Liu born in Beijing, People's Republic of China, 1981 04/03 Gabor Pali born in Kunhegyes, Hungary, 1982 04/04 Jason Unovitch born in Scranton, Pennsylvania, United States, 1986 04/05 Stacey Son born in Burley, Idaho, United States, 1967 04/06 Peter Jeremy born in Sydney, New South Wales, Australia, 1961 04/07 Edward Tomasz Napierala born in Wolsztyn, Poland, 1981 04/08 Jordan K. Hubbard born in Honolulu, Hawaii, United States, 1963 04/09 Ceri Davies born in Haverfordwest, Pembrokeshire, United Kingdom, 1976 04/11 Bruce A. Mah born in Fresno, California, United States, 1969 04/12 Patrick Gardella born in Columbus, Ohio, United States, 1967 04/12 Ed Schouten born in Oss, the Netherlands, 1986 04/12 Ruey-Cherng Yu born in Keelung, Taiwan, 1978 04/13 Oliver Braun born in Nuremberg, Bavaria, Germany, 1972 04/14 Crist J. Clark born in Milwaukee, Wisconsin, United States, 1970 04/14 Glen J. Barber born in Wilkes-Barre, Pennsylvania, United States, 1981 04/15 David Malone born in Dublin, Ireland, 1973 04/17 Alexey Degtyarev born in Ahtubinsk, Russian Federation, 1984 04/17 Dryice Liu born in Jinan, Shandong, China, 1975 04/22 Joerg Wunsch born in Dresden, Sachsen, Germany, 1962 04/22 Jun Kuriyama born in Matsue, Shimane, Japan, 1973 04/22 Jakub Klama born in Blachownia, Silesia, Poland, 1989 04/25 Richard Gallamore born in Kissimmee, Florida, United States, 1987 04/26 Rene Ladan born in Geldrop, the Netherlands, 1980 04/28 Oleg Bulyzhin born in Kharkov, USSR, 1976 04/28 Andriy Voskoboinyk born in Bila Tserkva, Ukraine, 1992 04/29 Adam Weinberger born in Berkeley, California, United States, 1980 04/29 Eric Anholt born in Portland, Oregon, United States, 1983 05/01 Randall Stewart born in Spokane, Washington, United States, 1959 05/02 Danilo G. Baio born in Maringa, Parana, Brazil, 1986 05/02 Wojciech A. Koszek born in Czestochowa, Poland, 1987 05/03 Brian Dean born in Elkins, West Virginia, United States, 1966 05/03 Patrick Kelsey born in Freehold, New Jersey, United States, 1976 05/03 Robert Nicholas Maxwell Watson born in Harrow, Middlesex, United Kingdom, 1977 05/04 Denis Peplin born in Nizhniy Novgorod, Russian Federation, 1977 05/08 Kirill Ponomarew born in Volgograd, Russian Federation, 1977 05/08 Sean Kelly born in Walnut Creek, California, United States, 1982 05/09 Daniel Eischen born in Syracuse, New York, United States, 1963 05/09 Aaron Dalton born in Boise, Idaho, United States, 1973 05/09 Jase Thew born in Abergavenny, Gwent, United Kingdom, 1974 05/10 Markus Brueffer born in Gronau, Nordrhein-Westfalen, Germany, 1977 05/11 Kurt Lidl born in Rockville, Maryland, United States, 1968 05/11 Jesus Rodriguez born in Barcelona, Spain, 1972 05/11 Marcin Wojtas born in Krakow, Poland, 1986 05/11 Roman Kurakin born in Moscow, USSR, 1979 05/11 Ulrich Spoerlein born in Schesslitz, Bayern, Germany, 1981 05/13 Pete Fritchman born in Lansdale, Pennsylvania, United States, 1983 05/14 Tatsumi Hosokawa born in Tokyo, Japan, 1968 05/14 Shigeyuku Fukushima born in Osaka, Japan, 1974 05/14 Bruce Cran born in Cambridge, United Kingdom, 1981 05/15 Hans Petter Selasky born in Flekkefjord, Norway, 1982 05/16 Johann Kois born in Wolfsberg, Austria, 1975 05/16 Marcus Alves Grando born in Florianopolis, Santa Catarina, Brazil, 1979 05/17 Thomas Abthorpe born in Port Arthur, Ontario, Canada, 1968 05/19 Philippe Charnier born in Fontainebleau, France, 1966 05/19 Ian Dowse born in Dublin, Ireland, 1975 05/19 Sofian Brabez born in Toulouse, France, 1984 05/20 Dan Moschuk died in Burlington, Ontario, Canada, 2010 05/21 Kris Kennaway born in Winnipeg, Manitoba, Canada, 1978 05/22 James Gritton born in San Francisco, California, United States, 1967 05/22 Clive Tong-I Lin born in Changhua, Taiwan, Republic of China, 1978 05/22 Michael Bushkov born in Rostov-on-Don, Russian Federation, 1985 05/22 Rui Paulo born in Evora, Portugal, 1986 05/22 David Naylor born in Johannesburg, South Africa, 1988 05/23 Munechika Sumikawa born in Osaka, Osaka, Japan, 1972 05/24 Duncan McLennan Barclay born in London, Middlesex, United Kingdom, 1970 05/24 Oliver Lehmann born in Karlsburg, Germany, 1981 05/25 Pawel Pekala born in Swidnica, Poland, 1980 05/25 Tom Rhodes born in Ellwood City, Pennsylvania, United States, 1981 05/25 Roman Divacky born in Brno, Czech Republic, 1983 05/26 Jim Pirzyk born in Chicago, Illinois, United States, 1968 05/26 Florian Smeets born in Schwerte, Nordrhein-Westfalen, Germany, 1982 05/27 Ollivier Robert born in Paris, France, 1967 05/29 Wilko Bulte born in Arnhem, the Netherlands, 1965 05/29 Seigo Tanimura born in Kitakyushu, Fukuoka, Japan, 1976 05/30 Wen Heping born in Xiangxiang, Hunan, China, 1970 05/31 Ville Skytta born in Helsinki, Finland, 1974 06/02 Jean-Marc Zucconi born in Pontarlier, France, 1954 06/02 Alexander Botero-Lowry born in Austin, Texas, United States, 1986 06/03 CHOI Junho born in Seoul, Korea, 1974 06/03 Wesley Shields born in Binghamton, New York, United States, 1981 06/04 Julian Elischer born in Perth, Australia, 1959 06/04 Justin Gibbs born in San Pedro, California, United States, 1973 06/04 Jason Evans born in Greeley, Colorado, United States, 1973 06/04 Thomas Moestl born in Braunschweig, Niedersachsen, Germany, 1980 06/04 Devin Teske born in Arcadia, California, United States, 1982 06/04 Zack Kirsch born in Memphis, Tennessee, United States, 1982 06/04 Johannes Jost Meixner born in Wiesbaden, Germany, 1987 06/06 Sergei Kolobov born in Karpinsk, Russian Federation, 1972 06/06 Ryan Libby born in Kirkland, Washington, United States, 1985 -06/06 Alan Eldridge died in Denver, Colorado, 2003 +06/06 Alan Eldridge died in Denver, Colorado, United States, 2003 06/07 Jimmy Olgeni born in Milano, Italy, 1976 06/07 Benjamin Close born in Adelaide, Australia, 1978 06/07 Roger Pau Monne born in Reus, Catalunya, Spain, 1986 06/08 Ravi Pokala born in Royal Oak, Michigan, United States, 1980 06/09 Stanislav Galabov born in Sofia, Bulgaria 1978 06/11 Alonso Cardenas Marquez born in Arequipa, Peru, 1979 06/14 Josh Paetzel born in Minneapolis, Minnesota, United States, 1973 06/17 Tilman Linneweh born in Weinheim, Baden-Wuerttemberg, Germany, 1978 06/18 Li-Wen Hsu born in Taipei, Taiwan, Republic of China, 1984 06/18 Roman Bogorodskiy born in Saratov, Russian Federation, 1986 06/19 Charlie Root born in Portland, Oregon, United States, 1993 06/21 Ganbold Tsagaankhuu born in Ulaanbaatar, Mongolia, 1971 06/21 Niels Heinen born in Markelo, the Netherlands, 1978 06/22 Andreas Tobler born in Heiden, Switzerland, 1968 06/24 Chris Faulhaber born in Springfield, Illinois, United States, 1971 06/26 Brian Somers born in Dundrum, Dublin, Ireland, 1967 06/28 Mark Santcroos born in Rotterdam, the Netherlands, 1979 06/28 Xin Li born in Beijing, People's Republic of China, 1982 06/28 Bradley T. Hughes born in Amarillo, Texas, United States, 1977 06/29 Wilfredo Sanchez Vega born in Majaguez, Puerto Rico, United States, 1972 06/29 Daniel Harris born in Lubbock, Texas, United States, 1985 06/29 Andrew Pantyukhin born in Moscow, Russian Federation, 1985 06/30 Guido van Rooij born in Best, Noord-Brabant, the Netherlands, 1965 06/30 Second quarter status reports are due on 07/15 07/01 Matthew Dillon born in San Francisco, California, United States, 1966 07/01 Mateusz Guzik born in Nowy Targ, Poland, 1986 07/02 Mark Christopher Ovens born in Preston, Lancashire, United Kingdom, 1958 07/02 Vasil Venelinov Dimov born in Shumen, Bulgaria, 1982 07/04 Motoyuki Konno born in Musashino, Tokyo, Japan, 1969 07/04 Florent Thoumie born in Montmorency, Val d'Oise, France, 1982 07/05 Olivier Cochard-Labbe born in Brest, France, 1977 07/05 Sergey Kandaurov born in Gubkin, Russian Federation, 1985 07/07 Andrew Thompson born in Lower Hutt, Wellington, New Zealand, 1979 07/07 Maxime Henrion born in Metz, France, 1981 07/07 George Reid born in Frimley, Hampshire, United Kingdom, 1983 07/10 Jung-uk Kim born in Seoul, Korea, 1971 07/10 Justin Seger born in Harvard, Massachusetts, United States, 1981 07/10 David Schultz born in Oakland, California, United States, 1982 07/10 Ben Woods born in Perth, Western Australia, Australia, 1984 07/11 Jesus R. Camou born in Hermosillo, Sonora, Mexico, 1983 07/15 Gary Jennejohn born in Rochester, New York, United States, 1950 07/16 Suleiman Souhlal born in Roma, Italy, 1983 07/16 Davide Italiano born in Milazzo, Italy, 1989 07/17 Michael Chin-Yuan Wu born in Taipei, Taiwan, Republic of China, 1980 07/19 Masafumi NAKANE born in Okazaki, Aichi, Japan, 1972 07/19 Simon L. Nielsen born in Copenhagen, Denmark, 1980 07/19 Gleb Smirnoff born in Kharkov, USSR, 1981 07/20 Dru Lavigne born in Kingston, Ontario, Canada, 1965 07/20 Andrey V. Elsukov born in Kotelnich, Russian Federation, 1981 07/22 James Housley born in Chicago, Illinois, United States, 1965 07/22 Jens Schweikhardt born in Waiblingen, Baden-Wuerttemberg, Germany, 1967 07/22 Lukas Ertl born in Weissenbach/Enns, Steiermark, Austria, 1976 07/23 Sergey A. Osokin born in Krasnogorsky, Stepnogorsk, Akmolinskaya region, Kazakhstan, 1972 07/23 Andrey Zonov born in Kirov, Russian Federation, 1985 07/24 Alexander Nedotsukov born in Ulyanovsk, Russian Federation, 1974 07/24 Alberto Villa born in Vercelli, Italy, 1987 07/27 Andriy Gapon born in Kyrykivka, Sumy region, Ukraine, 1976 07/28 Jim Mock born in Bethlehem, Pennsylvania, United States, 1974 07/28 Tom Hukins born in Manchester, United Kingdom, 1976 07/29 Dirk Meyer born in Kassel, Hessen, Germany, 1965 07/29 Felippe M. Motta born in Maceio, Alagoas, Brazil, 1988 08/02 Gabor Kovesdan born in Budapest, Hungary, 1987 08/03 Peter Holm born in Copenhagen, Denmark, 1955 08/05 Alfred Perlstein born in Brooklyn, New York, United States, 1978 08/06 Anton Berezin born in Dnepropetrovsk, Ukraine, 1970 08/06 John-Mark Gurney born in Detroit, Michigan, United States, 1978 08/06 Damjan Marion born in Rijeka, Croatia, 1978 08/07 Jonathan Mini born in San Mateo, California, United States, 1979 08/08 Mikolaj Golub born in Kharkov, USSR, 1977 08/08 Juergen Lock died in Bremen, Germany, 2015 08/09 Stefan Farfeleder died in Wien, Austria, 2015 08/10 Julio Merino born in Barcelona, Spain, 1984 08/10 Peter Pentchev born in Sofia, Bulgaria, 1977 08/12 Joe Marcus Clarke born in Lakeland, Florida, United States, 1976 08/12 Max Brazhnikov born in Leningradskaya, Russian Federation, 1979 08/14 Stefan Esser born in Cologne, Nordrhein-Westfalen, Germany, 1961 08/16 Andrey Chernov died in Moscow, Russian Federation, 2017 08/17 Olivier Houchard born in Nancy, France, 1980 08/19 Chin-San Huang born in Yi-Lan, Taiwan, Republic of China, 1979 08/19 Pav Lucistnik born in Kutna Hora, Czech Republic, 1980 08/20 Michael Heffner born in Cleona, Pennsylvania, United States, 1981 08/21 Jason A. Harmening born in Fort Wayne, Indiana, United States, 1981 08/24 Mark Linimon born in Houston, Texas, United States, 1955 08/24 Alexander Botero-Lowry died in San Francisco, California, United States, 2012 08/25 Beech Rintoul born in Oakland, California, United States, 1952 08/25 Jean Milanez Melo born in Divinopolis, Minas Gerais, Brazil, 1982 08/26 Scott Long born in Chicago, Illinois, United States, 1974 08/26 Dima Ruban born in Nalchik, USSR, 1970 08/26 Marc Fonvieille born in Avignon, France, 1972 08/26 Herve Quiroz born in Aix-en-Provence, France, 1977 08/27 Andrey Chernov born in Moscow, USSR, 1966 08/27 Tony Finch born in London, United Kingdom, 1974 08/27 Michael Johnson born in Morganton, North Carolina, United States, 1980 08/28 Norikatsu Shigemura born in Fujisawa, Kanagawa, Japan, 1974 08/29 Thomas Gellekum born in Moenchengladbach, Nordrhein-Westfalen, Germany, 1967 08/29 Max Laier born in Karlsruhe, Germany, 1981 09/01 Pyun YongHyeon born in Kimcheon, Korea, 1968 09/01 William Grzybowski born in Parana, Brazil, 1988 09/03 Max Khon born in Novosibirsk, USSR, 1976 09/03 Allan Jude born in Hamilton, Ontario, Canada, 1984 09/03 Cheng-Lung Sung born in Taipei, Taiwan, Republic of China, 1977 09/05 Mark Robert Vaughan Murray born in Harare, Mashonaland, Zimbabwe, 1961 09/05 Adrian Harold Chadd born in Perth, Western Australia, Australia, 1979 09/05 Rodrigo Osorio born in Montevideo, Uruguay, 1975 09/06 Eric Joyner born in Fairfax, Virginia, United States, 1991 09/07 Tim Bishop born in Cornwall, United Kingdom, 1978 09/07 Chris Rees born in Kettering, United Kingdom, 1987 09/08 Boris Samorodov born in Krasnodar, Russian Federation, 1963 09/09 Yoshio Mita born in Hiroshima, Japan, 1972 +09/09 Steven Hartland born in Wordsley, United Kingdom, 1973 09/10 Wesley R. Peters born in Hartford, Alabama, United States, 1961 09/12 Weongyo Jeong born in Haman, Korea, 1980 09/12 Benedict Christopher Reuschling born in Darmstadt, Germany, 1981 09/12 William C. Fumerola II born in Detroit, Michigan, United States, 1981 09/14 Matthew Seaman born in Bristol, United Kingdom, 1965 09/15 Aleksandr Rybalko born in Odessa, Ukraine, 1977 09/15 Dima Panov born in Khabarovsk, Russian Federation, 1978 09/16 Maksim Yevmenkin born in Taganrog, USSR, 1974 09/17 Maxim Bolotin born in Rostov-on-Don, Russian Federation, 1976 09/18 Matthew Fleming born in Cleveland, Ohio, United States, 1975 09/20 Kevin Lo born in Taipei, Taiwan, Republic of China, 1972 09/21 Alex Kozlov born in Bila Tserkva, Ukraine, 1970 09/21 Gleb Kurtsou born in Minsk, Belarus, 1984 09/22 Alan Somers born in San Antonio, Texas, United States, 1982 09/22 Bryan Drewery born in San Diego, California, United States, 1984 09/23 Martin Matuska born in Bratislava, Slovakia, 1979 09/24 Larry Rosenman born in Queens, New York, United States, 1957 09/27 Kyle Evans born in Oklahoma City, Oklahoma, United States, 1991 09/27 Neil Blakey-Milner born in Port Elizabeth, South Africa, 1978 09/27 Renato Botelho born in Araras, Sao Paulo, Brazil, 1979 09/28 Greg Lehey born in Melbourne, Victoria, Australia, 1948 09/28 Alex Dupre born in Milano, Italy, 1980 09/29 Matthew Hunt born in Johnstown, Pennsylvania, United States, 1976 09/30 Mark Felder born in Prairie du Chien, Wisconsin, United States, 1985 09/30 Hiten M. Pandya born in Dar-es-Salaam, Tanzania, East Africa, 1986 09/30 Third quarter status reports are due on 10/15 10/02 Beat Gaetzi born in Zurich, Switzerland, 1980 10/02 Grzegorz Blach born in Starachowice, Poland, 1985 10/05 Hiroki Sato born in Yamagata, Japan, 1977 10/05 Chris Costello born in Houston, Texas, United States, 1985 10/09 Stefan Walter born in Werne, Nordrhein-Westfalen, Germany, 1978 10/11 Rick Macklem born in Ontario, Canada, 1955 10/12 Pawel Jakub Dawidek born in Radzyn Podlaski, Poland, 1980 10/15 Maxim Konovalov born in Khabarovsk, USSR, 1973 10/15 Eugene Grosbein born in Novokuznetsk, Russian Republic, USSR, 1976 10/16 Remko Lodder born in Rotterdam, the Netherlands, 1983 10/17 Maho NAKATA born in Osaka, Japan, 1974 10/18 Sheldon Hearn born in Cape Town, Western Cape, South Africa, 1974 10/18 Vladimir Kondratyev born in Ryazan, USSR, 1975 10/19 Nicholas Souchu born in Suresnes, Hauts-de-Seine, France, 1972 10/19 Nick Barkas born in Longview, Washington, United States, 1981 10/19 Pedro Giffuni born in Bogotá, Colombia, 1968 10/20 Joel Dahl born in Bitterna, Skaraborg, Sweden, 1983 10/20 Dmitry Marakasov born in Moscow, Russian Federation, 1984 10/21 Ben Smithurst born in Sheffield, South Yorkshire, United Kingdom, 1981 10/22 Jean-Sebastien Pedron born in Redon, Ille-et-Vilaine, France, 1980 10/23 Mario Sergio Fujikawa Ferreira born in Brasilia, Distrito Federal, Brazil, 1976 10/23 Romain Tartière born in Clermont-Ferrand, France, 1984 10/25 Eric Melville born in Los Gatos, California, United States, 1980 10/25 Julien Laffaye born in Toulouse, France, 1988 10/25 Ashish SHUKLA born in Kanpur, India, 1985 10/25 Toomas Soome born Estonia, 1971 10/26 Matthew Ahrens born in United States, 1979 10/26 Philip M. Gollucci born in Silver Spring, Maryland, United States, 1979 10/27 Takanori Watanabe born in Numazu, Shizuoka, Japan, 1972 10/30 Olli Hauer born in Sindelfingen, Germany, 1968 10/31 Taras Korenko born in Cherkasy region, Ukraine, 1980 11/03 Ryan Stone born in Ottawa, Ontario, Canada, 1985 11/05 M. Warner Losh born in Kansas City, Kansas, United States, 1966 11/06 Michael Zhilin born in Stary Oskol, USSR, 1985 11/08 Joseph R. Mingrone born in Charlottetown, Prince Edward Island, Canada, 1976 11/09 Coleman Kane born in Cincinnati, Ohio, United States, 1980 11/09 Antoine Brodin born in Bagnolet, France, 1981 11/10 Gregory Neil Shapiro born in Providence, Rhode Island, United States, 1970 11/11 Danilo E. Gondolfo born in Lobato, Parana, Brazil, 1987 11/13 John Baldwin born in Stuart, Virginia, United States, 1977 11/14 Jeremie Le Hen born in Nancy, France, 1980 11/15 Lars Engels born in Hilden, Nordrhein-Westfalen, Germany, 1980 11/15 Tijl Coosemans born in Duffel, Belgium, 1983 11/16 Jose Maria Alcaide Salinas born in Madrid, Spain, 1962 11/16 Matt Joras born in Evanston, Illinois, United States, 1992 11/17 Ralf S. Engelschall born in Dachau, Bavaria, Germany, 1972 11/18 Thomas Quinot born in Paris, France, 1977 11/19 Konstantin Belousov born in Kiev, USSR, 1972 11/20 Dmitry Morozovsky born in Moscow, USSR, 1968 11/20 Gavin Atkinson born in Middlesbrough, United Kingdom, 1979 11/21 Mark Johnston born in Toronto, Ontario, Canada, 1989 11/22 Frederic Culot born in Saint-Germain-En-Laye, France, 1976 11/23 Josef Lawrence Karthauser born in Pembury, Kent, United Kingdom, 1972 11/23 Sepherosa Ziehau born in Shanghai, China, 1980 11/24 Andrey Zakhvatov born in Chelyabinsk, Russian Federation, 1974 11/24 Daniel Gerzo born in Bratislava, Slovakia, 1986 11/28 Nik Clayton born in Peterborough, United Kingdom, 1973 11/28 Stanislav Sedov born in Chelyabinsk, USSR, 1985 12/01 Hajimu Umemoto born in Nara, Japan, 1961 12/01 Alexey Dokuchaev born in Magadan, USSR, 1980 12/02 Ermal Luçi born in Tirane, Albania, 1980 12/03 Diane Bruce born in Ottawa, Ontario, Canada, 1952 12/04 Mariusz Zaborski born in Skierniewice, Poland, 1990 12/05 Ivan Voras born in Slavonski Brod, Croatia, 1981 12/06 Stefan Farfeleder born in Wien, Austria, 1980 12/08 Michael Tuexen born in Oldenburg, Germany, 1966 12/11 Ganael Laplanche born in Reims, France, 1980 12/15 James FitzGibbon born in Amersham, Buckinghamshire, United Kingdom, 1974 12/15 Timur I. Bakeyev born in Kazan, Republic of Tatarstan, USSR, 1974 12/18 Chris Timmons born in Ellensburg, Washington, United States, 1964 12/18 Dag-Erling Smorgrav born in Brussels, Belgium, 1977 12/18 Muhammad Moinur Rahman born in Dhaka, Bangladesh, 1983 12/18 Semen Ustimenko born in Novosibirsk, Russian Federation, 1979 12/19 Stephen Hurd born in Estevan, Saskatchewan, Canada, 1975 12/19 Emmanuel Vadot born in Decines-Charpieu, France, 1983 -12/20 Sean Bruno born in Monterey, California, USA, 1974 +12/20 Sean Bruno born in Monterey, California, United States, 1974 12/21 Rong-En Fan born in Taipei, Taiwan, Republic of China, 1982 12/22 Alan L. Cox born in Warren, Ohio, United States, 1964 12/22 Maxim Sobolev born in Dnepropetrovsk, Ukraine, 1976 12/23 Sean Chittenden born in Seattle, Washington, United States, 1979 12/23 Alejandro Pulver born in Buenos Aires, Argentina, 1989 12/24 Jochen Neumeister born in Heidenheim, Germany, 1975 12/24 Guido Falsi born in Firenze, Italy, 1978 12/25 Niclas Zeising born in Stockholm, Sweden, 1986 12/28 Soren Schmidt born in Maribo, Denmark, 1960 12/28 Ade Lovett born in London, England, 1969 12/28 Marius Strobl born in Cham, Bavaria, Germany, 1978 12/31 Edwin Groothuis born in Geldrop, the Netherlands, 1970 12/31 Fourth quarter status reports are due on 01/15 #endif /* !_calendar_freebsd_ */ Index: projects/runtime-coverage/usr.bin/netstat/inet.c =================================================================== --- projects/runtime-coverage/usr.bin/netstat/inet.c (revision 322921) +++ projects/runtime-coverage/usr.bin/netstat/inet.c (revision 322922) @@ -1,1388 +1,1400 @@ /*- * Copyright (c) 1983, 1988, 1993, 1995 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #if 0 #ifndef lint static char sccsid[] = "@(#)inet.c 8.5 (Berkeley) 5/24/95"; #endif /* not lint */ #endif #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET6 #include #endif /* INET6 */ #include #include #include #include #include #include #include #include #include #define TCPSTATES #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "netstat.h" #include "nl_defs.h" void inetprint(const char *, struct in_addr *, int, const char *, int, const int); #ifdef INET6 static int udp_done, tcp_done, sdp_done; #endif /* INET6 */ static int pcblist_sysctl(int proto, const char *name, char **bufp) { const char *mibvar; char *buf; size_t len; switch (proto) { case IPPROTO_TCP: mibvar = "net.inet.tcp.pcblist"; break; case IPPROTO_UDP: mibvar = "net.inet.udp.pcblist"; break; case IPPROTO_DIVERT: mibvar = "net.inet.divert.pcblist"; break; default: mibvar = "net.inet.raw.pcblist"; break; } if (strncmp(name, "sdp", 3) == 0) mibvar = "net.inet.sdp.pcblist"; len = 0; if (sysctlbyname(mibvar, 0, &len, 0, 0) < 0) { if (errno != ENOENT) xo_warn("sysctl: %s", mibvar); return (0); } if ((buf = malloc(len)) == NULL) { xo_warnx("malloc %lu bytes", (u_long)len); return (0); } if (sysctlbyname(mibvar, buf, &len, 0, 0) < 0) { xo_warn("sysctl: %s", mibvar); free(buf); return (0); } *bufp = buf; return (1); } /* * Copied directly from uipc_socket2.c. We leave out some fields that are in * nested structures that aren't used to avoid extra work. */ static void sbtoxsockbuf(struct sockbuf *sb, struct xsockbuf *xsb) { xsb->sb_cc = sb->sb_ccc; xsb->sb_hiwat = sb->sb_hiwat; xsb->sb_mbcnt = sb->sb_mbcnt; xsb->sb_mcnt = sb->sb_mcnt; xsb->sb_ccnt = sb->sb_ccnt; xsb->sb_mbmax = sb->sb_mbmax; xsb->sb_lowat = sb->sb_lowat; xsb->sb_flags = sb->sb_flags; xsb->sb_timeo = sb->sb_timeo; } int sotoxsocket(struct socket *so, struct xsocket *xso) { struct protosw proto; struct domain domain; bzero(xso, sizeof *xso); xso->xso_len = sizeof *xso; xso->xso_so = so; xso->so_type = so->so_type; xso->so_options = so->so_options; xso->so_linger = so->so_linger; xso->so_state = so->so_state; xso->so_pcb = so->so_pcb; if (kread((uintptr_t)so->so_proto, &proto, sizeof(proto)) != 0) return (-1); xso->xso_protocol = proto.pr_protocol; if (kread((uintptr_t)proto.pr_domain, &domain, sizeof(domain)) != 0) return (-1); xso->xso_family = domain.dom_family; xso->so_timeo = so->so_timeo; xso->so_error = so->so_error; if (SOLISTENING(so)) { xso->so_qlen = so->sol_qlen; xso->so_incqlen = so->sol_incqlen; xso->so_qlimit = so->sol_qlimit; } else { sbtoxsockbuf(&so->so_snd, &xso->so_snd); sbtoxsockbuf(&so->so_rcv, &xso->so_rcv); xso->so_oobmark = so->so_oobmark; } return (0); } /* * Print a summary of connections related to an Internet * protocol. For TCP, also give state of connection. * Listening processes (aflag) are suppressed unless the * -a (all) flag is specified. */ void protopr(u_long off, const char *name, int af1, int proto) { static int first = 1; int istcp; char *buf; const char *vchar; struct xtcpcb *tp; struct xinpcb *inp; struct xinpgen *xig, *oxig; struct xsocket *so; istcp = 0; switch (proto) { case IPPROTO_TCP: #ifdef INET6 if (strncmp(name, "sdp", 3) != 0) { if (tcp_done != 0) return; else tcp_done = 1; } else { if (sdp_done != 0) return; else sdp_done = 1; } #endif istcp = 1; break; case IPPROTO_UDP: #ifdef INET6 if (udp_done != 0) return; else udp_done = 1; #endif break; } if (!pcblist_sysctl(proto, name, &buf)) return; oxig = xig = (struct xinpgen *)buf; for (xig = (struct xinpgen *)((char *)xig + xig->xig_len); xig->xig_len > sizeof(struct xinpgen); xig = (struct xinpgen *)((char *)xig + xig->xig_len)) { if (istcp) { tp = (struct xtcpcb *)xig; inp = &tp->xt_inp; } else { inp = (struct xinpcb *)xig; } so = &inp->xi_socket; /* Ignore sockets for protocols other than the desired one. */ if (so->xso_protocol != proto) continue; /* Ignore PCBs which were freed during copyout. */ if (inp->inp_gencnt > oxig->xig_gen) continue; if ((af1 == AF_INET && (inp->inp_vflag & INP_IPV4) == 0) #ifdef INET6 || (af1 == AF_INET6 && (inp->inp_vflag & INP_IPV6) == 0) #endif /* INET6 */ || (af1 == AF_UNSPEC && ((inp->inp_vflag & INP_IPV4) == 0 #ifdef INET6 && (inp->inp_vflag & INP_IPV6) == 0 #endif /* INET6 */ )) ) continue; if (!aflag && ( (istcp && tp->t_state == TCPS_LISTEN) || (af1 == AF_INET && inet_lnaof(inp->inp_laddr) == INADDR_ANY) #ifdef INET6 || (af1 == AF_INET6 && IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_laddr)) #endif /* INET6 */ || (af1 == AF_UNSPEC && (((inp->inp_vflag & INP_IPV4) != 0 && inet_lnaof(inp->inp_laddr) == INADDR_ANY) #ifdef INET6 || ((inp->inp_vflag & INP_IPV6) != 0 && IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_laddr)) #endif )) )) continue; if (first) { if (!Lflag) { xo_emit("Active Internet connections"); if (aflag) xo_emit(" (including servers)"); } else xo_emit( "Current listen queue sizes (qlen/incqlen/maxqlen)"); xo_emit("\n"); if (Aflag) xo_emit("{T:/%-*s} ", 2 * (int)sizeof(void *), "Tcpcb"); if (Lflag) xo_emit((Aflag && !Wflag) ? "{T:/%-5.5s} {T:/%-32.32s} {T:/%-18.18s}" : ((!Wflag || af1 == AF_INET) ? "{T:/%-5.5s} {T:/%-32.32s} {T:/%-22.22s}" : "{T:/%-5.5s} {T:/%-32.32s} {T:/%-45.45s}"), "Proto", "Listen", "Local Address"); else if (Tflag) xo_emit((Aflag && !Wflag) ? "{T:/%-5.5s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-18.18s} {T:/%s}" : ((!Wflag || af1 == AF_INET) ? "{T:/%-5.5s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-22.22s} {T:/%s}" : "{T:/%-5.5s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-45.45s} {T:/%s}"), "Proto", "Rexmit", "OOORcv", "0-win", "Local Address", "Foreign Address"); else { xo_emit((Aflag && !Wflag) ? "{T:/%-5.5s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-18.18s} {T:/%-18.18s}" : ((!Wflag || af1 == AF_INET) ? "{T:/%-5.5s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-22.22s} {T:/%-22.22s}" : "{T:/%-5.5s} {T:/%-6.6s} {T:/%-6.6s} {T:/%-45.45s} {T:/%-45.45s}"), "Proto", "Recv-Q", "Send-Q", "Local Address", "Foreign Address"); if (!xflag && !Rflag) xo_emit(" (state)"); } if (xflag) { xo_emit(" {T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s} " "{T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s} " "{T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s} " "{T:/%-6.6s} {T:/%-6.6s} {T:/%-6.6s}", "R-MBUF", "S-MBUF", "R-CLUS", "S-CLUS", "R-HIWA", "S-HIWA", "R-LOWA", "S-LOWA", "R-BCNT", "S-BCNT", "R-BMAX", "S-BMAX"); xo_emit(" {T:/%7.7s} {T:/%7.7s} {T:/%7.7s} " "{T:/%7.7s} {T:/%7.7s} {T:/%7.7s}", "rexmt", "persist", "keep", "2msl", "delack", "rcvtime"); } else if (Rflag) { xo_emit(" {T:/%8.8s} {T:/%5.5s}", "flowid", "ftype"); } xo_emit("\n"); first = 0; } if (Lflag && so->so_qlimit == 0) continue; xo_open_instance("socket"); if (Aflag) { if (istcp) xo_emit("{q:address/%*lx} ", 2 * (int)sizeof(void *), (u_long)inp->inp_ppcb); else xo_emit("{q:address/%*lx} ", 2 * (int)sizeof(void *), (u_long)so->so_pcb); } #ifdef INET6 if ((inp->inp_vflag & INP_IPV6) != 0) vchar = ((inp->inp_vflag & INP_IPV4) != 0) ? "46" : "6"; else #endif vchar = ((inp->inp_vflag & INP_IPV4) != 0) ? "4" : ""; if (istcp && (tp->t_flags & TF_TOE) != 0) xo_emit("{:protocol/%-3.3s%-2.2s/%s%s} ", "toe", vchar); else xo_emit("{:protocol/%-3.3s%-2.2s/%s%s} ", name, vchar); if (Lflag) { char buf1[33]; snprintf(buf1, sizeof buf1, "%u/%u/%u", so->so_qlen, so->so_incqlen, so->so_qlimit); xo_emit("{:listen-queue-sizes/%-32.32s} ", buf1); } else if (Tflag) { if (istcp) xo_emit("{:sent-retransmit-packets/%6u} " "{:received-out-of-order-packets/%6u} " "{:sent-zero-window/%6u} ", tp->t_sndrexmitpack, tp->t_rcvoopack, tp->t_sndzerowin); else xo_emit("{P:/%21s}", ""); } else { xo_emit("{:receive-bytes-waiting/%6u} " "{:send-bytes-waiting/%6u} ", so->so_rcv.sb_cc, so->so_snd.sb_cc); } if (numeric_port) { if (inp->inp_vflag & INP_IPV4) { inetprint("local", &inp->inp_laddr, (int)inp->inp_lport, name, 1, af1); if (!Lflag) inetprint("remote", &inp->inp_faddr, (int)inp->inp_fport, name, 1, af1); } #ifdef INET6 else if (inp->inp_vflag & INP_IPV6) { inet6print("local", &inp->in6p_laddr, (int)inp->inp_lport, name, 1); if (!Lflag) inet6print("remote", &inp->in6p_faddr, (int)inp->inp_fport, name, 1); } /* else nothing printed now */ #endif /* INET6 */ } else if (inp->inp_flags & INP_ANONPORT) { if (inp->inp_vflag & INP_IPV4) { inetprint("local", &inp->inp_laddr, (int)inp->inp_lport, name, 1, af1); if (!Lflag) inetprint("remote", &inp->inp_faddr, (int)inp->inp_fport, name, 0, af1); } #ifdef INET6 else if (inp->inp_vflag & INP_IPV6) { inet6print("local", &inp->in6p_laddr, (int)inp->inp_lport, name, 1); if (!Lflag) inet6print("remote", &inp->in6p_faddr, (int)inp->inp_fport, name, 0); } /* else nothing printed now */ #endif /* INET6 */ } else { if (inp->inp_vflag & INP_IPV4) { inetprint("local", &inp->inp_laddr, (int)inp->inp_lport, name, 0, af1); if (!Lflag) inetprint("remote", &inp->inp_faddr, (int)inp->inp_fport, name, inp->inp_lport != inp->inp_fport, af1); } #ifdef INET6 else if (inp->inp_vflag & INP_IPV6) { inet6print("local", &inp->in6p_laddr, (int)inp->inp_lport, name, 0); if (!Lflag) inet6print("remote", &inp->in6p_faddr, (int)inp->inp_fport, name, inp->inp_lport != inp->inp_fport); } /* else nothing printed now */ #endif /* INET6 */ } if (xflag) { xo_emit("{:receive-mbufs/%6u} {:send-mbufs/%6u} " "{:receive-clusters/%6u} {:send-clusters/%6u} " "{:receive-high-water/%6u} {:send-high-water/%6u} " "{:receive-low-water/%6u} {:send-low-water/%6u} " "{:receive-mbuf-bytes/%6u} {:send-mbuf-bytes/%6u} " "{:receive-mbuf-bytes-max/%6u} " "{:send-mbuf-bytes-max/%6u}", so->so_rcv.sb_mcnt, so->so_snd.sb_mcnt, so->so_rcv.sb_ccnt, so->so_snd.sb_ccnt, so->so_rcv.sb_hiwat, so->so_snd.sb_hiwat, so->so_rcv.sb_lowat, so->so_snd.sb_lowat, so->so_rcv.sb_mbcnt, so->so_snd.sb_mbcnt, so->so_rcv.sb_mbmax, so->so_snd.sb_mbmax); if (istcp) xo_emit(" {:retransmit-timer/%4d.%02d} " "{:persist-timer/%4d.%02d} " "{:keepalive-timer/%4d.%02d} " "{:msl2-timer/%4d.%02d} " "{:delay-ack-timer/%4d.%02d} " "{:inactivity-timer/%4d.%02d}", tp->tt_rexmt / 1000, (tp->tt_rexmt % 1000) / 10, tp->tt_persist / 1000, (tp->tt_persist % 1000) / 10, tp->tt_keep / 1000, (tp->tt_keep % 1000) / 10, tp->tt_2msl / 1000, (tp->tt_2msl % 1000) / 10, tp->tt_delack / 1000, (tp->tt_delack % 1000) / 10, tp->t_rcvtime / 1000, (tp->t_rcvtime % 1000) / 10); } if (istcp && !Lflag && !xflag && !Tflag && !Rflag) { if (tp->t_state < 0 || tp->t_state >= TCP_NSTATES) xo_emit("{:tcp-state/%d}", tp->t_state); else { xo_emit("{:tcp-state/%s}", tcpstates[tp->t_state]); #if defined(TF_NEEDSYN) && defined(TF_NEEDFIN) /* Show T/TCP `hidden state' */ if (tp->t_flags & (TF_NEEDSYN|TF_NEEDFIN)) xo_emit("{:need-syn-or-fin/*}"); #endif /* defined(TF_NEEDSYN) && defined(TF_NEEDFIN) */ } } if (Rflag) { /* XXX: is this right Alfred */ xo_emit(" {:flow-id/%08x} {:flow-type/%5d}", inp->inp_flowid, inp->inp_flowtype); } xo_emit("\n"); xo_close_instance("socket"); } if (xig != oxig && xig->xig_gen != oxig->xig_gen) { if (oxig->xig_count > xig->xig_count) { xo_emit("Some {d:lost/%s} sockets may have been " "deleted.\n", name); } else if (oxig->xig_count < xig->xig_count) { xo_emit("Some {d:created/%s} sockets may have been " "created.\n", name); } else { xo_emit("Some {d:changed/%s} sockets may have been " "created or deleted.\n", name); } } free(buf); } /* * Dump TCP statistics structure. */ void tcp_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct tcpstat tcpstat; uint64_t tcps_states[TCP_NSTATES]; #ifdef INET6 if (tcp_done != 0) return; else tcp_done = 1; #endif if (fetch_stats("net.inet.tcp.stats", off, &tcpstat, sizeof(tcpstat), kread_counters) != 0) return; if (fetch_stats_ro("net.inet.tcp.states", nl[N_TCPS_STATES].n_value, &tcps_states, sizeof(tcps_states), kread_counters) != 0) return; xo_open_container("tcp"); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (tcpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t )tcpstat.f, plural(tcpstat.f)) #define p1a(f, m) if (tcpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t )tcpstat.f) #define p2(f1, f2, m) if (tcpstat.f1 || tcpstat.f2 || sflag <= 1) \ xo_emit(m, (uintmax_t )tcpstat.f1, plural(tcpstat.f1), \ (uintmax_t )tcpstat.f2, plural(tcpstat.f2)) #define p2a(f1, f2, m) if (tcpstat.f1 || tcpstat.f2 || sflag <= 1) \ xo_emit(m, (uintmax_t )tcpstat.f1, plural(tcpstat.f1), \ (uintmax_t )tcpstat.f2) #define p3(f, m) if (tcpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t )tcpstat.f, pluralies(tcpstat.f)) p(tcps_sndtotal, "\t{:sent-packets/%ju} {N:/packet%s sent}\n"); p2(tcps_sndpack,tcps_sndbyte, "\t\t{:sent-data-packets/%ju} " "{N:/data packet%s} ({:sent-data-bytes/%ju} {N:/byte%s})\n"); p2(tcps_sndrexmitpack, tcps_sndrexmitbyte, "\t\t" "{:sent-retransmitted-packets/%ju} {N:/data packet%s} " "({:sent-retransmitted-bytes/%ju} {N:/byte%s}) " "{N:retransmitted}\n"); p(tcps_sndrexmitbad, "\t\t" "{:sent-unnecessary-retransmitted-packets/%ju} " "{N:/data packet%s unnecessarily retransmitted}\n"); p(tcps_mturesent, "\t\t{:sent-resends-by-mtu-discovery/%ju} " "{N:/resend%s initiated by MTU discovery}\n"); p2a(tcps_sndacks, tcps_delack, "\t\t{:sent-ack-only-packets/%ju} " "{N:/ack-only packet%s/} ({:sent-packets-delayed/%ju} " "{N:delayed})\n"); p(tcps_sndurg, "\t\t{:sent-urg-only-packets/%ju} " "{N:/URG only packet%s}\n"); p(tcps_sndprobe, "\t\t{:sent-window-probe-packets/%ju} " "{N:/window probe packet%s}\n"); p(tcps_sndwinup, "\t\t{:sent-window-update-packets/%ju} " "{N:/window update packet%s}\n"); p(tcps_sndctrl, "\t\t{:sent-control-packets/%ju} " "{N:/control packet%s}\n"); p(tcps_rcvtotal, "\t{:received-packets/%ju} " "{N:/packet%s received}\n"); p2(tcps_rcvackpack, tcps_rcvackbyte, "\t\t" "{:received-ack-packets/%ju} {N:/ack%s} " "{N:(for} {:received-ack-bytes/%ju} {N:/byte%s})\n"); p(tcps_rcvdupack, "\t\t{:received-duplicate-acks/%ju} " "{N:/duplicate ack%s}\n"); p(tcps_rcvacktoomuch, "\t\t{:received-acks-for-unsent-data/%ju} " "{N:/ack%s for unsent data}\n"); p2(tcps_rcvpack, tcps_rcvbyte, "\t\t" "{:received-in-sequence-packets/%ju} {N:/packet%s} " "({:received-in-sequence-bytes/%ju} {N:/byte%s}) " "{N:received in-sequence}\n"); p2(tcps_rcvduppack, tcps_rcvdupbyte, "\t\t" "{:received-completely-duplicate-packets/%ju} " "{N:/completely duplicate packet%s} " "({:received-completely-duplicate-bytes/%ju} {N:/byte%s})\n"); p(tcps_pawsdrop, "\t\t{:received-old-duplicate-packets/%ju} " "{N:/old duplicate packet%s}\n"); p2(tcps_rcvpartduppack, tcps_rcvpartdupbyte, "\t\t" "{:received-some-duplicate-packets/%ju} " "{N:/packet%s with some dup. data} " "({:received-some-duplicate-bytes/%ju} {N:/byte%s duped/})\n"); p2(tcps_rcvoopack, tcps_rcvoobyte, "\t\t{:received-out-of-order/%ju} " "{N:/out-of-order packet%s} " "({:received-out-of-order-bytes/%ju} {N:/byte%s})\n"); p2(tcps_rcvpackafterwin, tcps_rcvbyteafterwin, "\t\t" "{:received-after-window-packets/%ju} {N:/packet%s} " "({:received-after-window-bytes/%ju} {N:/byte%s}) " "{N:of data after window}\n"); p(tcps_rcvwinprobe, "\t\t{:received-window-probes/%ju} " "{N:/window probe%s}\n"); p(tcps_rcvwinupd, "\t\t{:receive-window-update-packets/%ju} " "{N:/window update packet%s}\n"); p(tcps_rcvafterclose, "\t\t{:received-after-close-packets/%ju} " "{N:/packet%s received after close}\n"); p(tcps_rcvbadsum, "\t\t{:discard-bad-checksum/%ju} " "{N:/discarded for bad checksum%s}\n"); p(tcps_rcvbadoff, "\t\t{:discard-bad-header-offset/%ju} " "{N:/discarded for bad header offset field%s}\n"); p1a(tcps_rcvshort, "\t\t{:discard-too-short/%ju} " "{N:discarded because packet too short}\n"); p1a(tcps_rcvmemdrop, "\t\t{:discard-memory-problems/%ju} " "{N:discarded due to memory problems}\n"); p(tcps_connattempt, "\t{:connection-requests/%ju} " "{N:/connection request%s}\n"); p(tcps_accepts, "\t{:connections-accepts/%ju} " "{N:/connection accept%s}\n"); p(tcps_badsyn, "\t{:bad-connection-attempts/%ju} " "{N:/bad connection attempt%s}\n"); p(tcps_listendrop, "\t{:listen-queue-overflows/%ju} " "{N:/listen queue overflow%s}\n"); p(tcps_badrst, "\t{:ignored-in-window-resets/%ju} " "{N:/ignored RSTs in the window%s}\n"); p(tcps_connects, "\t{:connections-established/%ju} " "{N:/connection%s established (including accepts)}\n"); p(tcps_usedrtt, "\t\t{:connections-hostcache-rtt/%ju} " "{N:/time%s used RTT from hostcache}\n"); p(tcps_usedrttvar, "\t\t{:connections-hostcache-rttvar/%ju} " "{N:/time%s used RTT variance from hostcache}\n"); p(tcps_usedssthresh, "\t\t{:connections-hostcache-ssthresh/%ju} " "{N:/time%s used slow-start threshold from hostcache}\n"); p2(tcps_closed, tcps_drops, "\t{:connections-closed/%ju} " "{N:/connection%s closed (including} " "{:connection-drops/%ju} {N:/drop%s})\n"); p(tcps_cachedrtt, "\t\t{:connections-updated-rtt-on-close/%ju} " "{N:/connection%s updated cached RTT on close}\n"); p(tcps_cachedrttvar, "\t\t" "{:connections-updated-variance-on-close/%ju} " "{N:/connection%s updated cached RTT variance on close}\n"); p(tcps_cachedssthresh, "\t\t" "{:connections-updated-ssthresh-on-close/%ju} " "{N:/connection%s updated cached ssthresh on close}\n"); p(tcps_conndrops, "\t{:embryonic-connections-dropped/%ju} " "{N:/embryonic connection%s dropped}\n"); p2(tcps_rttupdated, tcps_segstimed, "\t{:segments-updated-rtt/%ju} " "{N:/segment%s updated rtt (of} " "{:segment-update-attempts/%ju} {N:/attempt%s})\n"); p(tcps_rexmttimeo, "\t{:retransmit-timeouts/%ju} " "{N:/retransmit timeout%s}\n"); p(tcps_timeoutdrop, "\t\t" "{:connections-dropped-by-retransmit-timeout/%ju} " "{N:/connection%s dropped by rexmit timeout}\n"); p(tcps_persisttimeo, "\t{:persist-timeout/%ju} " "{N:/persist timeout%s}\n"); p(tcps_persistdrop, "\t\t" "{:connections-dropped-by-persist-timeout/%ju} " "{N:/connection%s dropped by persist timeout}\n"); p(tcps_finwait2_drops, "\t" "{:connections-dropped-by-finwait2-timeout/%ju} " "{N:/Connection%s (fin_wait_2) dropped because of timeout}\n"); p(tcps_keeptimeo, "\t{:keepalive-timeout/%ju} " "{N:/keepalive timeout%s}\n"); p(tcps_keepprobe, "\t\t{:keepalive-probes/%ju} " "{N:/keepalive probe%s sent}\n"); p(tcps_keepdrops, "\t\t{:connections-dropped-by-keepalives/%ju} " "{N:/connection%s dropped by keepalive}\n"); p(tcps_predack, "\t{:ack-header-predictions/%ju} " "{N:/correct ACK header prediction%s}\n"); p(tcps_preddat, "\t{:data-packet-header-predictions/%ju} " "{N:/correct data packet header prediction%s}\n"); xo_open_container("syncache"); p3(tcps_sc_added, "\t{:entries-added/%ju} " "{N:/syncache entr%s added}\n"); p1a(tcps_sc_retransmitted, "\t\t{:retransmitted/%ju} " "{N:/retransmitted}\n"); p1a(tcps_sc_dupsyn, "\t\t{:duplicates/%ju} {N:/dupsyn}\n"); p1a(tcps_sc_dropped, "\t\t{:dropped/%ju} {N:/dropped}\n"); p1a(tcps_sc_completed, "\t\t{:completed/%ju} {N:/completed}\n"); p1a(tcps_sc_bucketoverflow, "\t\t{:bucket-overflow/%ju} " "{N:/bucket overflow}\n"); p1a(tcps_sc_cacheoverflow, "\t\t{:cache-overflow/%ju} " "{N:/cache overflow}\n"); p1a(tcps_sc_reset, "\t\t{:reset/%ju} {N:/reset}\n"); p1a(tcps_sc_stale, "\t\t{:stale/%ju} {N:/stale}\n"); p1a(tcps_sc_aborted, "\t\t{:aborted/%ju} {N:/aborted}\n"); p1a(tcps_sc_badack, "\t\t{:bad-ack/%ju} {N:/badack}\n"); p1a(tcps_sc_unreach, "\t\t{:unreachable/%ju} {N:/unreach}\n"); p(tcps_sc_zonefail, "\t\t{:zone-failures/%ju} {N:/zone failure%s}\n"); p(tcps_sc_sendcookie, "\t{:sent-cookies/%ju} {N:/cookie%s sent}\n"); p(tcps_sc_recvcookie, "\t{:receivd-cookies/%ju} " "{N:/cookie%s received}\n"); xo_close_container("syncache"); xo_open_container("hostcache"); p3(tcps_hc_added, "\t{:entries-added/%ju} " "{N:/hostcache entr%s added}\n"); p1a(tcps_hc_bucketoverflow, "\t\t{:buffer-overflows/%ju} " "{N:/bucket overflow}\n"); xo_close_container("hostcache"); xo_open_container("sack"); p(tcps_sack_recovery_episode, "\t{:recovery-episodes/%ju} " "{N:/SACK recovery episode%s}\n"); p(tcps_sack_rexmits, "\t{:segment-retransmits/%ju} " "{N:/segment rexmit%s in SACK recovery episodes}\n"); p(tcps_sack_rexmit_bytes, "\t{:byte-retransmits/%ju} " "{N:/byte rexmit%s in SACK recovery episodes}\n"); p(tcps_sack_rcv_blocks, "\t{:received-blocks/%ju} " "{N:/SACK option%s (SACK blocks) received}\n"); p(tcps_sack_send_blocks, "\t{:sent-option-blocks/%ju} " "{N:/SACK option%s (SACK blocks) sent}\n"); p1a(tcps_sack_sboverflow, "\t{:scoreboard-overflows/%ju} " "{N:/SACK scoreboard overflow}\n"); xo_close_container("sack"); xo_open_container("ecn"); p(tcps_ecn_ce, "\t{:ce-packets/%ju} " "{N:/packet%s with ECN CE bit set}\n"); p(tcps_ecn_ect0, "\t{:ect0-packets/%ju} " "{N:/packet%s with ECN ECT(0) bit set}\n"); p(tcps_ecn_ect1, "\t{:ect1-packets/%ju} " "{N:/packet%s with ECN ECT(1) bit set}\n"); p(tcps_ecn_shs, "\t{:handshakes/%ju} " "{N:/successful ECN handshake%s}\n"); p(tcps_ecn_rcwnd, "\t{:congestion-reductions/%ju} " "{N:/time%s ECN reduced the congestion window}\n"); xo_close_container("ecn"); xo_open_container("tcp-signature"); p(tcps_sig_rcvgoodsig, "\t{:received-good-signature/%ju} " "{N:/packet%s with matching signature received}\n"); p(tcps_sig_rcvbadsig, "\t{:received-bad-signature/%ju} " "{N:/packet%s with bad signature received}\n"); p(tcps_sig_err_buildsig, "\t{:failed-make-signature/%ju} " "{N:/time%s failed to make signature due to no SA}\n"); p(tcps_sig_err_sigopt, "\t{:no-signature-expected/%ju} " "{N:/time%s unexpected signature received}\n"); p(tcps_sig_err_nosigopt, "\t{:no-signature-provided/%ju} " "{N:/time%s no signature provided by segment}\n"); + + xo_close_container("tcp-signature"); + xo_open_container("pmtud"); + + p(tcps_pmtud_blackhole_activated, "\t{:pmtud-activated/%ju} " + "{N:/Path MTU discovery black hole detection activation%s}\n"); + p(tcps_pmtud_blackhole_activated_min_mss, + "\t{:pmtud-activated-min-mss/%ju} " + "{N:/Path MTU discovery black hole detection min MSS activation%s}\n"); + p(tcps_pmtud_blackhole_failed, "\t{:pmtud-failed/%ju} " + "{N:/Path MTU discovery black hole detection failure%s}\n"); #undef p #undef p1a #undef p2 #undef p2a #undef p3 - xo_close_container("tcp-signature"); + xo_close_container("pmtud"); + xo_open_container("TCP connection count by state"); xo_emit("{T:/TCP connection count by state}:\n"); for (int i = 0; i < TCP_NSTATES; i++) { /* * XXXGL: is there a way in libxo to use %s * in the "content string" of a format * string? I failed to do that, that's why * a temporary buffer is used to construct * format string for xo_emit(). */ char fmtbuf[80]; if (sflag > 1 && tcps_states[i] == 0) continue; snprintf(fmtbuf, sizeof(fmtbuf), "\t{:%s/%%ju} " "{Np:/connection ,connections} in %s state\n", tcpstates[i], tcpstates[i]); xo_emit(fmtbuf, (uintmax_t )tcps_states[i]); } xo_close_container("TCP connection count by state"); xo_close_container("tcp"); } /* * Dump UDP statistics structure. */ void udp_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct udpstat udpstat; uint64_t delivered; #ifdef INET6 if (udp_done != 0) return; else udp_done = 1; #endif if (fetch_stats("net.inet.udp.stats", off, &udpstat, sizeof(udpstat), kread_counters) != 0) return; xo_open_container("udp"); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (udpstat.f || sflag <= 1) \ xo_emit("\t" m, (uintmax_t)udpstat.f, plural(udpstat.f)) #define p1a(f, m) if (udpstat.f || sflag <= 1) \ xo_emit("\t" m, (uintmax_t)udpstat.f) p(udps_ipackets, "{:received-datagrams/%ju} " "{N:/datagram%s received}\n"); p1a(udps_hdrops, "{:dropped-incomplete-headers/%ju} " "{N:/with incomplete header}\n"); p1a(udps_badlen, "{:dropped-bad-data-length/%ju} " "{N:/with bad data length field}\n"); p1a(udps_badsum, "{:dropped-bad-checksum/%ju} " "{N:/with bad checksum}\n"); p1a(udps_nosum, "{:dropped-no-checksum/%ju} " "{N:/with no checksum}\n"); p1a(udps_noport, "{:dropped-no-socket/%ju} " "{N:/dropped due to no socket}\n"); p(udps_noportbcast, "{:dropped-broadcast-multicast/%ju} " "{N:/broadcast\\/multicast datagram%s undelivered}\n"); p1a(udps_fullsock, "{:dropped-full-socket-buffer/%ju} " "{N:/dropped due to full socket buffers}\n"); p1a(udpps_pcbhashmiss, "{:not-for-hashed-pcb/%ju} " "{N:/not for hashed pcb}\n"); delivered = udpstat.udps_ipackets - udpstat.udps_hdrops - udpstat.udps_badlen - udpstat.udps_badsum - udpstat.udps_noport - udpstat.udps_noportbcast - udpstat.udps_fullsock; if (delivered || sflag <= 1) xo_emit("\t{:delivered-packets/%ju} {N:/delivered}\n", (uint64_t)delivered); p(udps_opackets, "{:output-packets/%ju} {N:/datagram%s output}\n"); /* the next statistic is cumulative in udps_noportbcast */ p(udps_filtermcast, "{:multicast-source-filter-matches/%ju} " "{N:/time%s multicast source filter matched}\n"); #undef p #undef p1a xo_close_container("udp"); } /* * Dump CARP statistics structure. */ void carp_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct carpstats carpstat; if (fetch_stats("net.inet.carp.stats", off, &carpstat, sizeof(carpstat), kread_counters) != 0) return; xo_open_container(name); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (carpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t)carpstat.f, plural(carpstat.f)) #define p2(f, m) if (carpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t)carpstat.f) p(carps_ipackets, "\t{:received-inet-packets/%ju} " "{N:/packet%s received (IPv4)}\n"); p(carps_ipackets6, "\t{:received-inet6-packets/%ju} " "{N:/packet%s received (IPv6)}\n"); p(carps_badttl, "\t\t{:dropped-wrong-ttl/%ju} " "{N:/packet%s discarded for wrong TTL}\n"); p(carps_hdrops, "\t\t{:dropped-short-header/%ju} " "{N:/packet%s shorter than header}\n"); p(carps_badsum, "\t\t{:dropped-bad-checksum/%ju} " "{N:/discarded for bad checksum%s}\n"); p(carps_badver, "\t\t{:dropped-bad-version/%ju} " "{N:/discarded packet%s with a bad version}\n"); p2(carps_badlen, "\t\t{:dropped-short-packet/%ju} " "{N:/discarded because packet too short}\n"); p2(carps_badauth, "\t\t{:dropped-bad-authentication/%ju} " "{N:/discarded for bad authentication}\n"); p2(carps_badvhid, "\t\t{:dropped-bad-vhid/%ju} " "{N:/discarded for bad vhid}\n"); p2(carps_badaddrs, "\t\t{:dropped-bad-address-list/%ju} " "{N:/discarded because of a bad address list}\n"); p(carps_opackets, "\t{:sent-inet-packets/%ju} " "{N:/packet%s sent (IPv4)}\n"); p(carps_opackets6, "\t{:sent-inet6-packets/%ju} " "{N:/packet%s sent (IPv6)}\n"); p2(carps_onomem, "\t\t{:send-failed-memory-error/%ju} " "{N:/send failed due to mbuf memory error}\n"); #if notyet p(carps_ostates, "\t\t{:send-state-updates/%s} " "{N:/state update%s sent}\n"); #endif #undef p #undef p2 xo_close_container(name); } /* * Dump IP statistics structure. */ void ip_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct ipstat ipstat; if (fetch_stats("net.inet.ip.stats", off, &ipstat, sizeof(ipstat), kread_counters) != 0) return; xo_open_container(name); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (ipstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t )ipstat.f, plural(ipstat.f)) #define p1a(f, m) if (ipstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t )ipstat.f) p(ips_total, "\t{:received-packets/%ju} " "{N:/total packet%s received}\n"); p(ips_badsum, "\t{:dropped-bad-checksum/%ju} " "{N:/bad header checksum%s}\n"); p1a(ips_toosmall, "\t{:dropped-below-minimum-size/%ju} " "{N:/with size smaller than minimum}\n"); p1a(ips_tooshort, "\t{:dropped-short-packets/%ju} " "{N:/with data size < data length}\n"); p1a(ips_toolong, "\t{:dropped-too-long/%ju} " "{N:/with ip length > max ip packet size}\n"); p1a(ips_badhlen, "\t{:dropped-short-header-length/%ju} " "{N:/with header length < data size}\n"); p1a(ips_badlen, "\t{:dropped-short-data/%ju} " "{N:/with data length < header length}\n"); p1a(ips_badoptions, "\t{:dropped-bad-options/%ju} " "{N:/with bad options}\n"); p1a(ips_badvers, "\t{:dropped-bad-version/%ju} " "{N:/with incorrect version number}\n"); p(ips_fragments, "\t{:received-fragments/%ju} " "{N:/fragment%s received}\n"); p(ips_fragdropped, "\t{:dropped-fragments/%ju} " "{N:/fragment%s dropped (dup or out of space)}\n"); p(ips_fragtimeout, "\t{:dropped-fragments-after-timeout/%ju} " "{N:/fragment%s dropped after timeout}\n"); p(ips_reassembled, "\t{:reassembled-packets/%ju} " "{N:/packet%s reassembled ok}\n"); p(ips_delivered, "\t{:received-local-packets/%ju} " "{N:/packet%s for this host}\n"); p(ips_noproto, "\t{:dropped-unknown-protocol/%ju} " "{N:/packet%s for unknown\\/unsupported protocol}\n"); p(ips_forward, "\t{:forwarded-packets/%ju} " "{N:/packet%s forwarded}"); p(ips_fastforward, " ({:fast-forwarded-packets/%ju} " "{N:/packet%s fast forwarded})"); if (ipstat.ips_forward || sflag <= 1) xo_emit("\n"); p(ips_cantforward, "\t{:packets-cannot-forward/%ju} " "{N:/packet%s not forwardable}\n"); p(ips_notmember, "\t{:received-unknown-multicast-group/%ju} " "{N:/packet%s received for unknown multicast group}\n"); p(ips_redirectsent, "\t{:redirects-sent/%ju} " "{N:/redirect%s sent}\n"); p(ips_localout, "\t{:sent-packets/%ju} " "{N:/packet%s sent from this host}\n"); p(ips_rawout, "\t{:send-packets-fabricated-header/%ju} " "{N:/packet%s sent with fabricated ip header}\n"); p(ips_odropped, "\t{:discard-no-mbufs/%ju} " "{N:/output packet%s dropped due to no bufs, etc.}\n"); p(ips_noroute, "\t{:discard-no-route/%ju} " "{N:/output packet%s discarded due to no route}\n"); p(ips_fragmented, "\t{:sent-fragments/%ju} " "{N:/output datagram%s fragmented}\n"); p(ips_ofragments, "\t{:fragments-created/%ju} " "{N:/fragment%s created}\n"); p(ips_cantfrag, "\t{:discard-cannot-fragment/%ju} " "{N:/datagram%s that can't be fragmented}\n"); p(ips_nogif, "\t{:discard-tunnel-no-gif/%ju} " "{N:/tunneling packet%s that can't find gif}\n"); p(ips_badaddr, "\t{:discard-bad-address/%ju} " "{N:/datagram%s with bad address in header}\n"); #undef p #undef p1a xo_close_container(name); } /* * Dump ARP statistics structure. */ void arp_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct arpstat arpstat; if (fetch_stats("net.link.ether.arp.stats", off, &arpstat, sizeof(arpstat), kread_counters) != 0) return; xo_open_container(name); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (arpstat.f || sflag <= 1) \ xo_emit("\t" m, (uintmax_t)arpstat.f, plural(arpstat.f)) #define p2(f, m) if (arpstat.f || sflag <= 1) \ xo_emit("\t" m, (uintmax_t)arpstat.f, pluralies(arpstat.f)) p(txrequests, "{:sent-requests/%ju} {N:/ARP request%s sent}\n"); p2(txreplies, "{:sent-replies/%ju} {N:/ARP repl%s sent}\n"); p(rxrequests, "{:received-requests/%ju} " "{N:/ARP request%s received}\n"); p2(rxreplies, "{:received-replies/%ju} " "{N:/ARP repl%s received}\n"); p(received, "{:received-packers/%ju} " "{N:/ARP packet%s received}\n"); p(dropped, "{:dropped-no-entry/%ju} " "{N:/total packet%s dropped due to no ARP entry}\n"); p(timeouts, "{:entries-timeout/%ju} " "{N:/ARP entry%s timed out}\n"); p(dupips, "{:dropped-duplicate-address/%ju} " "{N:/Duplicate IP%s seen}\n"); #undef p #undef p2 xo_close_container(name); } static const char *icmpnames[ICMP_MAXTYPE + 1] = { "echo reply", /* RFC 792 */ "#1", "#2", "destination unreachable", /* RFC 792 */ "source quench", /* RFC 792 */ "routing redirect", /* RFC 792 */ "#6", "#7", "echo", /* RFC 792 */ "router advertisement", /* RFC 1256 */ "router solicitation", /* RFC 1256 */ "time exceeded", /* RFC 792 */ "parameter problem", /* RFC 792 */ "time stamp", /* RFC 792 */ "time stamp reply", /* RFC 792 */ "information request", /* RFC 792 */ "information request reply", /* RFC 792 */ "address mask request", /* RFC 950 */ "address mask reply", /* RFC 950 */ "#19", "#20", "#21", "#22", "#23", "#24", "#25", "#26", "#27", "#28", "#29", "icmp traceroute", /* RFC 1393 */ "datagram conversion error", /* RFC 1475 */ "mobile host redirect", "IPv6 where-are-you", "IPv6 i-am-here", "mobile registration req", "mobile registration reply", "domain name request", /* RFC 1788 */ "domain name reply", /* RFC 1788 */ "icmp SKIP", "icmp photuris", /* RFC 2521 */ }; /* * Dump ICMP statistics. */ void icmp_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct icmpstat icmpstat; size_t len; int i, first; if (fetch_stats("net.inet.icmp.stats", off, &icmpstat, sizeof(icmpstat), kread_counters) != 0) return; xo_open_container(name); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (icmpstat.f || sflag <= 1) \ xo_emit(m, icmpstat.f, plural(icmpstat.f)) #define p1a(f, m) if (icmpstat.f || sflag <= 1) \ xo_emit(m, icmpstat.f) #define p2(f, m) if (icmpstat.f || sflag <= 1) \ xo_emit(m, icmpstat.f, plurales(icmpstat.f)) p(icps_error, "\t{:icmp-calls/%lu} " "{N:/call%s to icmp_error}\n"); p(icps_oldicmp, "\t{:errors-not-from-message/%lu} " "{N:/error%s not generated in response to an icmp message}\n"); for (first = 1, i = 0; i < ICMP_MAXTYPE + 1; i++) { if (icmpstat.icps_outhist[i] != 0) { if (first) { xo_open_list("output-histogram"); xo_emit("\tOutput histogram:\n"); first = 0; } xo_open_instance("output-histogram"); if (icmpnames[i] != NULL) xo_emit("\t\t{k:name/%s}: {:count/%lu}\n", icmpnames[i], icmpstat.icps_outhist[i]); else xo_emit("\t\tunknown ICMP #{k:name/%d}: " "{:count/%lu}\n", i, icmpstat.icps_outhist[i]); xo_close_instance("output-histogram"); } } if (!first) xo_close_list("output-histogram"); p(icps_badcode, "\t{:dropped-bad-code/%lu} " "{N:/message%s with bad code fields}\n"); p(icps_tooshort, "\t{:dropped-too-short/%lu} " "{N:/message%s less than the minimum length}\n"); p(icps_checksum, "\t{:dropped-bad-checksum/%lu} " "{N:/message%s with bad checksum}\n"); p(icps_badlen, "\t{:dropped-bad-length/%lu} " "{N:/message%s with bad length}\n"); p1a(icps_bmcastecho, "\t{:dropped-multicast-echo/%lu} " "{N:/multicast echo requests ignored}\n"); p1a(icps_bmcasttstamp, "\t{:dropped-multicast-timestamp/%lu} " "{N:/multicast timestamp requests ignored}\n"); for (first = 1, i = 0; i < ICMP_MAXTYPE + 1; i++) { if (icmpstat.icps_inhist[i] != 0) { if (first) { xo_open_list("input-histogram"); xo_emit("\tInput histogram:\n"); first = 0; } xo_open_instance("input-histogram"); if (icmpnames[i] != NULL) xo_emit("\t\t{k:name/%s}: {:count/%lu}\n", icmpnames[i], icmpstat.icps_inhist[i]); else xo_emit( "\t\tunknown ICMP #{k:name/%d}: {:count/%lu}\n", i, icmpstat.icps_inhist[i]); xo_close_instance("input-histogram"); } } if (!first) xo_close_list("input-histogram"); p(icps_reflect, "\t{:sent-packets/%lu} " "{N:/message response%s generated}\n"); p2(icps_badaddr, "\t{:discard-invalid-return-address/%lu} " "{N:/invalid return address%s}\n"); p(icps_noroute, "\t{:discard-no-route/%lu} " "{N:/no return route%s}\n"); #undef p #undef p1a #undef p2 if (live) { len = sizeof i; if (sysctlbyname("net.inet.icmp.maskrepl", &i, &len, NULL, 0) < 0) return; xo_emit("\tICMP address mask responses are " "{q:icmp-address-responses/%sabled}\n", i ? "en" : "dis"); } xo_close_container(name); } /* * Dump IGMP statistics structure. */ void igmp_stats(u_long off, const char *name, int af1 __unused, int proto __unused) { struct igmpstat igmpstat; if (fetch_stats("net.inet.igmp.stats", 0, &igmpstat, sizeof(igmpstat), kread) != 0) return; if (igmpstat.igps_version != IGPS_VERSION_3) { xo_warnx("%s: version mismatch (%d != %d)", __func__, igmpstat.igps_version, IGPS_VERSION_3); } if (igmpstat.igps_len != IGPS_VERSION3_LEN) { xo_warnx("%s: size mismatch (%d != %d)", __func__, igmpstat.igps_len, IGPS_VERSION3_LEN); } xo_open_container(name); xo_emit("{T:/%s}:\n", name); #define p64(f, m) if (igmpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t) igmpstat.f, plural(igmpstat.f)) #define py64(f, m) if (igmpstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t) igmpstat.f, pluralies(igmpstat.f)) p64(igps_rcv_total, "\t{:received-messages/%ju} " "{N:/message%s received}\n"); p64(igps_rcv_tooshort, "\t{:dropped-too-short/%ju} " "{N:/message%s received with too few bytes}\n"); p64(igps_rcv_badttl, "\t{:dropped-wrong-ttl/%ju} " "{N:/message%s received with wrong TTL}\n"); p64(igps_rcv_badsum, "\t{:dropped-bad-checksum/%ju} " "{N:/message%s received with bad checksum}\n"); py64(igps_rcv_v1v2_queries, "\t{:received-membership-queries/%ju} " "{N:/V1\\/V2 membership quer%s received}\n"); py64(igps_rcv_v3_queries, "\t{:received-v3-membership-queries/%ju} " "{N:/V3 membership quer%s received}\n"); py64(igps_rcv_badqueries, "\t{:dropped-membership-queries/%ju} " "{N:/membership quer%s received with invalid field(s)}\n"); py64(igps_rcv_gen_queries, "\t{:received-general-queries/%ju} " "{N:/general quer%s received}\n"); py64(igps_rcv_group_queries, "\t{:received-group-queries/%ju} " "{N:/group quer%s received}\n"); py64(igps_rcv_gsr_queries, "\t{:received-group-source-queries/%ju} " "{N:/group-source quer%s received}\n"); py64(igps_drop_gsr_queries, "\t{:dropped-group-source-queries/%ju} " "{N:/group-source quer%s dropped}\n"); p64(igps_rcv_reports, "\t{:received-membership-requests/%ju} " "{N:/membership report%s received}\n"); p64(igps_rcv_badreports, "\t{:dropped-membership-reports/%ju} " "{N:/membership report%s received with invalid field(s)}\n"); p64(igps_rcv_ourreports, "\t" "{:received-membership-reports-matching/%ju} " "{N:/membership report%s received for groups to which we belong}" "\n"); p64(igps_rcv_nora, "\t{:received-v3-reports-no-router-alert/%ju} " "{N:/V3 report%s received without Router Alert}\n"); p64(igps_snd_reports, "\t{:sent-membership-reports/%ju} " "{N:/membership report%s sent}\n"); #undef p64 #undef py64 xo_close_container(name); } /* * Dump PIM statistics structure. */ void pim_stats(u_long off __unused, const char *name, int af1 __unused, int proto __unused) { struct pimstat pimstat; if (fetch_stats("net.inet.pim.stats", off, &pimstat, sizeof(pimstat), kread_counters) != 0) return; xo_open_container(name); xo_emit("{T:/%s}:\n", name); #define p(f, m) if (pimstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t)pimstat.f, plural(pimstat.f)) #define py(f, m) if (pimstat.f || sflag <= 1) \ xo_emit(m, (uintmax_t)pimstat.f, pimstat.f != 1 ? "ies" : "y") p(pims_rcv_total_msgs, "\t{:received-messages/%ju} " "{N:/message%s received}\n"); p(pims_rcv_total_bytes, "\t{:received-bytes/%ju} " "{N:/byte%s received}\n"); p(pims_rcv_tooshort, "\t{:dropped-too-short/%ju} " "{N:/message%s received with too few bytes}\n"); p(pims_rcv_badsum, "\t{:dropped-bad-checksum/%ju} " "{N:/message%s received with bad checksum}\n"); p(pims_rcv_badversion, "\t{:dropped-bad-version/%ju} " "{N:/message%s received with bad version}\n"); p(pims_rcv_registers_msgs, "\t{:received-data-register-messages/%ju} " "{N:/data register message%s received}\n"); p(pims_rcv_registers_bytes, "\t{:received-data-register-bytes/%ju} " "{N:/data register byte%s received}\n"); p(pims_rcv_registers_wrongiif, "\t" "{:received-data-register-wrong-interface/%ju} " "{N:/data register message%s received on wrong iif}\n"); p(pims_rcv_badregisters, "\t{:received-bad-registers/%ju} " "{N:/bad register%s received}\n"); p(pims_snd_registers_msgs, "\t{:sent-data-register-messages/%ju} " "{N:/data register message%s sent}\n"); p(pims_snd_registers_bytes, "\t{:sent-data-register-bytes/%ju} " "{N:/data register byte%s sent}\n"); #undef p #undef py xo_close_container(name); } /* * Pretty print an Internet address (net address + port). */ void inetprint(const char *container, struct in_addr *in, int port, const char *proto, int num_port, const int af1) { struct servent *sp = 0; char line[80], *cp; int width; size_t alen, plen; if (container) xo_open_container(container); if (Wflag) snprintf(line, sizeof(line), "%s.", inetname(in)); else snprintf(line, sizeof(line), "%.*s.", (Aflag && !num_port) ? 12 : 16, inetname(in)); alen = strlen(line); cp = line + alen; if (!num_port && port) sp = getservbyport((int)port, proto); if (sp || port == 0) snprintf(cp, sizeof(line) - alen, "%.15s ", sp ? sp->s_name : "*"); else snprintf(cp, sizeof(line) - alen, "%d ", ntohs((u_short)port)); width = (Aflag && !Wflag) ? 18 : ((!Wflag || af1 == AF_INET) ? 22 : 45); if (Wflag) xo_emit("{d:target/%-*s} ", width, line); else xo_emit("{d:target/%-*.*s} ", width, width, line); plen = strlen(cp) - 1; alen--; xo_emit("{e:address/%*.*s}{e:port/%*.*s}", alen, alen, line, plen, plen, cp); if (container) xo_close_container(container); } /* * Construct an Internet address representation. * If numeric_addr has been supplied, give * numeric value, otherwise try for symbolic name. */ char * inetname(struct in_addr *inp) { char *cp; static char line[MAXHOSTNAMELEN]; struct hostent *hp; struct netent *np; cp = 0; if (!numeric_addr && inp->s_addr != INADDR_ANY) { int net = inet_netof(*inp); int lna = inet_lnaof(*inp); if (lna == INADDR_ANY) { np = getnetbyaddr(net, AF_INET); if (np) cp = np->n_name; } if (cp == NULL) { hp = gethostbyaddr((char *)inp, sizeof (*inp), AF_INET); if (hp) { cp = hp->h_name; trimdomain(cp, strlen(cp)); } } } if (inp->s_addr == INADDR_ANY) strcpy(line, "*"); else if (cp) { strlcpy(line, cp, sizeof(line)); } else { inp->s_addr = ntohl(inp->s_addr); #define C(x) ((u_int)((x) & 0xff)) snprintf(line, sizeof(line), "%u.%u.%u.%u", C(inp->s_addr >> 24), C(inp->s_addr >> 16), C(inp->s_addr >> 8), C(inp->s_addr)); } return (line); } Index: projects/runtime-coverage/usr.bin/truss/syscalls.c =================================================================== --- projects/runtime-coverage/usr.bin/truss/syscalls.c (revision 322921) +++ projects/runtime-coverage/usr.bin/truss/syscalls.c (revision 322922) @@ -1,2414 +1,2410 @@ /* * Copyright 1997 Sean Eric Fagan * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Sean Eric Fagan * 4. Neither the name of the author may be used to endorse or promote * products derived from this software without specific prior written * permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); /* * This file has routines used to print out system calls and their * arguments. */ #include #include #include #include #include #include #include #include #define _WANT_FREEBSD11_STAT #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "truss.h" #include "extern.h" #include "syscall.h" /* * This should probably be in its own file, sorted alphabetically. */ static struct syscall decoded_syscalls[] = { /* Native ABI */ { .name = "__acl_aclcheck_fd", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_aclcheck_file", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_aclcheck_link", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_delete_fd", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Acltype, 1 } } }, { .name = "__acl_delete_file", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Acltype, 1 } } }, { .name = "__acl_delete_link", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Acltype, 1 } } }, { .name = "__acl_get_fd", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_get_file", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_get_link", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_set_fd", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_set_file", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__acl_set_link", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } }, { .name = "__cap_rights_get", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Int, 1 }, { CapRights | OUT, 2 } } }, { .name = "__getcwd", .ret_type = 1, .nargs = 2, .args = { { Name | OUT, 0 }, { Int, 1 } } }, { .name = "_umtx_op", .ret_type = 1, .nargs = 5, .args = { { Ptr, 0 }, { Umtxop, 1 }, { LongHex, 2 }, { Ptr, 3 }, { Ptr, 4 } } }, { .name = "accept", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Sockaddr | OUT, 1 }, { Ptr | OUT, 2 } } }, { .name = "access", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Accessmode, 1 } } }, { .name = "bind", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Sockaddr | IN, 1 }, { Socklent, 2 } } }, { .name = "bindat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Int, 1 }, { Sockaddr | IN, 2 }, { Int, 3 } } }, { .name = "break", .ret_type = 1, .nargs = 1, .args = { { Ptr, 0 } } }, { .name = "cap_fcntls_get", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CapFcntlRights | OUT, 1 } } }, { .name = "cap_fcntls_limit", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CapFcntlRights, 1 } } }, { .name = "cap_getmode", .ret_type = 1, .nargs = 1, .args = { { PUInt | OUT, 0 } } }, { .name = "cap_rights_limit", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CapRights, 1 } } }, { .name = "chdir", .ret_type = 1, .nargs = 1, .args = { { Name, 0 } } }, { .name = "chflags", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { FileFlags, 1 } } }, { .name = "chflagsat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name | IN, 1 }, { FileFlags, 2 }, { Atflags, 3 } } }, { .name = "chmod", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Octal, 1 } } }, { .name = "chown", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Int, 1 }, { Int, 2 } } }, { .name = "chroot", .ret_type = 1, .nargs = 1, .args = { { Name, 0 } } }, { .name = "clock_gettime", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Timespec | OUT, 1 } } }, { .name = "close", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "compat11.fstat", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Stat11 | OUT, 1 } } }, { .name = "compat11.fstatat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name | IN, 1 }, { Stat11 | OUT, 2 }, { Atflags, 3 } } }, { .name = "compat11.lstat", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Stat11 | OUT, 1 } } }, { .name = "compat11.stat", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Stat11 | OUT, 1 } } }, { .name = "connect", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Sockaddr | IN, 1 }, { Socklent, 2 } } }, { .name = "connectat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Int, 1 }, { Sockaddr | IN, 2 }, { Int, 3 } } }, { .name = "dup", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "dup2", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Int, 1 } } }, { .name = "eaccess", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Accessmode, 1 } } }, { .name = "execve", .ret_type = 1, .nargs = 3, .args = { { Name | IN, 0 }, { ExecArgs | IN, 1 }, { ExecEnv | IN, 2 } } }, { .name = "exit", .ret_type = 0, .nargs = 1, .args = { { Hex, 0 } } }, { .name = "extattr_delete_fd", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Extattrnamespace, 1 }, { Name, 2 } } }, { .name = "extattr_delete_file", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 } } }, { .name = "extattr_delete_link", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 } } }, { .name = "extattr_get_fd", .ret_type = 1, .nargs = 5, .args = { { Int, 0 }, { Extattrnamespace, 1 }, { Name, 2 }, { BinString | OUT, 3 }, { Sizet, 4 } } }, { .name = "extattr_get_file", .ret_type = 1, .nargs = 5, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 }, { BinString | OUT, 3 }, { Sizet, 4 } } }, { .name = "extattr_get_link", .ret_type = 1, .nargs = 5, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 }, { BinString | OUT, 3 }, { Sizet, 4 } } }, { .name = "extattr_list_fd", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { Extattrnamespace, 1 }, { BinString | OUT, 2 }, { Sizet, 3 } } }, { .name = "extattr_list_file", .ret_type = 1, .nargs = 4, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { BinString | OUT, 2 }, { Sizet, 3 } } }, { .name = "extattr_list_link", .ret_type = 1, .nargs = 4, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { BinString | OUT, 2 }, { Sizet, 3 } } }, { .name = "extattr_set_fd", .ret_type = 1, .nargs = 5, .args = { { Int, 0 }, { Extattrnamespace, 1 }, { Name, 2 }, { BinString | IN, 3 }, { Sizet, 4 } } }, { .name = "extattr_set_file", .ret_type = 1, .nargs = 5, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 }, { BinString | IN, 3 }, { Sizet, 4 } } }, { .name = "extattr_set_link", .ret_type = 1, .nargs = 5, .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 }, { BinString | IN, 3 }, { Sizet, 4 } } }, { .name = "extattrctl", .ret_type = 1, .nargs = 5, .args = { { Name, 0 }, { Hex, 1 }, { Name, 2 }, { Extattrnamespace, 3 }, { Name, 4 } } }, { .name = "faccessat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name | IN, 1 }, { Accessmode, 2 }, { Atflags, 3 } } }, { .name = "fchflags", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { FileFlags, 1 } } }, { .name = "fchmod", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Octal, 1 } } }, { .name = "fchmodat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 }, { Atflags, 3 } } }, { .name = "fchown", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Int, 1 }, { Int, 2 } } }, { .name = "fchownat", .ret_type = 1, .nargs = 5, .args = { { Atfd, 0 }, { Name, 1 }, { Int, 2 }, { Int, 3 }, { Atflags, 4 } } }, { .name = "fcntl", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Fcntl, 1 }, { Fcntlflag, 2 } } }, { .name = "flock", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Flockop, 1 } } }, { .name = "fstat", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Stat | OUT, 1 } } }, { .name = "fstatat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name | IN, 1 }, { Stat | OUT, 2 }, { Atflags, 3 } } }, { .name = "fstatfs", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { StatFs | OUT, 1 } } }, { .name = "ftruncate", .ret_type = 1, .nargs = 2, .args = { { Int | IN, 0 }, { QuadHex | IN, 1 } } }, { .name = "futimens", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Timespec2 | IN, 1 } } }, { .name = "futimes", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Timeval2 | IN, 1 } } }, { .name = "futimesat", .ret_type = 1, .nargs = 3, .args = { { Atfd, 0 }, { Name | IN, 1 }, { Timeval2 | IN, 2 } } }, { .name = "getdirentries", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | OUT, 1 }, { Int, 2 }, { PQuadHex | OUT, 3 } } }, { .name = "getfsstat", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Long, 1 }, { Getfsstatmode, 2 } } }, { .name = "getitimer", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Itimerval | OUT, 2 } } }, { .name = "getpeername", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Sockaddr | OUT, 1 }, { Ptr | OUT, 2 } } }, { .name = "getpgid", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "getpriority", .ret_type = 1, .nargs = 2, .args = { { Priowhich, 0 }, { Int, 1 } } }, { .name = "getrlimit", .ret_type = 1, .nargs = 2, .args = { { Resource, 0 }, { Rlimit | OUT, 1 } } }, { .name = "getrusage", .ret_type = 1, .nargs = 2, .args = { { RusageWho, 0 }, { Rusage | OUT, 1 } } }, { .name = "getsid", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "getsockname", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Sockaddr | OUT, 1 }, { Ptr | OUT, 2 } } }, { .name = "getsockopt", .ret_type = 1, .nargs = 5, .args = { { Int, 0 }, { Sockoptlevel, 1 }, { Sockoptname, 2 }, { Ptr | OUT, 3 }, { Ptr | OUT, 4 } } }, { .name = "gettimeofday", .ret_type = 1, .nargs = 2, .args = { { Timeval | OUT, 0 }, { Ptr, 1 } } }, { .name = "ioctl", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Ioctl, 1 }, { Ptr, 2 } } }, { .name = "kevent", .ret_type = 1, .nargs = 6, .args = { { Int, 0 }, { Kevent, 1 }, { Int, 2 }, { Kevent | OUT, 3 }, { Int, 4 }, { Timespec, 5 } } }, { .name = "kill", .ret_type = 1, .nargs = 2, .args = { { Int | IN, 0 }, { Signal | IN, 1 } } }, { .name = "kldfind", .ret_type = 1, .nargs = 1, .args = { { Name | IN, 0 } } }, { .name = "kldfirstmod", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "kldload", .ret_type = 1, .nargs = 1, .args = { { Name | IN, 0 } } }, { .name = "kldnext", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "kldstat", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Ptr, 1 } } }, { .name = "kldsym", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Kldsymcmd, 1 }, { Ptr, 2 } } }, { .name = "kldunload", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "kldunloadf", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Kldunloadflags, 1 } } }, { .name = "kse_release", .ret_type = 0, .nargs = 1, .args = { { Timespec, 0 } } }, { .name = "lchflags", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { FileFlags, 1 } } }, { .name = "lchmod", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Octal, 1 } } }, { .name = "lchown", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Int, 1 }, { Int, 2 } } }, { .name = "link", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Name, 1 } } }, { .name = "linkat", .ret_type = 1, .nargs = 5, .args = { { Atfd, 0 }, { Name, 1 }, { Atfd, 2 }, { Name, 3 }, { Atflags, 4 } } }, { .name = "listen", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Int, 1 } } }, { .name = "lseek", .ret_type = 2, .nargs = 3, .args = { { Int, 0 }, { QuadHex, 1 }, { Whence, 2 } } }, { .name = "lstat", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Stat | OUT, 1 } } }, { .name = "lutimes", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Timeval2 | IN, 1 } } }, { .name = "madvise", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Sizet, 1 }, { Madvice, 2 } } }, { .name = "minherit", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Sizet, 1 }, { Minherit, 2 } } }, { .name = "mkdir", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Octal, 1 } } }, { .name = "mkdirat", .ret_type = 1, .nargs = 3, .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 } } }, { .name = "mkfifo", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Octal, 1 } } }, { .name = "mkfifoat", .ret_type = 1, .nargs = 3, .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 } } }, { .name = "mknod", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Octal, 1 }, { Int, 2 } } }, { .name = "mknodat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 }, { Int, 3 } } }, { .name = "mlock", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { Sizet, 1 } } }, { .name = "mlockall", .ret_type = 1, .nargs = 1, .args = { { Mlockall, 0 } } }, { .name = "mmap", .ret_type = 1, .nargs = 6, .args = { { Ptr, 0 }, { Sizet, 1 }, { Mprot, 2 }, { Mmapflags, 3 }, { Int, 4 }, { QuadHex, 5 } } }, { .name = "modfind", .ret_type = 1, .nargs = 1, .args = { { Name | IN, 0 } } }, { .name = "mount", .ret_type = 1, .nargs = 4, .args = { { Name, 0 }, { Name, 1 }, { Mountflags, 2 }, { Ptr, 3 } } }, { .name = "mprotect", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Sizet, 1 }, { Mprot, 2 } } }, { .name = "msync", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Sizet, 1 }, { Msync, 2 } } }, { .name = "munlock", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { Sizet, 1 } } }, { .name = "munmap", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { Sizet, 1 } } }, { .name = "nanosleep", .ret_type = 1, .nargs = 1, .args = { { Timespec, 0 } } }, { .name = "nmount", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { UInt, 1 }, { Mountflags, 2 } } }, { .name = "open", .ret_type = 1, .nargs = 3, .args = { { Name | IN, 0 }, { Open, 1 }, { Octal, 2 } } }, { .name = "openat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name | IN, 1 }, { Open, 2 }, { Octal, 3 } } }, { .name = "pathconf", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Pathconf, 1 } } }, { .name = "pipe", .ret_type = 1, .nargs = 1, .args = { { PipeFds | OUT, 0 } } }, { .name = "pipe2", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { Pipe2, 1 } } }, { .name = "poll", .ret_type = 1, .nargs = 3, .args = { { Pollfd, 0 }, { Int, 1 }, { Int, 2 } } }, { .name = "posix_fadvise", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { QuadHex, 1 }, { QuadHex, 2 }, { Fadvice, 3 } } }, { .name = "posix_openpt", .ret_type = 1, .nargs = 1, .args = { { Open, 0 } } }, { .name = "pread", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | OUT, 1 }, { Sizet, 2 }, { QuadHex, 3 } } }, { .name = "procctl", .ret_type = 1, .nargs = 4, .args = { { Idtype, 0 }, { Quad, 1 }, { Procctl, 2 }, { Ptr, 3 } } }, { .name = "ptrace", .ret_type = 1, .nargs = 4, .args = { { Ptraceop, 0 }, { Int, 1 }, { Ptr, 2 }, { Int, 3 } } }, { .name = "pwrite", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | IN, 1 }, { Sizet, 2 }, { QuadHex, 3 } } }, { .name = "quotactl", .ret_type = 1, .nargs = 4, .args = { { Name, 0 }, { Quotactlcmd, 1 }, { Int, 2 }, { Ptr, 3 } } }, { .name = "read", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { BinString | OUT, 1 }, { Sizet, 2 } } }, { .name = "readlink", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Readlinkres | OUT, 1 }, { Sizet, 2 } } }, { .name = "readlinkat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name, 1 }, { Readlinkres | OUT, 2 }, { Sizet, 3 } } }, { .name = "reboot", .ret_type = 1, .nargs = 1, .args = { { Reboothowto, 0 } } }, { .name = "recvfrom", .ret_type = 1, .nargs = 6, .args = { { Int, 0 }, { BinString | OUT, 1 }, { Sizet, 2 }, { Msgflags, 3 }, { Sockaddr | OUT, 4 }, { Ptr | OUT, 5 } } }, { .name = "recvmsg", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Ptr, 1 }, { Msgflags, 2 } } }, { .name = "rename", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Name, 1 } } }, { .name = "renameat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name, 1 }, { Atfd, 2 }, { Name, 3 } } }, { .name = "rfork", .ret_type = 1, .nargs = 1, .args = { { Rforkflags, 0 } } }, { .name = "rmdir", .ret_type = 1, .nargs = 1, .args = { { Name, 0 } } }, { .name = "rtprio", .ret_type = 1, .nargs = 3, .args = { { Rtpriofunc, 0 }, { Int, 1 }, { Ptr, 2 } } }, { .name = "rtprio_thread", .ret_type = 1, .nargs = 3, .args = { { Rtpriofunc, 0 }, { Int, 1 }, { Ptr, 2 } } }, { .name = "sched_get_priority_max", .ret_type = 1, .nargs = 1, .args = { { Schedpolicy, 0 } } }, { .name = "sched_get_priority_min", .ret_type = 1, .nargs = 1, .args = { { Schedpolicy, 0 } } }, { .name = "sched_getparam", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Schedparam | OUT, 1 } } }, { .name = "sched_getscheduler", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "sched_rr_get_interval", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Timespec | OUT, 1 } } }, { .name = "sched_setparam", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Schedparam, 1 } } }, { .name = "sched_setscheduler", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Schedpolicy, 1 }, { Schedparam, 2 } } }, { .name = "sctp_generic_recvmsg", .ret_type = 1, .nargs = 7, .args = { { Int, 0 }, { Ptr | IN, 1 }, { Int, 2 }, { Sockaddr | OUT, 3 }, { Ptr | OUT, 4 }, { Ptr | OUT, 5 }, { Ptr | OUT, 6 } } }, { .name = "sctp_generic_sendmsg", .ret_type = 1, .nargs = 7, .args = { { Int, 0 }, { BinString | IN, 1 }, { Int, 2 }, { Sockaddr | IN, 3 }, { Socklent, 4 }, { Ptr | IN, 5 }, { Msgflags, 6 } } }, { .name = "select", .ret_type = 1, .nargs = 5, .args = { { Int, 0 }, { Fd_set, 1 }, { Fd_set, 2 }, { Fd_set, 3 }, { Timeval, 4 } } }, { .name = "sendmsg", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Ptr, 1 }, { Msgflags, 2 } } }, { .name = "sendto", .ret_type = 1, .nargs = 6, .args = { { Int, 0 }, { BinString | IN, 1 }, { Sizet, 2 }, { Msgflags, 3 }, { Sockaddr | IN, 4 }, { Socklent | IN, 5 } } }, { .name = "setitimer", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Itimerval, 1 }, { Itimerval | OUT, 2 } } }, { .name = "setpriority", .ret_type = 1, .nargs = 3, .args = { { Priowhich, 0 }, { Int, 1 }, { Int, 2 } } }, { .name = "setrlimit", .ret_type = 1, .nargs = 2, .args = { { Resource, 0 }, { Rlimit | IN, 1 } } }, { .name = "setsockopt", .ret_type = 1, .nargs = 5, .args = { { Int, 0 }, { Sockoptlevel, 1 }, { Sockoptname, 2 }, { Ptr | IN, 3 }, { Socklent, 4 } } }, { .name = "shutdown", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Shutdown, 1 } } }, { .name = "sigaction", .ret_type = 1, .nargs = 3, .args = { { Signal, 0 }, { Sigaction | IN, 1 }, { Sigaction | OUT, 2 } } }, { .name = "sigpending", .ret_type = 1, .nargs = 1, .args = { { Sigset | OUT, 0 } } }, { .name = "sigprocmask", .ret_type = 1, .nargs = 3, .args = { { Sigprocmask, 0 }, { Sigset, 1 }, { Sigset | OUT, 2 } } }, { .name = "sigqueue", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Signal, 1 }, { LongHex, 2 } } }, { .name = "sigreturn", .ret_type = 1, .nargs = 1, .args = { { Ptr, 0 } } }, { .name = "sigsuspend", .ret_type = 1, .nargs = 1, .args = { { Sigset | IN, 0 } } }, { .name = "sigtimedwait", .ret_type = 1, .nargs = 3, .args = { { Sigset | IN, 0 }, { Ptr, 1 }, { Timespec | IN, 2 } } }, { .name = "sigwait", .ret_type = 1, .nargs = 2, .args = { { Sigset | IN, 0 }, { Ptr, 1 } } }, { .name = "sigwaitinfo", .ret_type = 1, .nargs = 2, .args = { { Sigset | IN, 0 }, { Ptr, 1 } } }, { .name = "socket", .ret_type = 1, .nargs = 3, .args = { { Sockdomain, 0 }, { Socktype, 1 }, { Sockprotocol, 2 } } }, { .name = "stat", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Stat | OUT, 1 } } }, { .name = "statfs", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { StatFs | OUT, 1 } } }, { .name = "symlink", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Name, 1 } } }, { .name = "symlinkat", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Atfd, 1 }, { Name, 2 } } }, { .name = "sysarch", .ret_type = 1, .nargs = 2, .args = { { Sysarch, 0 }, { Ptr, 1 } } }, { .name = "thr_kill", .ret_type = 1, .nargs = 2, .args = { { Long, 0 }, { Signal, 1 } } }, { .name = "thr_self", .ret_type = 1, .nargs = 1, .args = { { Ptr, 0 } } }, + { .name = "thr_set_name", .ret_type = 1, .nargs = 2, + .args = { { Long, 0 }, { Name, 1 } } }, { .name = "truncate", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { QuadHex | IN, 1 } } }, #if 0 /* Does not exist */ { .name = "umount", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Int, 2 } } }, #endif { .name = "unlink", .ret_type = 1, .nargs = 1, .args = { { Name, 0 } } }, { .name = "unlinkat", .ret_type = 1, .nargs = 3, .args = { { Atfd, 0 }, { Name, 1 }, { Atflags, 2 } } }, { .name = "unmount", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Mountflags, 1 } } }, { .name = "utimensat", .ret_type = 1, .nargs = 4, .args = { { Atfd, 0 }, { Name | IN, 1 }, { Timespec2 | IN, 2 }, { Atflags, 3 } } }, { .name = "utimes", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Timeval2 | IN, 1 } } }, { .name = "utrace", .ret_type = 1, .nargs = 1, .args = { { Utrace, 0 } } }, { .name = "wait4", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { ExitStatus | OUT, 1 }, { Waitoptions, 2 }, { Rusage | OUT, 3 } } }, { .name = "wait6", .ret_type = 1, .nargs = 6, .args = { { Idtype, 0 }, { Quad, 1 }, { ExitStatus | OUT, 2 }, { Waitoptions, 3 }, { Rusage | OUT, 4 }, { Ptr, 5 } } }, { .name = "write", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { BinString | IN, 1 }, { Sizet, 2 } } }, /* Linux ABI */ { .name = "linux_access", .ret_type = 1, .nargs = 2, .args = { { Name, 0 }, { Accessmode, 1 } } }, { .name = "linux_execve", .ret_type = 1, .nargs = 3, .args = { { Name | IN, 0 }, { ExecArgs | IN, 1 }, { ExecEnv | IN, 2 } } }, { .name = "linux_lseek", .ret_type = 2, .nargs = 3, .args = { { Int, 0 }, { Int, 1 }, { Whence, 2 } } }, { .name = "linux_mkdir", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Int, 1 } } }, { .name = "linux_newfstat", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Ptr | OUT, 1 } } }, { .name = "linux_newstat", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Ptr | OUT, 1 } } }, { .name = "linux_open", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Hex, 1 }, { Octal, 2 } } }, { .name = "linux_readlink", .ret_type = 1, .nargs = 3, .args = { { Name, 0 }, { Name | OUT, 1 }, { Sizet, 2 } } }, { .name = "linux_socketcall", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { LinuxSockArgs, 1 } } }, { .name = "linux_stat64", .ret_type = 1, .nargs = 2, .args = { { Name | IN, 0 }, { Ptr | OUT, 1 } } }, /* CloudABI system calls. */ { .name = "cloudabi_sys_clock_res_get", .ret_type = 1, .nargs = 1, .args = { { CloudABIClockID, 0 } } }, { .name = "cloudabi_sys_clock_time_get", .ret_type = 1, .nargs = 2, .args = { { CloudABIClockID, 0 }, { CloudABITimestamp, 1 } } }, { .name = "cloudabi_sys_condvar_signal", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { CloudABIMFlags, 1 }, { UInt, 2 } } }, { .name = "cloudabi_sys_fd_close", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "cloudabi_sys_fd_create1", .ret_type = 1, .nargs = 1, .args = { { CloudABIFileType, 0 } } }, { .name = "cloudabi_sys_fd_create2", .ret_type = 1, .nargs = 2, .args = { { CloudABIFileType, 0 }, { PipeFds | OUT, 0 } } }, { .name = "cloudabi_sys_fd_datasync", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "cloudabi_sys_fd_dup", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "cloudabi_sys_fd_replace", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { Int, 1 } } }, { .name = "cloudabi_sys_fd_seek", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Int, 1 }, { CloudABIWhence, 2 } } }, { .name = "cloudabi_sys_fd_stat_get", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CloudABIFDStat | OUT, 1 } } }, { .name = "cloudabi_sys_fd_stat_put", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { CloudABIFDStat | IN, 1 }, { ClouduABIFDSFlags, 2 } } }, { .name = "cloudabi_sys_fd_sync", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "cloudabi_sys_file_advise", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { Int, 1 }, { Int, 2 }, { CloudABIAdvice, 3 } } }, { .name = "cloudabi_sys_file_allocate", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { Int, 1 }, { Int, 2 } } }, { .name = "cloudabi_sys_file_create", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { BinString | IN, 1 }, { CloudABIFileType, 3 } } }, { .name = "cloudabi_sys_file_link", .ret_type = 1, .nargs = 4, .args = { { CloudABILookup, 0 }, { BinString | IN, 1 }, { Int, 3 }, { BinString | IN, 4 } } }, { .name = "cloudabi_sys_file_open", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | IN, 1 }, { CloudABIOFlags, 3 }, { CloudABIFDStat | IN, 4 } } }, { .name = "cloudabi_sys_file_readdir", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | OUT, 1 }, { Int, 2 }, { Int, 3 } } }, { .name = "cloudabi_sys_file_readlink", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | IN, 1 }, { BinString | OUT, 3 }, { Int, 4 } } }, { .name = "cloudabi_sys_file_rename", .ret_type = 1, .nargs = 4, .args = { { Int, 0 }, { BinString | IN, 1 }, { Int, 3 }, { BinString | IN, 4 } } }, { .name = "cloudabi_sys_file_stat_fget", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CloudABIFileStat | OUT, 1 } } }, { .name = "cloudabi_sys_file_stat_fput", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { CloudABIFileStat | IN, 1 }, { CloudABIFSFlags, 2 } } }, { .name = "cloudabi_sys_file_stat_get", .ret_type = 1, .nargs = 3, .args = { { CloudABILookup, 0 }, { BinString | IN, 1 }, { CloudABIFileStat | OUT, 3 } } }, { .name = "cloudabi_sys_file_stat_put", .ret_type = 1, .nargs = 4, .args = { { CloudABILookup, 0 }, { BinString | IN, 1 }, { CloudABIFileStat | IN, 3 }, { CloudABIFSFlags, 4 } } }, { .name = "cloudabi_sys_file_symlink", .ret_type = 1, .nargs = 3, .args = { { BinString | IN, 0 }, { Int, 2 }, { BinString | IN, 3 } } }, { .name = "cloudabi_sys_file_unlink", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { BinString | IN, 1 }, { CloudABIULFlags, 3 } } }, { .name = "cloudabi_sys_lock_unlock", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { CloudABIMFlags, 1 } } }, { .name = "cloudabi_sys_mem_advise", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIAdvice, 2 } } }, { .name = "cloudabi_sys_mem_map", .ret_type = 1, .nargs = 6, .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIMProt, 2 }, { CloudABIMFlags, 3 }, { Int, 4 }, { Int, 5 } } }, { .name = "cloudabi_sys_mem_protect", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIMProt, 2 } } }, { .name = "cloudabi_sys_mem_sync", .ret_type = 1, .nargs = 3, .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIMSFlags, 2 } } }, { .name = "cloudabi_sys_mem_unmap", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { Int, 1 } } }, { .name = "cloudabi_sys_proc_exec", .ret_type = 1, .nargs = 5, .args = { { Int, 0 }, { BinString | IN, 1 }, { Int, 2 }, { IntArray, 3 }, { Int, 4 } } }, { .name = "cloudabi_sys_proc_exit", .ret_type = 1, .nargs = 1, .args = { { Int, 0 } } }, { .name = "cloudabi_sys_proc_fork", .ret_type = 1, .nargs = 0 }, { .name = "cloudabi_sys_proc_raise", .ret_type = 1, .nargs = 1, .args = { { CloudABISignal, 0 } } }, { .name = "cloudabi_sys_random_get", .ret_type = 1, .nargs = 2, .args = { { BinString | OUT, 0 }, { Int, 1 } } }, { .name = "cloudabi_sys_sock_accept", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CloudABISockStat | OUT, 1 } } }, - { .name = "cloudabi_sys_sock_bind", .ret_type = 1, .nargs = 3, - .args = { { Int, 0 }, { Int, 1 }, { BinString | IN, 2 } } }, - { .name = "cloudabi_sys_sock_connect", .ret_type = 1, .nargs = 3, - .args = { { Int, 0 }, { Int, 1 }, { BinString | IN, 2 } } }, - { .name = "cloudabi_sys_sock_listen", .ret_type = 1, .nargs = 2, - .args = { { Int, 0 }, { Int, 1 } } }, { .name = "cloudabi_sys_sock_shutdown", .ret_type = 1, .nargs = 2, .args = { { Int, 0 }, { CloudABISDFlags, 1 } } }, { .name = "cloudabi_sys_sock_stat_get", .ret_type = 1, .nargs = 3, .args = { { Int, 0 }, { CloudABISockStat | OUT, 1 }, { CloudABISSFlags, 2 } } }, { .name = "cloudabi_sys_thread_exit", .ret_type = 1, .nargs = 2, .args = { { Ptr, 0 }, { CloudABIMFlags, 1 } } }, { .name = "cloudabi_sys_thread_yield", .ret_type = 1, .nargs = 0 }, { .name = 0 }, }; static STAILQ_HEAD(, syscall) syscalls; /* Xlat idea taken from strace */ struct xlat { int val; const char *str; }; #define X(a) { a, #a }, #define XEND { 0, NULL } static struct xlat kevent_filters[] = { X(EVFILT_READ) X(EVFILT_WRITE) X(EVFILT_AIO) X(EVFILT_VNODE) X(EVFILT_PROC) X(EVFILT_SIGNAL) X(EVFILT_TIMER) X(EVFILT_PROCDESC) X(EVFILT_FS) X(EVFILT_LIO) X(EVFILT_USER) X(EVFILT_SENDFILE) XEND }; static struct xlat kevent_flags[] = { X(EV_ADD) X(EV_DELETE) X(EV_ENABLE) X(EV_DISABLE) X(EV_ONESHOT) X(EV_CLEAR) X(EV_RECEIPT) X(EV_DISPATCH) X(EV_FORCEONESHOT) X(EV_DROP) X(EV_FLAG1) X(EV_ERROR) X(EV_EOF) XEND }; static struct xlat kevent_user_ffctrl[] = { X(NOTE_FFNOP) X(NOTE_FFAND) X(NOTE_FFOR) X(NOTE_FFCOPY) XEND }; static struct xlat kevent_rdwr_fflags[] = { X(NOTE_LOWAT) X(NOTE_FILE_POLL) XEND }; static struct xlat kevent_vnode_fflags[] = { X(NOTE_DELETE) X(NOTE_WRITE) X(NOTE_EXTEND) X(NOTE_ATTRIB) X(NOTE_LINK) X(NOTE_RENAME) X(NOTE_REVOKE) XEND }; static struct xlat kevent_proc_fflags[] = { X(NOTE_EXIT) X(NOTE_FORK) X(NOTE_EXEC) X(NOTE_TRACK) X(NOTE_TRACKERR) X(NOTE_CHILD) XEND }; static struct xlat kevent_timer_fflags[] = { X(NOTE_SECONDS) X(NOTE_MSECONDS) X(NOTE_USECONDS) X(NOTE_NSECONDS) XEND }; static struct xlat poll_flags[] = { X(POLLSTANDARD) X(POLLIN) X(POLLPRI) X(POLLOUT) X(POLLERR) X(POLLHUP) X(POLLNVAL) X(POLLRDNORM) X(POLLRDBAND) X(POLLWRBAND) X(POLLINIGNEOF) XEND }; static struct xlat sigaction_flags[] = { X(SA_ONSTACK) X(SA_RESTART) X(SA_RESETHAND) X(SA_NOCLDSTOP) X(SA_NODEFER) X(SA_NOCLDWAIT) X(SA_SIGINFO) XEND }; static struct xlat pathconf_arg[] = { X(_PC_LINK_MAX) X(_PC_MAX_CANON) X(_PC_MAX_INPUT) X(_PC_NAME_MAX) X(_PC_PATH_MAX) X(_PC_PIPE_BUF) X(_PC_CHOWN_RESTRICTED) X(_PC_NO_TRUNC) X(_PC_VDISABLE) X(_PC_ASYNC_IO) X(_PC_PRIO_IO) X(_PC_SYNC_IO) X(_PC_ALLOC_SIZE_MIN) X(_PC_FILESIZEBITS) X(_PC_REC_INCR_XFER_SIZE) X(_PC_REC_MAX_XFER_SIZE) X(_PC_REC_MIN_XFER_SIZE) X(_PC_REC_XFER_ALIGN) X(_PC_SYMLINK_MAX) X(_PC_ACL_EXTENDED) X(_PC_ACL_PATH_MAX) X(_PC_CAP_PRESENT) X(_PC_INF_PRESENT) X(_PC_MAC_PRESENT) X(_PC_ACL_NFS4) X(_PC_MIN_HOLE_SIZE) XEND }; static struct xlat at_flags[] = { X(AT_EACCESS) X(AT_SYMLINK_NOFOLLOW) X(AT_SYMLINK_FOLLOW) X(AT_REMOVEDIR) XEND }; static struct xlat sysarch_ops[] = { #if defined(__i386__) || defined(__amd64__) X(I386_GET_LDT) X(I386_SET_LDT) X(I386_GET_IOPERM) X(I386_SET_IOPERM) X(I386_VM86) X(I386_GET_FSBASE) X(I386_SET_FSBASE) X(I386_GET_GSBASE) X(I386_SET_GSBASE) X(I386_GET_XFPUSTATE) X(AMD64_GET_FSBASE) X(AMD64_SET_FSBASE) X(AMD64_GET_GSBASE) X(AMD64_SET_GSBASE) X(AMD64_GET_XFPUSTATE) #endif XEND }; static struct xlat linux_socketcall_ops[] = { X(LINUX_SOCKET) X(LINUX_BIND) X(LINUX_CONNECT) X(LINUX_LISTEN) X(LINUX_ACCEPT) X(LINUX_GETSOCKNAME) X(LINUX_GETPEERNAME) X(LINUX_SOCKETPAIR) X(LINUX_SEND) X(LINUX_RECV) X(LINUX_SENDTO) X(LINUX_RECVFROM) X(LINUX_SHUTDOWN) X(LINUX_SETSOCKOPT) X(LINUX_GETSOCKOPT) X(LINUX_SENDMSG) X(LINUX_RECVMSG) XEND }; #undef X #define X(a) { CLOUDABI_##a, #a }, static struct xlat cloudabi_advice[] = { X(ADVICE_DONTNEED) X(ADVICE_NOREUSE) X(ADVICE_NORMAL) X(ADVICE_RANDOM) X(ADVICE_SEQUENTIAL) X(ADVICE_WILLNEED) XEND }; static struct xlat cloudabi_clockid[] = { X(CLOCK_MONOTONIC) X(CLOCK_PROCESS_CPUTIME_ID) X(CLOCK_REALTIME) X(CLOCK_THREAD_CPUTIME_ID) XEND }; static struct xlat cloudabi_errno[] = { X(E2BIG) X(EACCES) X(EADDRINUSE) X(EADDRNOTAVAIL) X(EAFNOSUPPORT) X(EAGAIN) X(EALREADY) X(EBADF) X(EBADMSG) X(EBUSY) X(ECANCELED) X(ECHILD) X(ECONNABORTED) X(ECONNREFUSED) X(ECONNRESET) X(EDEADLK) X(EDESTADDRREQ) X(EDOM) X(EDQUOT) X(EEXIST) X(EFAULT) X(EFBIG) X(EHOSTUNREACH) X(EIDRM) X(EILSEQ) X(EINPROGRESS) X(EINTR) X(EINVAL) X(EIO) X(EISCONN) X(EISDIR) X(ELOOP) X(EMFILE) X(EMLINK) X(EMSGSIZE) X(EMULTIHOP) X(ENAMETOOLONG) X(ENETDOWN) X(ENETRESET) X(ENETUNREACH) X(ENFILE) X(ENOBUFS) X(ENODEV) X(ENOENT) X(ENOEXEC) X(ENOLCK) X(ENOLINK) X(ENOMEM) X(ENOMSG) X(ENOPROTOOPT) X(ENOSPC) X(ENOSYS) X(ENOTCONN) X(ENOTDIR) X(ENOTEMPTY) X(ENOTRECOVERABLE) X(ENOTSOCK) X(ENOTSUP) X(ENOTTY) X(ENXIO) X(EOVERFLOW) X(EOWNERDEAD) X(EPERM) X(EPIPE) X(EPROTO) X(EPROTONOSUPPORT) X(EPROTOTYPE) X(ERANGE) X(EROFS) X(ESPIPE) X(ESRCH) X(ESTALE) X(ETIMEDOUT) X(ETXTBSY) X(EXDEV) X(ENOTCAPABLE) XEND }; static struct xlat cloudabi_fdflags[] = { X(FDFLAG_APPEND) X(FDFLAG_DSYNC) X(FDFLAG_NONBLOCK) X(FDFLAG_RSYNC) X(FDFLAG_SYNC) XEND }; static struct xlat cloudabi_fdsflags[] = { X(FDSTAT_FLAGS) X(FDSTAT_RIGHTS) XEND }; static struct xlat cloudabi_filetype[] = { X(FILETYPE_UNKNOWN) X(FILETYPE_BLOCK_DEVICE) X(FILETYPE_CHARACTER_DEVICE) X(FILETYPE_DIRECTORY) X(FILETYPE_FIFO) X(FILETYPE_POLL) X(FILETYPE_PROCESS) X(FILETYPE_REGULAR_FILE) X(FILETYPE_SHARED_MEMORY) X(FILETYPE_SOCKET_DGRAM) X(FILETYPE_SOCKET_STREAM) X(FILETYPE_SYMBOLIC_LINK) XEND }; static struct xlat cloudabi_fsflags[] = { X(FILESTAT_ATIM) X(FILESTAT_ATIM_NOW) X(FILESTAT_MTIM) X(FILESTAT_MTIM_NOW) X(FILESTAT_SIZE) XEND }; static struct xlat cloudabi_mflags[] = { X(MAP_ANON) X(MAP_FIXED) X(MAP_PRIVATE) X(MAP_SHARED) XEND }; static struct xlat cloudabi_mprot[] = { X(PROT_EXEC) X(PROT_WRITE) X(PROT_READ) XEND }; static struct xlat cloudabi_msflags[] = { X(MS_ASYNC) X(MS_INVALIDATE) X(MS_SYNC) XEND }; static struct xlat cloudabi_oflags[] = { X(O_CREAT) X(O_DIRECTORY) X(O_EXCL) X(O_TRUNC) XEND }; static struct xlat cloudabi_sdflags[] = { X(SHUT_RD) X(SHUT_WR) XEND }; static struct xlat cloudabi_signal[] = { X(SIGABRT) X(SIGALRM) X(SIGBUS) X(SIGCHLD) X(SIGCONT) X(SIGFPE) X(SIGHUP) X(SIGILL) X(SIGINT) X(SIGKILL) X(SIGPIPE) X(SIGQUIT) X(SIGSEGV) X(SIGSTOP) X(SIGSYS) X(SIGTERM) X(SIGTRAP) X(SIGTSTP) X(SIGTTIN) X(SIGTTOU) X(SIGURG) X(SIGUSR1) X(SIGUSR2) X(SIGVTALRM) X(SIGXCPU) X(SIGXFSZ) XEND }; static struct xlat cloudabi_ssflags[] = { X(SOCKSTAT_CLEAR_ERROR) XEND }; static struct xlat cloudabi_ssstate[] = { X(SOCKSTATE_ACCEPTCONN) XEND }; static struct xlat cloudabi_ulflags[] = { X(UNLINK_REMOVEDIR) XEND }; static struct xlat cloudabi_whence[] = { X(WHENCE_CUR) X(WHENCE_END) X(WHENCE_SET) XEND }; #undef X #undef XEND /* * Searches an xlat array for a value, and returns it if found. Otherwise * return a string representation. */ static const char * lookup(struct xlat *xlat, int val, int base) { static char tmp[16]; for (; xlat->str != NULL; xlat++) if (xlat->val == val) return (xlat->str); switch (base) { case 8: sprintf(tmp, "0%o", val); break; case 16: sprintf(tmp, "0x%x", val); break; case 10: sprintf(tmp, "%u", val); break; default: errx(1,"Unknown lookup base"); break; } return (tmp); } static const char * xlookup(struct xlat *xlat, int val) { return (lookup(xlat, val, 16)); } /* * Searches an xlat array containing bitfield values. Remaining bits * set after removing the known ones are printed at the end: * IN|0x400. */ static char * xlookup_bits(struct xlat *xlat, int val) { int len, rem; static char str[512]; len = 0; rem = val; for (; xlat->str != NULL; xlat++) { if ((xlat->val & rem) == xlat->val) { /* * Don't print the "all-bits-zero" string unless all * bits are really zero. */ if (xlat->val == 0 && val != 0) continue; len += sprintf(str + len, "%s|", xlat->str); rem &= ~(xlat->val); } } /* * If we have leftover bits or didn't match anything, print * the remainder. */ if (rem || len == 0) len += sprintf(str + len, "0x%x", rem); if (len && str[len - 1] == '|') len--; str[len] = 0; return (str); } static void print_integer_arg(const char *(*decoder)(int), FILE *fp, int value) { const char *str; str = decoder(value); if (str != NULL) fputs(str, fp); else fprintf(fp, "%d", value); } static void print_mask_arg(bool (*decoder)(FILE *, int, int *), FILE *fp, int value) { int rem; if (!decoder(fp, value, &rem)) fprintf(fp, "0x%x", rem); else if (rem != 0) fprintf(fp, "|0x%x", rem); } static void print_mask_arg32(bool (*decoder)(FILE *, uint32_t, uint32_t *), FILE *fp, uint32_t value) { uint32_t rem; if (!decoder(fp, value, &rem)) fprintf(fp, "0x%x", rem); else if (rem != 0) fprintf(fp, "|0x%x", rem); } #ifndef __LP64__ /* * Add argument padding to subsequent system calls afater a Quad * syscall arguments as needed. This used to be done by hand in the * decoded_syscalls table which was ugly and error prone. It is * simpler to do the fixup of offsets at initalization time than when * decoding arguments. */ static void quad_fixup(struct syscall *sc) { int offset, prev; u_int i; offset = 0; prev = -1; for (i = 0; i < sc->nargs; i++) { /* This arg type is a dummy that doesn't use offset. */ if ((sc->args[i].type & ARG_MASK) == PipeFds) continue; assert(prev < sc->args[i].offset); prev = sc->args[i].offset; sc->args[i].offset += offset; switch (sc->args[i].type & ARG_MASK) { case Quad: case QuadHex: #ifdef __powerpc__ /* * 64-bit arguments on 32-bit powerpc must be * 64-bit aligned. If the current offset is * not aligned, the calling convention inserts * a 32-bit pad argument that should be skipped. */ if (sc->args[i].offset % 2 == 1) { sc->args[i].offset++; offset++; } #endif offset++; default: break; } } } #endif void init_syscalls(void) { struct syscall *sc; STAILQ_INIT(&syscalls); for (sc = decoded_syscalls; sc->name != NULL; sc++) { #ifndef __LP64__ quad_fixup(sc); #endif STAILQ_INSERT_HEAD(&syscalls, sc, entries); } } static struct syscall * find_syscall(struct procabi *abi, u_int number) { struct extra_syscall *es; if (number < nitems(abi->syscalls)) return (abi->syscalls[number]); STAILQ_FOREACH(es, &abi->extra_syscalls, entries) { if (es->number == number) return (es->sc); } return (NULL); } static void add_syscall(struct procabi *abi, u_int number, struct syscall *sc) { struct extra_syscall *es; if (number < nitems(abi->syscalls)) { assert(abi->syscalls[number] == NULL); abi->syscalls[number] = sc; } else { es = malloc(sizeof(*es)); es->sc = sc; es->number = number; STAILQ_INSERT_TAIL(&abi->extra_syscalls, es, entries); } } /* * If/when the list gets big, it might be desirable to do it * as a hash table or binary search. */ struct syscall * get_syscall(struct threadinfo *t, u_int number, u_int nargs) { struct syscall *sc; const char *name; char *new_name; u_int i; sc = find_syscall(t->proc->abi, number); if (sc != NULL) return (sc); name = sysdecode_syscallname(t->proc->abi->abi, number); if (name == NULL) { asprintf(&new_name, "#%d", number); name = new_name; } else new_name = NULL; STAILQ_FOREACH(sc, &syscalls, entries) { if (strcmp(name, sc->name) == 0) { add_syscall(t->proc->abi, number, sc); free(new_name); return (sc); } } /* It is unknown. Add it into the list. */ #if DEBUG fprintf(stderr, "unknown syscall %s -- setting args to %d\n", name, nargs); #endif sc = calloc(1, sizeof(struct syscall)); sc->name = name; if (new_name != NULL) sc->unknown = true; sc->ret_type = 1; sc->nargs = nargs; for (i = 0; i < nargs; i++) { sc->args[i].offset = i; /* Treat all unknown arguments as LongHex. */ sc->args[i].type = LongHex; } STAILQ_INSERT_HEAD(&syscalls, sc, entries); add_syscall(t->proc->abi, number, sc); return (sc); } /* * Copy a fixed amount of bytes from the process. */ static int get_struct(pid_t pid, void *offset, void *buf, int len) { struct ptrace_io_desc iorequest; iorequest.piod_op = PIOD_READ_D; iorequest.piod_offs = offset; iorequest.piod_addr = buf; iorequest.piod_len = len; if (ptrace(PT_IO, pid, (caddr_t)&iorequest, 0) < 0) return (-1); return (0); } #define MAXSIZE 4096 /* * Copy a string from the process. Note that it is * expected to be a C string, but if max is set, it will * only get that much. */ static char * get_string(pid_t pid, void *addr, int max) { struct ptrace_io_desc iorequest; char *buf, *nbuf; size_t offset, size, totalsize; offset = 0; if (max) size = max + 1; else { /* Read up to the end of the current page. */ size = PAGE_SIZE - ((uintptr_t)addr % PAGE_SIZE); if (size > MAXSIZE) size = MAXSIZE; } totalsize = size; buf = malloc(totalsize); if (buf == NULL) return (NULL); for (;;) { iorequest.piod_op = PIOD_READ_D; iorequest.piod_offs = (char *)addr + offset; iorequest.piod_addr = buf + offset; iorequest.piod_len = size; if (ptrace(PT_IO, pid, (caddr_t)&iorequest, 0) < 0) { free(buf); return (NULL); } if (memchr(buf + offset, '\0', size) != NULL) return (buf); offset += size; if (totalsize < MAXSIZE && max == 0) { size = MAXSIZE - totalsize; if (size > PAGE_SIZE) size = PAGE_SIZE; nbuf = realloc(buf, totalsize + size); if (nbuf == NULL) { buf[totalsize - 1] = '\0'; return (buf); } buf = nbuf; totalsize += size; } else { buf[totalsize - 1] = '\0'; return (buf); } } } static const char * strsig2(int sig) { static char tmp[32]; const char *signame; signame = sysdecode_signal(sig); if (signame == NULL) { snprintf(tmp, sizeof(tmp), "%d", sig); signame = tmp; } return (signame); } static void print_kevent(FILE *fp, struct kevent *ke, int input) { switch (ke->filter) { case EVFILT_READ: case EVFILT_WRITE: case EVFILT_VNODE: case EVFILT_PROC: case EVFILT_TIMER: case EVFILT_PROCDESC: fprintf(fp, "%ju", (uintmax_t)ke->ident); break; case EVFILT_SIGNAL: fputs(strsig2(ke->ident), fp); break; default: fprintf(fp, "%p", (void *)ke->ident); } fprintf(fp, ",%s,%s,", xlookup(kevent_filters, ke->filter), xlookup_bits(kevent_flags, ke->flags)); switch (ke->filter) { case EVFILT_READ: case EVFILT_WRITE: fputs(xlookup_bits(kevent_rdwr_fflags, ke->fflags), fp); break; case EVFILT_VNODE: fputs(xlookup_bits(kevent_vnode_fflags, ke->fflags), fp); break; case EVFILT_PROC: case EVFILT_PROCDESC: fputs(xlookup_bits(kevent_proc_fflags, ke->fflags), fp); break; case EVFILT_TIMER: fputs(xlookup_bits(kevent_timer_fflags, ke->fflags), fp); break; case EVFILT_USER: { int ctrl, data; ctrl = ke->fflags & NOTE_FFCTRLMASK; data = ke->fflags & NOTE_FFLAGSMASK; if (input) { fputs(xlookup(kevent_user_ffctrl, ctrl), fp); if (ke->fflags & NOTE_TRIGGER) fputs("|NOTE_TRIGGER", fp); if (data != 0) fprintf(fp, "|%#x", data); } else { fprintf(fp, "%#x", data); } break; } default: fprintf(fp, "%#x", ke->fflags); } fprintf(fp, ",%#jx,%p", (uintmax_t)ke->data, ke->udata); } static void print_utrace(FILE *fp, void *utrace_addr, size_t len) { unsigned char *utrace_buffer; fprintf(fp, "{ "); if (sysdecode_utrace(fp, utrace_addr, len)) { fprintf(fp, " }"); return; } utrace_buffer = utrace_addr; fprintf(fp, "%zu:", len); while (len--) fprintf(fp, " %02x", *utrace_buffer++); fprintf(fp, " }"); } /* * Converts a syscall argument into a string. Said string is * allocated via malloc(), so needs to be free()'d. sc is * a pointer to the syscall description (see above); args is * an array of all of the system call arguments. */ char * print_arg(struct syscall_args *sc, unsigned long *args, long *retval, struct trussinfo *trussinfo) { FILE *fp; char *tmp; size_t tmplen; pid_t pid; fp = open_memstream(&tmp, &tmplen); pid = trussinfo->curthread->proc->pid; switch (sc->type & ARG_MASK) { case Hex: fprintf(fp, "0x%x", (int)args[sc->offset]); break; case Octal: fprintf(fp, "0%o", (int)args[sc->offset]); break; case Int: fprintf(fp, "%d", (int)args[sc->offset]); break; case UInt: fprintf(fp, "%u", (unsigned int)args[sc->offset]); break; case PUInt: { unsigned int val; if (get_struct(pid, (void *)args[sc->offset], &val, sizeof(val)) == 0) fprintf(fp, "{ %u }", val); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case LongHex: fprintf(fp, "0x%lx", args[sc->offset]); break; case Long: fprintf(fp, "%ld", args[sc->offset]); break; case Sizet: fprintf(fp, "%zu", (size_t)args[sc->offset]); break; case Name: { /* NULL-terminated string. */ char *tmp2; tmp2 = get_string(pid, (void*)args[sc->offset], 0); fprintf(fp, "\"%s\"", tmp2); free(tmp2); break; } case BinString: { /* * Binary block of data that might have printable characters. * XXX If type|OUT, assume that the length is the syscall's * return value. Otherwise, assume that the length of the block * is in the next syscall argument. */ int max_string = trussinfo->strsize; char tmp2[max_string + 1], *tmp3; int len; int truncated = 0; if (sc->type & OUT) len = retval[0]; else len = args[sc->offset + 1]; /* * Don't print more than max_string characters, to avoid word * wrap. If we have to truncate put some ... after the string. */ if (len > max_string) { len = max_string; truncated = 1; } if (len && get_struct(pid, (void*)args[sc->offset], &tmp2, len) != -1) { tmp3 = malloc(len * 4 + 1); while (len) { if (strvisx(tmp3, tmp2, len, VIS_CSTYLE|VIS_TAB|VIS_NL) <= max_string) break; len--; truncated = 1; } fprintf(fp, "\"%s\"%s", tmp3, truncated ? "..." : ""); free(tmp3); } else { fprintf(fp, "0x%lx", args[sc->offset]); } break; } case ExecArgs: case ExecEnv: case StringArray: { uintptr_t addr; union { char *strarray[0]; char buf[PAGE_SIZE]; } u; char *string; size_t len; u_int first, i; /* * Only parse argv[] and environment arrays from exec calls * if requested. */ if (((sc->type & ARG_MASK) == ExecArgs && (trussinfo->flags & EXECVEARGS) == 0) || ((sc->type & ARG_MASK) == ExecEnv && (trussinfo->flags & EXECVEENVS) == 0)) { fprintf(fp, "0x%lx", args[sc->offset]); break; } /* * Read a page of pointers at a time. Punt if the top-level * pointer is not aligned. Note that the first read is of * a partial page. */ addr = args[sc->offset]; if (addr % sizeof(char *) != 0) { fprintf(fp, "0x%lx", args[sc->offset]); break; } len = PAGE_SIZE - (addr & PAGE_MASK); if (get_struct(pid, (void *)addr, u.buf, len) == -1) { fprintf(fp, "0x%lx", args[sc->offset]); break; } fputc('[', fp); first = 1; i = 0; while (u.strarray[i] != NULL) { string = get_string(pid, u.strarray[i], 0); fprintf(fp, "%s \"%s\"", first ? "" : ",", string); free(string); first = 0; i++; if (i == len / sizeof(char *)) { addr += len; len = PAGE_SIZE; if (get_struct(pid, (void *)addr, u.buf, len) == -1) { fprintf(fp, ", "); break; } i = 0; } } fputs(" ]", fp); break; } #ifdef __LP64__ case Quad: fprintf(fp, "%ld", args[sc->offset]); break; case QuadHex: fprintf(fp, "0x%lx", args[sc->offset]); break; #else case Quad: case QuadHex: { unsigned long long ll; #if _BYTE_ORDER == _LITTLE_ENDIAN ll = (unsigned long long)args[sc->offset + 1] << 32 | args[sc->offset]; #else ll = (unsigned long long)args[sc->offset] << 32 | args[sc->offset + 1]; #endif if ((sc->type & ARG_MASK) == Quad) fprintf(fp, "%lld", ll); else fprintf(fp, "0x%llx", ll); break; } #endif case PQuadHex: { uint64_t val; if (get_struct(pid, (void *)args[sc->offset], &val, sizeof(val)) == 0) fprintf(fp, "{ 0x%jx }", (uintmax_t)val); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Ptr: fprintf(fp, "0x%lx", args[sc->offset]); break; case Readlinkres: { char *tmp2; if (retval[0] == -1) break; tmp2 = get_string(pid, (void*)args[sc->offset], retval[0]); fprintf(fp, "\"%s\"", tmp2); free(tmp2); break; } case Ioctl: { const char *temp; unsigned long cmd; cmd = args[sc->offset]; temp = sysdecode_ioctlname(cmd); if (temp) fputs(temp, fp); else { fprintf(fp, "0x%lx { IO%s%s 0x%lx('%c'), %lu, %lu }", cmd, cmd & IOC_OUT ? "R" : "", cmd & IOC_IN ? "W" : "", IOCGROUP(cmd), isprint(IOCGROUP(cmd)) ? (char)IOCGROUP(cmd) : '?', cmd & 0xFF, IOCPARM_LEN(cmd)); } break; } case Timespec: { struct timespec ts; if (get_struct(pid, (void *)args[sc->offset], &ts, sizeof(ts)) != -1) fprintf(fp, "{ %jd.%09ld }", (intmax_t)ts.tv_sec, ts.tv_nsec); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Timespec2: { struct timespec ts[2]; const char *sep; unsigned int i; if (get_struct(pid, (void *)args[sc->offset], &ts, sizeof(ts)) != -1) { fputs("{ ", fp); sep = ""; for (i = 0; i < nitems(ts); i++) { fputs(sep, fp); sep = ", "; switch (ts[i].tv_nsec) { case UTIME_NOW: fprintf(fp, "UTIME_NOW"); break; case UTIME_OMIT: fprintf(fp, "UTIME_OMIT"); break; default: fprintf(fp, "%jd.%09ld", (intmax_t)ts[i].tv_sec, ts[i].tv_nsec); break; } } fputs(" }", fp); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Timeval: { struct timeval tv; if (get_struct(pid, (void *)args[sc->offset], &tv, sizeof(tv)) != -1) fprintf(fp, "{ %jd.%06ld }", (intmax_t)tv.tv_sec, tv.tv_usec); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Timeval2: { struct timeval tv[2]; if (get_struct(pid, (void *)args[sc->offset], &tv, sizeof(tv)) != -1) fprintf(fp, "{ %jd.%06ld, %jd.%06ld }", (intmax_t)tv[0].tv_sec, tv[0].tv_usec, (intmax_t)tv[1].tv_sec, tv[1].tv_usec); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Itimerval: { struct itimerval itv; if (get_struct(pid, (void *)args[sc->offset], &itv, sizeof(itv)) != -1) fprintf(fp, "{ %jd.%06ld, %jd.%06ld }", (intmax_t)itv.it_interval.tv_sec, itv.it_interval.tv_usec, (intmax_t)itv.it_value.tv_sec, itv.it_value.tv_usec); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case LinuxSockArgs: { struct linux_socketcall_args largs; if (get_struct(pid, (void *)args[sc->offset], (void *)&largs, sizeof(largs)) != -1) fprintf(fp, "{ %s, 0x%lx }", lookup(linux_socketcall_ops, largs.what, 10), (long unsigned int)largs.args); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Pollfd: { /* * XXX: A Pollfd argument expects the /next/ syscall argument * to be the number of fds in the array. This matches the poll * syscall. */ struct pollfd *pfd; int numfds = args[sc->offset + 1]; size_t bytes = sizeof(struct pollfd) * numfds; int i; if ((pfd = malloc(bytes)) == NULL) err(1, "Cannot malloc %zu bytes for pollfd array", bytes); if (get_struct(pid, (void *)args[sc->offset], pfd, bytes) != -1) { fputs("{", fp); for (i = 0; i < numfds; i++) { fprintf(fp, " %d/%s", pfd[i].fd, xlookup_bits(poll_flags, pfd[i].events)); } fputs(" }", fp); } else { fprintf(fp, "0x%lx", args[sc->offset]); } free(pfd); break; } case Fd_set: { /* * XXX: A Fd_set argument expects the /first/ syscall argument * to be the number of fds in the array. This matches the * select syscall. */ fd_set *fds; int numfds = args[0]; size_t bytes = _howmany(numfds, _NFDBITS) * _NFDBITS; int i; if ((fds = malloc(bytes)) == NULL) err(1, "Cannot malloc %zu bytes for fd_set array", bytes); if (get_struct(pid, (void *)args[sc->offset], fds, bytes) != -1) { fputs("{", fp); for (i = 0; i < numfds; i++) { if (FD_ISSET(i, fds)) fprintf(fp, " %d", i); } fputs(" }", fp); } else fprintf(fp, "0x%lx", args[sc->offset]); free(fds); break; } case Signal: fputs(strsig2(args[sc->offset]), fp); break; case Sigset: { long sig; sigset_t ss; int i, first; sig = args[sc->offset]; if (get_struct(pid, (void *)args[sc->offset], (void *)&ss, sizeof(ss)) == -1) { fprintf(fp, "0x%lx", args[sc->offset]); break; } fputs("{ ", fp); first = 1; for (i = 1; i < sys_nsig; i++) { if (sigismember(&ss, i)) { fprintf(fp, "%s%s", !first ? "|" : "", strsig2(i)); first = 0; } } if (!first) fputc(' ', fp); fputc('}', fp); break; } case Sigprocmask: print_integer_arg(sysdecode_sigprocmask_how, fp, args[sc->offset]); break; case Fcntlflag: /* XXX: Output depends on the value of the previous argument. */ if (sysdecode_fcntl_arg_p(args[sc->offset - 1])) sysdecode_fcntl_arg(fp, args[sc->offset - 1], args[sc->offset], 16); break; case Open: print_mask_arg(sysdecode_open_flags, fp, args[sc->offset]); break; case Fcntl: print_integer_arg(sysdecode_fcntl_cmd, fp, args[sc->offset]); break; case Mprot: print_mask_arg(sysdecode_mmap_prot, fp, args[sc->offset]); break; case Mmapflags: print_mask_arg(sysdecode_mmap_flags, fp, args[sc->offset]); break; case Whence: print_integer_arg(sysdecode_whence, fp, args[sc->offset]); break; case Sockdomain: print_integer_arg(sysdecode_socketdomain, fp, args[sc->offset]); break; case Socktype: print_mask_arg(sysdecode_socket_type, fp, args[sc->offset]); break; case Shutdown: print_integer_arg(sysdecode_shutdown_how, fp, args[sc->offset]); break; case Resource: print_integer_arg(sysdecode_rlimit, fp, args[sc->offset]); break; case RusageWho: print_integer_arg(sysdecode_getrusage_who, fp, args[sc->offset]); break; case Pathconf: fputs(xlookup(pathconf_arg, args[sc->offset]), fp); break; case Rforkflags: print_mask_arg(sysdecode_rfork_flags, fp, args[sc->offset]); break; case Sockaddr: { char addr[64]; struct sockaddr_in *lsin; struct sockaddr_in6 *lsin6; struct sockaddr_un *sun; struct sockaddr *sa; socklen_t len; u_char *q; if (args[sc->offset] == 0) { fputs("NULL", fp); break; } /* * Extract the address length from the next argument. If * this is an output sockaddr (OUT is set), then the * next argument is a pointer to a socklen_t. Otherwise * the next argument contains a socklen_t by value. */ if (sc->type & OUT) { if (get_struct(pid, (void *)args[sc->offset + 1], &len, sizeof(len)) == -1) { fprintf(fp, "0x%lx", args[sc->offset]); break; } } else len = args[sc->offset + 1]; /* If the length is too small, just bail. */ if (len < sizeof(*sa)) { fprintf(fp, "0x%lx", args[sc->offset]); break; } sa = calloc(1, len); if (get_struct(pid, (void *)args[sc->offset], sa, len) == -1) { free(sa); fprintf(fp, "0x%lx", args[sc->offset]); break; } switch (sa->sa_family) { case AF_INET: if (len < sizeof(*lsin)) goto sockaddr_short; lsin = (struct sockaddr_in *)(void *)sa; inet_ntop(AF_INET, &lsin->sin_addr, addr, sizeof(addr)); fprintf(fp, "{ AF_INET %s:%d }", addr, htons(lsin->sin_port)); break; case AF_INET6: if (len < sizeof(*lsin6)) goto sockaddr_short; lsin6 = (struct sockaddr_in6 *)(void *)sa; inet_ntop(AF_INET6, &lsin6->sin6_addr, addr, sizeof(addr)); fprintf(fp, "{ AF_INET6 [%s]:%d }", addr, htons(lsin6->sin6_port)); break; case AF_UNIX: sun = (struct sockaddr_un *)sa; fprintf(fp, "{ AF_UNIX \"%.*s\" }", (int)(len - offsetof(struct sockaddr_un, sun_path)), sun->sun_path); break; default: sockaddr_short: fprintf(fp, "{ sa_len = %d, sa_family = %d, sa_data = {", (int)sa->sa_len, (int)sa->sa_family); for (q = (u_char *)sa->sa_data; q < (u_char *)sa + len; q++) fprintf(fp, "%s 0x%02x", q == (u_char *)sa->sa_data ? "" : ",", *q); fputs(" } }", fp); } free(sa); break; } case Sigaction: { struct sigaction sa; if (get_struct(pid, (void *)args[sc->offset], &sa, sizeof(sa)) != -1) { fputs("{ ", fp); if (sa.sa_handler == SIG_DFL) fputs("SIG_DFL", fp); else if (sa.sa_handler == SIG_IGN) fputs("SIG_IGN", fp); else fprintf(fp, "%p", sa.sa_handler); fprintf(fp, " %s ss_t }", xlookup_bits(sigaction_flags, sa.sa_flags)); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Kevent: { /* * XXX XXX: The size of the array is determined by either the * next syscall argument, or by the syscall return value, * depending on which argument number we are. This matches the * kevent syscall, but luckily that's the only syscall that uses * them. */ struct kevent *ke; int numevents = -1; size_t bytes; int i; if (sc->offset == 1) numevents = args[sc->offset+1]; else if (sc->offset == 3 && retval[0] != -1) numevents = retval[0]; if (numevents >= 0) { bytes = sizeof(struct kevent) * numevents; if ((ke = malloc(bytes)) == NULL) err(1, "Cannot malloc %zu bytes for kevent array", bytes); } else ke = NULL; if (numevents >= 0 && get_struct(pid, (void *)args[sc->offset], ke, bytes) != -1) { fputc('{', fp); for (i = 0; i < numevents; i++) { fputc(' ', fp); print_kevent(fp, &ke[i], sc->offset == 1); } fputs(" }", fp); } else { fprintf(fp, "0x%lx", args[sc->offset]); } free(ke); break; } case Stat: { struct stat st; if (get_struct(pid, (void *)args[sc->offset], &st, sizeof(st)) != -1) { char mode[12]; strmode(st.st_mode, mode); fprintf(fp, "{ mode=%s,inode=%ju,size=%jd,blksize=%ld }", mode, (uintmax_t)st.st_ino, (intmax_t)st.st_size, (long)st.st_blksize); } else { fprintf(fp, "0x%lx", args[sc->offset]); } break; } case Stat11: { struct freebsd11_stat st; if (get_struct(pid, (void *)args[sc->offset], &st, sizeof(st)) != -1) { char mode[12]; strmode(st.st_mode, mode); fprintf(fp, "{ mode=%s,inode=%ju,size=%jd,blksize=%ld }", mode, (uintmax_t)st.st_ino, (intmax_t)st.st_size, (long)st.st_blksize); } else { fprintf(fp, "0x%lx", args[sc->offset]); } break; } case StatFs: { unsigned int i; struct statfs buf; if (get_struct(pid, (void *)args[sc->offset], &buf, sizeof(buf)) != -1) { char fsid[17]; bzero(fsid, sizeof(fsid)); if (buf.f_fsid.val[0] != 0 || buf.f_fsid.val[1] != 0) { for (i = 0; i < sizeof(buf.f_fsid); i++) snprintf(&fsid[i*2], sizeof(fsid) - (i*2), "%02x", ((u_char *)&buf.f_fsid)[i]); } fprintf(fp, "{ fstypename=%s,mntonname=%s,mntfromname=%s," "fsid=%s }", buf.f_fstypename, buf.f_mntonname, buf.f_mntfromname, fsid); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Rusage: { struct rusage ru; if (get_struct(pid, (void *)args[sc->offset], &ru, sizeof(ru)) != -1) { fprintf(fp, "{ u=%jd.%06ld,s=%jd.%06ld,in=%ld,out=%ld }", (intmax_t)ru.ru_utime.tv_sec, ru.ru_utime.tv_usec, (intmax_t)ru.ru_stime.tv_sec, ru.ru_stime.tv_usec, ru.ru_inblock, ru.ru_oublock); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Rlimit: { struct rlimit rl; if (get_struct(pid, (void *)args[sc->offset], &rl, sizeof(rl)) != -1) { fprintf(fp, "{ cur=%ju,max=%ju }", rl.rlim_cur, rl.rlim_max); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case ExitStatus: { int status; if (get_struct(pid, (void *)args[sc->offset], &status, sizeof(status)) != -1) { fputs("{ ", fp); if (WIFCONTINUED(status)) fputs("CONTINUED", fp); else if (WIFEXITED(status)) fprintf(fp, "EXITED,val=%d", WEXITSTATUS(status)); else if (WIFSIGNALED(status)) fprintf(fp, "SIGNALED,sig=%s%s", strsig2(WTERMSIG(status)), WCOREDUMP(status) ? ",cored" : ""); else fprintf(fp, "STOPPED,sig=%s", strsig2(WTERMSIG(status))); fputs(" }", fp); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Waitoptions: print_mask_arg(sysdecode_wait6_options, fp, args[sc->offset]); break; case Idtype: print_integer_arg(sysdecode_idtype, fp, args[sc->offset]); break; case Procctl: print_integer_arg(sysdecode_procctl_cmd, fp, args[sc->offset]); break; case Umtxop: print_integer_arg(sysdecode_umtx_op, fp, args[sc->offset]); break; case Atfd: print_integer_arg(sysdecode_atfd, fp, args[sc->offset]); break; case Atflags: fputs(xlookup_bits(at_flags, args[sc->offset]), fp); break; case Accessmode: print_mask_arg(sysdecode_access_mode, fp, args[sc->offset]); break; case Sysarch: fputs(xlookup(sysarch_ops, args[sc->offset]), fp); break; case PipeFds: /* * The pipe() system call in the kernel returns its * two file descriptors via return values. However, * the interface exposed by libc is that pipe() * accepts a pointer to an array of descriptors. * Format the output to match the libc API by printing * the returned file descriptors as a fake argument. * * Overwrite the first retval to signal a successful * return as well. */ fprintf(fp, "{ %ld, %ld }", retval[0], retval[1]); retval[0] = 0; break; case Utrace: { size_t len; void *utrace_addr; len = args[sc->offset + 1]; utrace_addr = calloc(1, len); if (get_struct(pid, (void *)args[sc->offset], (void *)utrace_addr, len) != -1) print_utrace(fp, utrace_addr, len); else fprintf(fp, "0x%lx", args[sc->offset]); free(utrace_addr); break; } case IntArray: { int descriptors[16]; unsigned long i, ndescriptors; bool truncated; ndescriptors = args[sc->offset + 1]; truncated = false; if (ndescriptors > nitems(descriptors)) { ndescriptors = nitems(descriptors); truncated = true; } if (get_struct(pid, (void *)args[sc->offset], descriptors, ndescriptors * sizeof(descriptors[0])) != -1) { fprintf(fp, "{"); for (i = 0; i < ndescriptors; i++) fprintf(fp, i == 0 ? " %d" : ", %d", descriptors[i]); fprintf(fp, truncated ? ", ... }" : " }"); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Pipe2: print_mask_arg(sysdecode_pipe2_flags, fp, args[sc->offset]); break; case CapFcntlRights: { uint32_t rights; if (sc->type & OUT) { if (get_struct(pid, (void *)args[sc->offset], &rights, sizeof(rights)) == -1) { fprintf(fp, "0x%lx", args[sc->offset]); break; } } else rights = args[sc->offset]; print_mask_arg32(sysdecode_cap_fcntlrights, fp, rights); break; } case Fadvice: print_integer_arg(sysdecode_fadvice, fp, args[sc->offset]); break; case FileFlags: { fflags_t rem; if (!sysdecode_fileflags(fp, args[sc->offset], &rem)) fprintf(fp, "0x%x", rem); else if (rem != 0) fprintf(fp, "|0x%x", rem); break; } case Flockop: print_mask_arg(sysdecode_flock_operation, fp, args[sc->offset]); break; case Getfsstatmode: print_integer_arg(sysdecode_getfsstat_mode, fp, args[sc->offset]); break; case Kldsymcmd: print_integer_arg(sysdecode_kldsym_cmd, fp, args[sc->offset]); break; case Kldunloadflags: print_integer_arg(sysdecode_kldunload_flags, fp, args[sc->offset]); break; case Madvice: print_integer_arg(sysdecode_madvice, fp, args[sc->offset]); break; case Socklent: fprintf(fp, "%u", (socklen_t)args[sc->offset]); break; case Sockprotocol: { const char *temp; int domain, protocol; domain = args[sc->offset - 2]; protocol = args[sc->offset]; if (protocol == 0) { fputs("0", fp); } else { temp = sysdecode_socket_protocol(domain, protocol); if (temp) { fputs(temp, fp); } else { fprintf(fp, "%d", protocol); } } break; } case Sockoptlevel: print_integer_arg(sysdecode_sockopt_level, fp, args[sc->offset]); break; case Sockoptname: { const char *temp; int level, name; level = args[sc->offset - 1]; name = args[sc->offset]; temp = sysdecode_sockopt_name(level, name); if (temp) { fputs(temp, fp); } else { fprintf(fp, "%d", name); } break; } case Msgflags: print_mask_arg(sysdecode_msg_flags, fp, args[sc->offset]); break; case CapRights: { cap_rights_t rights; if (get_struct(pid, (void *)args[sc->offset], &rights, sizeof(rights)) != -1) { fputs("{ ", fp); sysdecode_cap_rights(fp, &rights); fputs(" }", fp); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case Acltype: print_integer_arg(sysdecode_acltype, fp, args[sc->offset]); break; case Extattrnamespace: print_integer_arg(sysdecode_extattrnamespace, fp, args[sc->offset]); break; case Minherit: print_integer_arg(sysdecode_minherit_inherit, fp, args[sc->offset]); break; case Mlockall: print_mask_arg(sysdecode_mlockall_flags, fp, args[sc->offset]); break; case Mountflags: print_mask_arg(sysdecode_mount_flags, fp, args[sc->offset]); break; case Msync: print_mask_arg(sysdecode_msync_flags, fp, args[sc->offset]); break; case Priowhich: print_integer_arg(sysdecode_prio_which, fp, args[sc->offset]); break; case Ptraceop: print_integer_arg(sysdecode_ptrace_request, fp, args[sc->offset]); break; case Quotactlcmd: if (!sysdecode_quotactl_cmd(fp, args[sc->offset])) fprintf(fp, "%#x", (int)args[sc->offset]); break; case Reboothowto: print_mask_arg(sysdecode_reboot_howto, fp, args[sc->offset]); break; case Rtpriofunc: print_integer_arg(sysdecode_rtprio_function, fp, args[sc->offset]); break; case Schedpolicy: print_integer_arg(sysdecode_scheduler_policy, fp, args[sc->offset]); break; case Schedparam: { struct sched_param sp; if (get_struct(pid, (void *)args[sc->offset], &sp, sizeof(sp)) != -1) fprintf(fp, "{ %d }", sp.sched_priority); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case CloudABIAdvice: fputs(xlookup(cloudabi_advice, args[sc->offset]), fp); break; case CloudABIClockID: fputs(xlookup(cloudabi_clockid, args[sc->offset]), fp); break; case ClouduABIFDSFlags: fputs(xlookup_bits(cloudabi_fdsflags, args[sc->offset]), fp); break; case CloudABIFDStat: { cloudabi_fdstat_t fds; if (get_struct(pid, (void *)args[sc->offset], &fds, sizeof(fds)) != -1) { fprintf(fp, "{ %s, ", xlookup(cloudabi_filetype, fds.fs_filetype)); fprintf(fp, "%s, ... }", xlookup_bits(cloudabi_fdflags, fds.fs_flags)); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case CloudABIFileStat: { cloudabi_filestat_t fsb; if (get_struct(pid, (void *)args[sc->offset], &fsb, sizeof(fsb)) != -1) fprintf(fp, "{ %s, %ju }", xlookup(cloudabi_filetype, fsb.st_filetype), (uintmax_t)fsb.st_size); else fprintf(fp, "0x%lx", args[sc->offset]); break; } case CloudABIFileType: fputs(xlookup(cloudabi_filetype, args[sc->offset]), fp); break; case CloudABIFSFlags: fputs(xlookup_bits(cloudabi_fsflags, args[sc->offset]), fp); break; case CloudABILookup: if ((args[sc->offset] & CLOUDABI_LOOKUP_SYMLINK_FOLLOW) != 0) fprintf(fp, "%d|LOOKUP_SYMLINK_FOLLOW", (int)args[sc->offset]); else fprintf(fp, "%d", (int)args[sc->offset]); break; case CloudABIMFlags: fputs(xlookup_bits(cloudabi_mflags, args[sc->offset]), fp); break; case CloudABIMProt: fputs(xlookup_bits(cloudabi_mprot, args[sc->offset]), fp); break; case CloudABIMSFlags: fputs(xlookup_bits(cloudabi_msflags, args[sc->offset]), fp); break; case CloudABIOFlags: fputs(xlookup_bits(cloudabi_oflags, args[sc->offset]), fp); break; case CloudABISDFlags: fputs(xlookup_bits(cloudabi_sdflags, args[sc->offset]), fp); break; case CloudABISignal: fputs(xlookup(cloudabi_signal, args[sc->offset]), fp); break; case CloudABISockStat: { cloudabi_sockstat_t ss; if (get_struct(pid, (void *)args[sc->offset], &ss, sizeof(ss)) != -1) { fprintf(fp, "%s, ", xlookup( cloudabi_errno, ss.ss_error)); fprintf(fp, "%s }", xlookup_bits( cloudabi_ssstate, ss.ss_state)); } else fprintf(fp, "0x%lx", args[sc->offset]); break; } case CloudABISSFlags: fputs(xlookup_bits(cloudabi_ssflags, args[sc->offset]), fp); break; case CloudABITimestamp: fprintf(fp, "%lu.%09lus", args[sc->offset] / 1000000000, args[sc->offset] % 1000000000); break; case CloudABIULFlags: fputs(xlookup_bits(cloudabi_ulflags, args[sc->offset]), fp); break; case CloudABIWhence: fputs(xlookup(cloudabi_whence, args[sc->offset]), fp); break; default: errx(1, "Invalid argument type %d\n", sc->type & ARG_MASK); } fclose(fp); return (tmp); } /* * Print (to outfile) the system call and its arguments. */ void print_syscall(struct trussinfo *trussinfo) { struct threadinfo *t; const char *name; char **s_args; int i, len, nargs; t = trussinfo->curthread; name = t->cs.sc->name; nargs = t->cs.nargs; s_args = t->cs.s_args; len = print_line_prefix(trussinfo); len += fprintf(trussinfo->outfile, "%s(", name); for (i = 0; i < nargs; i++) { if (s_args[i] != NULL) len += fprintf(trussinfo->outfile, "%s", s_args[i]); else len += fprintf(trussinfo->outfile, ""); len += fprintf(trussinfo->outfile, "%s", i < (nargs - 1) ? "," : ""); } len += fprintf(trussinfo->outfile, ")"); for (i = 0; i < 6 - (len / 8); i++) fprintf(trussinfo->outfile, "\t"); } void print_syscall_ret(struct trussinfo *trussinfo, int errorp, long *retval) { struct timespec timediff; struct threadinfo *t; struct syscall *sc; int error; t = trussinfo->curthread; sc = t->cs.sc; if (trussinfo->flags & COUNTONLY) { timespecsubt(&t->after, &t->before, &timediff); timespecadd(&sc->time, &timediff, &sc->time); sc->ncalls++; if (errorp) sc->nerror++; return; } print_syscall(trussinfo); fflush(trussinfo->outfile); if (retval == NULL) { /* * This system call resulted in the current thread's exit, * so there is no return value or error to display. */ fprintf(trussinfo->outfile, "\n"); return; } if (errorp) { error = sysdecode_abi_to_freebsd_errno(t->proc->abi->abi, retval[0]); fprintf(trussinfo->outfile, " ERR#%ld '%s'\n", retval[0], error == INT_MAX ? "Unknown error" : strerror(error)); } #ifndef __LP64__ else if (sc->ret_type == 2) { off_t off; #if _BYTE_ORDER == _LITTLE_ENDIAN off = (off_t)retval[1] << 32 | retval[0]; #else off = (off_t)retval[0] << 32 | retval[1]; #endif fprintf(trussinfo->outfile, " = %jd (0x%jx)\n", (intmax_t)off, (intmax_t)off); } #endif else fprintf(trussinfo->outfile, " = %ld (0x%lx)\n", retval[0], retval[0]); } void print_summary(struct trussinfo *trussinfo) { struct timespec total = {0, 0}; struct syscall *sc; int ncall, nerror; fprintf(trussinfo->outfile, "%-20s%15s%8s%8s\n", "syscall", "seconds", "calls", "errors"); ncall = nerror = 0; STAILQ_FOREACH(sc, &syscalls, entries) if (sc->ncalls) { fprintf(trussinfo->outfile, "%-20s%5jd.%09ld%8d%8d\n", sc->name, (intmax_t)sc->time.tv_sec, sc->time.tv_nsec, sc->ncalls, sc->nerror); timespecadd(&total, &sc->time, &total); ncall += sc->ncalls; nerror += sc->nerror; } fprintf(trussinfo->outfile, "%20s%15s%8s%8s\n", "", "-------------", "-------", "-------"); fprintf(trussinfo->outfile, "%-20s%5jd.%09ld%8d%8d\n", "", (intmax_t)total.tv_sec, total.tv_nsec, ncall, nerror); } Index: projects/runtime-coverage/usr.sbin/makefs/mtree.c =================================================================== --- projects/runtime-coverage/usr.sbin/makefs/mtree.c (revision 322921) +++ projects/runtime-coverage/usr.sbin/makefs/mtree.c (revision 322922) @@ -1,1116 +1,1104 @@ /*- * Copyright (c) 2011 Marcel Moolenaar * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include #include "makefs.h" #ifndef ENOATTR #define ENOATTR ENOMSG #endif #define IS_DOT(nm) ((nm)[0] == '.' && (nm)[1] == '\0') #define IS_DOTDOT(nm) ((nm)[0] == '.' && (nm)[1] == '.' && (nm)[2] == '\0') struct mtree_fileinfo { SLIST_ENTRY(mtree_fileinfo) next; FILE *fp; const char *name; u_int line; }; /* Global state used while parsing. */ static SLIST_HEAD(, mtree_fileinfo) mtree_fileinfo = SLIST_HEAD_INITIALIZER(mtree_fileinfo); static fsnode *mtree_root; static fsnode *mtree_current; static fsnode mtree_global; static fsinode mtree_global_inode; static u_int errors, warnings; static void mtree_error(const char *, ...) __printflike(1, 2); static void mtree_warning(const char *, ...) __printflike(1, 2); static int mtree_file_push(const char *name, FILE *fp) { struct mtree_fileinfo *fi; fi = emalloc(sizeof(*fi)); if (strcmp(name, "-") == 0) fi->name = estrdup("(stdin)"); else fi->name = estrdup(name); if (fi->name == NULL) { free(fi); return (ENOMEM); } fi->fp = fp; fi->line = 0; SLIST_INSERT_HEAD(&mtree_fileinfo, fi, next); return (0); } static void mtree_print(const char *msgtype, const char *fmt, va_list ap) { struct mtree_fileinfo *fi; if (msgtype != NULL) { fi = SLIST_FIRST(&mtree_fileinfo); if (fi != NULL) fprintf(stderr, "%s:%u: ", fi->name, fi->line); fprintf(stderr, "%s: ", msgtype); } vfprintf(stderr, fmt, ap); } static void mtree_error(const char *fmt, ...) { va_list ap; va_start(ap, fmt); mtree_print("error", fmt, ap); va_end(ap); errors++; fputc('\n', stderr); } static void mtree_warning(const char *fmt, ...) { va_list ap; va_start(ap, fmt); mtree_print("warning", fmt, ap); va_end(ap); warnings++; fputc('\n', stderr); } #ifndef MAKEFS_MAX_TREE_DEPTH # define MAKEFS_MAX_TREE_DEPTH (MAXPATHLEN/2) #endif /* construct path to node->name */ static char * mtree_file_path(fsnode *node) { fsnode *pnode; struct sbuf *sb; char *res, *rp[MAKEFS_MAX_TREE_DEPTH]; int depth; depth = 0; rp[depth] = node->name; for (pnode = node->parent; pnode && depth < MAKEFS_MAX_TREE_DEPTH - 1; pnode = pnode->parent) { if (strcmp(pnode->name, ".") == 0) break; rp[++depth] = pnode->name; } sb = sbuf_new_auto(); if (sb == NULL) { errno = ENOMEM; return (NULL); } while (depth > 0) { sbuf_cat(sb, rp[depth--]); sbuf_putc(sb, '/'); } sbuf_cat(sb, rp[depth]); sbuf_finish(sb); res = estrdup(sbuf_data(sb)); sbuf_delete(sb); if (res == NULL) errno = ENOMEM; return res; } /* mtree_resolve() sets errno to indicate why NULL was returned. */ static char * mtree_resolve(const char *spec, int *istemp) { struct sbuf *sb; char *res, *var = NULL; const char *base, *p, *v; size_t len; int c, error, quoted, subst; len = strlen(spec); if (len == 0) { errno = EINVAL; return (NULL); } c = (len > 1) ? (spec[0] == spec[len - 1]) ? spec[0] : 0 : 0; *istemp = (c == '`') ? 1 : 0; subst = (c == '`' || c == '"') ? 1 : 0; quoted = (subst || c == '\'') ? 1 : 0; if (!subst) { res = estrdup(spec + quoted); if (quoted) res[len - 2] = '\0'; return (res); } sb = sbuf_new_auto(); if (sb == NULL) { errno = ENOMEM; return (NULL); } base = spec + 1; len -= 2; error = 0; while (len > 0) { p = strchr(base, '$'); if (p == NULL) { sbuf_bcat(sb, base, len); base += len; len = 0; continue; } /* The following is safe. spec always starts with a quote. */ if (p[-1] == '\\') p--; if (base != p) { sbuf_bcat(sb, base, p - base); len -= p - base; base = p; } if (*p == '\\') { sbuf_putc(sb, '$'); base += 2; len -= 2; continue; } /* Skip the '$'. */ base++; len--; /* Handle ${X} vs $X. */ v = base; if (*base == '{') { p = strchr(v, '}'); if (p == NULL) p = v; } else p = v; len -= (p + 1) - base; base = p + 1; if (v == p) { sbuf_putc(sb, *v); continue; } error = ENOMEM; var = ecalloc(p - v, 1); memcpy(var, v + 1, p - v - 1); if (strcmp(var, ".CURDIR") == 0) { res = getcwd(NULL, 0); if (res == NULL) break; } else if (strcmp(var, ".PROG") == 0) { res = estrdup(getprogname()); } else { v = getenv(var); if (v != NULL) { res = estrdup(v); } else res = NULL; } error = 0; if (res != NULL) { sbuf_cat(sb, res); free(res); } free(var); var = NULL; } free(var); sbuf_finish(sb); res = (error == 0) ? strdup(sbuf_data(sb)) : NULL; sbuf_delete(sb); if (res == NULL) errno = ENOMEM; return (res); } static int skip_over(FILE *fp, const char *cs) { int c; c = getc(fp); while (c != EOF && strchr(cs, c) != NULL) c = getc(fp); if (c != EOF) { ungetc(c, fp); return (0); } return (ferror(fp) ? errno : -1); } static int skip_to(FILE *fp, const char *cs) { int c; c = getc(fp); while (c != EOF && strchr(cs, c) == NULL) c = getc(fp); if (c != EOF) { ungetc(c, fp); return (0); } return (ferror(fp) ? errno : -1); } static int read_word(FILE *fp, char *buf, size_t bufsz) { struct mtree_fileinfo *fi; size_t idx, qidx; int c, done, error, esc, qlvl; if (bufsz == 0) return (EINVAL); done = 0; esc = 0; idx = 0; qidx = -1; qlvl = 0; do { c = getc(fp); switch (c) { case EOF: buf[idx] = '\0'; error = ferror(fp) ? errno : -1; if (error == -1) mtree_error("unexpected end of file"); return (error); case '#': /* comment -- skip to end of line. */ if (!esc) { error = skip_to(fp, "\n"); if (!error) continue; } break; case '\\': esc++; - if (esc == 1) - continue; break; case '`': case '\'': case '"': if (esc) break; if (qlvl == 0) { qlvl++; qidx = idx; } else if (c == buf[qidx]) { qlvl--; if (qlvl > 0) { do { qidx--; } while (buf[qidx] != '`' && buf[qidx] != '\'' && buf[qidx] != '"'); } else qidx = -1; } else { qlvl++; qidx = idx; } break; case ' ': case '\t': case '\n': if (!esc && qlvl == 0) { ungetc(c, fp); c = '\0'; done = 1; break; } if (c == '\n') { /* * We going to eat the newline ourselves. */ if (qlvl > 0) mtree_warning("quoted word straddles " "onto next line."); fi = SLIST_FIRST(&mtree_fileinfo); fi->line++; } break; - case 'a': + default: if (esc) - c = '\a'; + buf[idx++] = '\\'; break; - case 'b': - if (esc) - c = '\b'; - break; - case 'f': - if (esc) - c = '\f'; - break; - case 'n': - if (esc) - c = '\n'; - break; - case 'r': - if (esc) - c = '\r'; - break; - case 't': - if (esc) - c = '\t'; - break; - case 'v': - if (esc) - c = '\v'; - break; } buf[idx++] = c; esc = 0; } while (idx < bufsz && !done); if (idx >= bufsz) { mtree_error("word too long to fit buffer (max %zu characters)", bufsz); skip_to(fp, " \t\n"); } return (0); } static fsnode * create_node(const char *name, u_int type, fsnode *parent, fsnode *global) { fsnode *n; n = ecalloc(1, sizeof(*n)); n->name = estrdup(name); n->type = (type == 0) ? global->type : type; n->parent = parent; n->inode = ecalloc(1, sizeof(*n->inode)); /* Assign global options/defaults. */ memcpy(n->inode, global->inode, sizeof(*n->inode)); n->inode->st.st_mode = (n->inode->st.st_mode & ~S_IFMT) | n->type; if (n->type == S_IFLNK) n->symlink = global->symlink; else if (n->type == S_IFREG) n->contents = global->contents; return (n); } static void destroy_node(fsnode *n) { assert(n != NULL); assert(n->name != NULL); assert(n->inode != NULL); free(n->inode); free(n->name); free(n); } static int read_number(const char *tok, u_int base, intmax_t *res, intmax_t min, intmax_t max) { char *end; intmax_t val; val = strtoimax(tok, &end, base); if (end == tok || end[0] != '\0') return (EINVAL); if (val < min || val > max) return (EDOM); *res = val; return (0); } static int read_mtree_keywords(FILE *fp, fsnode *node) { char keyword[PATH_MAX]; char *name, *p, *value; gid_t gid; uid_t uid; struct stat *st, sb; intmax_t num; u_long flset, flclr; int error, istemp; uint32_t type; st = &node->inode->st; do { error = skip_over(fp, " \t"); if (error) break; error = read_word(fp, keyword, sizeof(keyword)); if (error) break; if (keyword[0] == '\0') break; value = strchr(keyword, '='); if (value != NULL) *value++ = '\0'; /* * We use EINVAL, ENOATTR, ENOSYS and ENXIO to signal * certain conditions: * EINVAL - Value provided for a keyword that does * not take a value. The value is ignored. * ENOATTR - Value missing for a keyword that needs * a value. The keyword is ignored. * ENOSYS - Unsupported keyword encountered. The * keyword is ignored. * ENXIO - Value provided for a keyword that does * not take a value. The value is ignored. */ switch (keyword[0]) { case 'c': if (strcmp(keyword, "contents") == 0) { if (value == NULL) { error = ENOATTR; break; } node->contents = estrdup(value); } else error = ENOSYS; break; case 'f': if (strcmp(keyword, "flags") == 0) { if (value == NULL) { error = ENOATTR; break; } flset = flclr = 0; if (!strtofflags(&value, &flset, &flclr)) { st->st_flags &= ~flclr; st->st_flags |= flset; } else error = errno; } else error = ENOSYS; break; case 'g': if (strcmp(keyword, "gid") == 0) { if (value == NULL) { error = ENOATTR; break; } error = read_number(value, 10, &num, 0, UINT_MAX); if (!error) st->st_gid = num; } else if (strcmp(keyword, "gname") == 0) { if (value == NULL) { error = ENOATTR; break; } if (gid_from_group(value, &gid) == 0) st->st_gid = gid; else error = EINVAL; } else error = ENOSYS; break; case 'l': if (strcmp(keyword, "link") == 0) { if (value == NULL) { error = ENOATTR; break; } - node->symlink = estrdup(value); + node->symlink = emalloc(strlen(value) + 1); + if (node->symlink == NULL) { + error = errno; + break; + } + if (strunvis(node->symlink, value) < 0) { + error = errno; + break; + } } else error = ENOSYS; break; case 'm': if (strcmp(keyword, "mode") == 0) { if (value == NULL) { error = ENOATTR; break; } if (value[0] >= '0' && value[0] <= '9') { error = read_number(value, 8, &num, 0, 07777); if (!error) { st->st_mode &= S_IFMT; st->st_mode |= num; } } else { /* Symbolic mode not supported. */ error = EINVAL; break; } } else error = ENOSYS; break; case 'o': if (strcmp(keyword, "optional") == 0) { if (value != NULL) error = ENXIO; node->flags |= FSNODE_F_OPTIONAL; } else error = ENOSYS; break; case 's': if (strcmp(keyword, "size") == 0) { if (value == NULL) { error = ENOATTR; break; } error = read_number(value, 10, &num, 0, INTMAX_MAX); if (!error) st->st_size = num; } else error = ENOSYS; break; case 't': if (strcmp(keyword, "time") == 0) { if (value == NULL) { error = ENOATTR; break; } p = strchr(value, '.'); if (p != NULL) *p++ = '\0'; error = read_number(value, 10, &num, 0, INTMAX_MAX); if (error) break; st->st_atime = num; st->st_ctime = num; st->st_mtime = num; if (p == NULL) break; error = read_number(p, 10, &num, 0, INTMAX_MAX); if (error) break; if (num != 0) error = EINVAL; } else if (strcmp(keyword, "type") == 0) { if (value == NULL) { error = ENOATTR; break; } if (strcmp(value, "dir") == 0) node->type = S_IFDIR; else if (strcmp(value, "file") == 0) node->type = S_IFREG; else if (strcmp(value, "link") == 0) node->type = S_IFLNK; else error = EINVAL; } else error = ENOSYS; break; case 'u': if (strcmp(keyword, "uid") == 0) { if (value == NULL) { error = ENOATTR; break; } error = read_number(value, 10, &num, 0, UINT_MAX); if (!error) st->st_uid = num; } else if (strcmp(keyword, "uname") == 0) { if (value == NULL) { error = ENOATTR; break; } if (uid_from_user(value, &uid) == 0) st->st_uid = uid; else error = EINVAL; } else error = ENOSYS; break; default: error = ENOSYS; break; } switch (error) { case EINVAL: mtree_error("%s: invalid value '%s'", keyword, value); break; case ENOATTR: mtree_error("%s: keyword needs a value", keyword); break; case ENOSYS: mtree_warning("%s: unsupported keyword", keyword); break; case ENXIO: mtree_error("%s: keyword does not take a value", keyword); break; } } while (1); if (error) return (error); st->st_mode = (st->st_mode & ~S_IFMT) | node->type; /* Nothing more to do for the global defaults. */ if (node->name == NULL) return (0); /* * Be intelligent about the file type. */ if (node->contents != NULL) { if (node->symlink != NULL) { mtree_error("%s: both link and contents keywords " "defined", node->name); return (0); } type = S_IFREG; } else if (node->type != 0) { type = node->type; if (type == S_IFREG) { /* the named path is the default contents */ node->contents = mtree_file_path(node); } } else type = (node->symlink != NULL) ? S_IFLNK : S_IFDIR; if (node->type == 0) node->type = type; if (node->type != type) { mtree_error("%s: file type and defined keywords to not match", node->name); return (0); } st->st_mode = (st->st_mode & ~S_IFMT) | node->type; if (node->contents == NULL) return (0); name = mtree_resolve(node->contents, &istemp); if (name == NULL) return (errno); if (stat(name, &sb) != 0) { mtree_error("%s: contents file '%s' not found", node->name, name); free(name); return (0); } /* * Check for hardlinks. If the contents key is used, then the check * will only trigger if the contents file is a link even if it is used * by more than one file */ if (sb.st_nlink > 1) { fsinode *curino; st->st_ino = sb.st_ino; st->st_dev = sb.st_dev; curino = link_check(node->inode); if (curino != NULL) { free(node->inode); node->inode = curino; node->inode->nlink++; } } free(node->contents); node->contents = name; st->st_size = sb.st_size; return (0); } static int read_mtree_command(FILE *fp) { char cmd[10]; int error; error = read_word(fp, cmd, sizeof(cmd)); if (error) goto out; error = read_mtree_keywords(fp, &mtree_global); out: skip_to(fp, "\n"); (void)getc(fp); return (error); } static int read_mtree_spec1(FILE *fp, bool def, const char *name) { fsnode *last, *node, *parent; u_int type; int error; assert(name[0] != '\0'); /* * Treat '..' specially, because it only changes our current * directory. We don't create a node for it. We simply ignore * any keywords that may appear on the line as well. * Going up a directory is a little non-obvious. A directory * node has a corresponding '.' child. The parent of '.' is * not the '.' node of the parent directory, but the directory * node within the parent to which the child relates. However, * going up a directory means we need to find the '.' node to * which the directoy node is linked. This we can do via the * first * pointer, because '.' is always the first entry in a * directory. */ if (IS_DOTDOT(name)) { /* This deals with NULL pointers as well. */ if (mtree_current == mtree_root) { mtree_warning("ignoring .. in root directory"); return (0); } node = mtree_current; assert(node != NULL); assert(IS_DOT(node->name)); assert(node->first == node); /* Get the corresponding directory node in the parent. */ node = mtree_current->parent; assert(node != NULL); assert(!IS_DOT(node->name)); node = node->first; assert(node != NULL); assert(IS_DOT(node->name)); assert(node->first == node); mtree_current = node; return (0); } /* * If we don't have a current directory and the first specification * (either implicit or defined) is not '.', then we need to create * a '.' node first (using a recursive call). */ if (!IS_DOT(name) && mtree_current == NULL) { error = read_mtree_spec1(fp, false, "."); if (error) return (error); } /* * Lookup the name in the current directory (if we have a current * directory) to make sure we do not create multiple nodes for the * same component. For non-definitions, if we find a node with the * same name, simply change the current directory. For definitions * more happens. */ last = NULL; node = mtree_current; while (node != NULL) { assert(node->first == mtree_current); if (strcmp(name, node->name) == 0) { if (def == true) { if (!dupsok) mtree_error( "duplicate definition of %s", name); else mtree_warning( "duplicate definition of %s", name); return (0); } if (node->type != S_IFDIR) { mtree_error("%s is not a directory", name); return (0); } assert(!IS_DOT(name)); node = node->child; assert(node != NULL); assert(IS_DOT(node->name)); mtree_current = node; return (0); } last = node; node = last->next; } parent = (mtree_current != NULL) ? mtree_current->parent : NULL; type = (def == false || IS_DOT(name)) ? S_IFDIR : 0; node = create_node(name, type, parent, &mtree_global); if (node == NULL) return (ENOMEM); if (def == true) { error = read_mtree_keywords(fp, node); if (error) { destroy_node(node); return (error); } } node->first = (mtree_current != NULL) ? mtree_current : node; if (last != NULL) last->next = node; if (node->type != S_IFDIR) return (0); if (!IS_DOT(node->name)) { parent = node; node = create_node(".", S_IFDIR, parent, parent); if (node == NULL) { last->next = NULL; destroy_node(parent); return (ENOMEM); } parent->child = node; node->first = node; } assert(node != NULL); assert(IS_DOT(node->name)); assert(node->first == node); mtree_current = node; if (mtree_root == NULL) mtree_root = node; return (0); } static int read_mtree_spec(FILE *fp) { - char pathspec[PATH_MAX]; + char pathspec[PATH_MAX], pathtmp[4*PATH_MAX + 1]; char *cp; int error; - error = read_word(fp, pathspec, sizeof(pathspec)); + error = read_word(fp, pathtmp, sizeof(pathtmp)); if (error) goto out; + if (strnunvis(pathspec, PATH_MAX, pathtmp) == -1) { + error = errno; + goto out; + } + error = 0; cp = strchr(pathspec, '/'); if (cp != NULL) { /* Absolute pathname */ mtree_current = mtree_root; do { *cp++ = '\0'; /* Disallow '..' as a component. */ if (IS_DOTDOT(pathspec)) { mtree_error("absolute path cannot contain " ".. component"); goto out; } /* Ignore multiple adjacent slashes and '.'. */ if (pathspec[0] != '\0' && !IS_DOT(pathspec)) error = read_mtree_spec1(fp, false, pathspec); memmove(pathspec, cp, strlen(cp) + 1); cp = strchr(pathspec, '/'); } while (!error && cp != NULL); /* Disallow '.' and '..' as the last component. */ if (!error && (IS_DOT(pathspec) || IS_DOTDOT(pathspec))) { mtree_error("absolute path cannot contain . or .. " "components"); goto out; } } /* Ignore absolute specfications that end with a slash. */ if (!error && pathspec[0] != '\0') error = read_mtree_spec1(fp, true, pathspec); out: skip_to(fp, "\n"); (void)getc(fp); return (error); } fsnode * read_mtree(const char *fname, fsnode *node) { struct mtree_fileinfo *fi; FILE *fp; int c, error; /* We do not yet support nesting... */ assert(node == NULL); if (strcmp(fname, "-") == 0) fp = stdin; else { fp = fopen(fname, "r"); if (fp == NULL) err(1, "Can't open `%s'", fname); } error = mtree_file_push(fname, fp); if (error) goto out; memset(&mtree_global, 0, sizeof(mtree_global)); memset(&mtree_global_inode, 0, sizeof(mtree_global_inode)); mtree_global.inode = &mtree_global_inode; mtree_global_inode.nlink = 1; mtree_global_inode.st.st_nlink = 1; mtree_global_inode.st.st_atime = mtree_global_inode.st.st_ctime = mtree_global_inode.st.st_mtime = time(NULL); errors = warnings = 0; setgroupent(1); setpassent(1); mtree_root = node; mtree_current = node; do { /* Start of a new line... */ fi = SLIST_FIRST(&mtree_fileinfo); fi->line++; error = skip_over(fp, " \t"); if (error) break; c = getc(fp); if (c == EOF) { error = ferror(fp) ? errno : -1; break; } switch (c) { case '\n': /* empty line */ error = 0; break; case '#': /* comment -- skip to end of line. */ error = skip_to(fp, "\n"); if (!error) (void)getc(fp); break; case '/': /* special commands */ error = read_mtree_command(fp); break; default: /* specification */ ungetc(c, fp); error = read_mtree_spec(fp); break; } } while (!error); endpwent(); endgrent(); if (error <= 0 && (errors || warnings)) { warnx("%u error(s) and %u warning(s) in mtree manifest", errors, warnings); if (errors) exit(1); } out: if (error > 0) errc(1, error, "Error reading mtree file"); if (fp != stdin) fclose(fp); if (mtree_root != NULL) return (mtree_root); /* Handle empty specifications. */ node = create_node(".", S_IFDIR, NULL, &mtree_global); node->first = node; return (node); } Index: projects/runtime-coverage/usr.sbin/vidcontrol/vidcontrol.1 =================================================================== --- projects/runtime-coverage/usr.sbin/vidcontrol/vidcontrol.1 (revision 322921) +++ projects/runtime-coverage/usr.sbin/vidcontrol/vidcontrol.1 (revision 322922) @@ -1,686 +1,728 @@ .\" .\" vidcontrol - a utility for manipulating the syscons or vt video driver .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" @(#)vidcontrol.1 .\" $FreeBSD$ .\" .Dd January 19, 2016 .Dt VIDCONTROL 1 .Os .Sh NAME .Nm vidcontrol .Nd system console control and configuration utility .Sh SYNOPSIS .Nm .Op Fl CdLHPpx .Op Fl b Ar color .Op Fl c Ar appearance .Oo .Fl f .Oo .Op Ar size .Ar file .Oc .Oc .Op Fl g Ar geometry .Op Fl h Ar size .Op Fl i Cm active | adapter | mode .Op Fl l Ar screen_map .Op Fl M Ar char .Op Fl m Cm on | off .Op Fl r Ar foreground Ar background .Op Fl S Cm on | off .Op Fl s Ar number .Op Fl T Cm xterm | cons25 .Op Fl t Ar N | Cm off .Op Ar mode .Op Ar foreground Op Ar background .Op Cm show .Sh DESCRIPTION The .Nm utility is used to set various options for the .Xr syscons 4 or .Xr vt 4 console driver, such as video mode, colors, cursor shape, screen output map, font and screen saver timeout. Only a small subset of options is supported by .Xr vt 4 . Unsupported options lead to error messages, typically including the text "Inappropriate ioctl for device". .Pp The following command line options are supported: .Bl -tag -width indent .It Ar mode Select a new video mode. The modes currently recognized are: .Ar 80x25 , .Ar 80x30 , .Ar 80x43 , .Ar 80x50 , .Ar 80x60 , .Ar 132x25 , .Ar 132x30 , .Ar 132x43 , .Ar 132x50 , .Ar 132x60 , .Ar VGA_40x25 , .Ar VGA_80x25 , .Ar VGA_80x30 , .Ar VGA_80x50 , .Ar VGA_80x60 , .Ar VGA_90x25 , .Ar VGA_90x30 , .Ar VGA_90x43 , .Ar VGA_90x50 , .Ar VGA_90x60 , .Ar EGA_80x25 , .Ar EGA_80x43 , .Ar VESA_132x25 , .Ar VESA_132x43 , .Ar VESA_132x50 , .Ar VESA_132x60 . .\"The graphic mode .\".Ar VGA_320x200 .\"and The raster text mode .Ar VESA_800x600 can also be chosen. Alternatively, a mode can be specified with its number by using a mode name of the form .Li MODE_ Ns Aq Ar NUMBER . A list of valid mode numbers can be obtained with the .Fl i Cm mode option. See .Sx Video Mode Support below. .It Ar foreground Op Ar background Change colors when displaying text. Specify the foreground color (e.g.\& .Dq vidcontrol white ) , or both a foreground and background colors (e.g.\& .Dq vidcontrol yellow blue ) . Use the .Cm show command below to see available colors. .It Cm show See the supported colors on a given platform. .It Fl b Ar color Set border color to .Ar color . This option may not be always supported by the video driver. .It Fl C Clear the history buffer. .It Fl c Ar setting Ns Op , Ns Ar setting ... Change the cursor appearance. The change is specified by a non-empty comma-separated list of .Cm setting Ns s . Each .Cm setting overrides or modifies previous ones in left to right order. .Pp The following override .Cm setting Ns s are available: .Bl -tag -width indent .It Cm normal Set to a block covering 1 character cell, with a configuration-dependent coloring that should be at worst inverse video. .It Cm destructive Set to a blinking sub-block with .Cm height scanlines starting at .Cm base . The name .Dq destructive is bad for backwards compatibility. This .Cm setting should not force destructiveness, and it now only gives destructiveness in some configurations (typically for hardware cursors in text mode). Blinking limits destructiveness. This .Cm setting should now be spelled .Cm normal , Ns Cm blink , Ns Cm noblock . A non-blinking destructive cursor would be unusable, so old versions of .Nm didn't support it, and this version doesn't have an override for it. .It Cm base Ns = Ns Ar value, Cm height Ns = Ns Ar value Set the specified scanline parameters. These parameters are only active in .Cm noblock mode. .Cm value is an integer in any base supported by .Xr strtol 3 . Setting .Cm height to 0 turns off the cursor in .Cm noblock mode. Negative .Ar value Ns s are silently ignored. Positive .Ar value Ns s are clamped to fit in the character cell when the cursor is drawn. .El .Pp The following modifier .Cm setting Ns s are available: .Bl -tag -width indent .It Cm blink , noblink Set or clear the blinking attribute. This is not quite backwards compatible. In old versions of .Nm , Cm blink was an override to a blinking block. .It Cm block , noblock Set or clear the .Cm block attribute. This attribute is the inverse of the flag .Dv CONS_CHAR_CURSOR in the implementation. It deactivates the scanline parameters, and expresses a preference for using a simpler method of implementation. Its inverse does the opposite. When the scanline parameters give a full block, this attribute reduces to a method selection bit. The .Cm block method tends to give better coloring. .It Cm hidden , nohidden Set or clear the hidden attribute. .El .Pp The following (non-sticky) flags control application of the .Cm setting Ns s : .Bl -tag -width indent +.It Cm charcolors +Apply +.Cm base +and +.Cm height +to the (character) cursor's list of preferred colors instead of its shape. +Beware that the color numbers are raw VGA palette indexes, +not ANSI color numbers. +The indexes are reduced mod 8, 16 or 256, +or ignored, +depending on the video mode and renderer. +.It Cm mousecolors +Colors for the mouse cursor in graphics mode. +Like +.Cm charcolors , +except there is no preference or sequence; +.Cm base +gives the mouse border color and +.Cm height +gives the mouse interior color. +Together with +.Cm charcolors , +this gives 2 selection bits which select between +only 3 of 4 sub-destinations of the 4 destinations selected by +.Cm default +and +.Cm local +(by ignoring +.Cm mousecolors +if +.Cm charcolors +is also set). +.It Cm default +Apply the changes to the default settings and then to the active settings, +instead of only to the active settings. +Together with +.Cm local , +this gives 2 selection bits which select between 4 destinations. +.It Cm shapeonly +Ignore any changes to the +.Cm block +and +.Cm hidden +attributes. .It Cm local Apply the changes to the current vty. The default is to apply them to a global place and copy from there to all vtys. .It Cm reset Reset everything. The default is to not reset. When the .Cm local parameter is specified, the current local settings are reset to default local settings. Otherwise, the current global settings are reset to default global settings and then copied to the current and default settings for all vtys. -The global defaults are decided (not quite right) at boot time -and cannot be fixed up. -The local defaults are obtained as above and cannot be fixed up -locally. +.It Cm show +Show the current changes. .El .It Fl d Print out current output screen map. .It Xo .Fl f .Oo .Op Ar size .Ar file .Oc .Xc Load font .Ar file for .Ar size (currently, only .Cm 8x8 , .Cm 8x14 or .Cm 8x16 ) . The font file can be either uuencoded or in raw binary format. You can also use the menu-driven .Xr vidfont 1 command to load the font of your choice. .Pp .Ar Size may be omitted, in this case .Nm will try to guess it from the size of font file. .Pp When using .Xr vt 4 both .Ar size and .Ar font can be omitted, and the default font will be loaded. .Pp Note that older video cards, such as MDA and CGA, do not support software font. See also .Sx Video Mode Support and .Sx EXAMPLES below and the man page for either .Xr syscons 4 or .Xr vt 4 (depending on which driver you use). .It Fl g Ar geometry Set the .Ar geometry of the text mode for the modes with selectable geometry. Currently only raster modes, such as .Ar VESA_800x600 , support this option. See also .Sx Video Mode Support and .Sx EXAMPLES below. .It Fl h Ar size Set the size of the history (scrollback) buffer to .Ar size lines. .It Fl i Cm active Shows the active vty number. .It Fl i Cm adapter Shows info about the current video adapter. .It Fl i Cm mode Shows the possible video modes with the current video hardware. .It Fl l Ar screen_map Install screen output map file from .Ar screen_map . See also .Xr syscons 4 or .Xr vt 4 (depending on which driver you use). .It Fl L Install default screen output map. .It Fl M Ar char Sets the base character used to render the mouse pointer to .Ar char . .It Fl m Cm on | off Switch the mouse pointer .Cm on or .Cm off . Used together with the .Xr moused 8 daemon for text mode cut & paste functionality. .It Fl p Capture the current contents of the video buffer corresponding to the terminal device referred to by standard input. The .Nm utility writes contents of the video buffer to the standard output in a raw binary format. For details about that format see .Sx Format of Video Buffer Dump below. .It Fl P Same as .Fl p , but dump contents of the video buffer in a plain text format ignoring nonprintable characters and information about text attributes. .It Fl H When used with .Fl p or .Fl P , it instructs .Nm to dump full history buffer instead of visible portion of the video buffer only. .It Fl r Ar foreground background Change reverse mode colors to .Ar foreground and .Ar background . .It Fl S Cm on | off Turn vty switching on or off. When vty switching is off, attempts to switch to a different virtual terminal will fail. (The default is to permit vty switching.) This protection can be easily bypassed when the kernel is compiled with the .Dv DDB option. However, you probably should not compile the kernel debugger on a box which is supposed to be physically secure. .It Fl s Ar number Set the active vty to .Ar number . .It Fl T Cm xterm | cons25 Switch between xterm and cons25 style terminal emulation. .It Fl t Ar N | Cm off Set the screensaver timeout to .Ar N seconds, or turns it .Cm off . .It Fl x Use hexadecimal digits for output. .El .Ss Video Mode Support Note that not all modes listed above may be supported by the video hardware. You can verify which mode is supported by the video hardware, using the .Fl i Cm mode option. .Pp The VESA BIOS support must be linked to the kernel or loaded as a KLD module if you wish to use VESA video modes or 132 column modes (see .Xr vga 4 ) . .Pp You need to compile your kernel with the .Ar VGA_WIDTH90 option if you wish to use VGA 90 column modes (see .Xr vga 4 ) . .Pp Video modes other than 25 and 30 line modes may require specific size of font. Use .Fl f option above to load a font file to the kernel. If the required size of font has not been loaded to the kernel, .Nm will fail if the user attempts to set a new video mode. .Pp .Bl -column "25 line modes" "8x16 (VGA), 8x14 (EGA)" -compact .Sy Modes Ta Sy Font size .No 25 line modes Ta 8x16 (VGA), 8x14 (EGA) .No 30 line modes Ta 8x16 .No 43 line modes Ta 8x8 .No 50 line modes Ta 8x8 .No 60 line modes Ta 8x8 .El .Pp It is better to always load all three sizes (8x8, 8x14 and 8x16) of the same font. .Pp You may set variables in .Pa /etc/rc.conf or .Pa /etc/rc.conf.local so that desired font files will be automatically loaded when the system starts up. See below. .Pp If you want to use any of the raster text modes you need to recompile your kernel with the .Dv SC_PIXEL_MODE option. See .Xr syscons 4 or .Xr vt 4 (depending on which driver you use) for more details on this kernel option. .Ss Format of Video Buffer Dump The .Nm utility uses the .Xr syscons 4 .\" is it supported on vt(4)??? or .Xr vt 4 .Dv CONS_SCRSHOT .Xr ioctl 2 to capture the current contents of the video buffer. The .Nm utility writes version and additional information to the standard output, followed by the contents of the video buffer. .Pp VGA video memory is typically arranged in two byte tuples, one per character position. In each tuple, the first byte will be the character code, and the second byte is the character's color attribute. .Pp The VGA color attribute byte looks like this: .Bl -column "X:X" "<00000000>" "width" "bright foreground color" .Sy "bits# width meaning" .Li "7 1 character blinking" .Li "6:4 <0XXX0000> 3 background color" .Li "3 <0000X000> 1 bright foreground color" .Li "2:0 <00000XXX> 3 foreground color" .El .Pp Here is a list of the three bit wide base colors: .Pp .Bl -hang -offset indent -compact .It 0 Black .It 1 Blue .It 2 Green .It 3 Cyan .It 4 Red .It 5 Magenta .It 6 Brown .It 7 Light Grey .El .Pp Base colors with bit 3 (the bright foreground flag) set: .Pp .Bl -hang -offset indent -compact .It 0 Dark Grey .It 1 Light Blue .It 2 Light Green .It 3 Light Cyan .It 4 Light Red .It 5 Light Magenta .It 6 Yellow .It 7 White .El .Pp For example, the two bytes .Pp .Dl "65 158" .Pp specify an uppercase A (character code 65), blinking (bit 7 set) in yellow (bits 3:0) on a blue background (bits 6:4). .Pp The .Nm output contains a small header which includes additional information which may be useful to utilities processing the output. .Pp The first 10 bytes are always arranged as follows: .Bl -column "Byte range" "Contents" -offset indent .It Sy "Byte Range Contents" .It "1 thru 8 Literal text" Dq Li SCRSHOT_ .It "9 File format version number" .It "10 Remaining number of bytes in the header" .El .Pp Subsequent bytes depend on the version number. .Bl -column "Version" "13 and up" -offset indent .It Sy "Version Byte Meaning" .It "1 11 Terminal width, in characters" .It " 12 Terminal depth, in characters" .It " 13 and up The snapshot data" .El .Pp So a dump of an 80x25 screen would start (in hex) .Bd -literal -offset indent 53 43 52 53 48 4f 54 5f 01 02 50 19 ----------------------- -- -- -- -- | | | | ` 25 decimal | | | `--- 80 decimal | | `------ 2 remaining bytes of header data | `--------- File format version 1 `------------------------ Literal "SCRSHOT_" .Ed .Sh VIDEO OUTPUT CONFIGURATION .Ss Boot Time Configuration You may set the following variables in .Pa /etc/rc.conf or .Pa /etc/rc.conf.local in order to configure the video output at boot time. .Pp .Bl -tag -width foo_bar_var -compact .It Ar blanktime Sets the timeout value for the .Fl t option. .It Ar font8x16 , font8x14 , font8x8 Specifies font files for the .Fl f option. .It Ar scrnmap Specifies a screen output map file for the .Fl l option. .El .Pp See .Xr rc.conf 5 for more details. .Ss Driver Configuration The video card driver may let you change default configuration options, such as the default font, so that you do not need to set up the options at boot time. See video card driver manuals, (e.g.\& .Xr vga 4 ) for details. .Sh FILES .Bl -tag -width /usr/share/syscons/scrnmaps/foo-bar -compact .It Pa /usr/share/syscons/fonts/* .It Pa /usr/share/vt/fonts/* font files. .It Pa /usr/share/syscons/scrnmaps/* screen output map files (relevant for .Xr syscons 4 only). .El .Sh EXAMPLES If you want to load .Pa /usr/share/syscons/fonts/iso-8x16.fnt to the kernel, run .Nm as: .Pp .Dl vidcontrol -f 8x16 /usr/share/syscons/fonts/iso-8x16.fnt .Pp So long as the font file is in .Pa /usr/share/syscons/fonts (if using syscons) or .Pa /usr/share/vt/fonts (if using vt), you may abbreviate the file name as .Pa iso-8x16 : .Pp .Dl vidcontrol -f 8x16 iso-8x16 .Pp Furthermore, you can also omit font size .Dq Li 8x16 : .Pp .Dl vidcontrol -f iso-8x16 .Pp Moreover, the suffix specifying the font size can be also omitted; in this case, .Nm will use the size of the currently displayed font to construct the suffix: .Pp .Dl vidcontrol -f iso .Pp Likewise, you can also abbreviate the screen output map file name for the .Fl l option if the file is found in .Pa /usr/share/syscons/scrnmaps . .Pp .Dl vidcontrol -l iso-8859-1_to_cp437 .Pp The above command will load .Pa /usr/share/syscons/scrnmaps/iso-8859-1_to_cp437.scm . .Pp The following command will set-up a 100x37 raster text mode (useful for some LCD models): .Pp .Dl vidcontrol -g 100x37 VESA_800x600 .Pp The following command will capture the contents of the first virtual terminal video buffer, and redirect the output to the .Pa shot.scr file: .Pp .Dl vidcontrol -p < /dev/ttyv0 > shot.scr .Pp The following command will dump contents of the fourth virtual terminal video buffer to the standard output in the human readable format: .Pp .Dl vidcontrol -P < /dev/ttyv3 .Sh SEE ALSO .Xr kbdcontrol 1 , .Xr vidfont 1 , .Xr keyboard 4 , .Xr screen 4 , .Xr syscons 4 , .Xr vga 4 , .Xr vt 4 , .Xr rc.conf 5 , .Xr kldload 8 , .Xr moused 8 , .Xr watch 8 .Pp The various .Pa scr2* utilities in the .Pa graphics and .Pa textproc categories of the .Em "Ports Collection" . .Sh AUTHORS .An S\(/oren Schmidt Aq Mt sos@FreeBSD.org .An Sascha Wildner Aq Mt saw@online.de .Sh CONTRIBUTORS .An -split .An Maxim Sobolev Aq Mt sobomax@FreeBSD.org .An Nik Clayton Aq Mt nik@FreeBSD.org Index: projects/runtime-coverage/usr.sbin/vidcontrol/vidcontrol.c =================================================================== --- projects/runtime-coverage/usr.sbin/vidcontrol/vidcontrol.c (revision 322921) +++ projects/runtime-coverage/usr.sbin/vidcontrol/vidcontrol.c (revision 322922) @@ -1,1518 +1,1531 @@ /*- * Copyright (c) 1994-1996 Søren Schmidt * All rights reserved. * * Portions of this software are based in part on the work of * Sascha Wildner contributed to The DragonFly Project * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $DragonFly: src/usr.sbin/vidcontrol/vidcontrol.c,v 1.10 2005/03/02 06:08:29 joerg Exp $ */ #ifndef lint static const char rcsid[] = "$FreeBSD$"; #endif /* not lint */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "path.h" #include "decode.h" #define DATASIZE(x) ((x).w * (x).h * 256 / 8) /* Screen dump modes */ #define DUMP_FMT_RAW 1 #define DUMP_FMT_TXT 2 /* Screen dump options */ #define DUMP_FBF 0 #define DUMP_ALL 1 /* Screen dump file format revision */ #define DUMP_FMT_REV 1 static const char *legal_colors[16] = { "black", "blue", "green", "cyan", "red", "magenta", "brown", "white", "grey", "lightblue", "lightgreen", "lightcyan", "lightred", "lightmagenta", "yellow", "lightwhite" }; static struct { int active_vty; vid_info_t console_info; unsigned char screen_map[256]; int video_mode_number; struct video_info video_mode_info; } cur_info; struct vt4font_header { uint8_t magic[8]; uint8_t width; uint8_t height; uint16_t pad; uint32_t glyph_count; uint32_t map_count[4]; } __packed; static int hex = 0; static int vesa_cols; static int vesa_rows; static int font_height; static int vt4_mode = 0; static int video_mode_changed; static struct video_info new_mode_info; /* * Initialize revert data. * * NOTE: the following parameters are not yet saved/restored: * * screen saver timeout * cursor type * mouse character and mouse show/hide state * vty switching on/off state * history buffer size * history contents * font maps */ static void init(void) { if (ioctl(0, VT_GETACTIVE, &cur_info.active_vty) == -1) err(1, "getting active vty"); cur_info.console_info.size = sizeof(cur_info.console_info); if (ioctl(0, CONS_GETINFO, &cur_info.console_info) == -1) err(1, "getting console information"); /* vt(4) use unicode, so no screen mapping required. */ if (vt4_mode == 0 && ioctl(0, GIO_SCRNMAP, &cur_info.screen_map) == -1) err(1, "getting screen map"); if (ioctl(0, CONS_GET, &cur_info.video_mode_number) == -1) err(1, "getting video mode number"); cur_info.video_mode_info.vi_mode = cur_info.video_mode_number; if (ioctl(0, CONS_MODEINFO, &cur_info.video_mode_info) == -1) err(1, "getting video mode parameters"); } /* * If something goes wrong along the way we call revert() to go back to the * console state we came from (which is assumed to be working). * * NOTE: please also read the comments of init(). */ static void revert(void) { int save_errno, size[3]; save_errno = errno; ioctl(0, VT_ACTIVATE, cur_info.active_vty); ioctl(0, KDSBORDER, cur_info.console_info.mv_ovscan); fprintf(stderr, "\033[=%dH", cur_info.console_info.mv_rev.fore); fprintf(stderr, "\033[=%dI", cur_info.console_info.mv_rev.back); if (vt4_mode == 0) ioctl(0, PIO_SCRNMAP, &cur_info.screen_map); if (video_mode_changed) { if (cur_info.video_mode_number >= M_VESA_BASE) ioctl(0, _IO('V', cur_info.video_mode_number - M_VESA_BASE), NULL); else ioctl(0, _IO('S', cur_info.video_mode_number), NULL); if (cur_info.video_mode_info.vi_flags & V_INFO_GRAPHICS) { size[0] = cur_info.video_mode_info.vi_width / 8; size[1] = cur_info.video_mode_info.vi_height / cur_info.console_info.font_size; size[2] = cur_info.console_info.font_size; ioctl(0, KDRASTER, size); } } /* Restore some colors last since mode setting forgets some. */ fprintf(stderr, "\033[=%dF", cur_info.console_info.mv_norm.fore); fprintf(stderr, "\033[=%dG", cur_info.console_info.mv_norm.back); errno = save_errno; } /* * Print a short usage string describing all options, then exit. */ static void usage(void) { if (vt4_mode) fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n%s\n", "usage: vidcontrol [-CHPpx] [-b color] [-c appearance] [-f [[size] file]]", " [-g geometry] [-h size] [-i active | adapter | mode]", " [-M char] [-m on | off]", " [-r foreground background] [-S on | off] [-s number]", " [-T xterm | cons25] [-t N | off] [mode]", " [foreground [background]] [show]"); else fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n%s\n", "usage: vidcontrol [-CdHLPpx] [-b color] [-c appearance] [-f [size] file]", " [-g geometry] [-h size] [-i active | adapter | mode]", " [-l screen_map] [-M char] [-m on | off]", " [-r foreground background] [-S on | off] [-s number]", " [-T xterm | cons25] [-t N | off] [mode]", " [foreground [background]] [show]"); exit(1); } /* Detect presence of vt(4). */ static int is_vt4(void) { char vty_name[4] = ""; size_t len = sizeof(vty_name); if (sysctlbyname("kern.vty", vty_name, &len, NULL, 0) != 0) return (0); return (strcmp(vty_name, "vt") == 0); } /* * Retrieve the next argument from the command line (for options that require * more than one argument). */ static char * nextarg(int ac, char **av, int *indp, int oc, int strict) { if (*indp < ac) return(av[(*indp)++]); if (strict != 0) { revert(); errx(1, "option requires two arguments -- %c", oc); } return(NULL); } /* * Guess which file to open. Try to open each combination of a specified set * of file name components. */ static FILE * openguess(const char *a[], const char *b[], const char *c[], const char *d[], char **name) { FILE *f; int i, j, k, l; for (i = 0; a[i] != NULL; i++) { for (j = 0; b[j] != NULL; j++) { for (k = 0; c[k] != NULL; k++) { for (l = 0; d[l] != NULL; l++) { asprintf(name, "%s%s%s%s", a[i], b[j], c[k], d[l]); f = fopen(*name, "r"); if (f != NULL) return (f); free(*name); } } } } return (NULL); } /* * Load a screenmap from a file and set it. */ static void load_scrnmap(const char *filename) { FILE *fd; int size; char *name; scrmap_t scrnmap; const char *a[] = {"", SCRNMAP_PATH, NULL}; const char *b[] = {filename, NULL}; const char *c[] = {"", ".scm", NULL}; const char *d[] = {"", NULL}; fd = openguess(a, b, c, d, &name); if (fd == NULL) { revert(); errx(1, "screenmap file not found"); } size = sizeof(scrnmap); if (decode(fd, (char *)&scrnmap, size) != size) { rewind(fd); if (fread(&scrnmap, 1, size, fd) != (size_t)size) { fclose(fd); revert(); errx(1, "bad screenmap file"); } } if (ioctl(0, PIO_SCRNMAP, &scrnmap) == -1) { revert(); err(1, "loading screenmap"); } fclose(fd); } /* * Set the default screenmap. */ static void load_default_scrnmap(void) { scrmap_t scrnmap; int i; for (i=0; i<256; i++) *((char*)&scrnmap + i) = i; if (ioctl(0, PIO_SCRNMAP, &scrnmap) == -1) { revert(); err(1, "loading default screenmap"); } } /* * Print the current screenmap to stdout. */ static void print_scrnmap(void) { unsigned char map[256]; size_t i; if (ioctl(0, GIO_SCRNMAP, &map) == -1) { revert(); err(1, "getting screenmap"); } for (i=0; ishape[0]; while ((word = strsep(¶m, ",")) != NULL) { if (strcmp(word, "normal") == 0) type = 0; else if (strcmp(word, "destructive") == 0) type = CONS_BLINK_CURSOR | CONS_CHAR_CURSOR; else if (strcmp(word, "blink") == 0) type |= CONS_BLINK_CURSOR; else if (strcmp(word, "noblink") == 0) type &= ~CONS_BLINK_CURSOR; else if (strcmp(word, "block") == 0) type &= ~CONS_CHAR_CURSOR; else if (strcmp(word, "noblock") == 0) type |= CONS_CHAR_CURSOR; else if (strcmp(word, "hidden") == 0) type |= CONS_HIDDEN_CURSOR; else if (strcmp(word, "nohidden") == 0) type &= ~CONS_HIDDEN_CURSOR; else if (strncmp(word, "base=", 5) == 0) shape->shape[1] = strtol(word + 5, NULL, 0); else if (strncmp(word, "height=", 7) == 0) shape->shape[2] = strtol(word + 7, NULL, 0); + else if (strcmp(word, "charcolors") == 0) + type |= CONS_CHARCURSOR_COLORS; + else if (strcmp(word, "mousecolors") == 0) + type |= CONS_MOUSECURSOR_COLORS; + else if (strcmp(word, "default") == 0) + type |= CONS_DEFAULT_CURSOR; + else if (strcmp(word, "shapeonly") == 0) + type |= CONS_SHAPEONLY_CURSOR; else if (strcmp(word, "local") == 0) type |= CONS_LOCAL_CURSOR; else if (strcmp(word, "reset") == 0) type |= CONS_RESET_CURSOR; + else if (strcmp(word, "show") == 0) + printf("flags %#x, base %d, height %d\n", + type, shape->shape[1], shape->shape[2]); else { revert(); errx(1, "invalid parameters for -c starting at '%s%s%s'", word, param != NULL ? "," : "", param != NULL ? param : ""); } } free(dupparam); shape->shape[0] = type; } /* * Set the cursor's shape/type. */ static void set_cursor_type(char *param) { struct cshape shape; - /* Determine if the new setting is local (default to non-local). */ + /* Dry run to determine color, default and local flags. */ shape.shape[0] = 0; + shape.shape[1] = -1; + shape.shape[2] = -1; parse_cursor_params(param, &shape); - /* Get the relevant shape (the local flag is the only input arg). */ + /* Get the relevant old setting. */ if (ioctl(0, CONS_GETCURSORSHAPE, &shape) != 0) { revert(); err(1, "ioctl(CONS_GETCURSORSHAPE)"); } parse_cursor_params(param, &shape); if (ioctl(0, CONS_SETCURSORSHAPE, &shape) != 0) { revert(); err(1, "ioctl(CONS_SETCURSORSHAPE)"); } } /* * Set the video mode. */ static void video_mode(int argc, char **argv, int *mode_index) { static struct { const char *name; unsigned long mode; unsigned long mode_num; } modes[] = { { "80x25", SW_TEXT_80x25, M_TEXT_80x25 }, { "80x30", SW_TEXT_80x30, M_TEXT_80x30 }, { "80x43", SW_TEXT_80x43, M_TEXT_80x43 }, { "80x50", SW_TEXT_80x50, M_TEXT_80x50 }, { "80x60", SW_TEXT_80x60, M_TEXT_80x60 }, { "132x25", SW_TEXT_132x25, M_TEXT_132x25 }, { "132x30", SW_TEXT_132x30, M_TEXT_132x30 }, { "132x43", SW_TEXT_132x43, M_TEXT_132x43 }, { "132x50", SW_TEXT_132x50, M_TEXT_132x50 }, { "132x60", SW_TEXT_132x60, M_TEXT_132x60 }, { "VGA_40x25", SW_VGA_C40x25, M_VGA_C40x25 }, { "VGA_80x25", SW_VGA_C80x25, M_VGA_C80x25 }, { "VGA_80x30", SW_VGA_C80x30, M_VGA_C80x30 }, { "VGA_80x50", SW_VGA_C80x50, M_VGA_C80x50 }, { "VGA_80x60", SW_VGA_C80x60, M_VGA_C80x60 }, #ifdef SW_VGA_C90x25 { "VGA_90x25", SW_VGA_C90x25, M_VGA_C90x25 }, { "VGA_90x30", SW_VGA_C90x30, M_VGA_C90x30 }, { "VGA_90x43", SW_VGA_C90x43, M_VGA_C90x43 }, { "VGA_90x50", SW_VGA_C90x50, M_VGA_C90x50 }, { "VGA_90x60", SW_VGA_C90x60, M_VGA_C90x60 }, #endif { "VGA_320x200", SW_VGA_CG320, M_CG320 }, { "EGA_80x25", SW_ENH_C80x25, M_ENH_C80x25 }, { "EGA_80x43", SW_ENH_C80x43, M_ENH_C80x43 }, { "VESA_132x25", SW_VESA_C132x25,M_VESA_C132x25 }, { "VESA_132x43", SW_VESA_C132x43,M_VESA_C132x43 }, { "VESA_132x50", SW_VESA_C132x50,M_VESA_C132x50 }, { "VESA_132x60", SW_VESA_C132x60,M_VESA_C132x60 }, { "VESA_800x600", SW_VESA_800x600,M_VESA_800x600 }, { NULL, 0, 0 }, }; int new_mode_num = 0; unsigned long mode = 0; int cur_mode; int save_errno; int size[3]; int i; if (ioctl(0, CONS_GET, &cur_mode) < 0) err(1, "cannot get the current video mode"); /* * Parse the video mode argument... */ if (*mode_index < argc) { if (!strncmp(argv[*mode_index], "MODE_", 5)) { if (!isdigit(argv[*mode_index][5])) errx(1, "invalid video mode number"); new_mode_num = atoi(&argv[*mode_index][5]); } else { for (i = 0; modes[i].name != NULL; ++i) { if (!strcmp(argv[*mode_index], modes[i].name)) { mode = modes[i].mode; new_mode_num = modes[i].mode_num; break; } } if (modes[i].name == NULL) return; if (ioctl(0, mode, NULL) < 0) { revert(); err(1, "cannot set videomode"); } video_mode_changed = 1; } /* * Collect enough information about the new video mode... */ new_mode_info.vi_mode = new_mode_num; if (ioctl(0, CONS_MODEINFO, &new_mode_info) == -1) { revert(); err(1, "obtaining new video mode parameters"); } if (mode == 0) { if (new_mode_num >= M_VESA_BASE) mode = _IO('V', new_mode_num - M_VESA_BASE); else mode = _IO('S', new_mode_num); } /* * Try setting the new mode. */ if (ioctl(0, mode, NULL) == -1) { revert(); err(1, "setting video mode"); } video_mode_changed = 1; /* * For raster modes it's not enough to just set the mode. * We also need to explicitly set the raster mode. */ if (new_mode_info.vi_flags & V_INFO_GRAPHICS) { /* font size */ if (font_height == 0) font_height = cur_info.console_info.font_size; size[2] = font_height; /* adjust columns */ if ((vesa_cols * 8 > new_mode_info.vi_width) || (vesa_cols <= 0)) { size[0] = new_mode_info.vi_width / 8; } else { size[0] = vesa_cols; } /* adjust rows */ if ((vesa_rows * font_height > new_mode_info.vi_height) || (vesa_rows <= 0)) { size[1] = new_mode_info.vi_height / font_height; } else { size[1] = vesa_rows; } /* set raster mode */ if (ioctl(0, KDRASTER, size)) { save_errno = errno; if (cur_mode >= M_VESA_BASE) ioctl(0, _IO('V', cur_mode - M_VESA_BASE), NULL); else ioctl(0, _IO('S', cur_mode), NULL); revert(); errno = save_errno; err(1, "cannot activate raster display"); } } /* Recover from mode setting forgetting colors. */ fprintf(stderr, "\033[=%dF", cur_info.console_info.mv_norm.fore); fprintf(stderr, "\033[=%dG", cur_info.console_info.mv_norm.back); (*mode_index)++; } } /* * Return the number for a specified color name. */ static int get_color_number(char *color) { int i; for (i=0; i<16; i++) { if (!strcmp(color, legal_colors[i])) return i; } return -1; } /* * Set normal text and background colors. */ static void set_normal_colors(int argc, char **argv, int *_index) { int color; if (*_index < argc && (color = get_color_number(argv[*_index])) != -1) { (*_index)++; fprintf(stderr, "\033[=%dF", color); if (*_index < argc && (color = get_color_number(argv[*_index])) != -1) { (*_index)++; fprintf(stderr, "\033[=%dG", color); } } } /* * Set reverse text and background colors. */ static void set_reverse_colors(int argc, char **argv, int *_index) { int color; if ((color = get_color_number(argv[*(_index)-1])) != -1) { fprintf(stderr, "\033[=%dH", color); if (*_index < argc && (color = get_color_number(argv[*_index])) != -1) { (*_index)++; fprintf(stderr, "\033[=%dI", color); } } } /* * Switch to virtual terminal #arg. */ static void set_console(char *arg) { int n; if(!arg || strspn(arg,"0123456789") != strlen(arg)) { revert(); errx(1, "bad console number"); } n = atoi(arg); if (n < 1 || n > 16) { revert(); errx(1, "console number out of range"); } else if (ioctl(0, VT_ACTIVATE, n) == -1) { revert(); err(1, "switching vty"); } } /* * Sets the border color. */ static void set_border_color(char *arg) { int color; color = get_color_number(arg); if (color == -1) { revert(); errx(1, "invalid color '%s'", arg); } if (ioctl(0, KDSBORDER, color) != 0) { revert(); err(1, "ioctl(KD_SBORDER)"); } } static void set_mouse_char(char *arg) { struct mouse_info mouse; long l; l = strtol(arg, NULL, 0); if ((l < 0) || (l > UCHAR_MAX - 3)) { revert(); warnx("argument to -M must be 0 through %d", UCHAR_MAX - 3); return; } mouse.operation = MOUSE_MOUSECHAR; mouse.u.mouse_char = (int)l; if (ioctl(0, CONS_MOUSECTL, &mouse) == -1) { revert(); err(1, "setting mouse character"); } } /* * Show/hide the mouse. */ static void set_mouse(char *arg) { struct mouse_info mouse; if (!strcmp(arg, "on")) { mouse.operation = MOUSE_SHOW; } else if (!strcmp(arg, "off")) { mouse.operation = MOUSE_HIDE; } else { revert(); errx(1, "argument to -m must be either on or off"); } if (ioctl(0, CONS_MOUSECTL, &mouse) == -1) { revert(); err(1, "%sing the mouse", mouse.operation == MOUSE_SHOW ? "show" : "hid"); } } static void set_lockswitch(char *arg) { int data; if (!strcmp(arg, "off")) { data = 0x01; } else if (!strcmp(arg, "on")) { data = 0x02; } else { revert(); errx(1, "argument to -S must be either on or off"); } if (ioctl(0, VT_LOCKSWITCH, &data) == -1) { revert(); err(1, "turning %s vty switching", data == 0x01 ? "off" : "on"); } } /* * Return the adapter name for a specified type. */ static const char *adapter_name(int type) { static struct { int type; const char *name; } names[] = { { KD_MONO, "MDA" }, { KD_HERCULES, "Hercules" }, { KD_CGA, "CGA" }, { KD_EGA, "EGA" }, { KD_VGA, "VGA" }, { KD_TGA, "TGA" }, { -1, "Unknown" }, }; int i; for (i = 0; names[i].type != -1; ++i) if (names[i].type == type) break; return names[i].name; } /* * Show active VTY, ie current console number. */ static void show_active_info(void) { printf("%d\n", cur_info.active_vty); } /* * Show graphics adapter information. */ static void show_adapter_info(void) { struct video_adapter_info ad; ad.va_index = 0; if (ioctl(0, CONS_ADPINFO, &ad) == -1) { revert(); err(1, "obtaining adapter information"); } printf("fb%d:\n", ad.va_index); printf(" %.*s%d, type:%s%s (%d), flags:0x%x\n", (int)sizeof(ad.va_name), ad.va_name, ad.va_unit, (ad.va_flags & V_ADP_VESA) ? "VESA " : "", adapter_name(ad.va_type), ad.va_type, ad.va_flags); printf(" initial mode:%d, current mode:%d, BIOS mode:%d\n", ad.va_initial_mode, ad.va_mode, ad.va_initial_bios_mode); printf(" frame buffer window:0x%zx, buffer size:0x%zx\n", ad.va_window, ad.va_buffer_size); printf(" window size:0x%zx, origin:0x%x\n", ad.va_window_size, ad.va_window_orig); printf(" display start address (%d, %d), scan line width:%d\n", ad.va_disp_start.x, ad.va_disp_start.y, ad.va_line_width); printf(" reserved:0x%zx\n", ad.va_unused0); } /* * Show video mode information. */ static void show_mode_info(void) { char buf[80]; struct video_info info; int c; int mm; int mode; printf(" mode# flags type size " "font window linear buffer\n"); printf("---------------------------------------" "---------------------------------------\n"); memset(&info, 0, sizeof(info)); for (mode = 0; mode <= M_VESA_MODE_MAX; ++mode) { info.vi_mode = mode; if (ioctl(0, CONS_MODEINFO, &info)) continue; if (info.vi_mode != mode) continue; if (info.vi_width == 0 && info.vi_height == 0 && info.vi_cwidth == 0 && info.vi_cheight == 0) continue; printf("%3d (0x%03x)", mode, mode); printf(" 0x%08x", info.vi_flags); if (info.vi_flags & V_INFO_GRAPHICS) { c = 'G'; if (info.vi_mem_model == V_INFO_MM_PLANAR) snprintf(buf, sizeof(buf), "%dx%dx%d %d", info.vi_width, info.vi_height, info.vi_depth, info.vi_planes); else { switch (info.vi_mem_model) { case V_INFO_MM_PACKED: mm = 'P'; break; case V_INFO_MM_DIRECT: mm = 'D'; break; case V_INFO_MM_CGA: mm = 'C'; break; case V_INFO_MM_HGC: mm = 'H'; break; case V_INFO_MM_VGAX: mm = 'V'; break; default: mm = ' '; break; } snprintf(buf, sizeof(buf), "%dx%dx%d %c", info.vi_width, info.vi_height, info.vi_depth, mm); } } else { c = 'T'; snprintf(buf, sizeof(buf), "%dx%d", info.vi_width, info.vi_height); } printf(" %c %-15s", c, buf); snprintf(buf, sizeof(buf), "%dx%d", info.vi_cwidth, info.vi_cheight); printf(" %-5s", buf); printf(" 0x%05zx %2dk %2dk", info.vi_window, (int)info.vi_window_size/1024, (int)info.vi_window_gran/1024); printf(" 0x%08zx %dk\n", info.vi_buffer, (int)info.vi_buffer_size/1024); } } static void show_info(char *arg) { if (!strcmp(arg, "active")) { show_active_info(); } else if (!strcmp(arg, "adapter")) { show_adapter_info(); } else if (!strcmp(arg, "mode")) { show_mode_info(); } else { revert(); errx(1, "argument to -i must be active, adapter, or mode"); } } static void test_frame(void) { vid_info_t info; const char *bg, *sep; int i, fore; info.size = sizeof(info); if (ioctl(0, CONS_GETINFO, &info) == -1) err(1, "getting console information"); fore = 15; if (info.mv_csz < 80) { bg = "BG"; sep = " "; } else { bg = "BACKGROUND"; sep = " "; } fprintf(stdout, "\033[=0G\n\n"); for (i=0; i<8; i++) { fprintf(stdout, "\033[=%dF\033[=0G%2d \033[=%dF%-7s%s" "\033[=%dF\033[=0G%2d \033[=%dF%-12s%s" "\033[=%dF%2d \033[=%dG%s\033[=0G%s" "\033[=%dF%2d \033[=%dG%s\033[=0G\n", fore, i, i, legal_colors[i], sep, fore, i + 8, i + 8, legal_colors[i + 8], sep, fore, i, i, bg, sep, fore, i + 8, i + 8, bg); } fprintf(stdout, "\033[=%dF\033[=%dG\033[=%dH\033[=%dI\n", info.mv_norm.fore, info.mv_norm.back, info.mv_rev.fore, info.mv_rev.back); } /* * Snapshot the video memory of that terminal, using the CONS_SCRSHOT * ioctl, and writes the results to stdout either in the special * binary format (see manual page for details), or in the plain * text format. */ static void dump_screen(int mode, int opt) { scrshot_t shot; vid_info_t info; info.size = sizeof(info); if (ioctl(0, CONS_GETINFO, &info) == -1) { revert(); err(1, "getting console information"); } shot.x = shot.y = 0; shot.xsize = info.mv_csz; shot.ysize = info.mv_rsz; if (opt == DUMP_ALL) shot.ysize += info.mv_hsz; shot.buf = alloca(shot.xsize * shot.ysize * sizeof(u_int16_t)); if (shot.buf == NULL) { revert(); errx(1, "failed to allocate memory for dump"); } if (ioctl(0, CONS_SCRSHOT, &shot) == -1) { revert(); err(1, "dumping screen"); } if (mode == DUMP_FMT_RAW) { printf("SCRSHOT_%c%c%c%c", DUMP_FMT_REV, 2, shot.xsize, shot.ysize); fflush(stdout); write(STDOUT_FILENO, shot.buf, shot.xsize * shot.ysize * sizeof(u_int16_t)); } else { char *line; int x, y; u_int16_t ch; line = alloca(shot.xsize + 1); if (line == NULL) { revert(); errx(1, "failed to allocate memory for line buffer"); } for (y = 0; y < shot.ysize; y++) { for (x = 0; x < shot.xsize; x++) { ch = shot.buf[x + (y * shot.xsize)]; ch &= 0xff; if (isprint(ch) == 0) ch = ' '; line[x] = (char)ch; } /* Trim trailing spaces */ do { line[x--] = '\0'; } while (line[x] == ' ' && x != 0); puts(line); } fflush(stdout); } } /* * Set the console history buffer size. */ static void set_history(char *opt) { int size; size = atoi(opt); if ((*opt == '\0') || size < 0) { revert(); errx(1, "argument must be a positive number"); } if (ioctl(0, CONS_HISTORY, &size) == -1) { revert(); err(1, "setting history buffer size"); } } /* * Clear the console history buffer. */ static void clear_history(void) { if (ioctl(0, CONS_CLRHIST) == -1) { revert(); err(1, "clearing history buffer"); } } static void set_terminal_mode(char *arg) { if (strcmp(arg, "xterm") == 0) fprintf(stderr, "\033[=T"); else if (strcmp(arg, "cons25") == 0) fprintf(stderr, "\033[=1T"); } int main(int argc, char **argv) { char *font, *type, *termmode; const char *opts; int dumpmod, dumpopt, opt; vt4_mode = is_vt4(); init(); dumpmod = 0; dumpopt = DUMP_FBF; termmode = NULL; if (vt4_mode) opts = "b:Cc:fg:h:Hi:M:m:pPr:S:s:T:t:x"; else opts = "b:Cc:dfg:h:Hi:l:LM:m:pPr:S:s:T:t:x"; while ((opt = getopt(argc, argv, opts)) != -1) switch(opt) { case 'b': set_border_color(optarg); break; case 'C': clear_history(); break; case 'c': set_cursor_type(optarg); break; case 'd': if (vt4_mode) break; print_scrnmap(); break; case 'f': optarg = nextarg(argc, argv, &optind, 'f', 0); if (optarg != NULL) { font = nextarg(argc, argv, &optind, 'f', 0); if (font == NULL) { type = NULL; font = optarg; } else type = optarg; load_font(type, font); } else { if (!vt4_mode) usage(); /* Switch syscons to ROM? */ load_default_vt4font(); } break; case 'g': if (sscanf(optarg, "%dx%d", &vesa_cols, &vesa_rows) != 2) { revert(); warnx("incorrect geometry: %s", optarg); usage(); } break; case 'h': set_history(optarg); break; case 'H': dumpopt = DUMP_ALL; break; case 'i': show_info(optarg); break; case 'l': if (vt4_mode) break; load_scrnmap(optarg); break; case 'L': if (vt4_mode) break; load_default_scrnmap(); break; case 'M': set_mouse_char(optarg); break; case 'm': set_mouse(optarg); break; case 'p': dumpmod = DUMP_FMT_RAW; break; case 'P': dumpmod = DUMP_FMT_TXT; break; case 'r': set_reverse_colors(argc, argv, &optind); break; case 'S': set_lockswitch(optarg); break; case 's': set_console(optarg); break; case 'T': if (strcmp(optarg, "xterm") != 0 && strcmp(optarg, "cons25") != 0) usage(); termmode = optarg; break; case 't': set_screensaver_timeout(optarg); break; case 'x': hex = 1; break; default: usage(); } if (dumpmod != 0) dump_screen(dumpmod, dumpopt); video_mode(argc, argv, &optind); set_normal_colors(argc, argv, &optind); if (optind < argc && !strcmp(argv[optind], "show")) { test_frame(); optind++; } if (termmode != NULL) set_terminal_mode(termmode); if ((optind != argc) || (argc == 1)) usage(); return (0); } Index: projects/runtime-coverage =================================================================== --- projects/runtime-coverage (revision 322921) +++ projects/runtime-coverage (revision 322922) Property changes on: projects/runtime-coverage ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head:r322871-322921