Index: releng/11.1/UPDATING =================================================================== --- releng/11.1/UPDATING (revision 335464) +++ releng/11.1/UPDATING (revision 335465) @@ -1,1772 +1,1779 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/updating-src.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. +20180621 p11 FreeBSD-SA-18:07.lazyfpu + FreeBSD-EN-18:07.pmap + + Fix Lazy FPU information disclosure. [SA-18:07.lazyfpu] + + Fix TLB shootdown for Xen based guests. [EN-18:07.pmap] + 20180508 p10 FreeBSD-SA-18:06.debugreg FreeBSD-EN-18:05.mem FreeBSD-EN-18:06.tzdata Fix mishandling of x86 debug exceptions. [SA-18:06.debugreg] Fix multiple small kernel memory disclosures. [EN-18:05.mem] Update timezone database information. [EN-18:06.tzdata] 20180404 p9 FreeBSD-SA-18:04.vt FreeBSD-SA-18:05.ipsec FreeBSD-EN-18:03.tzdata FreeBSD-EN-18:04.mem Fix vt console memory disclosure. [SA-18:04.vt] Fix ipsec crash or denial of service. [SA-18:05.ipsec] Update timezone database information. [EN-18:03.tzdata] Fix multiple small kernel memory disclosures. [EN-18:04.mem] 20180314 p8 FreeBSD-SA-18:03.speculative_execution Add mitigations for two classes of speculative execution vulnerabilities on amd64. 20180307 p7 FreeBSD-SA-18:01.ipsec FreeBSD-SA-18:02.ntp FreeBSD-EN-18:01.tzdata FreeBSD-EN-18:02.file Fix ipsec validation and use-after-free. [SA-18:01.ipsec] Fix multiple vulnerabilities in ntp. [SA-18:02.ntp] Update timezone database information. [EN-18:01.tzdata] Update file(1) to new version with security update. [EN-18:02.file] 20171209 p6 FreeBSD-SA-17:12.openssl Fix multiple vulnerabilities of OpenSSL. 20171129 p5 FreeBSD-SA-17:11.openssl Fix multiple vulnerabilities of OpenSSL. 20171115 p4 FreeBSD-SA-17:08.ptrace FreeBSD-SA-17:10.kldstat Fix ptrace(2) vulnerability. [SA-17:08.ptrace] Fix kldstat(2) vulnerability. [SA-17:10.kldstat] 20171102 p3 FreeBSD-EN-17:09.tzdata Update timezone database information. [EN-17:09] 20171017 p2 FreeBSD-SA-17:07.wpa Fix WPA2 protocol vulnerability. [SA-17:07] 20170810 p1 FreeBSD-SA-17:06.openssh FreeBSD-EN-17:07.vnet FreeBSD-EN-17:08.pf Fix OpenSSH Denial of Service vulnerability. [SA-17:06] Fix VNET kernel panic with asynchronous I/O. [EN-17:07] Fix pf(4) housekeeping thread causes kernel panic. [EN-17:08] 20170725: 11.1-RELEASE. 20170518: arm64 builds now use the base system LLD 4.0.0 linker by default, instead of requiring that the aarch64-binutils port or package be installed. To continue using aarch64-binutils, set CROSS_BINUTILS_PREFIX=/usr/local/aarch64-freebsd/bin . 20170529: The ctl.ko module no longer implements the iSCSI target frontend: cfiscsi.ko does instead. If building cfiscsi.ko as a kernel module, the module can be loaded via one of the following methods: - `cfiscsi_load="YES"` in loader.conf(5). - Add `cfiscsi` to `$kld_list` in rc.conf(5). - ctladm(8)/ctld(8), when compiled with iSCSI support (`WITH_ISCSI=yes` in src.conf(5)) Please see cfiscsi(4) for more details. 20170511: The mmcsd.ko module now additionally depends on geom_flashmap.ko. Also, mmc.ko and mmcsd.ko need to be a matching pair built from the same source (previously, the dependency of mmcsd.ko on mmc.ko was missing, but mmcsd.ko now will refuse to load if it is incompatible with mmc.ko). 20170414: Binds and sends to the loopback addresses, IPv6 and IPv4, will now use any explicitly assigned loopback address available in the jail instead of using the first assigned address of the jail. 20170413: As of r316810 for ipfilter, keep frags is no longer assumed when keep state is specified in a rule. r316810 aligns ipfilter with documentation in man pages separating keep frags from keep state. This allows keep state to specified without forcing keep frags and allows keep frags to be specified independently of keep state. To maintain previous behaviour, also specify keep frags with keep state (as documented in ipf.conf.5). 20170402: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170323: The code that provides support for ZFS .zfs/ directory functionality has been reimplemented. It's not possible now to create a snapshot by mkdir under .zfs/snapshot/. That should be the only user visible change. 20170319: Many changes in the IPsec code have been merged from the FreeBSD-CURRENT branch. The IPSEC_FILTERTUNNEL kernel option is removed in favour of corresponding sysctl. The IPSEC_NAT_T kernel option is also removed, and now NAT-T is supported by default. Security associations now use the single namespace for SPI allocation, so if you use several manually configured security associations with the same SPI, this configuration needs modification. 20161217: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161124: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161119: The layout of the pmap structure has changed for powerpc to put the pmap statistics at the front for all CPU variations. libkvm(3) and all tools that link against it need to be recompiled. 20161030: isl(4) and cyapa(4) drivers now require a new driver, chromebook_platform(4), to work properly on Chromebook-class hardware. On other types of hardware the drivers may need to be configured using device hints. Please see the corresponding manual pages for details. 20161210: Relocatable object files with the extension of .So have been renamed to use an extension of .pico instead. The purpose of this change is to avoid a name clash with shared libraries on case-insensitive file systems. On those file systems, foo.So is the same file as foo.so. 20160622: The libc stub for the pipe(2) system call has been replaced with a wrapper that calls the pipe2(2) system call and the pipe(2) system call is now only implemented by the kernels that include "options COMPAT_FREEBSD10" in their config file (this is the default). Users should ensure that this option is enabled in their kernel or upgrade userspace to r302092 before upgrading their kernel. 20160527: CAM will now strip leading spaces from SCSI disks' serial numbers. This will effect users who create UFS filesystems on SCSI disks using those disk's diskid device nodes. For example, if /etc/fstab previously contained a line like "/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should change it to "/dev/diskid/DISK-ABCDEFG0123456". Users of geom transforms like gmirror may also be affected. ZFS users should generally be fine. 20160523: The bitstring(3) API has been updated with new functionality and improved performance. But it is binary-incompatible with the old API. Objects built with the new headers may not be linked against objects built with the old headers. 20160520: The brk and sbrk functions have been removed from libc on arm64. Binutils from ports has been updated to not link to these functions and should be updated to the latest version before installing a new libc. 20160517: The armv6 port now defaults to hard float ABI. Limited support for running both hardfloat and soft float on the same system is available using the libraries installed with -DWITH_LIBSOFT. This has only been tested as an upgrade path for installworld and packages may fail or need manual intervention to run. New packages will be needed. To update an existing self-hosted armv6hf system, you must add TARGET_ARCH=armv6 on the make command line for both the build and the install steps. 20160510: Kernel modules compiled outside of a kernel build now default to installing to /boot/modules instead of /boot/kernel. Many kernel modules built this way (such as those in ports) already overrode KMODDIR explicitly to install into /boot/modules. However, manually building and installing a module from /sys/modules will now install to /boot/modules instead of /boot/kernel. 20160414: The CAM I/O scheduler has been committed to the kernel. There should be no user visible impact. This does enable NCQ Trim on ada SSDs. While the list of known rogues that claim support for this but actually corrupt data is believed to be complete, be on the lookout for data corruption. The known rogue list is believed to be complete: o Crucial MX100, M550 drives with MU01 firmware. o Micron M510 and M550 drives with MU01 firmware. o Micron M500 prior to MU07 firmware o Samsung 830, 840, and 850 all firmwares o FCCT M500 all firmwares Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware with working NCQ TRIM. For Micron branded drives, see your sales rep for updated firmware. Black listed drives will work correctly because these drives work correctly so long as no NCQ TRIMs are sent to them. Given this list is the same as found in Linux, it's believed there are no other rogues in the market place. All other models from the above vendors work. To be safe, if you are at all concerned, you can quirk each of your drives to prevent NCQ from being sent by setting: kern.cam.ada.X.quirks="0x2" in loader.conf. If the drive requires the 4k sector quirk, set the quirks entry to 0x3. 20160330: The FAST_DEPEND build option has been removed and its functionality is now the one true way. The old mkdep(1) style of 'make depend' has been removed. See 20160311 for further details. 20160317: Resource range types have grown from unsigned long to uintmax_t. All drivers, and anything using libdevinfo, need to be recompiled. 20160311: WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree builds. It no longer runs mkdep(1) during 'make depend', and the 'make depend' stage can safely be skipped now as it is auto ran when building 'make all' and will generate all SRCS and DPSRCS before building anything else. Dependencies are gathered at compile time with -MF flags kept in separate .depend files per object file. Users should run 'make cleandepend' once if using -DNO_CLEAN to clean out older stale .depend files. 20160306: On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into kernel modules. Therefore, if you load any kernel modules at boot time, please install the boot loaders after you install the kernel, but before rebooting, e.g.: make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE make -C sys/boot install Then follow the usual steps, described in the General Notes section, below. 20160305: Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20160301: The AIO subsystem is now a standard part of the kernel. The VFS_AIO kernel option and aio.ko kernel module have been removed. Due to stability concerns, asynchronous I/O requests are only permitted on sockets and raw disks by default. To enable asynchronous I/O requests on all file types, set the vfs.aio.enable_unsafe sysctl to a non-zero value. 20160226: The ELF object manipulation tool objcopy is now provided by the ELF Tool Chain project rather than by GNU binutils. It should be a drop-in replacement, with the addition of arm64 support. The (temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set to obtain the GNU version if necessary. 20160129: Building ZFS pools on top of zvols is prohibited by default. That feature has never worked safely; it's always been prone to deadlocks. Using a zvol as the backing store for a VM guest's virtual disk will still work, even if the guest is using ZFS. Legacy behavior can be restored by setting vfs.zfs.vol.recursive=1. 20160119: The NONE and HPN patches has been removed from OpenSSH. They are still available in the security/openssh-portable port. 20160113: With the addition of ypldap(8), a new _ypldap user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20151216: The tftp loader (pxeboot) now uses the option root-path directive. As a consequence it no longer looks for a pxeboot.4th file on the tftp server. Instead it uses the regular /boot infrastructure as with the other loaders. 20151211: The code to start recording plug and play data into the modules has been committed. While the old tools will properly build a new kernel, a number of warnings about "unknown metadata record 4" will be produced for an older kldxref. To avoid such warnings, make sure to rebuild the kernel toolchain (or world). Make sure that you have r292078 or later when trying to build 292077 or later before rebuilding. 20151207: Debug data files are now built by default with 'make buildworld' and installed with 'make installworld'. This facilitates debugging but requires more disk space both during the build and for the installed world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes in src.conf(5). 20151130: r291527 changed the internal interface between the nfsd.ko and nfscommon.ko modules. As such, they must both be upgraded to-gether. __FreeBSD_version has been bumped because of this. 20151108: Add support for unicode collation strings leads to a change of order of files listed by ls(1) for example. To get back to the old behaviour, set LC_COLLATE environment variable to "C". Databases administrators will need to reindex their databases given collation results will be different. Due to a bug in install(1) it is recommended to remove the ancient locales before running make installworld. rm -rf /usr/share/locale/* 20151030: The OpenSSL has been upgraded to 1.0.2d. Any binaries requiring libcrypto.so.7 or libssl.so.7 must be recompiled. 20151020: Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0. Kernel modules isp_2400_multi and isp_2500_multi were removed and should be replaced with isp_2400 and isp_2500 modules respectively. 20151017: The build previously allowed using 'make -n' to not recurse into sub-directories while showing what commands would be executed, and 'make -n -n' to recursively show commands. Now 'make -n' will recurse and 'make -N' will not. 20151012: If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster and etcupdate will now use this file. A custom sendmail.cf is now updated via this mechanism rather than via installworld. If you had excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may want to remove the exclusion or change it to "always install". /etc/mail/sendmail.cf is now managed the same way regardless of whether SENDMAIL_MC/SENDMAIL_CF is used. If you are not using SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior. 20151011: Compatibility shims for legacy ATA device names have been removed. It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.* environment variables, /dev/ad* and /dev/ar* symbolic links. 20151006: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20150924: Kernel debug files have been moved to /usr/lib/debug/boot/kernel/, and renamed from .symbols to .debug. This reduces the size requirements on the boot partition or file system and provides consistency with userland debug files. When using the supported kernel installation method the /usr/lib/debug/boot/kernel directory will be renamed (to kernel.old) as is done with /boot/kernel. Developers wishing to maintain the historical behavior of installing debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5). 20150827: The wireless drivers had undergone changes that remove the 'parent interface' from the ifconfig -l output. The rc.d network scripts used to check presence of a parent interface in the list, so old scripts would fail to start wireless networking. Thus, etcupdate(3) or mergemaster(8) run is required after kernel update, to update your rc.d scripts in /etc. 20150827: pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl' These configurations are now automatically interpreted as 'scrub fragment reassemble'. 20150817: Kernel-loadable modules for the random(4) device are back. To use them, the kernel must have device random options RANDOM_LOADABLE kldload(8) can then be used to load random_fortuna.ko or random_yarrow.ko. Please note that due to the indirect function calls that the loadable modules need to provide, the build-in variants will be slightly more efficient. The random(4) kernel option RANDOM_DUMMY has been retired due to unpopularity. It was not all that useful anyway. 20150813: The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired. Control over building the ELF Tool Chain tools is now provided by the WITHOUT_TOOLCHAIN knob. 20150810: The polarity of Pulse Per Second (PPS) capture events with the uart(4) driver has been corrected. Prior to this change the PPS "assert" event corresponded to the trailing edge of a positive PPS pulse and the "clear" event was the leading edge of the next pulse. As the width of a PPS pulse in a typical GPS receiver is on the order of 1 millisecond, most users will not notice any significant difference with this change. Anyone who has compensated for the historical polarity reversal by configuring a negative offset equal to the pulse width will need to remove that workaround. 20150809: The default group assigned to /dev/dri entries has been changed from 'wheel' to 'video' with the id of '44'. If you want to have access to the dri devices please add yourself to the video group with: # pw groupmod video -m $USER 20150806: The menu.rc and loader.rc files will now be replaced during upgrades. Please migrate local changes to menu.rc.local and loader.rc.local instead. 20150805: GNU Binutils versions of addr2line, c++filt, nm, readelf, size, strings and strip have been removed. The src.conf(5) knob WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools. 20150728: As ZFS requires more kernel stack pages than is the default on some architectures e.g. i386, it now warns if KSTACK_PAGES is less than ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing). Please consider using 'options KSTACK_PAGES=X' where X is greater than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations. 20150706: sendmail has been updated to 8.15.2. Starting with FreeBSD 11.0 and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by default, i.e., they will not contain "::". For example, instead of ::1, it will be 0:0:0:0:0:0:0:1. This permits a zero subnet to have a more specific match, such as different map entries for IPv6:0:0 vs IPv6:0. This change requires that configuration data (including maps, files, classes, custom ruleset, etc.) must use the same format, so make certain such configuration data is upgrading. As a very simple check search for patterns like 'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'. To return to the old behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or the cf option UseCompressedIPv6Addresses. 20150630: The default kernel entropy-processing algorithm is now Fortuna, replacing Yarrow. Assuming you have 'device random' in your kernel config file, the configurations allow a kernel option to override this default. You may choose *ONE* of: options RANDOM_YARROW # Legacy /dev/random algorithm. options RANDOM_DUMMY # Blocking-only driver. If you have neither, you get Fortuna. For most people, read no further, Fortuna will give a /dev/random that works like it always used to, and the difference will be irrelevant. If you remove 'device random', you get *NO* kernel-processed entropy at all. This may be acceptable to folks building embedded systems, but has complications. Carry on reading, and it is assumed you know what you need. *PLEASE* read random(4) and random(9) if you are in the habit of tweaking kernel configs, and/or if you are a member of the embedded community, wanting specific and not-usual behaviour from your security subsystems. NOTE!! If you use RANDOM_DUMMY and/or have no 'device random', you will NOT have a functioning /dev/random, and many cryptographic features will not work, including SSH. You may also find strange behaviour from the random(3) set of library functions, in particular sranddev(3), srandomdev(3) and arc4random(3). The reason for this is that the KERN_ARND sysctl only returns entropy if it thinks it has some to share, and with RANDOM_DUMMY or no 'device random' this will never happen. 20150623: An additional fix for the issue described in the 20150614 sendmail entry below has been been committed in revision 284717. 20150616: FreeBSD's old make (fmake) has been removed from the system. It is available as the devel/fmake port or via pkg install fmake. 20150615: The fix for the issue described in the 20150614 sendmail entry below has been been committed in revision 284436. The work around described in that entry is no longer needed unless the default setting is overridden by a confDH_PARAMETERS configuration setting of '5' or pointing to a 512 bit DH parameter file. 20150614: ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf and devel/kyua to version 0.20+ and adjust any calling code to work with Kyuafile and kyua. 20150614: The import of openssl to address the FreeBSD-SA-15:10.openssl security advisory includes a change which rejects handshakes with DH parameters below 768 bits. sendmail releases prior to 8.15.2 (not yet released), defaulted to a 512 bit DH parameter setting for client connections. To work around this interoperability, sendmail can be configured to use a 2048 bit DH parameter by: 1. Edit /etc/mail/`hostname`.mc 2. If a setting for confDH_PARAMETERS does not exist or exists and is set to a string beginning with '5', replace it with '2'. 3. If a setting for confDH_PARAMETERS exists and is set to a file path, create a new file with: openssl dhparam -out /path/to/file 2048 4. Rebuild the .cf file: cd /etc/mail/; make; make install 5. Restart sendmail: cd /etc/mail/; make restart A sendmail patch is coming, at which time this file will be updated. 20150604: Generation of legacy formatted entries have been disabled by default in pwd_mkdb(8), as all base system consumers of the legacy formatted entries were converted to use the new format by default when the new, machine independent format have been added and supported since FreeBSD 5.x. Please see the pwd_mkdb(8) manual page for further details. 20150525: Clang and llvm have been upgraded to 3.6.1 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150521: TI platform code switched to using vendor DTS files and this update may break existing systems running on Beaglebone, Beaglebone Black, and Pandaboard: - dtb files should be regenerated/reinstalled. Filenames are the same but content is different now - GPIO addressing was changed, now each GPIO bank (32 pins per bank) has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old addressing scheme is now pin 25 on /dev/gpioc3. - Pandaboard: /etc/ttys should be updated, serial console device is now /dev/ttyu2, not /dev/ttyu0 20150501: soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim. If you need the GNU extension from groff soelim(1), install groff from package: pkg install groff, or via ports: textproc/groff. 20150423: chmod, chflags, chown and chgrp now affect symlinks in -R mode as defined in symlink(7); previously symlinks were silently ignored. 20150415: The const qualifier has been removed from iconv(3) to comply with POSIX. The ports tree is aware of this from r384038 onwards. 20150416: Libraries specified by LIBADD in Makefiles must have a corresponding DPADD_ variable to ensure correct dependencies. This is now enforced in src.libnames.mk. 20150324: From legacy ata(4) driver was removed support for SATA controllers supported by more functional drivers ahci(4), siis(4) and mvs(4). Kernel modules ataahci and ataadaptec were removed completely, replaced by ahci and mvs modules respectively. 20150315: Clang, llvm and lldb have been upgraded to 3.6.0 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150307: The 32-bit PowerPC kernel has been changed to a position-independent executable. This can only be booted with a version of loader(8) newer than January 31, 2015, so make sure to update both world and kernel before rebooting. 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. See 20150805 for updated information. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The stable/10 branch has been created in subversion from head revision r256279. 20131010: The rc.d/jail script has been updated to support jail(8) configuration file. The "jail__*" rc.conf(5) variables for per-jail configuration are automatically converted to /var/run/jail..conf before the jail(8) utility is invoked. This is transparently backward compatible. See below about some incompatibilities and rc.conf(5) manual page for more details. These variables are now deprecated in favor of jail(8) configuration file. One can use "rc.d/jail config " command to generate a jail(8) configuration file in /var/run/jail..conf without running the jail(8) utility. The default pathname of the configuration file is /etc/jail.conf and can be specified by using $jail_conf or $jail__conf variables. Please note that jail_devfs_ruleset accepts an integer at this moment. Please consider to rewrite the ruleset name with an integer. 20130930: BIND has been removed from the base system. If all you need is a local resolver, simply enable and start the local_unbound service instead. Otherwise, several versions of BIND are available in the ports tree. The dns/bind99 port is one example. With this change, nslookup(1) and dig(1) are no longer in the base system. Users should instead use host(1) and drill(1) which are in the base system. Alternatively, nslookup and dig can be obtained by installing the dns/bind-tools port. 20130916: With the addition of unbound(8), a new unbound user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20130911: OpenSSH is now built with DNSSEC support, and will by default silently trust signed SSHFP records. This can be controlled with the VerifyHostKeyDNS client configuration setting. DNSSEC support can be disabled entirely with the WITHOUT_LDNS option in src.conf. 20130906: The GNU Compiler Collection and C++ standard library (libstdc++) are no longer built by default on platforms where clang is the system compiler. You can enable them with the WITH_GCC and WITH_GNUCXX options in src.conf. 20130905: The PROCDESC kernel option is now part of the GENERIC kernel configuration and is required for the rwhod(8) to work. If you are using custom kernel configuration, you should include 'options PROCDESC'. 20130905: The API and ABI related to the Capsicum framework was modified in backward incompatible way. The userland libraries and programs have to be recompiled to work with the new kernel. This includes the following libraries and programs, but the whole buildworld is advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl, kdump, procstat, rwho, rwhod, uniq. 20130903: AES-NI intrinsic support has been added to gcc. The AES-NI module has been updated to use this support. A new gcc is required to build the aesni module on both i386 and amd64. 20130821: The PADLOCK_RNG and RDRAND_RNG kernel options are now devices. Thus "device padlock_rng" and "device rdrand_rng" should be used instead of "options PADLOCK_RNG" & "options RDRAND_RNG". 20130813: WITH_ICONV has been split into two feature sets. WITH_ICONV now enables just the iconv* functionality and is now on by default. WITH_LIBICONV_COMPAT enables the libiconv api and link time compatibility. Set WITHOUT_ICONV to build the old way. If you have been using WITH_ICONV before, you will very likely need to turn on WITH_LIBICONV_COMPAT. 20130806: INVARIANTS option now enables DEBUG for code with OpenSolaris and Illumos origin, including ZFS. If you have INVARIANTS in your kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG explicitly. DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS) locks if WITNESS option was set. Because that generated a lot of witness(9) reports and all of them were believed to be false positives, this is no longer done. New option OPENSOLARIS_WITNESS can be used to achieve the previous behavior. 20130806: Timer values in IPv6 data structures now use time_uptime instead of time_second. Although this is not a user-visible functional change, userland utilities which directly use them---ndp(8), rtadvd(8), and rtsold(8) in the base system---need to be updated to r253970 or later. 20130802: find -delete can now delete the pathnames given as arguments, instead of only files found below them or if the pathname did not contain any slashes. Formerly, the following error message would result: find: -delete: : relative path potentially not safe Deleting the pathnames given as arguments can be prevented without error messages using -mindepth 1 or by changing directory and passing "." as argument to find. This works in the old as well as the new version of find. 20130726: Behavior of devfs rules path matching has been changed. Pattern is now always matched against fully qualified devfs path and slash characters must be explicitly matched by slashes in pattern (FNM_PATHNAME). Rulesets involving devfs subdirectories must be reviewed. 20130716: The default ARM ABI has changed to the ARM EABI. The old ABI is incompatible with the ARM EABI and all programs and modules will need to be rebuilt to work with a new kernel. To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set. NOTE: Support for the old ABI will be removed in the future and users are advised to upgrade. 20130709: pkg_install has been disconnected from the build if you really need it you should add WITH_PKGTOOLS in your src.conf(5). 20130709: Most of network statistics structures were changed to be able keep 64-bits counters. Thus all tools, that work with networking statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.) 20130618: Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file. 20130615: CVS has been removed from the base system. An exact copy of the code is available from the devel/cvs port. 20130613: Some people report the following error after the switch to bmake: make: illegal option -- J usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable] ... *** [buildworld] Error code 2 this likely due to an old instance of make in ${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE}) which src/Makefile will use that blindly, if it exists, so if you see the above error: rm -rf `make -V MAKEPATH` should resolve it. 20130516: Use bmake by default. Whereas before one could choose to build with bmake via -DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old make. The goal is to remove these knobs for 10-RELEASE. It is worth noting that bmake (like gmake) treats the command line as the unit of failure, rather than statements within the command line. Thus '(cd some/where && dosomething)' is safer than 'cd some/where; dosomething'. The '()' allows consistent behavior in parallel build. 20130429: Fix a bug that allows NFS clients to issue READDIR on files. 20130426: The WITHOUT_IDEA option has been removed because the IDEA patent expired. 20130426: The sysctl which controls TRIM support under ZFS has been renamed from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been enabled by default. 20130425: The mergemaster command now uses the default MAKEOBJDIRPREFIX rather than creating it's own in the temporary directory in order allow access to bootstrapped versions of tools such as install and mtree. When upgrading from version of FreeBSD where the install command does not support -l, you will need to install a new mergemaster command if mergemaster -p is required. This can be accomplished with the command (cd src/usr.sbin/mergemaster && make install). 20130404: Legacy ATA stack, disabled and replaced by new CAM-based one since FreeBSD 9.0, completely removed from the sources. Kernel modules atadisk and atapi*, user-level tools atacontrol and burncd are removed. Kernel option `options ATA_CAM` is now permanently enabled and removed. 20130319: SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2) and socketpair(2). Software, in particular Kerberos, may automatically detect and use these during building. The resulting binaries will not work on older kernels. 20130308: CTL_DISABLE has also been added to the sparc64 GENERIC (for further information, see the respective 20130304 entry). 20130304: Recent commits to callout(9) changed the size of struct callout, so the KBI is probably heavily disturbed. Also, some functions in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced by macros. Every kernel module using it won't load, so rebuild is requested. The ctl device has been re-enabled in GENERIC for i386 and amd64, but does not initialize by default (because of the new CTL_DISABLE option) to save memory. To re-enable it, remove the CTL_DISABLE option from the kernel config file or set kern.cam.ctl.disable=0 in /boot/loader.conf. 20130301: The ctl device has been disabled in GENERIC for i386 and amd64. This was done due to the extra memory being allocated at system initialisation time by the ctl driver which was only used if a CAM target device was created. This makes a FreeBSD system unusable on 128MB or less of RAM. 20130208: A new compression method (lz4) has been merged to -HEAD. Please refer to zpool-features(7) for more information. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20130129: A BSD-licensed patch(1) variant has been added and is installed as bsdpatch, being the GNU version the default patch. To inverse the logic and use the BSD-licensed one as default, while having the GNU version installed as gnupatch, rebuild and install world with the WITH_BSD_PATCH knob set. 20130121: Due to the use of the new -l option to install(1) during build and install, you must take care not to directly set the INSTALL make variable in your /etc/make.conf, /etc/src.conf, or on the command line. If you wish to use the -C flag for all installs you may be able to add INSTALL+=-C to /etc/make.conf or /etc/src.conf. 20130118: The install(1) option -M has changed meaning and now takes an argument that is a file or path to append logs to. In the unlikely event that -M was the last option on the command line and the command line contained at least two files and a target directory the first file will have logs appended to it. The -M option served little practical purpose in the last decade so its use is expected to be extremely rare. 20121223: After switching to Clang as the default compiler some users of ZFS on i386 systems started to experience stack overflow kernel panics. Please consider using 'options KSTACK_PAGES=4' in such configurations. 20121222: GEOM_LABEL now mangles label names read from file system metadata. Mangling affect labels containing spaces, non-printable characters, '%' or '"'. Device names in /etc/fstab and other places may need to be updated. 20121217: By default, only the 10 most recent kernel dumps will be saved. To restore the previous behaviour (no limit on the number of kernel dumps stored in the dump directory) add the following line to /etc/rc.conf: savecore_flags="" 20121201: With the addition of auditdistd(8), a new auditdistd user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20121117: The sin6_scope_id member variable in struct sockaddr_in6 is now filled by the kernel before passing the structure to the userland via sysctl or routing socket. This means the KAME-specific embedded scope id in sin6_addr.s6_addr[2] is always cleared in userland application. This behavior can be controlled by net.inet6.ip6.deembed_scopeid. __FreeBSD_version is bumped to 1000025. 20121105: On i386 and amd64 systems WITH_CLANG_IS_CC is now the default. This means that the world and kernel will be compiled with clang and that clang will be installed as /usr/bin/cc, /usr/bin/c++, and /usr/bin/cpp. To disable this behavior and revert to building with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions of current may need to bootstrap WITHOUT_CLANG first if the clang build fails (its compatibility window doesn't extend to the 9 stable branch point). 20121102: The IPFIREWALL_FORWARD kernel option has been removed. Its functionality now turned on by default. 20121023: The ZERO_COPY_SOCKET kernel option has been removed and split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP. NB: SOCKET_SEND_COW uses the VM page based copy-on-write mechanism which is not safe and may result in kernel crashes. NB: The SOCKET_RECV_PFLIP mechanism is useless as no current driver supports disposeable external page sized mbuf storage. Proper replacements for both zero-copy mechanisms are under consideration and will eventually lead to complete removal of the two kernel options. 20121023: The IPv4 network stack has been converted to network byte order. The following modules need to be recompiled together with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4), pf(4), ipfw(4), ng_ipfw(4), stf(4). 20121022: Support for non-MPSAFE filesystems was removed from VFS. The VFS_VERSION was bumped, all filesystem modules shall be recompiled. 20121018: All the non-MPSAFE filesystems have been disconnected from the build. The full list includes: codafs, hpfs, ntfs, nwfs, portalfs, smbfs, xfs. 20121016: The interface cloning API and ABI has changed. The following modules need to be recompiled together with kernel: ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4), vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4), faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4). 20121015: The sdhci driver was split in two parts: sdhci (generic SD Host Controller logic) and sdhci_pci (actual hardware driver). No kernel config modifications are required, but if you load sdhc as a module you must switch to sdhci_pci instead. 20121014: Import the FUSE kernel and userland support into base system. 20121013: The GNU sort(1) program has been removed since the BSD-licensed sort(1) has been the default for quite some time and no serious problems have been reported. The corresponding WITH_GNU_SORT knob has also gone. 20121006: The pfil(9) API/ABI for AF_INET family has been changed. Packet filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled with new kernel. 20121001: The net80211(4) ABI has been changed to allow for improved driver PS-POLL and power-save support. All wireless drivers need to be recompiled to work with the new kernel. 20120913: The random(4) support for the VIA hardware random number generator (`PADLOCK') is no longer enabled unconditionally. Add the padlock_rng device in the custom kernel config if needed. The GENERIC kernels on i386 and amd64 do include the device, so the change only affects the custom kernel configurations. 20120908: The pf(4) packet filter ABI has been changed. pfctl(8) and snmp_pf module need to be recompiled to work with new kernel. 20120828: A new ZFS feature flag "com.delphix:empty_bpobj" has been merged to -HEAD. Pools that have empty_bpobj in active state can not be imported read-write with ZFS implementations that do not support this feature. For more information read the zpool-features(5) manual page. 20120727: The sparc64 ZFS loader has been changed to no longer try to auto- detect ZFS providers based on diskN aliases but now requires these to be explicitly listed in the OFW boot-device environment variable. 20120712: The OpenSSL has been upgraded to 1.0.1c. Any binaries requiring libcrypto.so.6 or libssl.so.6 must be recompiled. Also, there are configuration changes. Make sure to merge /etc/ssl/openssl.cnf. 20120712: The following sysctls and tunables have been renamed for consistency with other variables: kern.cam.da.da_send_ordered -> kern.cam.da.send_ordered kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered 20120628: The sort utility has been replaced with BSD sort. For now, GNU sort is also available as "gnusort" or the default can be set back to GNU sort by setting WITH_GNU_SORT. In this case, BSD sort will be installed as "bsdsort". 20120611: A new version of ZFS (pool version 5000) has been merged to -HEAD. Starting with this version the old system of ZFS pool versioning is superseded by "feature flags". This concept enables forward compatibility against certain future changes in functionality of ZFS pools. The first read-only compatible "feature flag" for ZFS pools is named "com.delphix:async_destroy". For more information read the new zpool-features(5) manual page. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20120417: The malloc(3) implementation embedded in libc now uses sources imported as contrib/jemalloc. The most disruptive API change is to /etc/malloc.conf. If your system has an old-style /etc/malloc.conf, delete it prior to installworld, and optionally re-create it using the new format after rebooting. See malloc.conf(5) for details (specifically the TUNING section and the "opt.*" entries in the MALLCTL NAMESPACE section). 20120328: Big-endian MIPS TARGET_ARCH values no longer end in "eb". mips64eb is now spelled mips64. mipsn32eb is now spelled mipsn32. mipseb is now spelled mips. This is to aid compatibility with third-party software that expects this naming scheme in uname(3). Little-endian settings are unchanged. If you are updating a big-endian mips64 machine from before this change, you may need to set MACHINE_ARCH=mips64 in your environment before the new build system will recognize your machine. 20120306: Disable by default the option VFS_ALLOW_NONMPSAFE for all supported platforms. 20120229: Now unix domain sockets behave "as expected" on nullfs(5). Previously nullfs(5) did not pass through all behaviours to the underlying layer, as a result if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. 20120211: The getifaddrs upgrade path broken with 20111215 has been restored. If you have upgraded in between 20111215 and 20120209 you need to recompile libc again with your kernel. You still need to recompile world to be able to configure CARP but this restriction already comes from 20111215. 20120114: The set_rcvar() function has been removed from /etc/rc.subr. All base and ports rc.d scripts have been updated, so if you have a port installed with a script in /usr/local/etc/rc.d you can either hand-edit the rcvar= line, or reinstall the port. An easy way to handle the mass-update of /etc/rc.d: rm /etc/rc.d/* && mergemaster -i 20120109: panic(9) now stops other CPUs in the SMP systems, disables interrupts on the current CPU and prevents other threads from running. This behavior can be reverted using the kern.stop_scheduler_on_panic tunable/sysctl. The new behavior can be incompatible with kern.sync_on_panic. 20111215: The carp(4) facility has been changed significantly. Configuration of the CARP protocol via ifconfig(8) has changed, as well as format of CARP events submitted to devd(8) has changed. See manual pages for more information. The arpbalance feature of carp(4) is currently not supported anymore. Size of struct in_aliasreq, struct in6_aliasreq has changed. User utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8), need to be recompiled. 20111122: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. 20111108: The option VFS_ALLOW_NONMPSAFE option has been added in order to explicitely support non-MPSAFE filesystems. It is on by default for all supported platform at this present time. 20111101: The broken amd(4) driver has been replaced with esp(4) in the amd64, i386 and pc98 GENERIC kernel configuration files. 20110930: sysinstall has been removed 20110923: The stable/9 branch created in subversion. This corresponds to the RELENG_9 branch in CVS. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach before reporting problems with a major version upgrade. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. This file should be read as a log of events. When a later event changes information of a prior event, the prior event should not be deleted. Instead, a pointer to the entry with the new information should be placed in the old entry. Readers of this file should also sanity check older entries before relying on them blindly. Authors of new entries should write them with this in mind. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To just build a kernel when you know that it won't mess you up -------------------------------------------------------------- This assumes you are already running a CURRENT system. Replace ${arch} with the architecture of your machine (e.g. "i386", "arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc). cd src/sys/${arch}/conf config KERNEL_NAME_HERE cd ../compile/KERNEL_NAME_HERE make depend make make install If this fails, go to the "To build a kernel" section. To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make kernel KERNCONF=YOUR_KERNEL_HERE [8] [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a cd src adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a noop. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] In order to have a kernel that can run the 4.x binaries needed to do an installworld, you must include the COMPAT_FREEBSD4 option in your kernel. Failure to do so may leave you with a system that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is required to run the 5.x binaries on more recent kernels. And so on for COMPAT_FREEBSD6 and COMPAT_FREEBSD7. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. [9] When checking out sources, you must include the -P flag to have cvs prune empty directories. If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since September 23, 2011. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. All Rights Reserved. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: releng/11.1/sys/amd64/amd64/cpu_switch.S =================================================================== --- releng/11.1/sys/amd64/amd64/cpu_switch.S (revision 335464) +++ releng/11.1/sys/amd64/amd64/cpu_switch.S (revision 335465) @@ -1,483 +1,480 @@ /*- * Copyright (c) 2003 Peter Wemm. * Copyright (c) 1990 The Regents of the University of California. * All rights reserved. * * This code is derived from software contributed to Berkeley by * William Jolitz. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include "assym.s" #include "opt_sched.h" /*****************************************************************************/ /* Scheduling */ /*****************************************************************************/ .text #ifdef SMP #define LK lock ; #else #define LK #endif #if defined(SCHED_ULE) && defined(SMP) #define SETLK xchgq #else #define SETLK movq #endif /* * cpu_throw() * * This is the second half of cpu_switch(). It is used when the current * thread is either a dummy or slated to die, and we no longer care * about its state. This is only a slight optimization and is probably * not worth it anymore. Note that we need to clear the pm_active bits so * we do need the old proc if it still exists. * %rdi = oldtd * %rsi = newtd */ ENTRY(cpu_throw) movq %rsi,%r12 movq %rsi,%rdi call pmap_activate_sw jmp sw1 END(cpu_throw) /* * cpu_switch(old, new, mtx) * * Save the current thread state, then select the next thread to run * and load its state. * %rdi = oldtd * %rsi = newtd * %rdx = mtx */ ENTRY(cpu_switch) /* Switch to new thread. First, save context. */ movq TD_PCB(%rdi),%r8 orl $PCB_FULL_IRET,PCB_FLAGS(%r8) movq (%rsp),%rax /* Hardware registers */ movq %r15,PCB_R15(%r8) movq %r14,PCB_R14(%r8) movq %r13,PCB_R13(%r8) movq %r12,PCB_R12(%r8) movq %rbp,PCB_RBP(%r8) movq %rsp,PCB_RSP(%r8) movq %rbx,PCB_RBX(%r8) movq %rax,PCB_RIP(%r8) testl $PCB_DBREGS,PCB_FLAGS(%r8) jnz store_dr /* static predict not taken */ done_store_dr: /* have we used fp, and need a save? */ cmpq %rdi,PCPU(FPCURTHREAD) - jne 3f + jne 2f movq PCB_SAVEFPU(%r8),%r8 clts - cmpl $0,use_xsave + cmpl $0,use_xsave(%rip) jne 1f fxsave (%r8) jmp 2f 1: movq %rdx,%rcx movl xsave_mask,%eax movl xsave_mask+4,%edx .globl ctx_switch_xsave ctx_switch_xsave: /* This is patched to xsaveopt if supported, see fpuinit_bsp1() */ xsave (%r8) movq %rcx,%rdx -2: smsw %ax - orb $CR0_TS,%al - lmsw %ax - xorl %eax,%eax - movq %rax,PCPU(FPCURTHREAD) -3: +2: /* Save is done. Now fire up new thread. Leave old vmspace. */ movq %rsi,%r12 movq %rdi,%r13 movq %rdx,%r15 movq %rsi,%rdi callq pmap_activate_sw SETLK %r15,TD_LOCK(%r13) /* Release the old thread */ sw1: movq TD_PCB(%r12),%r8 #if defined(SCHED_ULE) && defined(SMP) /* Wait for the new thread to become unblocked */ movq $blocked_lock, %rdx 1: movq TD_LOCK(%r12),%rcx cmpq %rcx, %rdx pause je 1b #endif /* * At this point, we've switched address spaces and are ready * to load up the rest of the next context. */ /* Skip loading user fsbase/gsbase for kthreads */ testl $TDP_KTHREAD,TD_PFLAGS(%r12) jnz do_kthread /* * Load ldt register */ movq TD_PROC(%r12),%rcx cmpq $0, P_MD+MD_LDT(%rcx) jne do_ldt xorl %eax,%eax ld_ldt: lldt %ax /* Restore fs base in GDT */ movl PCB_FSBASE(%r8),%eax movq PCPU(FS32P),%rdx movw %ax,2(%rdx) shrl $16,%eax movb %al,4(%rdx) shrl $8,%eax movb %al,7(%rdx) /* Restore gs base in GDT */ movl PCB_GSBASE(%r8),%eax movq PCPU(GS32P),%rdx movw %ax,2(%rdx) shrl $16,%eax movb %al,4(%rdx) shrl $8,%eax movb %al,7(%rdx) do_kthread: /* Do we need to reload tss ? */ movq PCPU(TSSP),%rax movq PCB_TSSP(%r8),%rdx testq %rdx,%rdx cmovzq PCPU(COMMONTSSP),%rdx cmpq %rax,%rdx jne do_tss done_tss: movq %r8,PCPU(RSP0) movq %r8,PCPU(CURPCB) /* Update the COMMON_TSS_RSP0 pointer for the next interrupt */ cmpb $0,pti(%rip) jne 1f movq %r8,COMMON_TSS_RSP0(%rdx) 1: movq %r12,PCPU(CURTHREAD) /* into next thread */ /* Test if debug registers should be restored. */ testl $PCB_DBREGS,PCB_FLAGS(%r8) jnz load_dr /* static predict not taken */ done_load_dr: /* Restore context. */ movq PCB_R15(%r8),%r15 movq PCB_R14(%r8),%r14 movq PCB_R13(%r8),%r13 movq PCB_R12(%r8),%r12 movq PCB_RBP(%r8),%rbp movq PCB_RSP(%r8),%rsp movq PCB_RBX(%r8),%rbx movq PCB_RIP(%r8),%rax movq %rax,(%rsp) + movq PCPU(CURTHREAD),%rdi + call fpu_activate_sw ret /* * We order these strangely for several reasons. * 1: I wanted to use static branch prediction hints * 2: Most athlon64/opteron cpus don't have them. They define * a forward branch as 'predict not taken'. Intel cores have * the 'rep' prefix to invert this. * So, to make it work on both forms of cpu we do the detour. * We use jumps rather than call in order to avoid the stack. */ store_dr: movq %dr7,%rax /* yes, do the save */ movq %dr0,%r15 movq %dr1,%r14 movq %dr2,%r13 movq %dr3,%r12 movq %dr6,%r11 movq %r15,PCB_DR0(%r8) movq %r14,PCB_DR1(%r8) movq %r13,PCB_DR2(%r8) movq %r12,PCB_DR3(%r8) movq %r11,PCB_DR6(%r8) movq %rax,PCB_DR7(%r8) andq $0x0000fc00, %rax /* disable all watchpoints */ movq %rax,%dr7 jmp done_store_dr load_dr: movq %dr7,%rax movq PCB_DR0(%r8),%r15 movq PCB_DR1(%r8),%r14 movq PCB_DR2(%r8),%r13 movq PCB_DR3(%r8),%r12 movq PCB_DR6(%r8),%r11 movq PCB_DR7(%r8),%rcx movq %r15,%dr0 movq %r14,%dr1 /* Preserve reserved bits in %dr7 */ andq $0x0000fc00,%rax andq $~0x0000fc00,%rcx movq %r13,%dr2 movq %r12,%dr3 orq %rcx,%rax movq %r11,%dr6 movq %rax,%dr7 jmp done_load_dr do_tss: movq %rdx,PCPU(TSSP) movq %rdx,%rcx movq PCPU(TSS),%rax movw %cx,2(%rax) shrq $16,%rcx movb %cl,4(%rax) shrq $8,%rcx movb %cl,7(%rax) shrq $8,%rcx movl %ecx,8(%rax) movb $0x89,5(%rax) /* unset busy */ cmpb $0,pti(%rip) je 1f movq PCPU(PRVSPACE),%rax addq $PC_PTI_STACK+PC_PTI_STACK_SZ*8,%rax movq %rax,COMMON_TSS_RSP0(%rdx) 1: movl $TSSSEL,%eax ltr %ax jmp done_tss do_ldt: movq PCPU(LDT),%rax movq P_MD+MD_LDT_SD(%rcx),%rdx movq %rdx,(%rax) movq P_MD+MD_LDT_SD+8(%rcx),%rdx movq %rdx,8(%rax) movl $LDTSEL,%eax jmp ld_ldt END(cpu_switch) /* * savectx(pcb) * Update pcb, saving current processor state. */ ENTRY(savectx) /* Save caller's return address. */ movq (%rsp),%rax movq %rax,PCB_RIP(%rdi) movq %rbx,PCB_RBX(%rdi) movq %rsp,PCB_RSP(%rdi) movq %rbp,PCB_RBP(%rdi) movq %r12,PCB_R12(%rdi) movq %r13,PCB_R13(%rdi) movq %r14,PCB_R14(%rdi) movq %r15,PCB_R15(%rdi) movq %cr0,%rax movq %rax,PCB_CR0(%rdi) movq %cr2,%rax movq %rax,PCB_CR2(%rdi) movq %cr3,%rax movq %rax,PCB_CR3(%rdi) movq %cr4,%rax movq %rax,PCB_CR4(%rdi) movq %dr0,%rax movq %rax,PCB_DR0(%rdi) movq %dr1,%rax movq %rax,PCB_DR1(%rdi) movq %dr2,%rax movq %rax,PCB_DR2(%rdi) movq %dr3,%rax movq %rax,PCB_DR3(%rdi) movq %dr6,%rax movq %rax,PCB_DR6(%rdi) movq %dr7,%rax movq %rax,PCB_DR7(%rdi) movl $MSR_FSBASE,%ecx rdmsr movl %eax,PCB_FSBASE(%rdi) movl %edx,PCB_FSBASE+4(%rdi) movl $MSR_GSBASE,%ecx rdmsr movl %eax,PCB_GSBASE(%rdi) movl %edx,PCB_GSBASE+4(%rdi) movl $MSR_KGSBASE,%ecx rdmsr movl %eax,PCB_KGSBASE(%rdi) movl %edx,PCB_KGSBASE+4(%rdi) movl $MSR_EFER,%ecx rdmsr movl %eax,PCB_EFER(%rdi) movl %edx,PCB_EFER+4(%rdi) movl $MSR_STAR,%ecx rdmsr movl %eax,PCB_STAR(%rdi) movl %edx,PCB_STAR+4(%rdi) movl $MSR_LSTAR,%ecx rdmsr movl %eax,PCB_LSTAR(%rdi) movl %edx,PCB_LSTAR+4(%rdi) movl $MSR_CSTAR,%ecx rdmsr movl %eax,PCB_CSTAR(%rdi) movl %edx,PCB_CSTAR+4(%rdi) movl $MSR_SF_MASK,%ecx rdmsr movl %eax,PCB_SFMASK(%rdi) movl %edx,PCB_SFMASK+4(%rdi) sgdt PCB_GDT(%rdi) sidt PCB_IDT(%rdi) sldt PCB_LDT(%rdi) str PCB_TR(%rdi) movl $1,%eax ret END(savectx) /* * resumectx(pcb) * Resuming processor state from pcb. */ ENTRY(resumectx) /* Switch to KPML4phys. */ movq KPML4phys,%rax movq %rax,%cr3 /* Force kernel segment registers. */ movl $KDSEL,%eax movw %ax,%ds movw %ax,%es movw %ax,%ss movl $KUF32SEL,%eax movw %ax,%fs movl $KUG32SEL,%eax movw %ax,%gs movl $MSR_FSBASE,%ecx movl PCB_FSBASE(%rdi),%eax movl 4 + PCB_FSBASE(%rdi),%edx wrmsr movl $MSR_GSBASE,%ecx movl PCB_GSBASE(%rdi),%eax movl 4 + PCB_GSBASE(%rdi),%edx wrmsr movl $MSR_KGSBASE,%ecx movl PCB_KGSBASE(%rdi),%eax movl 4 + PCB_KGSBASE(%rdi),%edx wrmsr /* Restore EFER one more time. */ movl $MSR_EFER,%ecx movl PCB_EFER(%rdi),%eax wrmsr /* Restore fast syscall stuff. */ movl $MSR_STAR,%ecx movl PCB_STAR(%rdi),%eax movl 4 + PCB_STAR(%rdi),%edx wrmsr movl $MSR_LSTAR,%ecx movl PCB_LSTAR(%rdi),%eax movl 4 + PCB_LSTAR(%rdi),%edx wrmsr movl $MSR_CSTAR,%ecx movl PCB_CSTAR(%rdi),%eax movl 4 + PCB_CSTAR(%rdi),%edx wrmsr movl $MSR_SF_MASK,%ecx movl PCB_SFMASK(%rdi),%eax wrmsr /* Restore CR0, CR2, CR4 and CR3. */ movq PCB_CR0(%rdi),%rax movq %rax,%cr0 movq PCB_CR2(%rdi),%rax movq %rax,%cr2 movq PCB_CR4(%rdi),%rax movq %rax,%cr4 movq PCB_CR3(%rdi),%rax movq %rax,%cr3 /* Restore descriptor tables. */ lidt PCB_IDT(%rdi) lldt PCB_LDT(%rdi) #define SDT_SYSTSS 9 #define SDT_SYSBSY 11 /* Clear "task busy" bit and reload TR. */ movq PCPU(TSS),%rax andb $(~SDT_SYSBSY | SDT_SYSTSS),5(%rax) movw PCB_TR(%rdi),%ax ltr %ax #undef SDT_SYSTSS #undef SDT_SYSBSY /* Restore debug registers. */ movq PCB_DR0(%rdi),%rax movq %rax,%dr0 movq PCB_DR1(%rdi),%rax movq %rax,%dr1 movq PCB_DR2(%rdi),%rax movq %rax,%dr2 movq PCB_DR3(%rdi),%rax movq %rax,%dr3 movq PCB_DR6(%rdi),%rax movq %rax,%dr6 movq PCB_DR7(%rdi),%rax movq %rax,%dr7 /* Restore other callee saved registers. */ movq PCB_R15(%rdi),%r15 movq PCB_R14(%rdi),%r14 movq PCB_R13(%rdi),%r13 movq PCB_R12(%rdi),%r12 movq PCB_RBP(%rdi),%rbp movq PCB_RSP(%rdi),%rsp movq PCB_RBX(%rdi),%rbx /* Restore return address. */ movq PCB_RIP(%rdi),%rax movq %rax,(%rsp) xorl %eax,%eax ret END(resumectx) Index: releng/11.1/sys/amd64/amd64/fpu.c =================================================================== --- releng/11.1/sys/amd64/amd64/fpu.c (revision 335464) +++ releng/11.1/sys/amd64/amd64/fpu.c (revision 335465) @@ -1,1109 +1,1145 @@ /*- * Copyright (c) 1990 William Jolitz. * Copyright (c) 1991 The Regents of the University of California. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)npx.c 7.2 (Berkeley) 5/12/91 */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * Floating point support. */ #if defined(__GNUCLIKE_ASM) && !defined(lint) #define fldcw(cw) __asm __volatile("fldcw %0" : : "m" (cw)) #define fnclex() __asm __volatile("fnclex") #define fninit() __asm __volatile("fninit") #define fnstcw(addr) __asm __volatile("fnstcw %0" : "=m" (*(addr))) #define fnstsw(addr) __asm __volatile("fnstsw %0" : "=am" (*(addr))) #define fxrstor(addr) __asm __volatile("fxrstor %0" : : "m" (*(addr))) #define fxsave(addr) __asm __volatile("fxsave %0" : "=m" (*(addr))) #define ldmxcsr(csr) __asm __volatile("ldmxcsr %0" : : "m" (csr)) #define stmxcsr(addr) __asm __volatile("stmxcsr %0" : : "m" (*(addr))) static __inline void xrstor(char *addr, uint64_t mask) { uint32_t low, hi; low = mask; hi = mask >> 32; __asm __volatile("xrstor %0" : : "m" (*addr), "a" (low), "d" (hi)); } static __inline void xsave(char *addr, uint64_t mask) { uint32_t low, hi; low = mask; hi = mask >> 32; __asm __volatile("xsave %0" : "=m" (*addr) : "a" (low), "d" (hi) : "memory"); } #else /* !(__GNUCLIKE_ASM && !lint) */ void fldcw(u_short cw); void fnclex(void); void fninit(void); void fnstcw(caddr_t addr); void fnstsw(caddr_t addr); void fxsave(caddr_t addr); void fxrstor(caddr_t addr); void ldmxcsr(u_int csr); void stmxcsr(u_int *csr); void xrstor(char *addr, uint64_t mask); void xsave(char *addr, uint64_t mask); #endif /* __GNUCLIKE_ASM && !lint */ #define start_emulating() load_cr0(rcr0() | CR0_TS) #define stop_emulating() clts() CTASSERT(sizeof(struct savefpu) == 512); CTASSERT(sizeof(struct xstate_hdr) == 64); CTASSERT(sizeof(struct savefpu_ymm) == 832); /* * This requirement is to make it easier for asm code to calculate * offset of the fpu save area from the pcb address. FPU save area * must be 64-byte aligned. */ CTASSERT(sizeof(struct pcb) % XSAVE_AREA_ALIGN == 0); /* * Ensure the copy of XCR0 saved in a core is contained in the padding * area. */ CTASSERT(X86_XSTATE_XCR0_OFFSET >= offsetof(struct savefpu, sv_pad) && X86_XSTATE_XCR0_OFFSET + sizeof(uint64_t) <= sizeof(struct savefpu)); static void fpu_clean_state(void); SYSCTL_INT(_hw, HW_FLOATINGPT, floatingpoint, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, 1, "Floating point instructions executed in hardware"); +int lazy_fpu_switch = 0; +SYSCTL_INT(_hw, OID_AUTO, lazy_fpu_switch, CTLFLAG_RWTUN | CTLFLAG_NOFETCH, + &lazy_fpu_switch, 0, + "Lazily load FPU context after context switch"); + int use_xsave; /* non-static for cpu_switch.S */ uint64_t xsave_mask; /* the same */ static uma_zone_t fpu_save_area_zone; static struct savefpu *fpu_initialstate; struct xsave_area_elm_descr { u_int offset; u_int size; } *xsave_area_desc; void fpusave(void *addr) { if (use_xsave) xsave((char *)addr, xsave_mask); else fxsave((char *)addr); } void fpurestore(void *addr) { if (use_xsave) xrstor((char *)addr, xsave_mask); else fxrstor((char *)addr); } void fpususpend(void *addr) { u_long cr0; cr0 = rcr0(); stop_emulating(); fpusave(addr); load_cr0(cr0); } void fpuresume(void *addr) { u_long cr0; cr0 = rcr0(); stop_emulating(); fninit(); if (use_xsave) load_xcr(XCR0, xsave_mask); fpurestore(addr); load_cr0(cr0); } /* * Enable XSAVE if supported and allowed by user. * Calculate the xsave_mask. */ static void fpuinit_bsp1(void) { u_int cp[4]; uint64_t xsave_mask_user; + TUNABLE_INT_FETCH("hw.lazy_fpu_switch", &lazy_fpu_switch); if ((cpu_feature2 & CPUID2_XSAVE) != 0) { use_xsave = 1; TUNABLE_INT_FETCH("hw.use_xsave", &use_xsave); } if (!use_xsave) return; cpuid_count(0xd, 0x0, cp); xsave_mask = XFEATURE_ENABLED_X87 | XFEATURE_ENABLED_SSE; if ((cp[0] & xsave_mask) != xsave_mask) panic("CPU0 does not support X87 or SSE: %x", cp[0]); xsave_mask = ((uint64_t)cp[3] << 32) | cp[0]; xsave_mask_user = xsave_mask; TUNABLE_ULONG_FETCH("hw.xsave_mask", &xsave_mask_user); xsave_mask_user |= XFEATURE_ENABLED_X87 | XFEATURE_ENABLED_SSE; xsave_mask &= xsave_mask_user; if ((xsave_mask & XFEATURE_AVX512) != XFEATURE_AVX512) xsave_mask &= ~XFEATURE_AVX512; if ((xsave_mask & XFEATURE_MPX) != XFEATURE_MPX) xsave_mask &= ~XFEATURE_MPX; cpuid_count(0xd, 0x1, cp); if ((cp[0] & CPUID_EXTSTATE_XSAVEOPT) != 0) { /* * Patch the XSAVE instruction in the cpu_switch code * to XSAVEOPT. We assume that XSAVE encoding used * REX byte, and set the bit 4 of the r/m byte. */ ctx_switch_xsave[3] |= 0x10; } } /* * Calculate the fpu save area size. */ static void fpuinit_bsp2(void) { u_int cp[4]; if (use_xsave) { cpuid_count(0xd, 0x0, cp); cpu_max_ext_state_size = cp[1]; /* * Reload the cpu_feature2, since we enabled OSXSAVE. */ do_cpuid(1, cp); cpu_feature2 = cp[2]; } else cpu_max_ext_state_size = sizeof(struct savefpu); } /* * Initialize the floating point unit. */ void fpuinit(void) { register_t saveintr; u_int mxcsr; u_short control; if (IS_BSP()) fpuinit_bsp1(); if (use_xsave) { load_cr4(rcr4() | CR4_XSAVE); load_xcr(XCR0, xsave_mask); } /* * XCR0 shall be set up before CPU can report the save area size. */ if (IS_BSP()) fpuinit_bsp2(); /* * It is too early for critical_enter() to work on AP. */ saveintr = intr_disable(); stop_emulating(); fninit(); control = __INITIAL_FPUCW__; fldcw(control); mxcsr = __INITIAL_MXCSR__; ldmxcsr(mxcsr); start_emulating(); intr_restore(saveintr); } /* * On the boot CPU we generate a clean state that is used to * initialize the floating point unit when it is first used by a * process. */ static void fpuinitstate(void *arg __unused) { register_t saveintr; int cp[4], i, max_ext_n; fpu_initialstate = malloc(cpu_max_ext_state_size, M_DEVBUF, M_WAITOK | M_ZERO); saveintr = intr_disable(); stop_emulating(); fpusave(fpu_initialstate); if (fpu_initialstate->sv_env.en_mxcsr_mask) cpu_mxcsr_mask = fpu_initialstate->sv_env.en_mxcsr_mask; else cpu_mxcsr_mask = 0xFFBF; /* * The fninit instruction does not modify XMM registers or x87 * registers (MM/ST). The fpusave call dumped the garbage * contained in the registers after reset to the initial state * saved. Clear XMM and x87 registers file image to make the * startup program state and signal handler XMM/x87 register * content predictable. */ bzero(fpu_initialstate->sv_fp, sizeof(fpu_initialstate->sv_fp)); bzero(fpu_initialstate->sv_xmm, sizeof(fpu_initialstate->sv_xmm)); /* * Create a table describing the layout of the CPU Extended * Save Area. */ if (use_xsave) { max_ext_n = flsl(xsave_mask); xsave_area_desc = malloc(max_ext_n * sizeof(struct xsave_area_elm_descr), M_DEVBUF, M_WAITOK | M_ZERO); /* x87 state */ xsave_area_desc[0].offset = 0; xsave_area_desc[0].size = 160; /* XMM */ xsave_area_desc[1].offset = 160; xsave_area_desc[1].size = 288 - 160; for (i = 2; i < max_ext_n; i++) { cpuid_count(0xd, i, cp); xsave_area_desc[i].offset = cp[1]; xsave_area_desc[i].size = cp[0]; } } fpu_save_area_zone = uma_zcreate("FPU_save_area", cpu_max_ext_state_size, NULL, NULL, NULL, NULL, XSAVE_AREA_ALIGN - 1, 0); start_emulating(); intr_restore(saveintr); } SYSINIT(fpuinitstate, SI_SUB_DRIVERS, SI_ORDER_ANY, fpuinitstate, NULL); /* * Free coprocessor (if we have it). */ void fpuexit(struct thread *td) { critical_enter(); if (curthread == PCPU_GET(fpcurthread)) { stop_emulating(); fpusave(curpcb->pcb_save); start_emulating(); PCPU_SET(fpcurthread, NULL); } critical_exit(); } int fpuformat(void) { return (_MC_FPFMT_XMM); } /* * The following mechanism is used to ensure that the FPE_... value * that is passed as a trapcode to the signal handler of the user * process does not have more than one bit set. * * Multiple bits may be set if the user process modifies the control * word while a status word bit is already set. While this is a sign * of bad coding, we have no choise than to narrow them down to one * bit, since we must not send a trapcode that is not exactly one of * the FPE_ macros. * * The mechanism has a static table with 127 entries. Each combination * of the 7 FPU status word exception bits directly translates to a * position in this table, where a single FPE_... value is stored. * This FPE_... value stored there is considered the "most important" * of the exception bits and will be sent as the signal code. The * precedence of the bits is based upon Intel Document "Numerical * Applications", Chapter "Special Computational Situations". * * The macro to choose one of these values does these steps: 1) Throw * away status word bits that cannot be masked. 2) Throw away the bits * currently masked in the control word, assuming the user isn't * interested in them anymore. 3) Reinsert status word bit 7 (stack * fault) if it is set, which cannot be masked but must be presered. * 4) Use the remaining bits to point into the trapcode table. * * The 6 maskable bits in order of their preference, as stated in the * above referenced Intel manual: * 1 Invalid operation (FP_X_INV) * 1a Stack underflow * 1b Stack overflow * 1c Operand of unsupported format * 1d SNaN operand. * 2 QNaN operand (not an exception, irrelavant here) * 3 Any other invalid-operation not mentioned above or zero divide * (FP_X_INV, FP_X_DZ) * 4 Denormal operand (FP_X_DNML) * 5 Numeric over/underflow (FP_X_OFL, FP_X_UFL) * 6 Inexact result (FP_X_IMP) */ static char fpetable[128] = { 0, FPE_FLTINV, /* 1 - INV */ FPE_FLTUND, /* 2 - DNML */ FPE_FLTINV, /* 3 - INV | DNML */ FPE_FLTDIV, /* 4 - DZ */ FPE_FLTINV, /* 5 - INV | DZ */ FPE_FLTDIV, /* 6 - DNML | DZ */ FPE_FLTINV, /* 7 - INV | DNML | DZ */ FPE_FLTOVF, /* 8 - OFL */ FPE_FLTINV, /* 9 - INV | OFL */ FPE_FLTUND, /* A - DNML | OFL */ FPE_FLTINV, /* B - INV | DNML | OFL */ FPE_FLTDIV, /* C - DZ | OFL */ FPE_FLTINV, /* D - INV | DZ | OFL */ FPE_FLTDIV, /* E - DNML | DZ | OFL */ FPE_FLTINV, /* F - INV | DNML | DZ | OFL */ FPE_FLTUND, /* 10 - UFL */ FPE_FLTINV, /* 11 - INV | UFL */ FPE_FLTUND, /* 12 - DNML | UFL */ FPE_FLTINV, /* 13 - INV | DNML | UFL */ FPE_FLTDIV, /* 14 - DZ | UFL */ FPE_FLTINV, /* 15 - INV | DZ | UFL */ FPE_FLTDIV, /* 16 - DNML | DZ | UFL */ FPE_FLTINV, /* 17 - INV | DNML | DZ | UFL */ FPE_FLTOVF, /* 18 - OFL | UFL */ FPE_FLTINV, /* 19 - INV | OFL | UFL */ FPE_FLTUND, /* 1A - DNML | OFL | UFL */ FPE_FLTINV, /* 1B - INV | DNML | OFL | UFL */ FPE_FLTDIV, /* 1C - DZ | OFL | UFL */ FPE_FLTINV, /* 1D - INV | DZ | OFL | UFL */ FPE_FLTDIV, /* 1E - DNML | DZ | OFL | UFL */ FPE_FLTINV, /* 1F - INV | DNML | DZ | OFL | UFL */ FPE_FLTRES, /* 20 - IMP */ FPE_FLTINV, /* 21 - INV | IMP */ FPE_FLTUND, /* 22 - DNML | IMP */ FPE_FLTINV, /* 23 - INV | DNML | IMP */ FPE_FLTDIV, /* 24 - DZ | IMP */ FPE_FLTINV, /* 25 - INV | DZ | IMP */ FPE_FLTDIV, /* 26 - DNML | DZ | IMP */ FPE_FLTINV, /* 27 - INV | DNML | DZ | IMP */ FPE_FLTOVF, /* 28 - OFL | IMP */ FPE_FLTINV, /* 29 - INV | OFL | IMP */ FPE_FLTUND, /* 2A - DNML | OFL | IMP */ FPE_FLTINV, /* 2B - INV | DNML | OFL | IMP */ FPE_FLTDIV, /* 2C - DZ | OFL | IMP */ FPE_FLTINV, /* 2D - INV | DZ | OFL | IMP */ FPE_FLTDIV, /* 2E - DNML | DZ | OFL | IMP */ FPE_FLTINV, /* 2F - INV | DNML | DZ | OFL | IMP */ FPE_FLTUND, /* 30 - UFL | IMP */ FPE_FLTINV, /* 31 - INV | UFL | IMP */ FPE_FLTUND, /* 32 - DNML | UFL | IMP */ FPE_FLTINV, /* 33 - INV | DNML | UFL | IMP */ FPE_FLTDIV, /* 34 - DZ | UFL | IMP */ FPE_FLTINV, /* 35 - INV | DZ | UFL | IMP */ FPE_FLTDIV, /* 36 - DNML | DZ | UFL | IMP */ FPE_FLTINV, /* 37 - INV | DNML | DZ | UFL | IMP */ FPE_FLTOVF, /* 38 - OFL | UFL | IMP */ FPE_FLTINV, /* 39 - INV | OFL | UFL | IMP */ FPE_FLTUND, /* 3A - DNML | OFL | UFL | IMP */ FPE_FLTINV, /* 3B - INV | DNML | OFL | UFL | IMP */ FPE_FLTDIV, /* 3C - DZ | OFL | UFL | IMP */ FPE_FLTINV, /* 3D - INV | DZ | OFL | UFL | IMP */ FPE_FLTDIV, /* 3E - DNML | DZ | OFL | UFL | IMP */ FPE_FLTINV, /* 3F - INV | DNML | DZ | OFL | UFL | IMP */ FPE_FLTSUB, /* 40 - STK */ FPE_FLTSUB, /* 41 - INV | STK */ FPE_FLTUND, /* 42 - DNML | STK */ FPE_FLTSUB, /* 43 - INV | DNML | STK */ FPE_FLTDIV, /* 44 - DZ | STK */ FPE_FLTSUB, /* 45 - INV | DZ | STK */ FPE_FLTDIV, /* 46 - DNML | DZ | STK */ FPE_FLTSUB, /* 47 - INV | DNML | DZ | STK */ FPE_FLTOVF, /* 48 - OFL | STK */ FPE_FLTSUB, /* 49 - INV | OFL | STK */ FPE_FLTUND, /* 4A - DNML | OFL | STK */ FPE_FLTSUB, /* 4B - INV | DNML | OFL | STK */ FPE_FLTDIV, /* 4C - DZ | OFL | STK */ FPE_FLTSUB, /* 4D - INV | DZ | OFL | STK */ FPE_FLTDIV, /* 4E - DNML | DZ | OFL | STK */ FPE_FLTSUB, /* 4F - INV | DNML | DZ | OFL | STK */ FPE_FLTUND, /* 50 - UFL | STK */ FPE_FLTSUB, /* 51 - INV | UFL | STK */ FPE_FLTUND, /* 52 - DNML | UFL | STK */ FPE_FLTSUB, /* 53 - INV | DNML | UFL | STK */ FPE_FLTDIV, /* 54 - DZ | UFL | STK */ FPE_FLTSUB, /* 55 - INV | DZ | UFL | STK */ FPE_FLTDIV, /* 56 - DNML | DZ | UFL | STK */ FPE_FLTSUB, /* 57 - INV | DNML | DZ | UFL | STK */ FPE_FLTOVF, /* 58 - OFL | UFL | STK */ FPE_FLTSUB, /* 59 - INV | OFL | UFL | STK */ FPE_FLTUND, /* 5A - DNML | OFL | UFL | STK */ FPE_FLTSUB, /* 5B - INV | DNML | OFL | UFL | STK */ FPE_FLTDIV, /* 5C - DZ | OFL | UFL | STK */ FPE_FLTSUB, /* 5D - INV | DZ | OFL | UFL | STK */ FPE_FLTDIV, /* 5E - DNML | DZ | OFL | UFL | STK */ FPE_FLTSUB, /* 5F - INV | DNML | DZ | OFL | UFL | STK */ FPE_FLTRES, /* 60 - IMP | STK */ FPE_FLTSUB, /* 61 - INV | IMP | STK */ FPE_FLTUND, /* 62 - DNML | IMP | STK */ FPE_FLTSUB, /* 63 - INV | DNML | IMP | STK */ FPE_FLTDIV, /* 64 - DZ | IMP | STK */ FPE_FLTSUB, /* 65 - INV | DZ | IMP | STK */ FPE_FLTDIV, /* 66 - DNML | DZ | IMP | STK */ FPE_FLTSUB, /* 67 - INV | DNML | DZ | IMP | STK */ FPE_FLTOVF, /* 68 - OFL | IMP | STK */ FPE_FLTSUB, /* 69 - INV | OFL | IMP | STK */ FPE_FLTUND, /* 6A - DNML | OFL | IMP | STK */ FPE_FLTSUB, /* 6B - INV | DNML | OFL | IMP | STK */ FPE_FLTDIV, /* 6C - DZ | OFL | IMP | STK */ FPE_FLTSUB, /* 6D - INV | DZ | OFL | IMP | STK */ FPE_FLTDIV, /* 6E - DNML | DZ | OFL | IMP | STK */ FPE_FLTSUB, /* 6F - INV | DNML | DZ | OFL | IMP | STK */ FPE_FLTUND, /* 70 - UFL | IMP | STK */ FPE_FLTSUB, /* 71 - INV | UFL | IMP | STK */ FPE_FLTUND, /* 72 - DNML | UFL | IMP | STK */ FPE_FLTSUB, /* 73 - INV | DNML | UFL | IMP | STK */ FPE_FLTDIV, /* 74 - DZ | UFL | IMP | STK */ FPE_FLTSUB, /* 75 - INV | DZ | UFL | IMP | STK */ FPE_FLTDIV, /* 76 - DNML | DZ | UFL | IMP | STK */ FPE_FLTSUB, /* 77 - INV | DNML | DZ | UFL | IMP | STK */ FPE_FLTOVF, /* 78 - OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 79 - INV | OFL | UFL | IMP | STK */ FPE_FLTUND, /* 7A - DNML | OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 7B - INV | DNML | OFL | UFL | IMP | STK */ FPE_FLTDIV, /* 7C - DZ | OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 7D - INV | DZ | OFL | UFL | IMP | STK */ FPE_FLTDIV, /* 7E - DNML | DZ | OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 7F - INV | DNML | DZ | OFL | UFL | IMP | STK */ }; /* * Read the FP status and control words, then generate si_code value * for SIGFPE. The error code chosen will be one of the * FPE_... macros. It will be sent as the second argument to old * BSD-style signal handlers and as "siginfo_t->si_code" (second * argument) to SA_SIGINFO signal handlers. * * Some time ago, we cleared the x87 exceptions with FNCLEX there. * Clearing exceptions was necessary mainly to avoid IRQ13 bugs. The * usermode code which understands the FPU hardware enough to enable * the exceptions, can also handle clearing the exception state in the * handler. The only consequence of not clearing the exception is the * rethrow of the SIGFPE on return from the signal handler and * reexecution of the corresponding instruction. * * For XMM traps, the exceptions were never cleared. */ int fputrap_x87(void) { struct savefpu *pcb_save; u_short control, status; critical_enter(); /* * Interrupt handling (for another interrupt) may have pushed the * state to memory. Fetch the relevant parts of the state from * wherever they are. */ if (PCPU_GET(fpcurthread) != curthread) { pcb_save = curpcb->pcb_save; control = pcb_save->sv_env.en_cw; status = pcb_save->sv_env.en_sw; } else { fnstcw(&control); fnstsw(&status); } critical_exit(); return (fpetable[status & ((~control & 0x3f) | 0x40)]); } int fputrap_sse(void) { u_int mxcsr; critical_enter(); if (PCPU_GET(fpcurthread) != curthread) mxcsr = curpcb->pcb_save->sv_env.en_mxcsr; else stmxcsr(&mxcsr); critical_exit(); return (fpetable[(mxcsr & (~mxcsr >> 7)) & 0x3f]); } +static void +restore_fpu_curthread(struct thread *td) +{ + struct pcb *pcb; + + /* + * Record new context early in case frstor causes a trap. + */ + PCPU_SET(fpcurthread, td); + + stop_emulating(); + fpu_clean_state(); + pcb = td->td_pcb; + + if ((pcb->pcb_flags & PCB_FPUINITDONE) == 0) { + /* + * This is the first time this thread has used the FPU or + * the PCB doesn't contain a clean FPU state. Explicitly + * load an initial state. + * + * We prefer to restore the state from the actual save + * area in PCB instead of directly loading from + * fpu_initialstate, to ignite the XSAVEOPT + * tracking engine. + */ + bcopy(fpu_initialstate, pcb->pcb_save, + cpu_max_ext_state_size); + fpurestore(pcb->pcb_save); + if (pcb->pcb_initial_fpucw != __INITIAL_FPUCW__) + fldcw(pcb->pcb_initial_fpucw); + if (PCB_USER_FPU(pcb)) + set_pcb_flags(pcb, PCB_FPUINITDONE | + PCB_USERFPUINITDONE); + else + set_pcb_flags(pcb, PCB_FPUINITDONE); + } else + fpurestore(pcb->pcb_save); +} + /* * Device Not Available (DNA, #NM) exception handler. * * It would be better to switch FP context here (if curthread != * fpcurthread) and not necessarily for every context switch, but it * is too hard to access foreign pcb's. */ void fpudna(void) { + struct thread *td; + td = curthread; /* * This handler is entered with interrupts enabled, so context * switches may occur before critical_enter() is executed. If * a context switch occurs, then when we regain control, our * state will have been completely restored. The CPU may * change underneath us, but the only part of our context that * lives in the CPU is CR0.TS and that will be "restored" by * setting it on the new CPU. */ critical_enter(); KASSERT((curpcb->pcb_flags & PCB_FPUNOSAVE) == 0, ("fpudna while in fpu_kern_enter(FPU_KERN_NOCTX)")); - if (PCPU_GET(fpcurthread) == curthread) { - printf("fpudna: fpcurthread == curthread\n"); + if (__predict_false(PCPU_GET(fpcurthread) == td)) { + /* + * Some virtual machines seems to set %cr0.TS at + * arbitrary moments. Silently clear the TS bit + * regardless of the eager/lazy FPU context switch + * mode. + */ stop_emulating(); - critical_exit(); - return; + } else { + if (__predict_false(PCPU_GET(fpcurthread) != NULL)) { + panic( + "fpudna: fpcurthread = %p (%d), curthread = %p (%d)\n", + PCPU_GET(fpcurthread), + PCPU_GET(fpcurthread)->td_tid, td, td->td_tid); + } + restore_fpu_curthread(td); } - if (PCPU_GET(fpcurthread) != NULL) { - panic("fpudna: fpcurthread = %p (%d), curthread = %p (%d)\n", - PCPU_GET(fpcurthread), PCPU_GET(fpcurthread)->td_tid, - curthread, curthread->td_tid); - } - stop_emulating(); - /* - * Record new context early in case frstor causes a trap. - */ - PCPU_SET(fpcurthread, curthread); + critical_exit(); +} - fpu_clean_state(); +void fpu_activate_sw(struct thread *td); /* Called from the context switch */ +void +fpu_activate_sw(struct thread *td) +{ - if ((curpcb->pcb_flags & PCB_FPUINITDONE) == 0) { - /* - * This is the first time this thread has used the FPU or - * the PCB doesn't contain a clean FPU state. Explicitly - * load an initial state. - * - * We prefer to restore the state from the actual save - * area in PCB instead of directly loading from - * fpu_initialstate, to ignite the XSAVEOPT - * tracking engine. - */ - bcopy(fpu_initialstate, curpcb->pcb_save, - cpu_max_ext_state_size); - fpurestore(curpcb->pcb_save); - if (curpcb->pcb_initial_fpucw != __INITIAL_FPUCW__) - fldcw(curpcb->pcb_initial_fpucw); - if (PCB_USER_FPU(curpcb)) - set_pcb_flags(curpcb, - PCB_FPUINITDONE | PCB_USERFPUINITDONE); - else - set_pcb_flags(curpcb, PCB_FPUINITDONE); - } else - fpurestore(curpcb->pcb_save); - critical_exit(); + if (lazy_fpu_switch || (td->td_pflags & TDP_KTHREAD) != 0 || + !PCB_USER_FPU(td->td_pcb)) { + PCPU_SET(fpcurthread, NULL); + start_emulating(); + } else if (PCPU_GET(fpcurthread) != td) { + restore_fpu_curthread(td); + } } void fpudrop(void) { struct thread *td; td = PCPU_GET(fpcurthread); KASSERT(td == curthread, ("fpudrop: fpcurthread != curthread")); CRITICAL_ASSERT(td); PCPU_SET(fpcurthread, NULL); clear_pcb_flags(td->td_pcb, PCB_FPUINITDONE); start_emulating(); } /* * Get the user state of the FPU into pcb->pcb_user_save without * dropping ownership (if possible). It returns the FPU ownership * status. */ int fpugetregs(struct thread *td) { struct pcb *pcb; uint64_t *xstate_bv, bit; char *sa; int max_ext_n, i, owned; pcb = td->td_pcb; if ((pcb->pcb_flags & PCB_USERFPUINITDONE) == 0) { bcopy(fpu_initialstate, get_pcb_user_save_pcb(pcb), cpu_max_ext_state_size); get_pcb_user_save_pcb(pcb)->sv_env.en_cw = pcb->pcb_initial_fpucw; fpuuserinited(td); return (_MC_FPOWNED_PCB); } critical_enter(); if (td == PCPU_GET(fpcurthread) && PCB_USER_FPU(pcb)) { fpusave(get_pcb_user_save_pcb(pcb)); owned = _MC_FPOWNED_FPU; } else { owned = _MC_FPOWNED_PCB; } critical_exit(); if (use_xsave) { /* * Handle partially saved state. */ sa = (char *)get_pcb_user_save_pcb(pcb); xstate_bv = (uint64_t *)(sa + sizeof(struct savefpu) + offsetof(struct xstate_hdr, xstate_bv)); max_ext_n = flsl(xsave_mask); for (i = 0; i < max_ext_n; i++) { bit = 1ULL << i; if ((xsave_mask & bit) == 0 || (*xstate_bv & bit) != 0) continue; bcopy((char *)fpu_initialstate + xsave_area_desc[i].offset, sa + xsave_area_desc[i].offset, xsave_area_desc[i].size); *xstate_bv |= bit; } } return (owned); } void fpuuserinited(struct thread *td) { struct pcb *pcb; pcb = td->td_pcb; if (PCB_USER_FPU(pcb)) set_pcb_flags(pcb, PCB_FPUINITDONE | PCB_USERFPUINITDONE); else set_pcb_flags(pcb, PCB_FPUINITDONE); } int fpusetxstate(struct thread *td, char *xfpustate, size_t xfpustate_size) { struct xstate_hdr *hdr, *ehdr; size_t len, max_len; uint64_t bv; /* XXXKIB should we clear all extended state in xstate_bv instead ? */ if (xfpustate == NULL) return (0); if (!use_xsave) return (EOPNOTSUPP); len = xfpustate_size; if (len < sizeof(struct xstate_hdr)) return (EINVAL); max_len = cpu_max_ext_state_size - sizeof(struct savefpu); if (len > max_len) return (EINVAL); ehdr = (struct xstate_hdr *)xfpustate; bv = ehdr->xstate_bv; /* * Avoid #gp. */ if (bv & ~xsave_mask) return (EINVAL); hdr = (struct xstate_hdr *)(get_pcb_user_save_td(td) + 1); hdr->xstate_bv = bv; bcopy(xfpustate + sizeof(struct xstate_hdr), (char *)(hdr + 1), len - sizeof(struct xstate_hdr)); return (0); } /* * Set the state of the FPU. */ int fpusetregs(struct thread *td, struct savefpu *addr, char *xfpustate, size_t xfpustate_size) { struct pcb *pcb; int error; pcb = td->td_pcb; critical_enter(); if (td == PCPU_GET(fpcurthread) && PCB_USER_FPU(pcb)) { error = fpusetxstate(td, xfpustate, xfpustate_size); if (error != 0) { critical_exit(); return (error); } bcopy(addr, get_pcb_user_save_td(td), sizeof(*addr)); fpurestore(get_pcb_user_save_td(td)); critical_exit(); set_pcb_flags(pcb, PCB_FPUINITDONE | PCB_USERFPUINITDONE); } else { critical_exit(); error = fpusetxstate(td, xfpustate, xfpustate_size); if (error != 0) return (error); bcopy(addr, get_pcb_user_save_td(td), sizeof(*addr)); fpuuserinited(td); } return (0); } /* * On AuthenticAMD processors, the fxrstor instruction does not restore * the x87's stored last instruction pointer, last data pointer, and last * opcode values, except in the rare case in which the exception summary * (ES) bit in the x87 status word is set to 1. * * In order to avoid leaking this information across processes, we clean * these values by performing a dummy load before executing fxrstor(). */ static void fpu_clean_state(void) { static float dummy_variable = 0.0; u_short status; /* * Clear the ES bit in the x87 status word if it is currently * set, in order to avoid causing a fault in the upcoming load. */ fnstsw(&status); if (status & 0x80) fnclex(); /* * Load the dummy variable into the x87 stack. This mangles * the x87 stack, but we don't care since we're about to call * fxrstor() anyway. */ __asm __volatile("ffree %%st(7); flds %0" : : "m" (dummy_variable)); } /* * This really sucks. We want the acpi version only, but it requires * the isa_if.h file in order to get the definitions. */ #include "opt_isa.h" #ifdef DEV_ISA #include /* * This sucks up the legacy ISA support assignments from PNPBIOS/ACPI. */ static struct isa_pnp_id fpupnp_ids[] = { { 0x040cd041, "Legacy ISA coprocessor support" }, /* PNP0C04 */ { 0 } }; static int fpupnp_probe(device_t dev) { int result; result = ISA_PNP_PROBE(device_get_parent(dev), dev, fpupnp_ids); if (result <= 0) device_quiet(dev); return (result); } static int fpupnp_attach(device_t dev) { return (0); } static device_method_t fpupnp_methods[] = { /* Device interface */ DEVMETHOD(device_probe, fpupnp_probe), DEVMETHOD(device_attach, fpupnp_attach), DEVMETHOD(device_detach, bus_generic_detach), DEVMETHOD(device_shutdown, bus_generic_shutdown), DEVMETHOD(device_suspend, bus_generic_suspend), DEVMETHOD(device_resume, bus_generic_resume), { 0, 0 } }; static driver_t fpupnp_driver = { "fpupnp", fpupnp_methods, 1, /* no softc */ }; static devclass_t fpupnp_devclass; DRIVER_MODULE(fpupnp, acpi, fpupnp_driver, fpupnp_devclass, 0, 0); #endif /* DEV_ISA */ static MALLOC_DEFINE(M_FPUKERN_CTX, "fpukern_ctx", "Kernel contexts for FPU state"); #define FPU_KERN_CTX_FPUINITDONE 0x01 #define FPU_KERN_CTX_DUMMY 0x02 /* avoided save for the kern thread */ #define FPU_KERN_CTX_INUSE 0x04 struct fpu_kern_ctx { struct savefpu *prev; uint32_t flags; char hwstate1[]; }; struct fpu_kern_ctx * fpu_kern_alloc_ctx(u_int flags) { struct fpu_kern_ctx *res; size_t sz; sz = sizeof(struct fpu_kern_ctx) + XSAVE_AREA_ALIGN + cpu_max_ext_state_size; res = malloc(sz, M_FPUKERN_CTX, ((flags & FPU_KERN_NOWAIT) ? M_NOWAIT : M_WAITOK) | M_ZERO); return (res); } void fpu_kern_free_ctx(struct fpu_kern_ctx *ctx) { KASSERT((ctx->flags & FPU_KERN_CTX_INUSE) == 0, ("free'ing inuse ctx")); /* XXXKIB clear the memory ? */ free(ctx, M_FPUKERN_CTX); } static struct savefpu * fpu_kern_ctx_savefpu(struct fpu_kern_ctx *ctx) { vm_offset_t p; p = (vm_offset_t)&ctx->hwstate1; p = roundup2(p, XSAVE_AREA_ALIGN); return ((struct savefpu *)p); } int fpu_kern_enter(struct thread *td, struct fpu_kern_ctx *ctx, u_int flags) { struct pcb *pcb; pcb = td->td_pcb; KASSERT((flags & FPU_KERN_NOCTX) != 0 || ctx != NULL, ("ctx is required when !FPU_KERN_NOCTX")); KASSERT(ctx == NULL || (ctx->flags & FPU_KERN_CTX_INUSE) == 0, ("using inuse ctx")); KASSERT((pcb->pcb_flags & PCB_FPUNOSAVE) == 0, ("recursive fpu_kern_enter while in PCB_FPUNOSAVE state")); if ((flags & FPU_KERN_NOCTX) != 0) { critical_enter(); stop_emulating(); if (curthread == PCPU_GET(fpcurthread)) { fpusave(curpcb->pcb_save); PCPU_SET(fpcurthread, NULL); } else { KASSERT(PCPU_GET(fpcurthread) == NULL, ("invalid fpcurthread")); } /* * This breaks XSAVEOPT tracker, but * PCB_FPUNOSAVE state is supposed to never need to * save FPU context at all. */ fpurestore(fpu_initialstate); set_pcb_flags(pcb, PCB_KERNFPU | PCB_FPUNOSAVE | PCB_FPUINITDONE); return (0); } if ((flags & FPU_KERN_KTHR) != 0 && is_fpu_kern_thread(0)) { ctx->flags = FPU_KERN_CTX_DUMMY | FPU_KERN_CTX_INUSE; return (0); } KASSERT(!PCB_USER_FPU(pcb) || pcb->pcb_save == get_pcb_user_save_pcb(pcb), ("mangled pcb_save")); ctx->flags = FPU_KERN_CTX_INUSE; if ((pcb->pcb_flags & PCB_FPUINITDONE) != 0) ctx->flags |= FPU_KERN_CTX_FPUINITDONE; fpuexit(td); ctx->prev = pcb->pcb_save; pcb->pcb_save = fpu_kern_ctx_savefpu(ctx); set_pcb_flags(pcb, PCB_KERNFPU); clear_pcb_flags(pcb, PCB_FPUINITDONE); return (0); } int fpu_kern_leave(struct thread *td, struct fpu_kern_ctx *ctx) { struct pcb *pcb; pcb = td->td_pcb; if ((pcb->pcb_flags & PCB_FPUNOSAVE) != 0) { KASSERT(ctx == NULL, ("non-null ctx after FPU_KERN_NOCTX")); KASSERT(PCPU_GET(fpcurthread) == NULL, ("non-NULL fpcurthread for PCB_FPUNOSAVE")); CRITICAL_ASSERT(td); clear_pcb_flags(pcb, PCB_FPUNOSAVE | PCB_FPUINITDONE); start_emulating(); critical_exit(); } else { KASSERT((ctx->flags & FPU_KERN_CTX_INUSE) != 0, ("leaving not inuse ctx")); ctx->flags &= ~FPU_KERN_CTX_INUSE; if (is_fpu_kern_thread(0) && (ctx->flags & FPU_KERN_CTX_DUMMY) != 0) return (0); KASSERT((ctx->flags & FPU_KERN_CTX_DUMMY) == 0, ("dummy ctx")); critical_enter(); if (curthread == PCPU_GET(fpcurthread)) fpudrop(); critical_exit(); pcb->pcb_save = ctx->prev; } if (pcb->pcb_save == get_pcb_user_save_pcb(pcb)) { if ((pcb->pcb_flags & PCB_USERFPUINITDONE) != 0) { set_pcb_flags(pcb, PCB_FPUINITDONE); clear_pcb_flags(pcb, PCB_KERNFPU); } else clear_pcb_flags(pcb, PCB_FPUINITDONE | PCB_KERNFPU); } else { if ((ctx->flags & FPU_KERN_CTX_FPUINITDONE) != 0) set_pcb_flags(pcb, PCB_FPUINITDONE); else clear_pcb_flags(pcb, PCB_FPUINITDONE); KASSERT(!PCB_USER_FPU(pcb), ("unpaired fpu_kern_leave")); } return (0); } int fpu_kern_thread(u_int flags) { KASSERT((curthread->td_pflags & TDP_KTHREAD) != 0, ("Only kthread may use fpu_kern_thread")); KASSERT(curpcb->pcb_save == get_pcb_user_save_pcb(curpcb), ("mangled pcb_save")); KASSERT(PCB_USER_FPU(curpcb), ("recursive call")); set_pcb_flags(curpcb, PCB_KERNFPU); return (0); } int is_fpu_kern_thread(u_int flags) { if ((curthread->td_pflags & TDP_KTHREAD) == 0) return (0); return ((curpcb->pcb_flags & PCB_KERNFPU) != 0); } /* * FPU save area alloc/free/init utility routines */ struct savefpu * fpu_save_area_alloc(void) { return (uma_zalloc(fpu_save_area_zone, 0)); } void fpu_save_area_free(struct savefpu *fsa) { uma_zfree(fpu_save_area_zone, fsa); } void fpu_save_area_reset(struct savefpu *fsa) { bcopy(fpu_initialstate, fsa, cpu_max_ext_state_size); } Index: releng/11.1/sys/conf/newvers.sh =================================================================== --- releng/11.1/sys/conf/newvers.sh (revision 335464) +++ releng/11.1/sys/conf/newvers.sh (revision 335465) @@ -1,291 +1,291 @@ #!/bin/sh - # # Copyright (c) 1984, 1986, 1990, 1993 # The Regents of the University of California. All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. Neither the name of the University nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # @(#)newvers.sh 8.1 (Berkeley) 4/20/94 # $FreeBSD$ # Command line options: # # -r Reproducible build. Do not embed directory names, user # names, time stamps or other dynamic information into # the output file. This is intended to allow two builds # done at different times and even by different people on # different hosts to produce identical output. # # -R Reproducible build if the tree represents an unmodified # checkout from a version control system. Metadata is # included if the tree is modified. TYPE="FreeBSD" REVISION="11.1" -BRANCH="RELEASE-p10" +BRANCH="RELEASE-p11" if [ -n "${BRANCH_OVERRIDE}" ]; then BRANCH=${BRANCH_OVERRIDE} fi RELEASE="${REVISION}-${BRANCH}" VERSION="${TYPE} ${RELEASE}" if [ -z "${SYSDIR}" ]; then SYSDIR=$(dirname $0)/.. fi if [ -n "${PARAMFILE}" ]; then RELDATE=$(awk '/__FreeBSD_version.*propagated to newvers/ {print $3}' \ ${PARAMFILE}) else RELDATE=$(awk '/__FreeBSD_version.*propagated to newvers/ {print $3}' \ ${SYSDIR}/sys/param.h) fi b=share/examples/etc/bsd-style-copyright if [ -r "${SYSDIR}/../COPYRIGHT" ]; then year=$(sed -Ee '/^Copyright .* The FreeBSD Project/!d;s/^.*1992-([0-9]*) .*$/\1/g' ${SYSDIR}/../COPYRIGHT) else year=$(date +%Y) fi # look for copyright template for bsd_copyright in ../$b ../../$b ../../../$b /usr/src/$b /usr/$b do if [ -r "$bsd_copyright" ]; then COPYRIGHT=`sed \ -e "s/\[year\]/1992-$year/" \ -e 's/\[your name here\]\.* /The FreeBSD Project./' \ -e 's/\[your name\]\.*/The FreeBSD Project./' \ -e '/\[id for your version control system, if any\]/d' \ $bsd_copyright` break fi done # no copyright found, use a dummy if [ -z "$COPYRIGHT" ]; then COPYRIGHT="/*- * Copyright (c) 1992-$year The FreeBSD Project. * All rights reserved. * */" fi # add newline COPYRIGHT="$COPYRIGHT " # VARS_ONLY means no files should be generated, this is just being # included. if [ -n "$VARS_ONLY" ]; then return 0 fi LC_ALL=C; export LC_ALL if [ ! -r version ] then echo 0 > version fi touch version v=`cat version` u=${USER:-root} d=`pwd` h=${HOSTNAME:-`hostname`} if [ -n "$SOURCE_DATE_EPOCH" ]; then if ! t=`date -r $SOURCE_DATE_EPOCH 2>/dev/null`; then echo "Invalid SOURCE_DATE_EPOCH" >&2 exit 1 fi else t=`date` fi i=`${MAKE:-make} -V KERN_IDENT` compiler_v=$($(${MAKE:-make} -V CC) -v 2>&1 | grep -w 'version') for dir in /usr/bin /usr/local/bin; do if [ ! -z "${svnversion}" ] ; then break fi if [ -x "${dir}/svnversion" ] && [ -z ${svnversion} ] ; then # Run svnversion from ${dir} on this script; if return code # is not zero, the checkout might not be compatible with the # svnversion being used. ${dir}/svnversion $(realpath ${0}) >/dev/null 2>&1 if [ $? -eq 0 ]; then svnversion=${dir}/svnversion break fi fi done if [ -z "${svnversion}" ] && [ -x /usr/bin/svnliteversion ] ; then /usr/bin/svnliteversion $(realpath ${0}) >/dev/null 2>&1 if [ $? -eq 0 ]; then svnversion=/usr/bin/svnliteversion else svnversion= fi fi for dir in /usr/bin /usr/local/bin; do if [ -x "${dir}/p4" ] && [ -z ${p4_cmd} ] ; then p4_cmd=${dir}/p4 fi done if [ -d "${SYSDIR}/../.git" ] ; then for dir in /usr/bin /usr/local/bin; do if [ -x "${dir}/git" ] ; then git_cmd="${dir}/git --git-dir=${SYSDIR}/../.git" break fi done fi if [ -d "${SYSDIR}/../.hg" ] ; then for dir in /usr/bin /usr/local/bin; do if [ -x "${dir}/hg" ] ; then hg_cmd="${dir}/hg -R ${SYSDIR}/.." break fi done fi if [ -n "$svnversion" ] ; then svn=`cd ${SYSDIR} && $svnversion 2>/dev/null` case "$svn" in [0-9]*[MSP]|*:*) svn=" r${svn}" modified=true ;; [0-9]*) svn=" r${svn}" ;; *) unset svn ;; esac fi if [ -n "$git_cmd" ] ; then git=`$git_cmd rev-parse --verify --short HEAD 2>/dev/null` svn=`$git_cmd svn find-rev $git 2>/dev/null` if [ -n "$svn" ] ; then svn=" r${svn}" git="=${git}" else svn=`$git_cmd log | fgrep 'git-svn-id:' | head -1 | \ sed -n 's/^.*@\([0-9][0-9]*\).*$/\1/p'` if [ -z "$svn" ] ; then svn=`$git_cmd log --format='format:%N' | \ grep '^svn ' | head -1 | \ sed -n 's/^.*revision=\([0-9][0-9]*\).*$/\1/p'` fi if [ -n "$svn" ] ; then svn=" r${svn}" git="+${git}" else git=" ${git}" fi fi git_b=`$git_cmd rev-parse --abbrev-ref HEAD` if [ -n "$git_b" ] ; then git="${git}(${git_b})" fi if $git_cmd --work-tree=${SYSDIR}/.. diff-index \ --name-only HEAD | read dummy; then git="${git}-dirty" modified=true fi fi if [ -n "$p4_cmd" ] ; then p4version=`cd ${SYSDIR} && $p4_cmd changes -m1 "./...#have" 2>&1 | \ awk '{ print $2 }'` case "$p4version" in [0-9]*) p4version=" ${p4version}" p4opened=`cd ${SYSDIR} && $p4_cmd opened ./... 2>&1` case "$p4opened" in File*) ;; //*) p4version="${p4version}+edit" modified=true ;; esac ;; *) unset p4version ;; esac fi if [ -n "$hg_cmd" ] ; then hg=`$hg_cmd id 2>/dev/null` svn=`$hg_cmd svn info 2>/dev/null | \ awk -F': ' '/Revision/ { print $2 }'` if [ -n "$svn" ] ; then svn=" r${svn}" fi if [ -n "$hg" ] ; then hg=" ${hg}" fi fi include_metadata=true while getopts rR opt; do case "$opt" in r) include_metadata= ;; R) if [ -z "${modified}" ]; then include_metadata= fi esac done shift $((OPTIND - 1)) if [ -z "${include_metadata}" ]; then VERINFO="${VERSION} ${svn}${git}${hg}${p4version}" VERSTR="${VERINFO}\\n" else VERINFO="${VERSION} #${v}${svn}${git}${hg}${p4version}: ${t}" VERSTR="${VERINFO}\\n ${u}@${h}:${d}\\n" fi cat << EOF > vers.c $COPYRIGHT #define SCCSSTR "@(#)${VERINFO}" #define VERSTR "${VERSTR}" #define RELSTR "${RELEASE}" char sccs[sizeof(SCCSSTR) > 128 ? sizeof(SCCSSTR) : 128] = SCCSSTR; char version[sizeof(VERSTR) > 256 ? sizeof(VERSTR) : 256] = VERSTR; char compiler_version[] = "${compiler_v}"; char ostype[] = "${TYPE}"; char osrelease[sizeof(RELSTR) > 32 ? sizeof(RELSTR) : 32] = RELSTR; int osreldate = ${RELDATE}; char kern_ident[] = "${i}"; EOF echo $((v + 1)) > version Index: releng/11.1/sys/i386/i386/swtch.s =================================================================== --- releng/11.1/sys/i386/i386/swtch.s (revision 335464) +++ releng/11.1/sys/i386/i386/swtch.s (revision 335465) @@ -1,474 +1,480 @@ /*- * Copyright (c) 1990 The Regents of the University of California. * All rights reserved. * * This code is derived from software contributed to Berkeley by * William Jolitz. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include "opt_sched.h" #include #include "assym.s" #if defined(SMP) && defined(SCHED_ULE) #define SETOP xchgl #define BLOCK_SPIN(reg) \ movl $blocked_lock,%eax ; \ 100: ; \ lock ; \ cmpxchgl %eax,TD_LOCK(reg) ; \ jne 101f ; \ pause ; \ jmp 100b ; \ 101: #else #define SETOP movl #define BLOCK_SPIN(reg) #endif /*****************************************************************************/ /* Scheduling */ /*****************************************************************************/ .text /* * cpu_throw() * * This is the second half of cpu_switch(). It is used when the current * thread is either a dummy or slated to die, and we no longer care * about its state. This is only a slight optimization and is probably * not worth it anymore. Note that we need to clear the pm_active bits so * we do need the old proc if it still exists. * 0(%esp) = ret * 4(%esp) = oldtd * 8(%esp) = newtd */ ENTRY(cpu_throw) movl PCPU(CPUID), %esi movl 4(%esp),%ecx /* Old thread */ testl %ecx,%ecx /* no thread? */ jz 1f /* release bit from old pm_active */ movl PCPU(CURPMAP), %ebx #ifdef SMP lock #endif btrl %esi, PM_ACTIVE(%ebx) /* clear old */ 1: movl 8(%esp),%ecx /* New thread */ movl TD_PCB(%ecx),%edx movl PCB_CR3(%edx),%eax movl %eax,%cr3 /* set bit in new pm_active */ movl TD_PROC(%ecx),%eax movl P_VMSPACE(%eax), %ebx addl $VM_PMAP, %ebx movl %ebx, PCPU(CURPMAP) #ifdef SMP lock #endif btsl %esi, PM_ACTIVE(%ebx) /* set new */ jmp sw1 END(cpu_throw) /* * cpu_switch(old, new) * * Save the current thread state, then select the next thread to run * and load its state. * 0(%esp) = ret * 4(%esp) = oldtd * 8(%esp) = newtd * 12(%esp) = newlock */ ENTRY(cpu_switch) /* Switch to new thread. First, save context. */ movl 4(%esp),%ecx #ifdef INVARIANTS testl %ecx,%ecx /* no thread? */ jz badsw2 /* no, panic */ #endif movl TD_PCB(%ecx),%edx movl (%esp),%eax /* Hardware registers */ movl %eax,PCB_EIP(%edx) movl %ebx,PCB_EBX(%edx) movl %esp,PCB_ESP(%edx) movl %ebp,PCB_EBP(%edx) movl %esi,PCB_ESI(%edx) movl %edi,PCB_EDI(%edx) mov %gs,PCB_GS(%edx) pushfl /* PSL */ popl PCB_PSL(%edx) /* Test if debug registers should be saved. */ testl $PCB_DBREGS,PCB_FLAGS(%edx) jz 1f /* no, skip over */ movl %dr7,%eax /* yes, do the save */ movl %eax,PCB_DR7(%edx) andl $0x0000fc00, %eax /* disable all watchpoints */ movl %eax,%dr7 movl %dr6,%eax movl %eax,PCB_DR6(%edx) movl %dr3,%eax movl %eax,PCB_DR3(%edx) movl %dr2,%eax movl %eax,PCB_DR2(%edx) movl %dr1,%eax movl %eax,PCB_DR1(%edx) movl %dr0,%eax movl %eax,PCB_DR0(%edx) 1: /* have we used fp, and need a save? */ cmpl %ecx,PCPU(FPCURTHREAD) jne 1f pushl PCB_SAVEFPU(%edx) /* h/w bugs make saving complicated */ call npxsave /* do it in a big C function */ popl %eax 1: /* Save is done. Now fire up new thread. Leave old vmspace. */ movl 4(%esp),%edi movl 8(%esp),%ecx /* New thread */ movl 12(%esp),%esi /* New lock */ #ifdef INVARIANTS testl %ecx,%ecx /* no thread? */ jz badsw3 /* no, panic */ #endif movl TD_PCB(%ecx),%edx /* switch address space */ movl PCB_CR3(%edx),%eax movl %cr3,%ebx /* The same address space? */ cmpl %ebx,%eax je sw0 movl %eax,%cr3 /* new address space */ movl %esi,%eax movl PCPU(CPUID),%esi SETOP %eax,TD_LOCK(%edi) /* Switchout td_lock */ /* Release bit from old pmap->pm_active */ movl PCPU(CURPMAP), %ebx #ifdef SMP lock #endif btrl %esi, PM_ACTIVE(%ebx) /* clear old */ /* Set bit in new pmap->pm_active */ movl TD_PROC(%ecx),%eax /* newproc */ movl P_VMSPACE(%eax), %ebx addl $VM_PMAP, %ebx movl %ebx, PCPU(CURPMAP) #ifdef SMP lock #endif btsl %esi, PM_ACTIVE(%ebx) /* set new */ jmp sw1 sw0: SETOP %esi,TD_LOCK(%edi) /* Switchout td_lock */ sw1: BLOCK_SPIN(%ecx) /* * At this point, we've switched address spaces and are ready * to load up the rest of the next context. */ cmpl $0, PCB_EXT(%edx) /* has pcb extension? */ je 1f /* If not, use the default */ movl $1, PCPU(PRIVATE_TSS) /* mark use of private tss */ movl PCB_EXT(%edx), %edi /* new tss descriptor */ jmp 2f /* Load it up */ 1: /* * Use the common default TSS instead of our own. * Set our stack pointer into the TSS, it's set to just * below the PCB. In C, common_tss.tss_esp0 = &pcb - 16; */ leal -16(%edx), %ebx /* leave space for vm86 */ movl %ebx, PCPU(COMMON_TSS) + TSS_ESP0 /* * Test this CPU's bit in the bitmap to see if this * CPU was using a private TSS. */ cmpl $0, PCPU(PRIVATE_TSS) /* Already using the common? */ je 3f /* if so, skip reloading */ movl $0, PCPU(PRIVATE_TSS) PCPU_ADDR(COMMON_TSSD, %edi) 2: /* Move correct tss descriptor into GDT slot, then reload tr. */ movl PCPU(TSS_GDT), %ebx /* entry in GDT */ movl 0(%edi), %eax movl 4(%edi), %esi movl %eax, 0(%ebx) movl %esi, 4(%ebx) movl $GPROC0_SEL*8, %esi /* GSEL(GPROC0_SEL, SEL_KPL) */ ltr %si 3: /* Copy the %fs and %gs selectors into this pcpu gdt */ leal PCB_FSD(%edx), %esi movl PCPU(FSGS_GDT), %edi movl 0(%esi), %eax /* %fs selector */ movl 4(%esi), %ebx movl %eax, 0(%edi) movl %ebx, 4(%edi) movl 8(%esi), %eax /* %gs selector, comes straight after */ movl 12(%esi), %ebx movl %eax, 8(%edi) movl %ebx, 12(%edi) /* Restore context. */ movl PCB_EBX(%edx),%ebx movl PCB_ESP(%edx),%esp movl PCB_EBP(%edx),%ebp movl PCB_ESI(%edx),%esi movl PCB_EDI(%edx),%edi movl PCB_EIP(%edx),%eax movl %eax,(%esp) pushl PCB_PSL(%edx) popfl movl %edx, PCPU(CURPCB) movl TD_TID(%ecx),%eax movl %ecx, PCPU(CURTHREAD) /* into next thread */ /* * Determine the LDT to use and load it if is the default one and * that is not the current one. */ movl TD_PROC(%ecx),%eax cmpl $0,P_MD+MD_LDT(%eax) jnz 1f movl _default_ldt,%eax cmpl PCPU(CURRENTLDT),%eax je 2f lldt _default_ldt movl %eax,PCPU(CURRENTLDT) jmp 2f 1: /* Load the LDT when it is not the default one. */ pushl %edx /* Preserve pointer to pcb. */ addl $P_MD,%eax /* Pointer to mdproc is arg. */ pushl %eax call set_user_ldt addl $4,%esp popl %edx 2: /* This must be done after loading the user LDT. */ .globl cpu_switch_load_gs cpu_switch_load_gs: mov PCB_GS(%edx),%gs + pushl %edx + pushl PCPU(CURTHREAD) + call npxswitch + popl %edx + popl %edx + /* Test if debug registers should be restored. */ testl $PCB_DBREGS,PCB_FLAGS(%edx) jz 1f /* * Restore debug registers. The special code for dr7 is to * preserve the current values of its reserved bits. */ movl PCB_DR6(%edx),%eax movl %eax,%dr6 movl PCB_DR3(%edx),%eax movl %eax,%dr3 movl PCB_DR2(%edx),%eax movl %eax,%dr2 movl PCB_DR1(%edx),%eax movl %eax,%dr1 movl PCB_DR0(%edx),%eax movl %eax,%dr0 movl %dr7,%eax andl $0x0000fc00,%eax movl PCB_DR7(%edx),%ecx andl $~0x0000fc00,%ecx orl %ecx,%eax movl %eax,%dr7 1: ret #ifdef INVARIANTS badsw1: pushal pushl $sw0_1 call panic sw0_1: .asciz "cpu_throw: no newthread supplied" badsw2: pushal pushl $sw0_2 call panic sw0_2: .asciz "cpu_switch: no curthread supplied" badsw3: pushal pushl $sw0_3 call panic sw0_3: .asciz "cpu_switch: no newthread supplied" #endif END(cpu_switch) /* * savectx(pcb) * Update pcb, saving current processor state. */ ENTRY(savectx) /* Fetch PCB. */ movl 4(%esp),%ecx /* Save caller's return address. Child won't execute this routine. */ movl (%esp),%eax movl %eax,PCB_EIP(%ecx) movl %cr3,%eax movl %eax,PCB_CR3(%ecx) movl %ebx,PCB_EBX(%ecx) movl %esp,PCB_ESP(%ecx) movl %ebp,PCB_EBP(%ecx) movl %esi,PCB_ESI(%ecx) movl %edi,PCB_EDI(%ecx) mov %gs,PCB_GS(%ecx) pushfl popl PCB_PSL(%ecx) movl %cr0,%eax movl %eax,PCB_CR0(%ecx) movl %cr2,%eax movl %eax,PCB_CR2(%ecx) movl %cr4,%eax movl %eax,PCB_CR4(%ecx) movl %dr0,%eax movl %eax,PCB_DR0(%ecx) movl %dr1,%eax movl %eax,PCB_DR1(%ecx) movl %dr2,%eax movl %eax,PCB_DR2(%ecx) movl %dr3,%eax movl %eax,PCB_DR3(%ecx) movl %dr6,%eax movl %eax,PCB_DR6(%ecx) movl %dr7,%eax movl %eax,PCB_DR7(%ecx) mov %ds,PCB_DS(%ecx) mov %es,PCB_ES(%ecx) mov %fs,PCB_FS(%ecx) mov %ss,PCB_SS(%ecx) sgdt PCB_GDT(%ecx) sidt PCB_IDT(%ecx) sldt PCB_LDT(%ecx) str PCB_TR(%ecx) movl $1,%eax ret END(savectx) /* * resumectx(pcb) __fastcall * Resuming processor state from pcb. */ ENTRY(resumectx) /* Restore GDT. */ lgdt PCB_GDT(%ecx) /* Restore segment registers */ movzwl PCB_DS(%ecx),%eax mov %ax,%ds movzwl PCB_ES(%ecx),%eax mov %ax,%es movzwl PCB_FS(%ecx),%eax mov %ax,%fs movzwl PCB_GS(%ecx),%eax movw %ax,%gs movzwl PCB_SS(%ecx),%eax mov %ax,%ss /* Restore CR2, CR4, CR3 and CR0 */ movl PCB_CR2(%ecx),%eax movl %eax,%cr2 movl PCB_CR4(%ecx),%eax movl %eax,%cr4 movl PCB_CR3(%ecx),%eax movl %eax,%cr3 movl PCB_CR0(%ecx),%eax movl %eax,%cr0 jmp 1f 1: /* Restore descriptor tables */ lidt PCB_IDT(%ecx) lldt PCB_LDT(%ecx) #define SDT_SYS386TSS 9 #define SDT_SYS386BSY 11 /* Clear "task busy" bit and reload TR */ movl PCPU(TSS_GDT),%eax andb $(~SDT_SYS386BSY | SDT_SYS386TSS),5(%eax) movzwl PCB_TR(%ecx),%eax ltr %ax #undef SDT_SYS386TSS #undef SDT_SYS386BSY /* Restore debug registers */ movl PCB_DR0(%ecx),%eax movl %eax,%dr0 movl PCB_DR1(%ecx),%eax movl %eax,%dr1 movl PCB_DR2(%ecx),%eax movl %eax,%dr2 movl PCB_DR3(%ecx),%eax movl %eax,%dr3 movl PCB_DR6(%ecx),%eax movl %eax,%dr6 movl PCB_DR7(%ecx),%eax movl %eax,%dr7 /* Restore other registers */ movl PCB_EDI(%ecx),%edi movl PCB_ESI(%ecx),%esi movl PCB_EBP(%ecx),%ebp movl PCB_ESP(%ecx),%esp movl PCB_EBX(%ecx),%ebx /* reload code selector by turning return into intersegmental return */ pushl PCB_EIP(%ecx) movl $KCSEL,4(%esp) xorl %eax,%eax lret END(resumectx) Index: releng/11.1/sys/i386/isa/npx.c =================================================================== --- releng/11.1/sys/i386/isa/npx.c (revision 335464) +++ releng/11.1/sys/i386/isa/npx.c (revision 335465) @@ -1,1438 +1,1465 @@ /*- * Copyright (c) 1990 William Jolitz. * Copyright (c) 1991 The Regents of the University of California. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)npx.c 7.2 (Berkeley) 5/12/91 */ #include __FBSDID("$FreeBSD$"); #include "opt_cpu.h" #include "opt_isa.h" #include "opt_npx.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef NPX_DEBUG #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DEV_ISA #include #endif /* * 387 and 287 Numeric Coprocessor Extension (NPX) Driver. */ #if defined(__GNUCLIKE_ASM) && !defined(lint) #define fldcw(cw) __asm __volatile("fldcw %0" : : "m" (cw)) #define fnclex() __asm __volatile("fnclex") #define fninit() __asm __volatile("fninit") #define fnsave(addr) __asm __volatile("fnsave %0" : "=m" (*(addr))) #define fnstcw(addr) __asm __volatile("fnstcw %0" : "=m" (*(addr))) #define fnstsw(addr) __asm __volatile("fnstsw %0" : "=am" (*(addr))) #define fp_divide_by_0() __asm __volatile( \ "fldz; fld1; fdiv %st,%st(1); fnop") #define frstor(addr) __asm __volatile("frstor %0" : : "m" (*(addr))) #define fxrstor(addr) __asm __volatile("fxrstor %0" : : "m" (*(addr))) #define fxsave(addr) __asm __volatile("fxsave %0" : "=m" (*(addr))) #define ldmxcsr(csr) __asm __volatile("ldmxcsr %0" : : "m" (csr)) #define stmxcsr(addr) __asm __volatile("stmxcsr %0" : : "m" (*(addr))) static __inline void xrstor(char *addr, uint64_t mask) { uint32_t low, hi; low = mask; hi = mask >> 32; __asm __volatile("xrstor %0" : : "m" (*addr), "a" (low), "d" (hi)); } static __inline void xsave(char *addr, uint64_t mask) { uint32_t low, hi; low = mask; hi = mask >> 32; __asm __volatile("xsave %0" : "=m" (*addr) : "a" (low), "d" (hi) : "memory"); } static __inline void xsaveopt(char *addr, uint64_t mask) { uint32_t low, hi; low = mask; hi = mask >> 32; __asm __volatile("xsaveopt %0" : "=m" (*addr) : "a" (low), "d" (hi) : "memory"); } #else /* !(__GNUCLIKE_ASM && !lint) */ void fldcw(u_short cw); void fnclex(void); void fninit(void); void fnsave(caddr_t addr); void fnstcw(caddr_t addr); void fnstsw(caddr_t addr); void fp_divide_by_0(void); void frstor(caddr_t addr); void fxsave(caddr_t addr); void fxrstor(caddr_t addr); void ldmxcsr(u_int csr); void stmxcsr(u_int *csr); void xrstor(char *addr, uint64_t mask); void xsave(char *addr, uint64_t mask); void xsaveopt(char *addr, uint64_t mask); #endif /* __GNUCLIKE_ASM && !lint */ #define start_emulating() load_cr0(rcr0() | CR0_TS) #define stop_emulating() clts() #define GET_FPU_CW(thread) \ (cpu_fxsr ? \ (thread)->td_pcb->pcb_save->sv_xmm.sv_env.en_cw : \ (thread)->td_pcb->pcb_save->sv_87.sv_env.en_cw) #define GET_FPU_SW(thread) \ (cpu_fxsr ? \ (thread)->td_pcb->pcb_save->sv_xmm.sv_env.en_sw : \ (thread)->td_pcb->pcb_save->sv_87.sv_env.en_sw) #define SET_FPU_CW(savefpu, value) do { \ if (cpu_fxsr) \ (savefpu)->sv_xmm.sv_env.en_cw = (value); \ else \ (savefpu)->sv_87.sv_env.en_cw = (value); \ } while (0) CTASSERT(sizeof(union savefpu) == 512); CTASSERT(sizeof(struct xstate_hdr) == 64); CTASSERT(sizeof(struct savefpu_ymm) == 832); /* * This requirement is to make it easier for asm code to calculate * offset of the fpu save area from the pcb address. FPU save area * must be 64-byte aligned. */ CTASSERT(sizeof(struct pcb) % XSAVE_AREA_ALIGN == 0); /* * Ensure the copy of XCR0 saved in a core is contained in the padding * area. */ CTASSERT(X86_XSTATE_XCR0_OFFSET >= offsetof(struct savexmm, sv_pad) && X86_XSTATE_XCR0_OFFSET + sizeof(uint64_t) <= sizeof(struct savexmm)); static void fpu_clean_state(void); static void fpusave(union savefpu *); static void fpurstor(union savefpu *); int hw_float; SYSCTL_INT(_hw, HW_FLOATINGPT, floatingpoint, CTLFLAG_RD, &hw_float, 0, "Floating point instructions executed in hardware"); +int lazy_fpu_switch = 0; +SYSCTL_INT(_hw, OID_AUTO, lazy_fpu_switch, CTLFLAG_RWTUN | CTLFLAG_NOFETCH, + &lazy_fpu_switch, 0, + "Lazily load FPU context after context switch"); + int use_xsave; uint64_t xsave_mask; static uma_zone_t fpu_save_area_zone; static union savefpu *npx_initialstate; struct xsave_area_elm_descr { u_int offset; u_int size; } *xsave_area_desc; static int use_xsaveopt; static volatile u_int npx_traps_while_probing; alias_for_inthand_t probetrap; __asm(" \n\ .text \n\ .p2align 2,0x90 \n\ .type " __XSTRING(CNAME(probetrap)) ",@function \n\ " __XSTRING(CNAME(probetrap)) ": \n\ ss \n\ incl " __XSTRING(CNAME(npx_traps_while_probing)) " \n\ fnclex \n\ iret \n\ "); /* * Determine if an FPU is present and how to use it. */ static int npx_probe(void) { struct gate_descriptor save_idt_npxtrap; u_short control, status; /* * Modern CPUs all have an FPU that uses the INT16 interface * and provide a simple way to verify that, so handle the * common case right away. */ if (cpu_feature & CPUID_FPU) { hw_float = 1; return (1); } save_idt_npxtrap = idt[IDT_MF]; setidt(IDT_MF, probetrap, SDT_SYS386TGT, SEL_KPL, GSEL(GCODE_SEL, SEL_KPL)); /* * Don't trap while we're probing. */ stop_emulating(); /* * Finish resetting the coprocessor, if any. If there is an error * pending, then we may get a bogus IRQ13, but npx_intr() will handle * it OK. Bogus halts have never been observed, but we enabled * IRQ13 and cleared the BUSY# latch early to handle them anyway. */ fninit(); /* * Don't use fwait here because it might hang. * Don't use fnop here because it usually hangs if there is no FPU. */ DELAY(1000); /* wait for any IRQ13 */ #ifdef DIAGNOSTIC if (npx_traps_while_probing != 0) printf("fninit caused %u bogus npx trap(s)\n", npx_traps_while_probing); #endif /* * Check for a status of mostly zero. */ status = 0x5a5a; fnstsw(&status); if ((status & 0xb8ff) == 0) { /* * Good, now check for a proper control word. */ control = 0x5a5a; fnstcw(&control); if ((control & 0x1f3f) == 0x033f) { /* * We have an npx, now divide by 0 to see if exception * 16 works. */ control &= ~(1 << 2); /* enable divide by 0 trap */ fldcw(control); #ifdef FPU_ERROR_BROKEN /* * FPU error signal doesn't work on some CPU * accelerator board. */ hw_float = 1; return (1); #endif npx_traps_while_probing = 0; fp_divide_by_0(); if (npx_traps_while_probing != 0) { /* * Good, exception 16 works. */ hw_float = 1; goto cleanup; } printf( "FPU does not use exception 16 for error reporting\n"); goto cleanup; } } /* * Probe failed. Floating point simply won't work. * Notify user and disable FPU/MMX/SSE instruction execution. */ printf("WARNING: no FPU!\n"); __asm __volatile("smsw %%ax; orb %0,%%al; lmsw %%ax" : : "n" (CR0_EM | CR0_MP) : "ax"); cleanup: idt[IDT_MF] = save_idt_npxtrap; return (hw_float); } /* * Enable XSAVE if supported and allowed by user. * Calculate the xsave_mask. */ static void npxinit_bsp1(void) { u_int cp[4]; uint64_t xsave_mask_user; + TUNABLE_INT_FETCH("hw.lazy_fpu_switch", &lazy_fpu_switch); if (cpu_fxsr && (cpu_feature2 & CPUID2_XSAVE) != 0) { use_xsave = 1; TUNABLE_INT_FETCH("hw.use_xsave", &use_xsave); } if (!use_xsave) return; cpuid_count(0xd, 0x0, cp); xsave_mask = XFEATURE_ENABLED_X87 | XFEATURE_ENABLED_SSE; if ((cp[0] & xsave_mask) != xsave_mask) panic("CPU0 does not support X87 or SSE: %x", cp[0]); xsave_mask = ((uint64_t)cp[3] << 32) | cp[0]; xsave_mask_user = xsave_mask; TUNABLE_QUAD_FETCH("hw.xsave_mask", &xsave_mask_user); xsave_mask_user |= XFEATURE_ENABLED_X87 | XFEATURE_ENABLED_SSE; xsave_mask &= xsave_mask_user; if ((xsave_mask & XFEATURE_AVX512) != XFEATURE_AVX512) xsave_mask &= ~XFEATURE_AVX512; if ((xsave_mask & XFEATURE_MPX) != XFEATURE_MPX) xsave_mask &= ~XFEATURE_MPX; cpuid_count(0xd, 0x1, cp); if ((cp[0] & CPUID_EXTSTATE_XSAVEOPT) != 0) use_xsaveopt = 1; } /* * Calculate the fpu save area size. */ static void npxinit_bsp2(void) { u_int cp[4]; if (use_xsave) { cpuid_count(0xd, 0x0, cp); cpu_max_ext_state_size = cp[1]; /* * Reload the cpu_feature2, since we enabled OSXSAVE. */ do_cpuid(1, cp); cpu_feature2 = cp[2]; } else cpu_max_ext_state_size = sizeof(union savefpu); } /* * Initialize floating point unit. */ void npxinit(bool bsp) { static union savefpu dummy; register_t saveintr; u_int mxcsr; u_short control; if (bsp) { if (!npx_probe()) return; npxinit_bsp1(); } if (use_xsave) { load_cr4(rcr4() | CR4_XSAVE); load_xcr(XCR0, xsave_mask); } /* * XCR0 shall be set up before CPU can report the save area size. */ if (bsp) npxinit_bsp2(); /* * fninit has the same h/w bugs as fnsave. Use the detoxified * fnsave to throw away any junk in the fpu. fpusave() initializes * the fpu. * * It is too early for critical_enter() to work on AP. */ saveintr = intr_disable(); stop_emulating(); if (cpu_fxsr) fninit(); else fnsave(&dummy); control = __INITIAL_NPXCW__; fldcw(control); if (cpu_fxsr) { mxcsr = __INITIAL_MXCSR__; ldmxcsr(mxcsr); } start_emulating(); intr_restore(saveintr); } /* * On the boot CPU we generate a clean state that is used to * initialize the floating point unit when it is first used by a * process. */ static void npxinitstate(void *arg __unused) { register_t saveintr; int cp[4], i, max_ext_n; if (!hw_float) return; npx_initialstate = malloc(cpu_max_ext_state_size, M_DEVBUF, M_WAITOK | M_ZERO); saveintr = intr_disable(); stop_emulating(); fpusave(npx_initialstate); if (cpu_fxsr) { if (npx_initialstate->sv_xmm.sv_env.en_mxcsr_mask) cpu_mxcsr_mask = npx_initialstate->sv_xmm.sv_env.en_mxcsr_mask; else cpu_mxcsr_mask = 0xFFBF; /* * The fninit instruction does not modify XMM * registers or x87 registers (MM/ST). The fpusave * call dumped the garbage contained in the registers * after reset to the initial state saved. Clear XMM * and x87 registers file image to make the startup * program state and signal handler XMM/x87 register * content predictable. */ bzero(npx_initialstate->sv_xmm.sv_fp, sizeof(npx_initialstate->sv_xmm.sv_fp)); bzero(npx_initialstate->sv_xmm.sv_xmm, sizeof(npx_initialstate->sv_xmm.sv_xmm)); } else bzero(npx_initialstate->sv_87.sv_ac, sizeof(npx_initialstate->sv_87.sv_ac)); /* * Create a table describing the layout of the CPU Extended * Save Area. */ if (use_xsave) { if (xsave_mask >> 32 != 0) max_ext_n = fls(xsave_mask >> 32) + 32; else max_ext_n = fls(xsave_mask); xsave_area_desc = malloc(max_ext_n * sizeof(struct xsave_area_elm_descr), M_DEVBUF, M_WAITOK | M_ZERO); /* x87 state */ xsave_area_desc[0].offset = 0; xsave_area_desc[0].size = 160; /* XMM */ xsave_area_desc[1].offset = 160; xsave_area_desc[1].size = 288 - 160; for (i = 2; i < max_ext_n; i++) { cpuid_count(0xd, i, cp); xsave_area_desc[i].offset = cp[1]; xsave_area_desc[i].size = cp[0]; } } fpu_save_area_zone = uma_zcreate("FPU_save_area", cpu_max_ext_state_size, NULL, NULL, NULL, NULL, XSAVE_AREA_ALIGN - 1, 0); start_emulating(); intr_restore(saveintr); } SYSINIT(npxinitstate, SI_SUB_DRIVERS, SI_ORDER_ANY, npxinitstate, NULL); /* * Free coprocessor (if we have it). */ void npxexit(struct thread *td) { critical_enter(); if (curthread == PCPU_GET(fpcurthread)) { stop_emulating(); fpusave(curpcb->pcb_save); start_emulating(); PCPU_SET(fpcurthread, NULL); } critical_exit(); #ifdef NPX_DEBUG if (hw_float) { u_int masked_exceptions; masked_exceptions = GET_FPU_CW(td) & GET_FPU_SW(td) & 0x7f; /* * Log exceptions that would have trapped with the old * control word (overflow, divide by 0, and invalid operand). */ if (masked_exceptions & 0x0d) log(LOG_ERR, "pid %d (%s) exited with masked floating point exceptions 0x%02x\n", td->td_proc->p_pid, td->td_proc->p_comm, masked_exceptions); } #endif } int npxformat(void) { if (!hw_float) return (_MC_FPFMT_NODEV); if (cpu_fxsr) return (_MC_FPFMT_XMM); return (_MC_FPFMT_387); } /* * The following mechanism is used to ensure that the FPE_... value * that is passed as a trapcode to the signal handler of the user * process does not have more than one bit set. * * Multiple bits may be set if the user process modifies the control * word while a status word bit is already set. While this is a sign * of bad coding, we have no choise than to narrow them down to one * bit, since we must not send a trapcode that is not exactly one of * the FPE_ macros. * * The mechanism has a static table with 127 entries. Each combination * of the 7 FPU status word exception bits directly translates to a * position in this table, where a single FPE_... value is stored. * This FPE_... value stored there is considered the "most important" * of the exception bits and will be sent as the signal code. The * precedence of the bits is based upon Intel Document "Numerical * Applications", Chapter "Special Computational Situations". * * The macro to choose one of these values does these steps: 1) Throw * away status word bits that cannot be masked. 2) Throw away the bits * currently masked in the control word, assuming the user isn't * interested in them anymore. 3) Reinsert status word bit 7 (stack * fault) if it is set, which cannot be masked but must be presered. * 4) Use the remaining bits to point into the trapcode table. * * The 6 maskable bits in order of their preference, as stated in the * above referenced Intel manual: * 1 Invalid operation (FP_X_INV) * 1a Stack underflow * 1b Stack overflow * 1c Operand of unsupported format * 1d SNaN operand. * 2 QNaN operand (not an exception, irrelavant here) * 3 Any other invalid-operation not mentioned above or zero divide * (FP_X_INV, FP_X_DZ) * 4 Denormal operand (FP_X_DNML) * 5 Numeric over/underflow (FP_X_OFL, FP_X_UFL) * 6 Inexact result (FP_X_IMP) */ static char fpetable[128] = { 0, FPE_FLTINV, /* 1 - INV */ FPE_FLTUND, /* 2 - DNML */ FPE_FLTINV, /* 3 - INV | DNML */ FPE_FLTDIV, /* 4 - DZ */ FPE_FLTINV, /* 5 - INV | DZ */ FPE_FLTDIV, /* 6 - DNML | DZ */ FPE_FLTINV, /* 7 - INV | DNML | DZ */ FPE_FLTOVF, /* 8 - OFL */ FPE_FLTINV, /* 9 - INV | OFL */ FPE_FLTUND, /* A - DNML | OFL */ FPE_FLTINV, /* B - INV | DNML | OFL */ FPE_FLTDIV, /* C - DZ | OFL */ FPE_FLTINV, /* D - INV | DZ | OFL */ FPE_FLTDIV, /* E - DNML | DZ | OFL */ FPE_FLTINV, /* F - INV | DNML | DZ | OFL */ FPE_FLTUND, /* 10 - UFL */ FPE_FLTINV, /* 11 - INV | UFL */ FPE_FLTUND, /* 12 - DNML | UFL */ FPE_FLTINV, /* 13 - INV | DNML | UFL */ FPE_FLTDIV, /* 14 - DZ | UFL */ FPE_FLTINV, /* 15 - INV | DZ | UFL */ FPE_FLTDIV, /* 16 - DNML | DZ | UFL */ FPE_FLTINV, /* 17 - INV | DNML | DZ | UFL */ FPE_FLTOVF, /* 18 - OFL | UFL */ FPE_FLTINV, /* 19 - INV | OFL | UFL */ FPE_FLTUND, /* 1A - DNML | OFL | UFL */ FPE_FLTINV, /* 1B - INV | DNML | OFL | UFL */ FPE_FLTDIV, /* 1C - DZ | OFL | UFL */ FPE_FLTINV, /* 1D - INV | DZ | OFL | UFL */ FPE_FLTDIV, /* 1E - DNML | DZ | OFL | UFL */ FPE_FLTINV, /* 1F - INV | DNML | DZ | OFL | UFL */ FPE_FLTRES, /* 20 - IMP */ FPE_FLTINV, /* 21 - INV | IMP */ FPE_FLTUND, /* 22 - DNML | IMP */ FPE_FLTINV, /* 23 - INV | DNML | IMP */ FPE_FLTDIV, /* 24 - DZ | IMP */ FPE_FLTINV, /* 25 - INV | DZ | IMP */ FPE_FLTDIV, /* 26 - DNML | DZ | IMP */ FPE_FLTINV, /* 27 - INV | DNML | DZ | IMP */ FPE_FLTOVF, /* 28 - OFL | IMP */ FPE_FLTINV, /* 29 - INV | OFL | IMP */ FPE_FLTUND, /* 2A - DNML | OFL | IMP */ FPE_FLTINV, /* 2B - INV | DNML | OFL | IMP */ FPE_FLTDIV, /* 2C - DZ | OFL | IMP */ FPE_FLTINV, /* 2D - INV | DZ | OFL | IMP */ FPE_FLTDIV, /* 2E - DNML | DZ | OFL | IMP */ FPE_FLTINV, /* 2F - INV | DNML | DZ | OFL | IMP */ FPE_FLTUND, /* 30 - UFL | IMP */ FPE_FLTINV, /* 31 - INV | UFL | IMP */ FPE_FLTUND, /* 32 - DNML | UFL | IMP */ FPE_FLTINV, /* 33 - INV | DNML | UFL | IMP */ FPE_FLTDIV, /* 34 - DZ | UFL | IMP */ FPE_FLTINV, /* 35 - INV | DZ | UFL | IMP */ FPE_FLTDIV, /* 36 - DNML | DZ | UFL | IMP */ FPE_FLTINV, /* 37 - INV | DNML | DZ | UFL | IMP */ FPE_FLTOVF, /* 38 - OFL | UFL | IMP */ FPE_FLTINV, /* 39 - INV | OFL | UFL | IMP */ FPE_FLTUND, /* 3A - DNML | OFL | UFL | IMP */ FPE_FLTINV, /* 3B - INV | DNML | OFL | UFL | IMP */ FPE_FLTDIV, /* 3C - DZ | OFL | UFL | IMP */ FPE_FLTINV, /* 3D - INV | DZ | OFL | UFL | IMP */ FPE_FLTDIV, /* 3E - DNML | DZ | OFL | UFL | IMP */ FPE_FLTINV, /* 3F - INV | DNML | DZ | OFL | UFL | IMP */ FPE_FLTSUB, /* 40 - STK */ FPE_FLTSUB, /* 41 - INV | STK */ FPE_FLTUND, /* 42 - DNML | STK */ FPE_FLTSUB, /* 43 - INV | DNML | STK */ FPE_FLTDIV, /* 44 - DZ | STK */ FPE_FLTSUB, /* 45 - INV | DZ | STK */ FPE_FLTDIV, /* 46 - DNML | DZ | STK */ FPE_FLTSUB, /* 47 - INV | DNML | DZ | STK */ FPE_FLTOVF, /* 48 - OFL | STK */ FPE_FLTSUB, /* 49 - INV | OFL | STK */ FPE_FLTUND, /* 4A - DNML | OFL | STK */ FPE_FLTSUB, /* 4B - INV | DNML | OFL | STK */ FPE_FLTDIV, /* 4C - DZ | OFL | STK */ FPE_FLTSUB, /* 4D - INV | DZ | OFL | STK */ FPE_FLTDIV, /* 4E - DNML | DZ | OFL | STK */ FPE_FLTSUB, /* 4F - INV | DNML | DZ | OFL | STK */ FPE_FLTUND, /* 50 - UFL | STK */ FPE_FLTSUB, /* 51 - INV | UFL | STK */ FPE_FLTUND, /* 52 - DNML | UFL | STK */ FPE_FLTSUB, /* 53 - INV | DNML | UFL | STK */ FPE_FLTDIV, /* 54 - DZ | UFL | STK */ FPE_FLTSUB, /* 55 - INV | DZ | UFL | STK */ FPE_FLTDIV, /* 56 - DNML | DZ | UFL | STK */ FPE_FLTSUB, /* 57 - INV | DNML | DZ | UFL | STK */ FPE_FLTOVF, /* 58 - OFL | UFL | STK */ FPE_FLTSUB, /* 59 - INV | OFL | UFL | STK */ FPE_FLTUND, /* 5A - DNML | OFL | UFL | STK */ FPE_FLTSUB, /* 5B - INV | DNML | OFL | UFL | STK */ FPE_FLTDIV, /* 5C - DZ | OFL | UFL | STK */ FPE_FLTSUB, /* 5D - INV | DZ | OFL | UFL | STK */ FPE_FLTDIV, /* 5E - DNML | DZ | OFL | UFL | STK */ FPE_FLTSUB, /* 5F - INV | DNML | DZ | OFL | UFL | STK */ FPE_FLTRES, /* 60 - IMP | STK */ FPE_FLTSUB, /* 61 - INV | IMP | STK */ FPE_FLTUND, /* 62 - DNML | IMP | STK */ FPE_FLTSUB, /* 63 - INV | DNML | IMP | STK */ FPE_FLTDIV, /* 64 - DZ | IMP | STK */ FPE_FLTSUB, /* 65 - INV | DZ | IMP | STK */ FPE_FLTDIV, /* 66 - DNML | DZ | IMP | STK */ FPE_FLTSUB, /* 67 - INV | DNML | DZ | IMP | STK */ FPE_FLTOVF, /* 68 - OFL | IMP | STK */ FPE_FLTSUB, /* 69 - INV | OFL | IMP | STK */ FPE_FLTUND, /* 6A - DNML | OFL | IMP | STK */ FPE_FLTSUB, /* 6B - INV | DNML | OFL | IMP | STK */ FPE_FLTDIV, /* 6C - DZ | OFL | IMP | STK */ FPE_FLTSUB, /* 6D - INV | DZ | OFL | IMP | STK */ FPE_FLTDIV, /* 6E - DNML | DZ | OFL | IMP | STK */ FPE_FLTSUB, /* 6F - INV | DNML | DZ | OFL | IMP | STK */ FPE_FLTUND, /* 70 - UFL | IMP | STK */ FPE_FLTSUB, /* 71 - INV | UFL | IMP | STK */ FPE_FLTUND, /* 72 - DNML | UFL | IMP | STK */ FPE_FLTSUB, /* 73 - INV | DNML | UFL | IMP | STK */ FPE_FLTDIV, /* 74 - DZ | UFL | IMP | STK */ FPE_FLTSUB, /* 75 - INV | DZ | UFL | IMP | STK */ FPE_FLTDIV, /* 76 - DNML | DZ | UFL | IMP | STK */ FPE_FLTSUB, /* 77 - INV | DNML | DZ | UFL | IMP | STK */ FPE_FLTOVF, /* 78 - OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 79 - INV | OFL | UFL | IMP | STK */ FPE_FLTUND, /* 7A - DNML | OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 7B - INV | DNML | OFL | UFL | IMP | STK */ FPE_FLTDIV, /* 7C - DZ | OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 7D - INV | DZ | OFL | UFL | IMP | STK */ FPE_FLTDIV, /* 7E - DNML | DZ | OFL | UFL | IMP | STK */ FPE_FLTSUB, /* 7F - INV | DNML | DZ | OFL | UFL | IMP | STK */ }; /* * Read the FP status and control words, then generate si_code value * for SIGFPE. The error code chosen will be one of the * FPE_... macros. It will be sent as the second argument to old * BSD-style signal handlers and as "siginfo_t->si_code" (second * argument) to SA_SIGINFO signal handlers. * * Some time ago, we cleared the x87 exceptions with FNCLEX there. * Clearing exceptions was necessary mainly to avoid IRQ13 bugs. The * usermode code which understands the FPU hardware enough to enable * the exceptions, can also handle clearing the exception state in the * handler. The only consequence of not clearing the exception is the * rethrow of the SIGFPE on return from the signal handler and * reexecution of the corresponding instruction. * * For XMM traps, the exceptions were never cleared. */ int npxtrap_x87(void) { u_short control, status; if (!hw_float) { printf( "npxtrap_x87: fpcurthread = %p, curthread = %p, hw_float = %d\n", PCPU_GET(fpcurthread), curthread, hw_float); panic("npxtrap from nowhere"); } critical_enter(); /* * Interrupt handling (for another interrupt) may have pushed the * state to memory. Fetch the relevant parts of the state from * wherever they are. */ if (PCPU_GET(fpcurthread) != curthread) { control = GET_FPU_CW(curthread); status = GET_FPU_SW(curthread); } else { fnstcw(&control); fnstsw(&status); } critical_exit(); return (fpetable[status & ((~control & 0x3f) | 0x40)]); } int npxtrap_sse(void) { u_int mxcsr; if (!hw_float) { printf( "npxtrap_sse: fpcurthread = %p, curthread = %p, hw_float = %d\n", PCPU_GET(fpcurthread), curthread, hw_float); panic("npxtrap from nowhere"); } critical_enter(); if (PCPU_GET(fpcurthread) != curthread) mxcsr = curthread->td_pcb->pcb_save->sv_xmm.sv_env.en_mxcsr; else stmxcsr(&mxcsr); critical_exit(); return (fpetable[(mxcsr & (~mxcsr >> 7)) & 0x3f]); } -/* - * Implement device not available (DNA) exception - * - * It would be better to switch FP context here (if curthread != fpcurthread) - * and not necessarily for every context switch, but it is too hard to - * access foreign pcb's. - */ - -static int err_count = 0; - -int -npxdna(void) +static void +restore_npx_curthread(struct thread *td, struct pcb *pcb) { - if (!hw_float) - return (0); - critical_enter(); - if (PCPU_GET(fpcurthread) == curthread) { - printf("npxdna: fpcurthread == curthread %d times\n", - ++err_count); - stop_emulating(); - critical_exit(); - return (1); - } - if (PCPU_GET(fpcurthread) != NULL) { - printf("npxdna: fpcurthread = %p (%d), curthread = %p (%d)\n", - PCPU_GET(fpcurthread), - PCPU_GET(fpcurthread)->td_proc->p_pid, - curthread, curthread->td_proc->p_pid); - panic("npxdna"); - } - stop_emulating(); /* * Record new context early in case frstor causes a trap. */ - PCPU_SET(fpcurthread, curthread); + PCPU_SET(fpcurthread, td); + stop_emulating(); if (cpu_fxsr) fpu_clean_state(); - if ((curpcb->pcb_flags & PCB_NPXINITDONE) == 0) { + if ((pcb->pcb_flags & PCB_NPXINITDONE) == 0) { /* * This is the first time this thread has used the FPU or * the PCB doesn't contain a clean FPU state. Explicitly * load an initial state. * * We prefer to restore the state from the actual save * area in PCB instead of directly loading from * npx_initialstate, to ignite the XSAVEOPT * tracking engine. */ - bcopy(npx_initialstate, curpcb->pcb_save, cpu_max_ext_state_size); - fpurstor(curpcb->pcb_save); - if (curpcb->pcb_initial_npxcw != __INITIAL_NPXCW__) - fldcw(curpcb->pcb_initial_npxcw); - curpcb->pcb_flags |= PCB_NPXINITDONE; - if (PCB_USER_FPU(curpcb)) - curpcb->pcb_flags |= PCB_NPXUSERINITDONE; + bcopy(npx_initialstate, pcb->pcb_save, cpu_max_ext_state_size); + fpurstor(pcb->pcb_save); + if (pcb->pcb_initial_npxcw != __INITIAL_NPXCW__) + fldcw(pcb->pcb_initial_npxcw); + pcb->pcb_flags |= PCB_NPXINITDONE; + if (PCB_USER_FPU(pcb)) + pcb->pcb_flags |= PCB_NPXUSERINITDONE; } else { - fpurstor(curpcb->pcb_save); + fpurstor(pcb->pcb_save); } - critical_exit(); +} +/* + * Implement device not available (DNA) exception + * + * It would be better to switch FP context here (if curthread != fpcurthread) + * and not necessarily for every context switch, but it is too hard to + * access foreign pcb's. + */ +int +npxdna(void) +{ + struct thread *td; + + if (!hw_float) + return (0); + td = curthread; + critical_enter(); + if (__predict_false(PCPU_GET(fpcurthread) == td)) { + /* + * Some virtual machines seems to set %cr0.TS at + * arbitrary moments. Silently clear the TS bit + * regardless of the eager/lazy FPU context switch + * mode. + */ + stop_emulating(); + } else { + if (__predict_false(PCPU_GET(fpcurthread) != NULL)) { + printf( + "npxdna: fpcurthread = %p (%d), curthread = %p (%d)\n", + PCPU_GET(fpcurthread), + PCPU_GET(fpcurthread)->td_proc->p_pid, + td, td->td_proc->p_pid); + panic("npxdna"); + } + restore_npx_curthread(td, td->td_pcb); + } + critical_exit(); return (1); } /* * Wrapper for fpusave() called from context switch routines. * * npxsave() must be called with interrupts disabled, so that it clears * fpcurthread atomically with saving the state. We require callers to do the * disabling, since most callers need to disable interrupts anyway to call * npxsave() atomically with checking fpcurthread. */ void npxsave(addr) union savefpu *addr; { stop_emulating(); if (use_xsaveopt) xsaveopt((char *)addr, xsave_mask); else fpusave(addr); - start_emulating(); - PCPU_SET(fpcurthread, NULL); +} + +void npxswitch(struct thread *td, struct pcb *pcb); +void +npxswitch(struct thread *td, struct pcb *pcb) +{ + + if (lazy_fpu_switch || (td->td_pflags & TDP_KTHREAD) != 0 || + !PCB_USER_FPU(pcb)) { + start_emulating(); + PCPU_SET(fpcurthread, NULL); + } else if (PCPU_GET(fpcurthread) != td) { + restore_npx_curthread(td, pcb); + } } /* * Unconditionally save the current co-processor state across suspend and * resume. */ void npxsuspend(union savefpu *addr) { register_t cr0; if (!hw_float) return; if (PCPU_GET(fpcurthread) == NULL) { bcopy(npx_initialstate, addr, cpu_max_ext_state_size); return; } cr0 = rcr0(); stop_emulating(); fpusave(addr); load_cr0(cr0); } void npxresume(union savefpu *addr) { register_t cr0; if (!hw_float) return; cr0 = rcr0(); npxinit(false); stop_emulating(); fpurstor(addr); load_cr0(cr0); } void npxdrop(void) { struct thread *td; /* * Discard pending exceptions in the !cpu_fxsr case so that unmasked * ones don't cause a panic on the next frstor. */ if (!cpu_fxsr) fnclex(); td = PCPU_GET(fpcurthread); KASSERT(td == curthread, ("fpudrop: fpcurthread != curthread")); CRITICAL_ASSERT(td); PCPU_SET(fpcurthread, NULL); td->td_pcb->pcb_flags &= ~PCB_NPXINITDONE; start_emulating(); } /* * Get the user state of the FPU into pcb->pcb_user_save without * dropping ownership (if possible). It returns the FPU ownership * status. */ int npxgetregs(struct thread *td) { struct pcb *pcb; uint64_t *xstate_bv, bit; char *sa; int max_ext_n, i; int owned; if (!hw_float) return (_MC_FPOWNED_NONE); pcb = td->td_pcb; if ((pcb->pcb_flags & PCB_NPXINITDONE) == 0) { bcopy(npx_initialstate, get_pcb_user_save_pcb(pcb), cpu_max_ext_state_size); SET_FPU_CW(get_pcb_user_save_pcb(pcb), pcb->pcb_initial_npxcw); npxuserinited(td); return (_MC_FPOWNED_PCB); } critical_enter(); if (td == PCPU_GET(fpcurthread)) { fpusave(get_pcb_user_save_pcb(pcb)); if (!cpu_fxsr) /* * fnsave initializes the FPU and destroys whatever * context it contains. Make sure the FPU owner * starts with a clean state next time. */ npxdrop(); owned = _MC_FPOWNED_FPU; } else { owned = _MC_FPOWNED_PCB; } critical_exit(); if (use_xsave) { /* * Handle partially saved state. */ sa = (char *)get_pcb_user_save_pcb(pcb); xstate_bv = (uint64_t *)(sa + sizeof(union savefpu) + offsetof(struct xstate_hdr, xstate_bv)); if (xsave_mask >> 32 != 0) max_ext_n = fls(xsave_mask >> 32) + 32; else max_ext_n = fls(xsave_mask); for (i = 0; i < max_ext_n; i++) { bit = 1ULL << i; if ((xsave_mask & bit) == 0 || (*xstate_bv & bit) != 0) continue; bcopy((char *)npx_initialstate + xsave_area_desc[i].offset, sa + xsave_area_desc[i].offset, xsave_area_desc[i].size); *xstate_bv |= bit; } } return (owned); } void npxuserinited(struct thread *td) { struct pcb *pcb; pcb = td->td_pcb; if (PCB_USER_FPU(pcb)) pcb->pcb_flags |= PCB_NPXINITDONE; pcb->pcb_flags |= PCB_NPXUSERINITDONE; } int npxsetxstate(struct thread *td, char *xfpustate, size_t xfpustate_size) { struct xstate_hdr *hdr, *ehdr; size_t len, max_len; uint64_t bv; /* XXXKIB should we clear all extended state in xstate_bv instead ? */ if (xfpustate == NULL) return (0); if (!use_xsave) return (EOPNOTSUPP); len = xfpustate_size; if (len < sizeof(struct xstate_hdr)) return (EINVAL); max_len = cpu_max_ext_state_size - sizeof(union savefpu); if (len > max_len) return (EINVAL); ehdr = (struct xstate_hdr *)xfpustate; bv = ehdr->xstate_bv; /* * Avoid #gp. */ if (bv & ~xsave_mask) return (EINVAL); hdr = (struct xstate_hdr *)(get_pcb_user_save_td(td) + 1); hdr->xstate_bv = bv; bcopy(xfpustate + sizeof(struct xstate_hdr), (char *)(hdr + 1), len - sizeof(struct xstate_hdr)); return (0); } int npxsetregs(struct thread *td, union savefpu *addr, char *xfpustate, size_t xfpustate_size) { struct pcb *pcb; int error; if (!hw_float) return (ENXIO); pcb = td->td_pcb; critical_enter(); if (td == PCPU_GET(fpcurthread) && PCB_USER_FPU(pcb)) { error = npxsetxstate(td, xfpustate, xfpustate_size); if (error != 0) { critical_exit(); return (error); } if (!cpu_fxsr) fnclex(); /* As in npxdrop(). */ bcopy(addr, get_pcb_user_save_td(td), sizeof(*addr)); fpurstor(get_pcb_user_save_td(td)); critical_exit(); pcb->pcb_flags |= PCB_NPXUSERINITDONE | PCB_NPXINITDONE; } else { critical_exit(); error = npxsetxstate(td, xfpustate, xfpustate_size); if (error != 0) return (error); bcopy(addr, get_pcb_user_save_td(td), sizeof(*addr)); npxuserinited(td); } return (0); } static void fpusave(addr) union savefpu *addr; { if (use_xsave) xsave((char *)addr, xsave_mask); else if (cpu_fxsr) fxsave(addr); else fnsave(addr); } static void npx_fill_fpregs_xmm1(struct savexmm *sv_xmm, struct save87 *sv_87) { struct env87 *penv_87; struct envxmm *penv_xmm; int i; penv_87 = &sv_87->sv_env; penv_xmm = &sv_xmm->sv_env; /* FPU control/status */ penv_87->en_cw = penv_xmm->en_cw; penv_87->en_sw = penv_xmm->en_sw; penv_87->en_fip = penv_xmm->en_fip; penv_87->en_fcs = penv_xmm->en_fcs; penv_87->en_opcode = penv_xmm->en_opcode; penv_87->en_foo = penv_xmm->en_foo; penv_87->en_fos = penv_xmm->en_fos; /* FPU registers and tags */ penv_87->en_tw = 0xffff; for (i = 0; i < 8; ++i) { sv_87->sv_ac[i] = sv_xmm->sv_fp[i].fp_acc; if ((penv_xmm->en_tw & (1 << i)) != 0) /* zero and special are set as valid */ penv_87->en_tw &= ~(3 << i * 2); } } void npx_fill_fpregs_xmm(struct savexmm *sv_xmm, struct save87 *sv_87) { bzero(sv_87, sizeof(*sv_87)); npx_fill_fpregs_xmm1(sv_xmm, sv_87); } void npx_set_fpregs_xmm(struct save87 *sv_87, struct savexmm *sv_xmm) { struct env87 *penv_87; struct envxmm *penv_xmm; int i; penv_87 = &sv_87->sv_env; penv_xmm = &sv_xmm->sv_env; /* FPU control/status */ penv_xmm->en_cw = penv_87->en_cw; penv_xmm->en_sw = penv_87->en_sw; penv_xmm->en_fip = penv_87->en_fip; penv_xmm->en_fcs = penv_87->en_fcs; penv_xmm->en_opcode = penv_87->en_opcode; penv_xmm->en_foo = penv_87->en_foo; penv_xmm->en_fos = penv_87->en_fos; /* * FPU registers and tags. * Abridged / Full translation (values in binary), see FXSAVE spec. * 0 11 * 1 00, 01, 10 */ penv_xmm->en_tw = 0; for (i = 0; i < 8; ++i) { sv_xmm->sv_fp[i].fp_acc = sv_87->sv_ac[i]; if ((penv_87->en_tw & (3 << i * 2)) != (3 << i * 2)) penv_xmm->en_tw |= 1 << i; } } void npx_get_fsave(void *addr) { struct thread *td; union savefpu *sv; td = curthread; npxgetregs(td); sv = get_pcb_user_save_td(td); if (cpu_fxsr) npx_fill_fpregs_xmm1(&sv->sv_xmm, addr); else bcopy(sv, addr, sizeof(struct env87) + sizeof(struct fpacc87[8])); } int npx_set_fsave(void *addr) { union savefpu sv; int error; bzero(&sv, sizeof(sv)); if (cpu_fxsr) npx_set_fpregs_xmm(addr, &sv.sv_xmm); else bcopy(addr, &sv, sizeof(struct env87) + sizeof(struct fpacc87[8])); error = npxsetregs(curthread, &sv, NULL, 0); return (error); } /* * On AuthenticAMD processors, the fxrstor instruction does not restore * the x87's stored last instruction pointer, last data pointer, and last * opcode values, except in the rare case in which the exception summary * (ES) bit in the x87 status word is set to 1. * * In order to avoid leaking this information across processes, we clean * these values by performing a dummy load before executing fxrstor(). */ static void fpu_clean_state(void) { static float dummy_variable = 0.0; u_short status; /* * Clear the ES bit in the x87 status word if it is currently * set, in order to avoid causing a fault in the upcoming load. */ fnstsw(&status); if (status & 0x80) fnclex(); /* * Load the dummy variable into the x87 stack. This mangles * the x87 stack, but we don't care since we're about to call * fxrstor() anyway. */ __asm __volatile("ffree %%st(7); flds %0" : : "m" (dummy_variable)); } static void fpurstor(union savefpu *addr) { if (use_xsave) xrstor((char *)addr, xsave_mask); else if (cpu_fxsr) fxrstor(addr); else frstor(addr); } #ifdef DEV_ISA /* * This sucks up the legacy ISA support assignments from PNPBIOS/ACPI. */ static struct isa_pnp_id npxisa_ids[] = { { 0x040cd041, "Legacy ISA coprocessor support" }, /* PNP0C04 */ { 0 } }; static int npxisa_probe(device_t dev) { int result; if ((result = ISA_PNP_PROBE(device_get_parent(dev), dev, npxisa_ids)) <= 0) { device_quiet(dev); } return(result); } static int npxisa_attach(device_t dev) { return (0); } static device_method_t npxisa_methods[] = { /* Device interface */ DEVMETHOD(device_probe, npxisa_probe), DEVMETHOD(device_attach, npxisa_attach), DEVMETHOD(device_detach, bus_generic_detach), DEVMETHOD(device_shutdown, bus_generic_shutdown), DEVMETHOD(device_suspend, bus_generic_suspend), DEVMETHOD(device_resume, bus_generic_resume), { 0, 0 } }; static driver_t npxisa_driver = { "npxisa", npxisa_methods, 1, /* no softc */ }; static devclass_t npxisa_devclass; DRIVER_MODULE(npxisa, isa, npxisa_driver, npxisa_devclass, 0, 0); #ifndef PC98 DRIVER_MODULE(npxisa, acpi, npxisa_driver, npxisa_devclass, 0, 0); #endif #endif /* DEV_ISA */ static MALLOC_DEFINE(M_FPUKERN_CTX, "fpukern_ctx", "Kernel contexts for FPU state"); #define FPU_KERN_CTX_NPXINITDONE 0x01 #define FPU_KERN_CTX_DUMMY 0x02 #define FPU_KERN_CTX_INUSE 0x04 struct fpu_kern_ctx { union savefpu *prev; uint32_t flags; char hwstate1[]; }; struct fpu_kern_ctx * fpu_kern_alloc_ctx(u_int flags) { struct fpu_kern_ctx *res; size_t sz; sz = sizeof(struct fpu_kern_ctx) + XSAVE_AREA_ALIGN + cpu_max_ext_state_size; res = malloc(sz, M_FPUKERN_CTX, ((flags & FPU_KERN_NOWAIT) ? M_NOWAIT : M_WAITOK) | M_ZERO); return (res); } void fpu_kern_free_ctx(struct fpu_kern_ctx *ctx) { KASSERT((ctx->flags & FPU_KERN_CTX_INUSE) == 0, ("free'ing inuse ctx")); /* XXXKIB clear the memory ? */ free(ctx, M_FPUKERN_CTX); } static union savefpu * fpu_kern_ctx_savefpu(struct fpu_kern_ctx *ctx) { vm_offset_t p; p = (vm_offset_t)&ctx->hwstate1; p = roundup2(p, XSAVE_AREA_ALIGN); return ((union savefpu *)p); } int fpu_kern_enter(struct thread *td, struct fpu_kern_ctx *ctx, u_int flags) { struct pcb *pcb; KASSERT((ctx->flags & FPU_KERN_CTX_INUSE) == 0, ("using inuse ctx")); if ((flags & FPU_KERN_KTHR) != 0 && is_fpu_kern_thread(0)) { ctx->flags = FPU_KERN_CTX_DUMMY | FPU_KERN_CTX_INUSE; return (0); } pcb = td->td_pcb; KASSERT(!PCB_USER_FPU(pcb) || pcb->pcb_save == get_pcb_user_save_pcb(pcb), ("mangled pcb_save")); ctx->flags = FPU_KERN_CTX_INUSE; if ((pcb->pcb_flags & PCB_NPXINITDONE) != 0) ctx->flags |= FPU_KERN_CTX_NPXINITDONE; npxexit(td); ctx->prev = pcb->pcb_save; pcb->pcb_save = fpu_kern_ctx_savefpu(ctx); pcb->pcb_flags |= PCB_KERNNPX; pcb->pcb_flags &= ~PCB_NPXINITDONE; return (0); } int fpu_kern_leave(struct thread *td, struct fpu_kern_ctx *ctx) { struct pcb *pcb; KASSERT((ctx->flags & FPU_KERN_CTX_INUSE) != 0, ("leaving not inuse ctx")); ctx->flags &= ~FPU_KERN_CTX_INUSE; if (is_fpu_kern_thread(0) && (ctx->flags & FPU_KERN_CTX_DUMMY) != 0) return (0); pcb = td->td_pcb; critical_enter(); if (curthread == PCPU_GET(fpcurthread)) npxdrop(); critical_exit(); pcb->pcb_save = ctx->prev; if (pcb->pcb_save == get_pcb_user_save_pcb(pcb)) { if ((pcb->pcb_flags & PCB_NPXUSERINITDONE) != 0) pcb->pcb_flags |= PCB_NPXINITDONE; else pcb->pcb_flags &= ~PCB_NPXINITDONE; pcb->pcb_flags &= ~PCB_KERNNPX; } else { if ((ctx->flags & FPU_KERN_CTX_NPXINITDONE) != 0) pcb->pcb_flags |= PCB_NPXINITDONE; else pcb->pcb_flags &= ~PCB_NPXINITDONE; KASSERT(!PCB_USER_FPU(pcb), ("unpaired fpu_kern_leave")); } return (0); } int fpu_kern_thread(u_int flags) { KASSERT((curthread->td_pflags & TDP_KTHREAD) != 0, ("Only kthread may use fpu_kern_thread")); KASSERT(curpcb->pcb_save == get_pcb_user_save_pcb(curpcb), ("mangled pcb_save")); KASSERT(PCB_USER_FPU(curpcb), ("recursive call")); curpcb->pcb_flags |= PCB_KERNNPX; return (0); } int is_fpu_kern_thread(u_int flags) { if ((curthread->td_pflags & TDP_KTHREAD) == 0) return (0); return ((curpcb->pcb_flags & PCB_KERNNPX) != 0); } /* * FPU save area alloc/free/init utility routines */ union savefpu * fpu_save_area_alloc(void) { return (uma_zalloc(fpu_save_area_zone, 0)); } void fpu_save_area_free(union savefpu *fsa) { uma_zfree(fpu_save_area_zone, fsa); } void fpu_save_area_reset(union savefpu *fsa) { bcopy(npx_initialstate, fsa, cpu_max_ext_state_size); }