Index: releng/9.3/UPDATING =================================================================== --- releng/9.3/UPDATING (revision 287146) +++ releng/9.3/UPDATING (revision 287147) @@ -1,1833 +1,1844 @@ Updating Information for FreeBSD current users This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. +20150825: p24 FreeBSD-SA-15:21.amd64 + FreeBSD-SA-15:22.openssh + FreeBSD-EN-15:15.pkg + + Fix local privilege escalation in IRET handler. [SA-15:21] + + Fix OpenSSH multiple vulnerabilities. [SA-15:22] + + Fix insufficient check of unsupported pkg(7) signature methods. + [EN-15:15] + 20150818: p23 FreeBSD-SA-15:20.expat Fix multiple integer overflows in expat (libbsdxml) XML parser. [SA-15:20] 20150805: p22 FreeBSD-SA-15:19.routed Fix routed remote denial of service vulnerability. 20150728: p21 FreeBSD-SA-15:15.tcp FreeBSD-SA-15:16.openssh FreeBSD-SA-15:17.bind Fix resource exhaustion in TCP reassembly. [SA-15:15] Fix OpenSSH multiple vulnerabilities. [SA-15:16] Fix BIND remote denial of service vulnerability. [SA-15:17] 20150721: p20 FreeBSD-SA-15:13.tcp Fix resource exhaustion due to sessions stuck in LAST_ACK state. [SA-15:13] 20150707: p19 FreeBSD-SA-15:11.bind Fix BIND resolver remote denial of service when validating. 20150630: p18 FreeBSD-EN-15:08.sendmail [revised] FreeBSD-EN-15:09.xlocale Improvements to sendmail TLS/DH interoperability. [EN-15:08] Fix inconsistency between locale and rune locale states. [EN-15:09] 20150618: p17 FreeBSD-EN-15:08.sendmail Improvements to sendmail TLS/DH interoperability. [EN-15:08] 20150612: p16 FreeBSD-SA-15:10.openssl Fix multiple vulnerabilities in OpenSSL. [SA-15:10] 20150609: p15 FreeBSD-EN-15:06.file Updated base system file(1) to 5.22 to address multiple denial of service issues. 20150513: p14 FreeBSD-EN-15:04.freebsd-update Fix bug with freebsd-update(8) that does not ensure the previous upgrade was completed. [EN-15:04] 20150407: p13 FreeBSD-SA-15:04.igmp [revised] FreeBSD-SA-15:07.ntp FreeBSD-SA-15:09.ipv6 Improved patch for SA-15:04.igmp. Fix multiple vulnerabilities of ntp. [SA-15:07] Fix Denial of Service with IPv6 Router Advertisements. [SA-15:09] 20150320: p12 Fix patch for SA-15:06.openssl. 20150319: p11 FreeBSD-SA-15:06.openssl Fix multiple vulnerabilities in OpenSSL. [SA-15:06] 20150225: p10 FreeBSD-SA-15:04.igmp FreeBSD-SA-15:05.bind FreeBSD-EN-15:01.vt FreeBSD-EN-15:02.openssl FreeBSD-EN-15:03.freebsd-update Fix integer overflow in IGMP protocol. [SA-15:04] Fix BIND remote denial of service vulnerability. [SA-15:05] Fix vt(4) crash with improper ioctl parameters. [EN-15:01] Updated base system OpenSSL to 0.9.8zd. [EN-15:02] Fix freebsd-update libraries update ordering issue. [EN-15:03] 20150127: p9 FreeBSD-SA-15:02.kmem FreeBSD-SA-15:03.sctp Fix SCTP SCTP_SS_VALUE kernel memory corruption and disclosure vulnerability. [SA-15:02] Fix SCTP stream reset vulnerability. [SA-15:03] 20150114: p8 FreeBSD-SA-15:01.openssl Fix multiple vulnerabilities in OpenSSL. [SA-15:01] 20141223: p7 FreeBSD-SA-14:31.ntp FreeBSD-EN-14:13.freebsd-update Fix multiple vulnerabilities in NTP suite. [SA-14:31] Fix directory deletion issue in freebsd-update. [EN-14:13] 20141210: p6 FreeBSD-SA-14:28.file FreeBSD-SA-14:29.bind Fix multiple vulnerabilities in file(1) and libmagic(3). [SA-14:28] Fix BIND remote denial of service vulnerability. [SA-14:29] 20141104: p5 FreeBSD-SA-14:25.setlogin FreeBSD-SA-14:26.ftp FreeBSD-EN-14:12.zfs Fix kernel stack disclosure in setlogin(2) / getlogin(2). [SA-14:25] Fix remote command execution in ftp(1). [SA-14:26] Fix NFSv4 and ZFS cache consistency issue. [EN-14:12] 20141022: p4 FreeBSD-EN-14:10.tzdata FreeBSD-EN-14:11.crypt Time zone data file update. [EN-14:10] Change crypt(3) default hashing algorithm back to DES. [EN-14:11] 20141021: p3 FreeBSD-SA-14:20.rtsold FreeBSD-SA-14:21.routed FreeBSD-SA-14:22.namei FreeBSD-SA-14:23.openssl Fix rtsold(8) remote buffer overflow vulnerability. [SA-14:20] Fix routed(8) remote denial of service vulnerability. [SA-14:21] Fix memory leak in sandboxed namei lookup. [SA-14:22] Fix OpenSSL multiple vulnerabilities. [SA-14:23] 20140916: p2 FreeBSD-SA-14:19.tcp Fix Denial of Service in TCP packet processing. [SA-14:19] 20140909: p1 FreeBSD-SA-14:18.openssl Fix OpenSSL multiple vulnerabilities. [SA-14:18] 20140716: 9.3-RELEASE. 20140608: On i386 and amd64 systems, the onifconsole flag is now set by default in /etc/ttys for ttyu0. This causes ttyu0 to be automatically enabled as a login TTY if it is set in the bootloader as an active kernel console. No changes in behavior should result otherwise. To revert to the previous behavior, set ttyu0 to "off" in /etc/ttys. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140321: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver for NVIDIA nForce MCP Ethernet adapters has been deprecated and will not be part of FreeBSD 11.0 and later releases. If you use this driver, please consider switching to the nfe(4) driver instead. 20131216: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 902505. 20130930: 9.2-RELEASE. 20130823: Behavior of devfs rules path matching has been changed. Pattern is now always matched against fully qualified devfs path and slash characters must be explicitly matched by slashes in pattern (FNM_PATHNAME). Rulesets involving devfs subdirectories must be reviewed. 20130705: hastctl(8)'s `status' command output changed to terse one-liner format. Scripts using this should switch to `list' command or be rewritten. 20130618: Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file. 20130605: Added ZFS TRIM support which is enabled by default. To disable ZFS TRIM support set vfs.zfs.trim.enabled=0 in loader.conf. Creating new ZFS pools and adding new devices to existing pools first performs a full device level TRIM which can take a significant amount of time. The sysctl vfs.zfs.vdev.trim_on_init can be set to 0 to disable this behaviour. ZFS TRIM requires the underlying device support BIO_DELETE which is currently provided by methods such as ATA TRIM and SCSI UNMAP via CAM, which are typically supported by SSD's. Stats for ZFS TRIM can be monitored by looking at the sysctl's under kstat.zfs.misc.zio_trim. 20130524: `list' command has been added to hastctl(8). For now, it is full equivalent of `status' command. WARNING: in the near future the output of hastctl's status command will change to more terse format. If you use `hastctl status' for parsing in your scripts, switch to `hastctl list'. 20130430: The mergemaster command now uses the default MAKEOBJDIRPREFIX rather than creating it's own in the temporary directory in order allow access to bootstrapped versions of tools such as install and mtree. When upgrading from version of FreeBSD where the install command does not support -l, you will need to install a new mergemaster command if mergemaster -p is required. This can be accomplished with the command (cd src/usr.sbin/mergemaster && make install). Due to the use of the new -l option to install(1) during build and install, you must take care not to directly set the INSTALL make variable in your /etc/make.conf, /etc/src.conf, or on the command line. If you wish to use the -C flag for all installs you may be able to add INSTALL+=-C to /etc/make.conf or /etc/src.conf. 20130429: Fix a bug that allows NFS clients to issue READDIR on files. 20130315: The install(1) option -M has changed meaning and now takes an argument that is a file or path to append logs to. In the unlikely event that -M was the last option on the command line and the command line contained at least two files and a target directory the first file will have logs appended to it. The -M option served little practical purpose in the last decade so it's used expected to be extremely rare. 20130225: A new compression method (lz4) has been merged to. Please refer to zpool-features(7) for more information. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20121224: The VFS KBI was changed with the merge of several nullfs optimizations and fixes. All filesystem modules must be recompiled. 20121218: With the addition of auditdistd(8), a new auditdistd user is now depended on during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20121205: 9.1-RELEASE. 20121129: A new version of ZFS (pool version 5000) has been merged to 9-STABLE. Starting with this version the old system of ZFS pool versioning is superseded by "feature flags". This concept enables forward compatibility against certain future changes in functionality of ZFS pools. The first two read-only compatible "feature flags" for ZFS pools are "com.delphix:async_destroy" and "com.delphix:empty_bpobj". For more information read the new zpool-features(7) manual page. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20121114: The commit introducing bsd.compiler.mk breaks the traditional building of kernels before this point. Add -m ${SRC}/share/mk (for the right value of SRC) to your command lines to work around; update your useland to a point after this; or use the buildkernel/installkernel top-level targets. See also 20120829. 20121102: The IPFIREWALL_FORWARD kernel option has been removed. Its functionality now turned on by default. 20120913: The random(4) support for the VIA hardware random number generator (`PADLOCK') is no longer enabled unconditionally. Add the PADLOCK_RNG option in the custom kernel config if needed. The GENERIC kernels on i386 and amd64 do include the option, so the change only affects the custom kernel configurations. 20120829: The amd64 kernel now uses xsetbv, xrstor instructions. To compile with the traditional method, you must update your system with an installworld before the kernel will build. The documented make buildkernel/installkernel interfaces (coupled with fresh make kernel-toolchain) continue to work. 20120727: The sparc64 ZFS loader has been changed to no longer try to auto- detect ZFS providers based on diskN aliases but now requires these to be explicitly listed in the OFW boot-device environment variable. 20120422: Now unix domain sockets behave "as expected" on nullfs(5). Previously nullfs(5) did not pass through all behaviours to the underlying layer, as a result if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. 20120109: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. 20120106: A new VOP_ADVISE() was added to support posix_fadvise(2). All filesystem modules must be recompiled. 20120106: The interface of the VOP_VPTOCNP(9) changed, now the returned vnode shall be referenced, previously it was required to be only held. All in-tree filesystems are converted. 20120106: 9.0-RELEASE. 20111101: The broken amd(4) driver has been replaced with esp(4) in the amd64, i386 and pc98 GENERIC kernel configuration files. 20110913: This commit modifies vfs_register() so that it uses a hash calculation to set vfc_typenum, which is enabled by default. The first time a system is booted after this change, the vfc_typenum values will change for all file systems. The main effect of this is a change to the NFS server file handles for file systems that use vfc_typenum in their fsid, such as ZFS. It will, however, prevent vfc_typenum from changing when file systems are loaded in a different order for subsequent reboots. To disable this, you can set vfs.typenumhash=0 in /boot/loader.conf until you are ready to remount all NFS clients after a reboot. 20110828: Bump the shared library version numbers for libraries that do not use symbol versioning, have changed the ABI compared to stable/8 and which shared library version was not bumped. Done as part of 9.0-RELEASE cycle. 20110815: During the merge of Capsicum features, the fget(9) KPI was modified. This may require the rebuilding of out-of-tree device drivers -- issues have been reported specifically with the nVidia device driver. __FreeBSD_version is bumped to 900041. Also, there is a period between 20110811 and 20110814 where the special devices /dev/{stdin,stdout,stderr} did not work correctly. Building world from a kernel during that window may not work. 20110628: The packet filter (pf) code has been updated to OpenBSD 4.5. You need to update userland tools to be in sync with kernel. This update breaks backward compatibility with earlier pfsync(4) versions. Care must be taken when updating redundant firewall setups. 20110608: The following sysctls and tunables are retired on x86 platforms: machdep.hlt_cpus machdep.hlt_logical_cpus The following sysctl is retired: machdep.hyperthreading_allowed The sysctls were supposed to provide a way to dynamically offline and online selected CPUs on x86 platforms, but the implementation has not been reliable especially with SCHED_ULE scheduler. machdep.hyperthreading_allowed tunable is still available to ignore hyperthreading CPUs at OS level. Individual CPUs can be disabled using hint.lapic.X.disabled tunable, where X is an APIC ID of a CPU. Be advised, though, that disabling CPUs in non-uniform fashion will result in non-uniform topology and may lead to sub-optimal system performance with SCHED_ULE, which is a default scheduler. 20110607: cpumask_t type is retired and cpuset_t is used in order to describe a mask of CPUs. 20110531: Changes to ifconfig(8) for dynamic address family detection mandate that you are running a kernel of 20110525 or later. Make sure to follow the update procedure to boot a new kernel before installing world. 20110513: Support for sun4v architecture is officially dropped 20110503: Several KPI breaking changes have been committed to the mii(4) layer, the PHY drivers and consequently some Ethernet drivers using mii(4). This means that miibus.ko and the modules of the affected Ethernet drivers need to be recompiled. Note to kernel developers: Given that the OUI bit reversion problem was fixed as part of these changes all mii(4) commits related to OUIs, i.e. to sys/dev/mii/miidevs, PHY driver probing and vendor specific handling, no longer can be merged verbatim to stable/8 and previous branches. 20110430: Users of the Atheros AR71xx SoC code now need to add 'device ar71xx_pci' into their kernel configurations along with 'device pci'. 20110427: The default NFS client is now the new NFS client, so fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, it is recommended that the mount(8) and mount_nfs(8) commands be rebuilt from sources and that a link to mount_nfs called mount_oldnfs be created. The new client is compiled into the kernel with "options NFSCL" and this is needed for diskless root file systems. The GENERIC kernel configs have been changed to use NFSCL and NFSD (the new server) instead of NFSCLIENT and NFSSERVER. To use the regular/old client, you can "mount -t oldnfs ...". For a diskless root file system, you must also include a line like: vfs.root.mountfrom="oldnfs:" in the boot/loader.conf on the root fs on the NFS server to make a diskless root fs use the old client. 20110424: The GENERIC kernels for all architectures now default to the new CAM-based ATA stack. It means that all legacy ATA drivers were removed and replaced by respective CAM drivers. If you are using ATA device names in /etc/fstab or other places, make sure to update them respectively (adX -> adaY, acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential numbers starting from zero for each type in order of detection, unless configured otherwise with tunables, see cam(4)). There will be symbolic links created in /dev/ to map old adX devices to the respective adaY. They should provide basic compatibility for file systems mounting in most cases, but they do not support old user-level APIs and do not have respective providers in GEOM. Consider using updated management tools with new device names. It is possible to load devices ahci, ata, siis and mvs as modules, but option ATA_CAM should remain in kernel configuration to make ata module work as CAM driver supporting legacy ATA controllers. Device ata still can be used in modular fashion (atacore + ...). Modules atadisk and atapi* are not used and won't affect operation in ATA_CAM mode. Note that to use CAM-based ATA kernel should include CAM devices scbus, pass, da (or explicitly ada), cd and optionally others. All of them are parts of the cam module. ataraid(4) functionality is now supported by the RAID GEOM class. To use it you can load geom_raid kernel module and use graid(8) tool for management. Instead of /dev/arX device names, use /dev/raid/rX. No kernel config options or code have been removed, so if a problem arises, please report it and optionally revert to the old ATA stack. In order to do it you can remove from the kernel config: options ATA_CAM device ahci device mvs device siis , and instead add back: device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives device atapifd # ATAPI floppy drives device atapist # ATAPI tape drives 20110423: The default NFS server has been changed to the new server, which was referred to as the experimental server. If you need to switch back to the old NFS server, you must now put the "-o" option on both the mountd and nfsd commands. This can be done using the mountd_flags and nfs_server_flags rc.conf variables until an update to the rc scripts is committed, which is coming soon. 20110418: The GNU Objective-C runtime library (libobjc), and other Objective-C related components have been removed from the base system. If you require an Objective-C library, please use one of the available ports. 20110331: ath(4) has been split into bus- and device- modules. if_ath contains the HAL, the TX rate control and the network device code. if_ath_pci contains the PCI bus glue. For Atheros MIPS embedded systems, if_ath_ahb contains the AHB glue. Users need to load both if_ath_pci and if_ath in order to use ath on everything else. TO REPEAT: if_ath_ahb is not needed for normal users. Normal users only need to load if_ath and if_ath_pci for ath(4) operation. 20110314: As part of the replacement of sysinstall, the process of building release media has changed significantly. For details, please re-read release(7), which has been updated to reflect the new build process. 20110218: GNU binutils 2.17.50 (as of 2007-07-03) has been merged to -HEAD. This is the last available version under GPLv2. It brings a number of new features, such as support for newer x86 CPU's (with SSE-3, SSSE-3, SSE 4.1 and SSE 4.2), better support for powerpc64, a number of new directives, and lots of other small improvements. See the ChangeLog file in contrib/binutils for the full details. 20110218: IPsec's HMAC_SHA256-512 support has been fixed to be RFC4868 compliant, and will now use half of hash for authentication. This will break interoperability with all stacks (including all actual FreeBSD versions) who implement draft-ietf-ipsec-ciph-sha-256-00 (they use 96 bits of hash for authentication). The only workaround with such peers is to use another HMAC algorithm for IPsec ("phase 2") authentication. 20110207: Remove the uio_yield prototype and symbol. This function has been misnamed since it was introduced and should not be globally exposed with this name. The equivalent functionality is now available using kern_yield(curthread->td_user_pri). The function remains undocumented. 20110112: A SYSCTL_[ADD_]UQUAD was added for unsigned uint64_t pointers, symmetric with the existing SYSCTL_[ADD_]QUAD. Type checking for scalar sysctls is defined but disabled. Code that needs UQUAD to pass the type checking that must compile on older systems where the define is not present can check against __FreeBSD_version >= 900030. The system dialog(1) has been replaced with a new version previously in ports as devel/cdialog. dialog(1) is mostly command-line compatible with the previous version, but the libdialog associated with it has a largely incompatible API. As such, the original version of libdialog will be kept temporarily as libodialog, until its base system consumers are replaced or updated. Bump __FreeBSD_version to 900030. 20110103: If you are trying to run make universe on a -stable system, and you get the following warning: "Makefile", line 356: "Target architecture for i386/conf/GENERIC unknown. config(8) likely too old." or something similar to it, then you must upgrade your -stable system to 8.2-Release or newer (really, any time after r210146 7/15/2010 in stable/8) or build the config from the latest stable/8 branch and install it on your system. Prior to this date, building a current universe on 8-stable system from between 7/15/2010 and 1/2/2011 would result in a weird shell parsing error in the first kernel build phase. A new config on those old systems will fix that problem for older versions of -current. 20101228: The TCP stack has been modified to allow Khelp modules to interact with it via helper hook points and store per-connection data in the TCP control block. Bump __FreeBSD_version to 900029. User space tools that rely on the size of struct tcpcb in tcp_var.h (e.g. sockstat) need to be recompiled. 20101114: Generic IEEE 802.3 annex 31B full duplex flow control support has been added to mii(4) and bge(4), bce(4), msk(4), nfe(4) and stge(4) along with brgphy(4), e1000phy(4) as well as ip1000phy() have been converted to take advantage of it instead of using custom implementations. This means that these drivers now no longer unconditionally advertise support for flow control but only do so if flow control is a selected media option. This was implemented in the generic support that way in order to allow flow control to be switched on and off via ifconfig(8) with the PHY specific default to typically off in order to protect from unwanted effects. Consequently, if you used flow control with one of the above mentioned drivers you now need to explicitly enable it, for example via: ifconfig bge0 media auto mediaopt flowcontrol Along with the above mentioned changes generic support for setting 1000baseT master mode also has been added and brgphy(4), ciphy(4), e1000phy(4) as well as ip1000phy(4) have been converted to take advantage of it. This means that these drivers now no longer take the link0 parameter for selecting master mode but the master media option has to be used instead, for example like in the following: ifconfig bge0 media 1000baseT mediaopt full-duplex,master Selection of master mode now is also available with all other PHY drivers supporting 1000baseT. 20101111: The TCP stack has received a significant update to add support for modularised congestion control and generally improve the clarity of congestion control decisions. Bump __FreeBSD_version to 900025. User space tools that rely on the size of struct tcpcb in tcp_var.h (e.g. sockstat) need to be recompiled. 20101002: The man(1) utility has been replaced by a new version that no longer uses /etc/manpath.config. Please consult man.conf(5) for how to migrate local entries to the new format. 20100928: The copyright strings printed by login(1) and sshd(8) at the time of a new connection have been removed to follow other operating systems and upstream sshd. 20100915: A workaround for a fixed ld bug has been removed in kernel code, so make sure that your system ld is built from sources after revision 210245 from 2010-07-19 (r211583 if building head kernel on stable/8, r211584 for stable/7; both from 2010-08-21). A symptom of incorrect ld version is different addresses for set_pcpu section and __start_set_pcpu symbol in kernel and/or modules. 20100913: The $ipv6_prefer variable in rc.conf(5) has been split into $ip6addrctl_policy and $ipv6_activate_all_interfaces. The $ip6addrctl_policy is a variable to choose a pre-defined address selection policy set by ip6addrctl(8). A value "ipv4_prefer", "ipv6_prefer" or "AUTO" can be specified. The default is "AUTO". The $ipv6_activate_all_interfaces specifies whether IFDISABLED flag (see an entry of 20090926) is set on an interface with no corresponding $ifconfig_IF_ipv6 line. The default is "NO" for security reason. If you want IPv6 link-local address on all interfaces by default, set this to "YES". The old ipv6_prefer="YES" is equivalent to ipv6_activate_all_interfaces="YES" and ip6addrctl_policy="ipv6_prefer". 20100913: DTrace has grown support for userland tracing. Due to this, DTrace is now i386 and amd64 only. dtruss(1) is now installed by default on those systems and a new kernel module is needed for userland tracing: fasttrap. No changes to your kernel config file are necessary to enable userland tracing, but you might consider adding 'STRIP=' and 'CFLAGS+=-fno-omit-frame-pointer' to your make.conf if you want to have informative userland stack traces in DTrace (ustack). 20100725: The acpi_aiboost(4) driver has been removed in favor of the new aibs(4) driver. You should update your kernel configuration file. 20100722: BSD grep has been imported to the base system and it is built by default. It is completely BSD licensed, highly GNU-compatible, uses less memory than its GNU counterpart and has a small codebase. However, it is slower than its GNU counterpart, which is mostly noticeable for larger searches, for smaller ones it is measurable but not significant. The reason is complex, the most important factor is that we lack a modern and efficient regex library and GNU overcomes this by optimizing the searches internally. Future work on improving the regex performance is planned, for the meantime, users that need better performance, can build GNU grep instead by setting the WITH_GNU_GREP knob. 20100713: Due to the import of powerpc64 support, all existing powerpc kernel configuration files must be updated with a machine directive like this: machine powerpc powerpc In addition, an updated config(8) is required to build powerpc kernels after this change. 20100713: A new version of ZFS (version 15) has been merged to -HEAD. This version uses a python library for the following subcommands: zfs allow, zfs unallow, zfs groupspace, zfs userspace. For full functionality of these commands the following port must be installed: sysutils/py-zfs 20100429: 'vm_page's are now hashed by physical address to an array of mutexes. Currently this is only used to serialize access to hold_count. Over time the page queue mutex will be peeled away. This changes the size of pmap on every architecture. And requires all callers of vm_page_hold and vm_page_unhold to be updated. 20100402: WITH_CTF can now be specified in src.conf (not recommended, there are some problems with static executables), make.conf (would also affect ports which do not use GNU make and do not override the compile targets) or in the kernel config (via "makeoptions WITH_CTF=yes"). When WITH_CTF was specified there before this was silently ignored, so make sure that WITH_CTF is not used in places which could lead to unwanted behavior. 20100311: The kernel option COMPAT_IA32 has been replaced with COMPAT_FREEBSD32 to allow 32-bit compatibility on non-x86 platforms. All kernel configurations on amd64 and ia64 platforms using these options must be modified accordingly. 20100113: The utmp user accounting database has been replaced with utmpx, the user accounting interface standardized by POSIX. Unfortunately the semantics of utmp and utmpx don't match, making it practically impossible to support both interfaces. The user accounting database is used by tools like finger(1), last(1), talk(1), w(1) and ac(8). All applications in the base system use utmpx. This means only local binaries (e.g. from the ports tree) may still use these utmp database files. These applications must be rebuilt to make use of utmpx. After the system has been upgraded, it is safe to remove the old log files (/var/run/utmp, /var/log/lastlog and /var/log/wtmp*), assuming their contents is of no importance anymore. Old wtmp databases can only be used by last(1) and ac(8) after they have been converted to the new format using wtmpcvt(1). 20100108: Introduce the kernel thread "deadlock resolver" (which can be enabled via the DEADLKRES option, see NOTES for more details) and the sleepq_type() function for sleepqueues. 20091202: The rc.firewall and rc.firewall6 were unified, and rc.firewall6 and rc.d/ip6fw were removed. According to the removal of rc.d/ip6fw, ipv6_firewall_* rc variables are obsoleted. Instead, the following new rc variables are added to rc.d/ipfw: firewall_client_net_ipv6, firewall_simple_iif_ipv6, firewall_simple_inet_ipv6, firewall_simple_oif_ipv6, firewall_simple_onet_ipv6, firewall_trusted_ipv6 The meanings correspond to the relevant IPv4 variables. 20091125: 8.0-RELEASE. 20091113: The default terminal emulation for syscons(4) has been changed from cons25 to xterm on all platforms except pc98. This means that the /etc/ttys file needs to be updated to ensure correct operation of applications on the console. The terminal emulation style can be toggled per window by using vidcontrol(1)'s -T flag. The TEKEN_CONS25 kernel configuration options can be used to change the compile-time default back to cons25. To prevent graphical artifacts, make sure the TERM environment variable is set to match the terminal emulation that is being performed by syscons(4). 20091109: The layout of the structure ieee80211req_scan_result has changed. Applications that require wireless scan results (e.g. ifconfig(8)) from net80211 need to be recompiled. Applications such as wpa_supplicant(8) may require a full world build without using NO_CLEAN in order to get synchronized with the new structure. 20091025: The iwn(4) driver has been updated to support the 5000 and 5150 series. There's one kernel module for each firmware. Adding "device iwnfw" to the kernel configuration file means including all three firmware images inside the kernel. If you want to include just the one for your wireless card, use the devices iwn4965fw, iwn5000fw or iwn5150fw. 20090926: The rc.d/network_ipv6, IPv6 configuration script has been integrated into rc.d/netif. The changes are the following: 1. To use IPv6, simply define $ifconfig_IF_ipv6 like $ifconfig_IF for IPv4. For aliases, $ifconfig_IF_aliasN should be used. Note that both variables need the "inet6" keyword at the head. Do not set $ipv6_network_interfaces manually if you do not understand what you are doing. It is not needed in most cases. $ipv6_ifconfig_IF and $ipv6_ifconfig_IF_aliasN still work, but they are obsolete. 2. $ipv6_enable is obsolete. Use $ipv6_prefer and "inet6 accept_rtadv" keyword in ifconfig(8) instead. If you define $ipv6_enable=YES, it means $ipv6_prefer=YES and all configured interfaces have "inet6 accept_rtadv" in the $ifconfig_IF_ipv6. These are for backward compatibility. 3. A new variable $ipv6_prefer has been added. If NO, IPv6 functionality of interfaces with no corresponding $ifconfig_IF_ipv6 is disabled by using "inet6 ifdisabled" flag, and the default address selection policy of ip6addrctl(8) is the IPv4-preferred one (see rc.d/ip6addrctl for more details). Note that if you want to configure IPv6 functionality on the disabled interfaces after boot, first you need to clear the flag by using ifconfig(8) like: ifconfig em0 inet6 -ifdisabled If YES, the default address selection policy is set as IPv6-preferred. The default value of $ipv6_prefer is NO. 4. If your system need to receive Router Advertisement messages, define "inet6 accept_rtadv" in $ifconfig_IF_ipv6. The rc(8) scripts automatically invoke rtsol(8) when the interface becomes UP. The Router Advertisement messages are used for SLAAC (State-Less Address AutoConfiguration). 20090922: 802.11s D3.03 support was committed. This is incompatible with the previous code, which was based on D3.0. 20090912: A sysctl variable net.inet6.ip6.accept_rtadv now sets the default value of a per-interface flag ND6_IFF_ACCEPT_RTADV, not a global knob to control whether accepting Router Advertisement messages or not. Also, a per-interface flag ND6_IFF_AUTO_LINKLOCAL has been added and a sysctl variable net.inet6.ip6.auto_linklocal is its default value. The ifconfig(8) utility now supports these flags. 20090910: ZFS snapshots are now mounted with MNT_IGNORE flag. Use -v option for mount(8) and -a option for df(1) to see them. 20090825: The old tunable hw.bus.devctl_disable has been superseded by hw.bus.devctl_queue. hw.bus.devctl_disable=1 in loader.conf should be replaced by hw.bus.devctl_queue=0. The default for this new tunable is 1000. 20090813: Remove the option STOP_NMI. The default action is now to use NMI only for KDB via the newly introduced function stop_cpus_hard() and maintain stop_cpus() to just use a normal IPI_STOP on ia32 and amd64. 20090803: The stable/8 branch created in subversion. This corresponds to the RELENG_8 branch in CVS. 20090719: Bump the shared library version numbers for all libraries that do not use symbol versioning as part of the 8.0-RELEASE cycle. Bump __FreeBSD_version to 800105. 20090714: Due to changes in the implementation of virtual network stack support, all network-related kernel modules must be recompiled. As this change breaks the ABI, bump __FreeBSD_version to 800104. 20090713: The TOE interface to the TCP syncache has been modified to remove struct tcpopt () from the ABI of the network stack. The cxgb driver is the only TOE consumer affected by this change, and needs to be recompiled along with the kernel. As this change breaks the ABI, bump __FreeBSD_version to 800103. 20090712: Padding has been added to struct tcpcb, sackhint and tcpstat in to facilitate future MFCs and bug fixes whilst maintaining the ABI. However, this change breaks the ABI, so bump __FreeBSD_version to 800102. User space tools that rely on the size of any of these structs (e.g. sockstat) need to be recompiled. 20090630: The NFS_LEGACYRPC option has been removed along with the old kernel RPC implementation that this option selected. Kernel configurations may need to be adjusted. 20090629: The network interface device nodes at /dev/net/ have been removed. All ioctl operations can be performed the normal way using routing sockets. The kqueue functionality can generally be replaced with routing sockets. 20090628: The documentation from the FreeBSD Documentation Project (Handbook, FAQ, etc.) is now installed via packages by sysinstall(8) and under the /usr/local/share/doc/freebsd directory instead of /usr/share/doc. 20090624: The ABI of various structures related to the SYSV IPC API have been changed. As a result, the COMPAT_FREEBSD[456] and COMPAT_43 kernel options now all require COMPAT_FREEBSD7. Bump __FreeBSD_version to 800100. 20090622: Layout of struct vnet has changed as routing related variables were moved to their own Vimage module. Modules need to be recompiled. Bump __FreeBSD_version to 800099. 20090619: NGROUPS_MAX and NGROUPS have been increased from 16 to 1023 and 1024 respectively. As long as no more than 16 groups per process are used, no changes should be visible. When more than 16 groups are used, old binaries may fail if they call getgroups() or getgrouplist() with statically sized storage. Recompiling will work around this, but applications should be modified to use dynamically allocated storage for group arrays as POSIX.1-2008 does not cap an implementation's number of supported groups at NGROUPS_MAX+1 as previous versions did. NFS and portalfs mounts may also be affected as the list of groups is truncated to 16. Users of NFS who use more than 16 groups, should take care that negative group permissions are not used on the exported file systems as they will not be reliable unless a GSSAPI based authentication method is used. 20090616: The compiling option ADAPTIVE_LOCKMGRS has been introduced. This option compiles in the support for adaptive spinning for lockmgrs which want to enable it. The lockinit() function now accepts the flag LK_ADAPTIVE in order to make the lock object subject to adaptive spinning when both held in write and read mode. 20090613: The layout of the structure returned by IEEE80211_IOC_STA_INFO has changed. User applications that use this ioctl need to be rebuilt. 20090611: The layout of struct thread has changed. Kernel and modules need to be rebuilt. 20090608: The layout of structs ifnet, domain, protosw and vnet_net has changed. Kernel modules need to be rebuilt. Bump __FreeBSD_version to 800097. 20090602: window(1) has been removed from the base system. It can now be installed from ports. The port is called misc/window. 20090601: The way we are storing and accessing `routing table' entries has changed. Programs reading the FIB, like netstat, need to be re-compiled. 20090601: A new netisr implementation has been added for FreeBSD 8. Network file system modules, such as igmp, ipdivert, and others, should be rebuilt. Bump __FreeBSD_version to 800096. 20090530: Remove the tunable/sysctl debug.mpsafevfs as its initial purpose is no more valid. 20090530: Add VOP_ACCESSX(9). File system modules need to be rebuilt. Bump __FreeBSD_version to 800094. 20090529: Add mnt_xflag field to 'struct mount'. File system modules need to be rebuilt. Bump __FreeBSD_version to 800093. 20090528: The compiling option ADAPTIVE_SX has been retired while it has been introduced the option NO_ADAPTIVE_SX which handles the reversed logic. The KPI for sx_init_flags() changes as accepting flags: SX_ADAPTIVESPIN flag has been retired while the SX_NOADAPTIVE flag has been introduced in order to handle the reversed logic. Bump __FreeBSD_version to 800092. 20090527: Add support for hierarchical jails. Remove global securelevel. Bump __FreeBSD_version to 800091. 20090523: The layout of struct vnet_net has changed, therefore modules need to be rebuilt. Bump __FreeBSD_version to 800090. 20090523: The newly imported zic(8) produces a new format in the output. Please run tzsetup(8) to install the newly created data to /etc/localtime. 20090520: The sysctl tree for the usb stack has renamed from hw.usb2.* to hw.usb.* and is now consistent again with previous releases. 20090520: 802.11 monitor mode support was revised and driver api's were changed. Drivers dependent on net80211 now support DLT_IEEE802_11_RADIO instead of DLT_IEEE802_11. No user-visible data structures were changed but applications that use DLT_IEEE802_11 may require changes. Bump __FreeBSD_version to 800088. 20090430: The layout of the following structs has changed: sysctl_oid, socket, ifnet, inpcbinfo, tcpcb, syncache_head, vnet_inet, vnet_inet6 and vnet_ipfw. Most modules need to be rebuild or panics may be experienced. World rebuild is required for correctly checking networking state from userland. Bump __FreeBSD_version to 800085. 20090429: MLDv2 and Source-Specific Multicast (SSM) have been merged to the IPv6 stack. VIMAGE hooks are in but not yet used. The implementation of SSM within FreeBSD's IPv6 stack closely follows the IPv4 implementation. For kernel developers: * The most important changes are that the ip6_output() and ip6_input() paths no longer take the IN6_MULTI_LOCK, and this lock has been downgraded to a non-recursive mutex. * As with the changes to the IPv4 stack to support SSM, filtering of inbound multicast traffic must now be performed by transport protocols within the IPv6 stack. This does not apply to TCP and SCTP, however, it does apply to UDP in IPv6 and raw IPv6. * The KPIs used by IPv6 multicast are similar to those used by the IPv4 stack, with the following differences: * im6o_mc_filter() is analogous to imo_multicast_filter(). * The legacy KAME entry points in6_joingroup and in6_leavegroup() are shimmed to in6_mc_join() and in6_mc_leave() respectively. * IN6_LOOKUP_MULTI() has been deprecated and removed. * IPv6 relies on MLD for the DAD mechanism. KAME's internal KPIs for MLDv1 have an additional 'timer' argument which is used to jitter the initial membership report for the solicited-node multicast membership on-link. * This is not strictly needed for MLDv2, which already jitters its report transmissions. However, the 'timer' argument is preserved in case MLDv1 is active on the interface. * The KAME linked-list based IPv6 membership implementation has been refactored to use a vector similar to that used by the IPv4 stack. Code which maintains a list of its own multicast memberships internally, e.g. carp, has been updated to reflect the new semantics. * There is a known Lock Order Reversal (LOR) due to in6_setscope() acquiring the IF_AFDATA_LOCK and being called within ip6_output(). Whilst MLDv2 tries to avoid this otherwise benign LOR, it is an implementation constraint which needs to be addressed in HEAD. For application developers: * The changes are broadly similar to those made for the IPv4 stack. * The use of IPv4 and IPv6 multicast socket options on the same socket, using mapped addresses, HAS NOT been tested or supported. * There are a number of issues with the implementation of various IPv6 multicast APIs which need to be resolved in the API surface before the implementation is fully compatible with KAME userland use, and these are mostly to do with interface index treatment. * The literature available discusses the use of either the delta / ASM API with setsockopt(2)/getsockopt(2), or the full-state / ASM API using setsourcefilter(3)/getsourcefilter(3). For more information please refer to RFC 3768, 'Socket Interface Extensions for Multicast Source Filters'. * Applications which use the published RFC 3678 APIs should be fine. For systems administrators: * The mtest(8) utility has been refactored to support IPv6, in addition to IPv4. Interface addresses are no longer accepted as arguments, their names must be used instead. The utility will map the interface name to its first IPv4 address as returned by getifaddrs(3). * The ifmcstat(8) utility has also been updated to print the MLDv2 endpoint state and source filter lists via sysctl(3). * The net.inet6.ip6.mcast.loop sysctl may be tuned to 0 to disable loopback of IPv6 multicast datagrams by default; it defaults to 1 to preserve the existing behaviour. Disabling multicast loopback is recommended for optimal system performance. * The IPv6 MROUTING code has been changed to examine this sysctl instead of attempting to perform a group lookup before looping back forwarded datagrams. Bump __FreeBSD_version to 800084. 20090422: Implement low-level Bluetooth HCI API. Bump __FreeBSD_version to 800083. 20090419: The layout of struct malloc_type, used by modules to register new memory allocation types, has changed. Most modules will need to be rebuilt or panics may be experienced. Bump __FreeBSD_version to 800081. 20090415: Anticipate overflowing inp_flags - add inp_flags2. This changes most offsets in inpcb, so checking v4 connection state will require a world rebuild. Bump __FreeBSD_version to 800080. 20090415: Add an llentry to struct route and struct route_in6. Modules embedding a struct route will need to be recompiled. Bump __FreeBSD_version to 800079. 20090414: The size of rt_metrics_lite and by extension rtentry has changed. Networking administration apps will need to be recompiled. The route command now supports show as an alias for get, weighting of routes, sticky and nostick flags to alter the behavior of stateful load balancing. Bump __FreeBSD_version to 800078. 20090408: Do not use Giant for kbdmux(4) locking. This is wrong and apparently causing more problems than it solves. This will re-open the issue where interrupt handlers may race with kbdmux(4) in polling mode. Typical symptoms include (but not limited to) duplicated and/or missing characters when low level console functions (such as gets) are used while interrupts are enabled (for example geli password prompt, mountroot prompt etc.). Disabling kbdmux(4) may help. 20090407: The size of structs vnet_net, vnet_inet and vnet_ipfw has changed; kernel modules referencing any of the above need to be recompiled. Bump __FreeBSD_version to 800075. 20090320: GEOM_PART has become the default partition slicer for storage devices, replacing GEOM_MBR, GEOM_BSD, GEOM_PC98 and GEOM_GPT slicers. It introduces some changes: MSDOS/EBR: the devices created from MSDOS extended partition entries (EBR) can be named differently than with GEOM_MBR and are now symlinks to devices with offset-based names. fstabs may need to be modified. BSD: the "geometry does not match label" warning is harmless in most cases but it points to problems in file system misalignment with disk geometry. The "c" partition is now implicit, covers the whole top-level drive and cannot be (mis)used by users. General: Kernel dumps are now not allowed to be written to devices whose partition types indicate they are meant to be used for file systems (or, in case of MSDOS partitions, as something else than the "386BSD" type). Most of these changes date approximately from 200812. 20090319: The uscanner(4) driver has been removed from the kernel. This follows Linux removing theirs in 2.6 and making libusb the default interface (supported by sane). 20090319: The multicast forwarding code has been cleaned up. netstat(1) only relies on KVM now for printing bandwidth upcall meters. The IPv4 and IPv6 modules are split into ip_mroute_mod and ip6_mroute_mod respectively. The config(5) options for statically compiling this code remain the same, i.e. 'options MROUTING'. 20090315: Support for the IFF_NEEDSGIANT network interface flag has been removed, which means that non-MPSAFE network device drivers are no longer supported. In particular, if_ar, if_sr, and network device drivers from the old (legacy) USB stack can no longer be built or used. 20090313: POSIX.1 Native Language Support (NLS) has been enabled in libc and a bunch of new language catalog files have also been added. This means that some common libc messages are now localized and they depend on the LC_MESSAGES environmental variable. 20090313: The k8temp(4) driver has been renamed to amdtemp(4) since support for Family 10 and Family 11 CPU families was added. 20090309: IGMPv3 and Source-Specific Multicast (SSM) have been merged to the IPv4 stack. VIMAGE hooks are in but not yet used. For kernel developers, the most important changes are that the ip_output() and ip_input() paths no longer take the IN_MULTI_LOCK(), and this lock has been downgraded to a non-recursive mutex. Transport protocols (UDP, Raw IP) are now responsible for filtering inbound multicast traffic according to group membership and source filters. The imo_multicast_filter() KPI exists for this purpose. Transports which do not use multicast (SCTP, TCP) already reject multicast by default. Forwarding and receive performance may improve as a mutex acquisition is no longer needed in the ip_input() low-level input path. in_addmulti() and in_delmulti() are shimmed to new KPIs which exist to support SSM in-kernel. For application developers, it is recommended that loopback of multicast datagrams be disabled for best performance, as this will still cause the lock to be taken for each looped-back datagram transmission. The net.inet.ip.mcast.loop sysctl may be tuned to 0 to disable loopback by default; it defaults to 1 to preserve the existing behaviour. For systems administrators, to obtain best performance with multicast reception and multiple groups, it is always recommended that a card with a suitably precise hash filter is used. Hash collisions will still result in the lock being taken within the transport protocol input path to check group membership. If deploying FreeBSD in an environment with IGMP snooping switches, it is recommended that the net.inet.igmp.sendlocal sysctl remain enabled; this forces 224.0.0.0/24 group membership to be announced via IGMP. The size of 'struct igmpstat' has changed; netstat needs to be recompiled to reflect this. Bump __FreeBSD_version to 800070. 20090309: libusb20.so.1 is now installed as libusb.so.1 and the ports system updated to use it. This requires a buildworld/installworld in order to update the library and dependencies (usbconfig, etc). Its advisable to rebuild all ports which uses libusb. More specific directions are given in the ports collection UPDATING file. Any /etc/libmap.conf entries for libusb are no longer required and can be removed. 20090302: A workaround is committed to allow the creation of System V shared memory segment of size > 2 GB on the 64-bit architectures. Due to a limitation of the existing ABI, the shm_segsz member of the struct shmid_ds, returned by shmctl(IPC_STAT) call is wrong for large segments. Note that limits must be explicitly raised to allow such segments to be created. 20090301: The layout of struct ifnet has changed, requiring a rebuild of all network device driver modules. 20090227: The /dev handling for the new USB stack has changed, a buildworld/installworld is required for libusb20. 20090223: The new USB2 stack has now been permanently moved in and all kernel and module names reverted to their previous values (eg, usb, ehci, ohci, ums, ...). The old usb stack can be compiled in by prefixing the name with the letter 'o', the old usb modules have been removed. Updating entry 20090216 for xorg and 20090215 for libmap may still apply. 20090217: The rc.conf(5) option if_up_delay has been renamed to defaultroute_delay to better reflect its purpose. If you have customized this setting in /etc/rc.conf you need to update it to use the new name. 20090216: xorg 7.4 wants to configure its input devices via hald which does not yet work with USB2. If the keyboard/mouse does not work in xorg then add Option "AllowEmptyInput" "off" to your ServerLayout section. This will cause X to use the configured kbd and mouse sections from your xorg.conf. 20090215: The GENERIC kernels for all architectures now default to the new USB2 stack. No kernel config options or code have been removed so if a problem arises please report it and optionally revert to the old USB stack. If you are loading USB kernel modules or have a custom kernel that includes GENERIC then ensure that usb names are also changed over, eg uftdi -> usb2_serial_ftdi. Older programs linked against the ports libusb 0.1 need to be redirected to the new stack's libusb20. /etc/libmap.conf can be used for this: # Map old usb library to new one for usb2 stack libusb-0.1.so.8 libusb20.so.1 20090209: All USB ethernet devices now attach as interfaces under the name ueN (eg. ue0). This is to provide a predictable name as vendors often change usb chipsets in a product without notice. 20090203: The ichsmb(4) driver has been changed to require SMBus slave addresses be left-justified (xxxxxxx0b) rather than right-justified. All of the other SMBus controller drivers require left-justified slave addresses, so this change makes all the drivers provide the same interface. 20090201: INET6 statistics (struct ip6stat) was updated. netstat(1) needs to be recompiled. 20090119: NTFS has been removed from GENERIC kernel on amd64 to match GENERIC on i386. Should not cause any issues since mount_ntfs(8) will load ntfs.ko module automatically when NTFS support is actually needed, unless ntfs.ko is not installed or security level prohibits loading kernel modules. If either is the case, "options NTFS" has to be added into kernel config. 20090115: TCP Appropriate Byte Counting (RFC 3465) support added to kernel. New field in struct tcpcb breaks ABI, so bump __FreeBSD_version to 800061. User space tools that rely on the size of struct tcpcb in tcp_var.h (e.g. sockstat) need to be recompiled. 20081225: ng_tty(4) module updated to match the new TTY subsystem. Due to API change, user-level applications must be updated. New API support added to mpd5 CVS and expected to be present in next mpd5.3 release. 20081219: With __FreeBSD_version 800060 the makefs tool is part of the base system (it was a port). 20081216: The afdata and ifnet locks have been changed from mutexes to rwlocks, network modules will need to be re-compiled. 20081214: __FreeBSD_version 800059 incorporates the new arp-v2 rewrite. RTF_CLONING, RTF_LLINFO and RTF_WASCLONED flags are eliminated. The new code reduced struct rtentry{} by 16 bytes on 32-bit architecture and 40 bytes on 64-bit architecture. The userland applications "arp" and "ndp" have been updated accordingly. The output from "netstat -r" shows only routing entries and none of the L2 information. 20081130: __FreeBSD_version 800057 marks the switchover from the binary ath hal to source code. Users must add the line: options AH_SUPPORT_AR5416 to their kernel config files when specifying: device ath_hal The ath_hal module no longer exists; the code is now compiled together with the driver in the ath module. It is now possible to tailor chip support (i.e. reduce the set of chips and thereby the code size); consult ath_hal(4) for details. 20081121: __FreeBSD_version 800054 adds memory barriers to , new interfaces to ifnet to facilitate multiple hardware transmit queues for cards that support them, and a lock-less ring-buffer implementation to enable drivers to more efficiently manage queueing of packets. 20081117: A new version of ZFS (version 13) has been merged to -HEAD. This version has zpool attribute "listsnapshots" off by default, which means "zfs list" does not show snapshots, and is the same as Solaris behavior. 20081028: dummynet(4) ABI has changed. ipfw(8) needs to be recompiled. 20081009: The uhci, ohci, ehci and slhci USB Host controller drivers have been put into separate modules. If you load the usb module separately through loader.conf you will need to load the appropriate *hci module as well. E.g. for a UHCI-based USB 2.0 controller add the following to loader.conf: uhci_load="YES" ehci_load="YES" 20081009: The ABI used by the PMC toolset has changed. Please keep userland (libpmc(3)) and the kernel module (hwpmc(4)) in sync. 20081009: atapci kernel module now includes only generic PCI ATA driver. AHCI driver moved to ataahci kernel module. All vendor-specific code moved into separate kernel modules: ataacard, ataacerlabs, ataadaptec, ataamd, ataati, atacenatek, atacypress, atacyrix, atahighpoint, ataintel, ataite, atajmicron, atamarvell, atamicron, atanational, atanetcell, atanvidia, atapromise, ataserverworks, atasiliconimage, atasis, atavia 20080820: The TTY subsystem of the kernel has been replaced by a new implementation, which provides better scalability and an improved driver model. Most common drivers have been migrated to the new TTY subsystem, while others have not. The following drivers have not yet been ported to the new TTY layer: PCI/ISA: cy, digi, rc, rp, sio USB: ubser, ucycom Line disciplines: ng_h4, ng_tty, ppp, sl, snp Adding these drivers to your kernel configuration file shall cause compilation to fail. 20080818: ntpd has been upgraded to 4.2.4p5. 20080801: OpenSSH has been upgraded to 5.1p1. For many years, FreeBSD's version of OpenSSH preferred DSA over RSA for host and user authentication keys. With this upgrade, we've switched to the vendor's default of RSA over DSA. This may cause upgraded clients to warn about unknown host keys even for previously known hosts. Users should follow the usual procedure for verifying host keys before accepting the RSA key. This can be circumvented by setting the "HostKeyAlgorithms" option to "ssh-dss,ssh-rsa" in ~/.ssh/config or on the ssh command line. Please note that the sequence of keys offered for authentication has been changed as well. You may want to specify IdentityFile in a different order to revert this behavior. 20080713: The sio(4) driver has been removed from the i386 and amd64 kernel configuration files. This means uart(4) is now the default serial port driver on those platforms as well. To prevent collisions with the sio(4) driver, the uart(4) driver uses different names for its device nodes. This means the onboard serial port will now most likely be called "ttyu0" instead of "ttyd0". You may need to reconfigure applications to use the new device names. When using the serial port as a boot console, be sure to update /boot/device.hints and /etc/ttys before booting the new kernel. If you forget to do so, you can still manually specify the hints at the loader prompt: set hint.uart.0.at="isa" set hint.uart.0.port="0x3F8" set hint.uart.0.flags="0x10" set hint.uart.0.irq="4" boot -s 20080609: The gpt(8) utility has been removed. Use gpart(8) to partition disks instead. 20080603: The version that Linuxulator emulates was changed from 2.4.2 to 2.6.16. If you experience any problems with Linux binaries please try to set sysctl compat.linux.osrelease to 2.4.2 and if it fixes the problem contact emulation mailing list. 20080525: ISDN4BSD (I4B) was removed from the src tree. You may need to update a your kernel configuration and remove relevant entries. 20080509: I have checked in code to support multiple routing tables. See the man pages setfib(1) and setfib(2). This is a hopefully backwards compatible version, but to make use of it you need to compile your kernel with options ROUTETABLES=2 (or more up to 16). 20080420: The 802.11 wireless support was redone to enable multi-bss operation on devices that are capable. The underlying device is no longer used directly but instead wlanX devices are cloned with ifconfig. This requires changes to rc.conf files. For example, change: ifconfig_ath0="WPA DHCP" to wlans_ath0=wlan0 ifconfig_wlan0="WPA DHCP" see rc.conf(5) for more details. In addition, mergemaster of /etc/rc.d is highly recommended. Simultaneous update of userland and kernel wouldn't hurt either. As part of the multi-bss changes the wlan_scan_ap and wlan_scan_sta modules were merged into the base wlan module. All references to these modules (e.g. in kernel config files) must be removed. 20080408: psm(4) has gained write(2) support in native operation level. Arbitrary commands can be written to /dev/psm%d and status can be read back from it. Therefore, an application is responsible for status validation and error recovery. It is a no-op in other operation levels. 20080312: Support for KSE threading has been removed from the kernel. To run legacy applications linked against KSE libmap.conf may be used. The following libmap.conf may be used to ensure compatibility with any prior release: libpthread.so.1 libthr.so.1 libpthread.so.2 libthr.so.2 libkse.so.3 libthr.so.3 20080301: The layout of struct vmspace has changed. This affects libkvm and any executables that link against libkvm and use the kvm_getprocs() function. In particular, but not exclusively, it affects ps(1), fstat(1), pkill(1), systat(1), top(1) and w(1). The effects are minimal, but it's advisable to upgrade world nonetheless. 20080229: The latest em driver no longer has support in it for the 82575 adapter, this is now moved to the igb driver. The split was done to make new features that are incompatible with older hardware easier to do. 20080220: The new geom_lvm(4) geom class has been renamed to geom_linux_lvm(4), likewise the kernel option is now GEOM_LINUX_LVM. 20080211: The default NFS mount mode has changed from UDP to TCP for increased reliability. If you rely on (insecurely) NFS mounting across a firewall you may need to update your firewall rules. 20080208: Belatedly note the addition of m_collapse for compacting mbuf chains. 20080126: The fts(3) structures have been changed to use adequate integer types for their members and so to be able to cope with huge file trees. The old fts(3) ABI is preserved through symbol versioning in libc, so third-party binaries using fts(3) should still work, although they will not take advantage of the extended types. At the same time, some third-party software might fail to build after this change due to unportable assumptions made in its source code about fts(3) structure members. Such software should be fixed by its vendor or, in the worst case, in the ports tree. FreeBSD_version 800015 marks this change for the unlikely case that a portable fix is impossible. 20080123: To upgrade to -current after this date, you must be running FreeBSD not older than 6.0-RELEASE. Upgrading to -current from 5.x now requires a stop over at RELENG_6 or RELENG_7 systems. 20071128: The ADAPTIVE_GIANT kernel option has been retired because its functionality is the default now. 20071118: The AT keyboard emulation of sunkbd(4) has been turned on by default. In order to make the special symbols of the Sun keyboards driven by sunkbd(4) work under X these now have to be configured the same way as Sun USB keyboards driven by ukbd(4) (which also does AT keyboard emulation), f.e.: Option "XkbLayout" "us" Option "XkbRules" "xorg" Option "XkbSymbols" "pc(pc105)+sun_vndr/usb(sun_usb)+us" 20071024: It has been decided that it is desirable to provide ABI backwards compatibility to the FreeBSD 4/5/6 versions of the PCIOCGETCONF, PCIOCREAD and PCIOCWRITE IOCTLs, which was broken with the introduction of PCI domain support (see the 20070930 entry). Unfortunately, this required the ABI of PCIOCGETCONF to be broken again in order to be able to provide backwards compatibility to the old version of that IOCTL. Thus consumers of PCIOCGETCONF have to be recompiled again. As for prominent ports this affects neither pciutils nor xorg-server this time, the hal port needs to be rebuilt however. 20071020: The misnamed kthread_create() and friends have been renamed to kproc_create() etc. Many of the callers already used kproc_start().. I will return kthread_create() and friends in a while with implementations that actually create threads, not procs. Renaming corresponds with version 800002. 20071010: RELENG_7 branched. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach before reporting problems with a major version upgrade. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ad0: "gpart bootcode -p /boot/gptzfsboot -i 1 ad0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To just build a kernel when you know that it won't mess you up -------------------------------------------------------------- This assumes you are already running a CURRENT system. Replace ${arch} with the architecture of your machine (e.g. "i386", "arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc). cd src/sys/${arch}/conf config KERNEL_NAME_HERE cd ../compile/KERNEL_NAME_HERE make depend make make install If this fails, go to the "To build a kernel" section. To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -p [5] make installworld mergemaster -i [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make kernel KERNCONF=YOUR_KERNEL_HERE [8] [1] [3] mergemaster -p [5] make installworld mergemaster -i [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a cd src adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a noop. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from [78]-stable or 9-stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] In order to have a kernel that can run the 4.x binaries needed to do an installworld, you must include the COMPAT_FREEBSD4 option in your kernel. Failure to do so may leave you with a system that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is required to run the 5.x binaries on more recent kernels. And so on for COMPAT_FREEBSD6 and COMPAT_FREEBSD7. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. [9] When checking out sources, you must include the -P flag to have cvs prune empty directories. If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since October 10, 2007. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. All Rights Reserved. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: releng/9.3/crypto/openssh/monitor.c =================================================================== --- releng/9.3/crypto/openssh/monitor.c (revision 287146) +++ releng/9.3/crypto/openssh/monitor.c (revision 287147) @@ -1,2139 +1,2139 @@ /* $OpenBSD: monitor.c,v 1.131 2014/02/02 03:44:31 djm Exp $ */ /* * Copyright 2002 Niels Provos * Copyright 2002 Markus Friedl * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "includes.h" #include #include #include #include "openbsd-compat/sys-tree.h" #include #include #include #ifdef HAVE_PATHS_H #include #endif #include #include #include #include #include #include #ifdef HAVE_POLL_H #include #else # ifdef HAVE_SYS_POLL_H # include # endif #endif #ifdef SKEY #include #endif #include #include "openbsd-compat/sys-queue.h" #include "atomicio.h" #include "xmalloc.h" #include "ssh.h" #include "key.h" #include "buffer.h" #include "hostfile.h" #include "auth.h" #include "cipher.h" #include "kex.h" #include "dh.h" #ifdef TARGET_OS_MAC /* XXX Broken krb5 headers on Mac */ #undef TARGET_OS_MAC #include "zlib.h" #define TARGET_OS_MAC 1 #else #include "zlib.h" #endif #include "packet.h" #include "auth-options.h" #include "sshpty.h" #include "channels.h" #include "session.h" #include "sshlogin.h" #include "canohost.h" #include "log.h" #include "servconf.h" #include "monitor.h" #include "monitor_mm.h" #ifdef GSSAPI #include "ssh-gss.h" #endif #include "monitor_wrap.h" #include "monitor_fdpass.h" #include "misc.h" #include "compat.h" #include "ssh2.h" #include "roaming.h" #include "authfd.h" #ifdef GSSAPI static Gssctxt *gsscontext = NULL; #endif /* Imports */ extern ServerOptions options; extern u_int utmp_len; extern Newkeys *current_keys[]; extern z_stream incoming_stream; extern z_stream outgoing_stream; extern u_char session_id[]; extern Buffer auth_debug; extern int auth_debug_init; extern Buffer loginmsg; /* State exported from the child */ struct { z_stream incoming; z_stream outgoing; u_char *keyin; u_int keyinlen; u_char *keyout; u_int keyoutlen; u_char *ivin; u_int ivinlen; u_char *ivout; u_int ivoutlen; u_char *ssh1key; u_int ssh1keylen; int ssh1cipher; int ssh1protoflags; u_char *input; u_int ilen; u_char *output; u_int olen; u_int64_t sent_bytes; u_int64_t recv_bytes; } child_state; /* Functions on the monitor that answer unprivileged requests */ int mm_answer_moduli(int, Buffer *); int mm_answer_sign(int, Buffer *); int mm_answer_pwnamallow(int, Buffer *); int mm_answer_auth2_read_banner(int, Buffer *); int mm_answer_authserv(int, Buffer *); int mm_answer_authpassword(int, Buffer *); int mm_answer_bsdauthquery(int, Buffer *); int mm_answer_bsdauthrespond(int, Buffer *); int mm_answer_skeyquery(int, Buffer *); int mm_answer_skeyrespond(int, Buffer *); int mm_answer_keyallowed(int, Buffer *); int mm_answer_keyverify(int, Buffer *); int mm_answer_pty(int, Buffer *); int mm_answer_pty_cleanup(int, Buffer *); int mm_answer_term(int, Buffer *); int mm_answer_rsa_keyallowed(int, Buffer *); int mm_answer_rsa_challenge(int, Buffer *); int mm_answer_rsa_response(int, Buffer *); int mm_answer_sesskey(int, Buffer *); int mm_answer_sessid(int, Buffer *); #ifdef USE_PAM int mm_answer_pam_start(int, Buffer *); int mm_answer_pam_account(int, Buffer *); int mm_answer_pam_init_ctx(int, Buffer *); int mm_answer_pam_query(int, Buffer *); int mm_answer_pam_respond(int, Buffer *); int mm_answer_pam_free_ctx(int, Buffer *); #endif #ifdef GSSAPI int mm_answer_gss_setup_ctx(int, Buffer *); int mm_answer_gss_accept_ctx(int, Buffer *); int mm_answer_gss_userok(int, Buffer *); int mm_answer_gss_checkmic(int, Buffer *); #endif #ifdef SSH_AUDIT_EVENTS int mm_answer_audit_event(int, Buffer *); int mm_answer_audit_command(int, Buffer *); #endif static int monitor_read_log(struct monitor *); static Authctxt *authctxt; static BIGNUM *ssh1_challenge = NULL; /* used for ssh1 rsa auth */ /* local state for key verify */ static u_char *key_blob = NULL; static u_int key_bloblen = 0; static int key_blobtype = MM_NOKEY; static char *hostbased_cuser = NULL; static char *hostbased_chost = NULL; static char *auth_method = "unknown"; static char *auth_submethod = NULL; static u_int session_id2_len = 0; static u_char *session_id2 = NULL; static pid_t monitor_child_pid; struct mon_table { enum monitor_reqtype type; int flags; int (*f)(int, Buffer *); }; #define MON_ISAUTH 0x0004 /* Required for Authentication */ #define MON_AUTHDECIDE 0x0008 /* Decides Authentication */ #define MON_ONCE 0x0010 /* Disable after calling */ #define MON_ALOG 0x0020 /* Log auth attempt without authenticating */ #define MON_AUTH (MON_ISAUTH|MON_AUTHDECIDE) #define MON_PERMIT 0x1000 /* Request is permitted */ struct mon_table mon_dispatch_proto20[] = { {MONITOR_REQ_MODULI, MON_ONCE, mm_answer_moduli}, {MONITOR_REQ_SIGN, MON_ONCE, mm_answer_sign}, {MONITOR_REQ_PWNAM, MON_ONCE, mm_answer_pwnamallow}, {MONITOR_REQ_AUTHSERV, MON_ONCE, mm_answer_authserv}, {MONITOR_REQ_AUTH2_READ_BANNER, MON_ONCE, mm_answer_auth2_read_banner}, {MONITOR_REQ_AUTHPASSWORD, MON_AUTH, mm_answer_authpassword}, #ifdef USE_PAM {MONITOR_REQ_PAM_START, MON_ONCE, mm_answer_pam_start}, {MONITOR_REQ_PAM_ACCOUNT, 0, mm_answer_pam_account}, {MONITOR_REQ_PAM_INIT_CTX, MON_ISAUTH, mm_answer_pam_init_ctx}, {MONITOR_REQ_PAM_QUERY, MON_ISAUTH, mm_answer_pam_query}, {MONITOR_REQ_PAM_RESPOND, MON_ISAUTH, mm_answer_pam_respond}, {MONITOR_REQ_PAM_FREE_CTX, MON_ONCE|MON_AUTHDECIDE, mm_answer_pam_free_ctx}, #endif #ifdef SSH_AUDIT_EVENTS {MONITOR_REQ_AUDIT_EVENT, MON_PERMIT, mm_answer_audit_event}, #endif #ifdef BSD_AUTH {MONITOR_REQ_BSDAUTHQUERY, MON_ISAUTH, mm_answer_bsdauthquery}, {MONITOR_REQ_BSDAUTHRESPOND, MON_AUTH, mm_answer_bsdauthrespond}, #endif #ifdef SKEY {MONITOR_REQ_SKEYQUERY, MON_ISAUTH, mm_answer_skeyquery}, {MONITOR_REQ_SKEYRESPOND, MON_AUTH, mm_answer_skeyrespond}, #endif {MONITOR_REQ_KEYALLOWED, MON_ISAUTH, mm_answer_keyallowed}, {MONITOR_REQ_KEYVERIFY, MON_AUTH, mm_answer_keyverify}, #ifdef GSSAPI {MONITOR_REQ_GSSSETUP, MON_ISAUTH, mm_answer_gss_setup_ctx}, {MONITOR_REQ_GSSSTEP, MON_ISAUTH, mm_answer_gss_accept_ctx}, {MONITOR_REQ_GSSUSEROK, MON_AUTH, mm_answer_gss_userok}, {MONITOR_REQ_GSSCHECKMIC, MON_ISAUTH, mm_answer_gss_checkmic}, #endif {0, 0, NULL} }; struct mon_table mon_dispatch_postauth20[] = { {MONITOR_REQ_MODULI, 0, mm_answer_moduli}, {MONITOR_REQ_SIGN, 0, mm_answer_sign}, {MONITOR_REQ_PTY, 0, mm_answer_pty}, {MONITOR_REQ_PTYCLEANUP, 0, mm_answer_pty_cleanup}, {MONITOR_REQ_TERM, 0, mm_answer_term}, #ifdef SSH_AUDIT_EVENTS {MONITOR_REQ_AUDIT_EVENT, MON_PERMIT, mm_answer_audit_event}, {MONITOR_REQ_AUDIT_COMMAND, MON_PERMIT, mm_answer_audit_command}, #endif {0, 0, NULL} }; struct mon_table mon_dispatch_proto15[] = { {MONITOR_REQ_PWNAM, MON_ONCE, mm_answer_pwnamallow}, {MONITOR_REQ_SESSKEY, MON_ONCE, mm_answer_sesskey}, {MONITOR_REQ_SESSID, MON_ONCE, mm_answer_sessid}, {MONITOR_REQ_AUTHPASSWORD, MON_AUTH, mm_answer_authpassword}, {MONITOR_REQ_RSAKEYALLOWED, MON_ISAUTH|MON_ALOG, mm_answer_rsa_keyallowed}, {MONITOR_REQ_KEYALLOWED, MON_ISAUTH|MON_ALOG, mm_answer_keyallowed}, {MONITOR_REQ_RSACHALLENGE, MON_ONCE, mm_answer_rsa_challenge}, {MONITOR_REQ_RSARESPONSE, MON_ONCE|MON_AUTHDECIDE, mm_answer_rsa_response}, #ifdef BSD_AUTH {MONITOR_REQ_BSDAUTHQUERY, MON_ISAUTH, mm_answer_bsdauthquery}, {MONITOR_REQ_BSDAUTHRESPOND, MON_AUTH, mm_answer_bsdauthrespond}, #endif #ifdef SKEY {MONITOR_REQ_SKEYQUERY, MON_ISAUTH, mm_answer_skeyquery}, {MONITOR_REQ_SKEYRESPOND, MON_AUTH, mm_answer_skeyrespond}, #endif #ifdef USE_PAM {MONITOR_REQ_PAM_START, MON_ONCE, mm_answer_pam_start}, {MONITOR_REQ_PAM_ACCOUNT, 0, mm_answer_pam_account}, {MONITOR_REQ_PAM_INIT_CTX, MON_ISAUTH, mm_answer_pam_init_ctx}, {MONITOR_REQ_PAM_QUERY, MON_ISAUTH, mm_answer_pam_query}, {MONITOR_REQ_PAM_RESPOND, MON_ISAUTH, mm_answer_pam_respond}, {MONITOR_REQ_PAM_FREE_CTX, MON_ONCE|MON_AUTHDECIDE, mm_answer_pam_free_ctx}, #endif #ifdef SSH_AUDIT_EVENTS {MONITOR_REQ_AUDIT_EVENT, MON_PERMIT, mm_answer_audit_event}, #endif {0, 0, NULL} }; struct mon_table mon_dispatch_postauth15[] = { {MONITOR_REQ_PTY, MON_ONCE, mm_answer_pty}, {MONITOR_REQ_PTYCLEANUP, MON_ONCE, mm_answer_pty_cleanup}, {MONITOR_REQ_TERM, 0, mm_answer_term}, #ifdef SSH_AUDIT_EVENTS {MONITOR_REQ_AUDIT_EVENT, MON_PERMIT, mm_answer_audit_event}, {MONITOR_REQ_AUDIT_COMMAND, MON_PERMIT|MON_ONCE, mm_answer_audit_command}, #endif {0, 0, NULL} }; struct mon_table *mon_dispatch; /* Specifies if a certain message is allowed at the moment */ static void monitor_permit(struct mon_table *ent, enum monitor_reqtype type, int permit) { while (ent->f != NULL) { if (ent->type == type) { ent->flags &= ~MON_PERMIT; ent->flags |= permit ? MON_PERMIT : 0; return; } ent++; } } static void monitor_permit_authentications(int permit) { struct mon_table *ent = mon_dispatch; while (ent->f != NULL) { if (ent->flags & MON_AUTH) { ent->flags &= ~MON_PERMIT; ent->flags |= permit ? MON_PERMIT : 0; } ent++; } } void monitor_child_preauth(Authctxt *_authctxt, struct monitor *pmonitor) { struct mon_table *ent; int authenticated = 0, partial = 0; debug3("preauth child monitor started"); close(pmonitor->m_recvfd); close(pmonitor->m_log_sendfd); pmonitor->m_log_sendfd = pmonitor->m_recvfd = -1; authctxt = _authctxt; memset(authctxt, 0, sizeof(*authctxt)); authctxt->loginmsg = &loginmsg; if (compat20) { mon_dispatch = mon_dispatch_proto20; /* Permit requests for moduli and signatures */ monitor_permit(mon_dispatch, MONITOR_REQ_MODULI, 1); monitor_permit(mon_dispatch, MONITOR_REQ_SIGN, 1); } else { mon_dispatch = mon_dispatch_proto15; monitor_permit(mon_dispatch, MONITOR_REQ_SESSKEY, 1); } /* The first few requests do not require asynchronous access */ while (!authenticated) { partial = 0; auth_method = "unknown"; auth_submethod = NULL; authenticated = (monitor_read(pmonitor, mon_dispatch, &ent) == 1); /* Special handling for multiple required authentications */ if (options.num_auth_methods != 0) { if (!compat20) fatal("AuthenticationMethods is not supported" "with SSH protocol 1"); if (authenticated && !auth2_update_methods_lists(authctxt, auth_method, auth_submethod)) { debug3("%s: method %s: partial", __func__, auth_method); authenticated = 0; partial = 1; } } if (authenticated) { if (!(ent->flags & MON_AUTHDECIDE)) fatal("%s: unexpected authentication from %d", __func__, ent->type); if (authctxt->pw->pw_uid == 0 && !auth_root_allowed(auth_method)) authenticated = 0; #ifdef USE_PAM /* PAM needs to perform account checks after auth */ if (options.use_pam && authenticated) { Buffer m; buffer_init(&m); mm_request_receive_expect(pmonitor->m_sendfd, MONITOR_REQ_PAM_ACCOUNT, &m); authenticated = mm_answer_pam_account(pmonitor->m_sendfd, &m); buffer_free(&m); } #endif } if (ent->flags & (MON_AUTHDECIDE|MON_ALOG)) { auth_log(authctxt, authenticated, partial, auth_method, auth_submethod); if (!authenticated) authctxt->failures++; } } if (!authctxt->valid) fatal("%s: authenticated invalid user", __func__); if (strcmp(auth_method, "unknown") == 0) fatal("%s: authentication method name unknown", __func__); debug("%s: %s has been authenticated by privileged process", __func__, authctxt->user); mm_get_keystate(pmonitor); /* Drain any buffered messages from the child */ while (pmonitor->m_log_recvfd != -1 && monitor_read_log(pmonitor) == 0) ; close(pmonitor->m_sendfd); close(pmonitor->m_log_recvfd); pmonitor->m_sendfd = pmonitor->m_log_recvfd = -1; } static void monitor_set_child_handler(pid_t pid) { monitor_child_pid = pid; } static void monitor_child_handler(int sig) { kill(monitor_child_pid, sig); } void monitor_child_postauth(struct monitor *pmonitor) { close(pmonitor->m_recvfd); pmonitor->m_recvfd = -1; monitor_set_child_handler(pmonitor->m_pid); signal(SIGHUP, &monitor_child_handler); signal(SIGTERM, &monitor_child_handler); signal(SIGINT, &monitor_child_handler); if (compat20) { mon_dispatch = mon_dispatch_postauth20; /* Permit requests for moduli and signatures */ monitor_permit(mon_dispatch, MONITOR_REQ_MODULI, 1); monitor_permit(mon_dispatch, MONITOR_REQ_SIGN, 1); monitor_permit(mon_dispatch, MONITOR_REQ_TERM, 1); } else { mon_dispatch = mon_dispatch_postauth15; monitor_permit(mon_dispatch, MONITOR_REQ_TERM, 1); } if (!no_pty_flag) { monitor_permit(mon_dispatch, MONITOR_REQ_PTY, 1); monitor_permit(mon_dispatch, MONITOR_REQ_PTYCLEANUP, 1); } for (;;) monitor_read(pmonitor, mon_dispatch, NULL); } void monitor_sync(struct monitor *pmonitor) { if (options.compression) { /* The member allocation is not visible, so sync it */ mm_share_sync(&pmonitor->m_zlib, &pmonitor->m_zback); } } static int monitor_read_log(struct monitor *pmonitor) { Buffer logmsg; u_int len, level; char *msg; buffer_init(&logmsg); /* Read length */ buffer_append_space(&logmsg, 4); if (atomicio(read, pmonitor->m_log_recvfd, buffer_ptr(&logmsg), buffer_len(&logmsg)) != buffer_len(&logmsg)) { if (errno == EPIPE) { buffer_free(&logmsg); debug("%s: child log fd closed", __func__); close(pmonitor->m_log_recvfd); pmonitor->m_log_recvfd = -1; return -1; } fatal("%s: log fd read: %s", __func__, strerror(errno)); } len = buffer_get_int(&logmsg); if (len <= 4 || len > 8192) fatal("%s: invalid log message length %u", __func__, len); /* Read severity, message */ buffer_clear(&logmsg); buffer_append_space(&logmsg, len); if (atomicio(read, pmonitor->m_log_recvfd, buffer_ptr(&logmsg), buffer_len(&logmsg)) != buffer_len(&logmsg)) fatal("%s: log fd read: %s", __func__, strerror(errno)); /* Log it */ level = buffer_get_int(&logmsg); msg = buffer_get_string(&logmsg, NULL); if (log_level_name(level) == NULL) fatal("%s: invalid log level %u (corrupted message?)", __func__, level); do_log2(level, "%s [preauth]", msg); buffer_free(&logmsg); free(msg); return 0; } int monitor_read(struct monitor *pmonitor, struct mon_table *ent, struct mon_table **pent) { Buffer m; int ret; u_char type; struct pollfd pfd[2]; for (;;) { memset(&pfd, 0, sizeof(pfd)); pfd[0].fd = pmonitor->m_sendfd; pfd[0].events = POLLIN; pfd[1].fd = pmonitor->m_log_recvfd; pfd[1].events = pfd[1].fd == -1 ? 0 : POLLIN; if (poll(pfd, pfd[1].fd == -1 ? 1 : 2, -1) == -1) { if (errno == EINTR || errno == EAGAIN) continue; fatal("%s: poll: %s", __func__, strerror(errno)); } if (pfd[1].revents) { /* * Drain all log messages before processing next * monitor request. */ monitor_read_log(pmonitor); continue; } if (pfd[0].revents) break; /* Continues below */ } buffer_init(&m); mm_request_receive(pmonitor->m_sendfd, &m); type = buffer_get_char(&m); debug3("%s: checking request %d", __func__, type); while (ent->f != NULL) { if (ent->type == type) break; ent++; } if (ent->f != NULL) { if (!(ent->flags & MON_PERMIT)) fatal("%s: unpermitted request %d", __func__, type); ret = (*ent->f)(pmonitor->m_sendfd, &m); buffer_free(&m); /* The child may use this request only once, disable it */ if (ent->flags & MON_ONCE) { debug2("%s: %d used once, disabling now", __func__, type); ent->flags &= ~MON_PERMIT; } if (pent != NULL) *pent = ent; return ret; } fatal("%s: unsupported request: %d", __func__, type); /* NOTREACHED */ return (-1); } /* allowed key state */ static int monitor_allowed_key(u_char *blob, u_int bloblen) { /* make sure key is allowed */ if (key_blob == NULL || key_bloblen != bloblen || timingsafe_bcmp(key_blob, blob, key_bloblen)) return (0); return (1); } static void monitor_reset_key_state(void) { /* reset state */ free(key_blob); free(hostbased_cuser); free(hostbased_chost); key_blob = NULL; key_bloblen = 0; key_blobtype = MM_NOKEY; hostbased_cuser = NULL; hostbased_chost = NULL; } int mm_answer_moduli(int sock, Buffer *m) { DH *dh; int min, want, max; min = buffer_get_int(m); want = buffer_get_int(m); max = buffer_get_int(m); debug3("%s: got parameters: %d %d %d", __func__, min, want, max); /* We need to check here, too, in case the child got corrupted */ if (max < min || want < min || max < want) fatal("%s: bad parameters: %d %d %d", __func__, min, want, max); buffer_clear(m); dh = choose_dh(min, want, max); if (dh == NULL) { buffer_put_char(m, 0); return (0); } else { /* Send first bignum */ buffer_put_char(m, 1); buffer_put_bignum2(m, dh->p); buffer_put_bignum2(m, dh->g); DH_free(dh); } mm_request_send(sock, MONITOR_ANS_MODULI, m); return (0); } extern AuthenticationConnection *auth_conn; int mm_answer_sign(int sock, Buffer *m) { Key *key; u_char *p; u_char *signature; u_int siglen, datlen; int keyid; debug3("%s", __func__); keyid = buffer_get_int(m); p = buffer_get_string(m, &datlen); /* * Supported KEX types use SHA1 (20 bytes), SHA256 (32 bytes), * SHA384 (48 bytes) and SHA512 (64 bytes). */ if (datlen != 20 && datlen != 32 && datlen != 48 && datlen != 64) fatal("%s: data length incorrect: %u", __func__, datlen); /* save session id, it will be passed on the first call */ if (session_id2_len == 0) { session_id2_len = datlen; session_id2 = xmalloc(session_id2_len); memcpy(session_id2, p, session_id2_len); } if ((key = get_hostkey_by_index(keyid)) != NULL) { if (key_sign(key, &signature, &siglen, p, datlen) < 0) fatal("%s: key_sign failed", __func__); } else if ((key = get_hostkey_public_by_index(keyid)) != NULL && auth_conn != NULL) { if (ssh_agent_sign(auth_conn, key, &signature, &siglen, p, datlen) < 0) fatal("%s: ssh_agent_sign failed", __func__); } else fatal("%s: no hostkey from index %d", __func__, keyid); debug3("%s: signature %p(%u)", __func__, signature, siglen); buffer_clear(m); buffer_put_string(m, signature, siglen); free(p); free(signature); mm_request_send(sock, MONITOR_ANS_SIGN, m); /* Turn on permissions for getpwnam */ monitor_permit(mon_dispatch, MONITOR_REQ_PWNAM, 1); return (0); } /* Retrieves the password entry and also checks if the user is permitted */ int mm_answer_pwnamallow(int sock, Buffer *m) { char *username; struct passwd *pwent; int allowed = 0; u_int i; debug3("%s", __func__); if (authctxt->attempt++ != 0) fatal("%s: multiple attempts for getpwnam", __func__); username = buffer_get_string(m, NULL); pwent = getpwnamallow(username); authctxt->user = xstrdup(username); setproctitle("%s [priv]", pwent ? username : "unknown"); free(username); buffer_clear(m); if (pwent == NULL) { buffer_put_char(m, 0); authctxt->pw = fakepw(); goto out; } allowed = 1; authctxt->pw = pwent; authctxt->valid = 1; buffer_put_char(m, 1); buffer_put_string(m, pwent, sizeof(struct passwd)); buffer_put_cstring(m, pwent->pw_name); buffer_put_cstring(m, "*"); #ifdef HAVE_STRUCT_PASSWD_PW_GECOS buffer_put_cstring(m, pwent->pw_gecos); #endif #ifdef HAVE_STRUCT_PASSWD_PW_CLASS buffer_put_cstring(m, pwent->pw_class); #endif buffer_put_cstring(m, pwent->pw_dir); buffer_put_cstring(m, pwent->pw_shell); out: buffer_put_string(m, &options, sizeof(options)); #define M_CP_STROPT(x) do { \ if (options.x != NULL) \ buffer_put_cstring(m, options.x); \ } while (0) #define M_CP_STRARRAYOPT(x, nx) do { \ for (i = 0; i < options.nx; i++) \ buffer_put_cstring(m, options.x[i]); \ } while (0) /* See comment in servconf.h */ COPY_MATCH_STRING_OPTS(); #undef M_CP_STROPT #undef M_CP_STRARRAYOPT /* Create valid auth method lists */ if (compat20 && auth2_setup_methods_lists(authctxt) != 0) { /* * The monitor will continue long enough to let the child * run to it's packet_disconnect(), but it must not allow any * authentication to succeed. */ debug("%s: no valid authentication method lists", __func__); } debug3("%s: sending MONITOR_ANS_PWNAM: %d", __func__, allowed); mm_request_send(sock, MONITOR_ANS_PWNAM, m); /* For SSHv1 allow authentication now */ if (!compat20) monitor_permit_authentications(1); else { /* Allow service/style information on the auth context */ monitor_permit(mon_dispatch, MONITOR_REQ_AUTHSERV, 1); monitor_permit(mon_dispatch, MONITOR_REQ_AUTH2_READ_BANNER, 1); } #ifdef USE_PAM if (options.use_pam) monitor_permit(mon_dispatch, MONITOR_REQ_PAM_START, 1); #endif return (0); } int mm_answer_auth2_read_banner(int sock, Buffer *m) { char *banner; buffer_clear(m); banner = auth2_read_banner(); buffer_put_cstring(m, banner != NULL ? banner : ""); mm_request_send(sock, MONITOR_ANS_AUTH2_READ_BANNER, m); free(banner); return (0); } int mm_answer_authserv(int sock, Buffer *m) { monitor_permit_authentications(1); authctxt->service = buffer_get_string(m, NULL); authctxt->style = buffer_get_string(m, NULL); debug3("%s: service=%s, style=%s", __func__, authctxt->service, authctxt->style); if (strlen(authctxt->style) == 0) { free(authctxt->style); authctxt->style = NULL; } return (0); } int mm_answer_authpassword(int sock, Buffer *m) { static int call_count; char *passwd; int authenticated; u_int plen; passwd = buffer_get_string(m, &plen); /* Only authenticate if the context is valid */ authenticated = options.password_authentication && auth_password(authctxt, passwd); explicit_bzero(passwd, strlen(passwd)); free(passwd); buffer_clear(m); buffer_put_int(m, authenticated); debug3("%s: sending result %d", __func__, authenticated); mm_request_send(sock, MONITOR_ANS_AUTHPASSWORD, m); call_count++; if (plen == 0 && call_count == 1) auth_method = "none"; else auth_method = "password"; /* Causes monitor loop to terminate if authenticated */ return (authenticated); } #ifdef BSD_AUTH int mm_answer_bsdauthquery(int sock, Buffer *m) { char *name, *infotxt; u_int numprompts; u_int *echo_on; char **prompts; u_int success; success = bsdauth_query(authctxt, &name, &infotxt, &numprompts, &prompts, &echo_on) < 0 ? 0 : 1; buffer_clear(m); buffer_put_int(m, success); if (success) buffer_put_cstring(m, prompts[0]); debug3("%s: sending challenge success: %u", __func__, success); mm_request_send(sock, MONITOR_ANS_BSDAUTHQUERY, m); if (success) { free(name); free(infotxt); free(prompts); free(echo_on); } return (0); } int mm_answer_bsdauthrespond(int sock, Buffer *m) { char *response; int authok; if (authctxt->as == 0) fatal("%s: no bsd auth session", __func__); response = buffer_get_string(m, NULL); authok = options.challenge_response_authentication && auth_userresponse(authctxt->as, response, 0); authctxt->as = NULL; debug3("%s: <%s> = <%d>", __func__, response, authok); free(response); buffer_clear(m); buffer_put_int(m, authok); debug3("%s: sending authenticated: %d", __func__, authok); mm_request_send(sock, MONITOR_ANS_BSDAUTHRESPOND, m); if (compat20) { auth_method = "keyboard-interactive"; auth_submethod = "bsdauth"; } else auth_method = "bsdauth"; return (authok != 0); } #endif #ifdef SKEY int mm_answer_skeyquery(int sock, Buffer *m) { struct skey skey; char challenge[1024]; u_int success; success = _compat_skeychallenge(&skey, authctxt->user, challenge, sizeof(challenge)) < 0 ? 0 : 1; buffer_clear(m); buffer_put_int(m, success); if (success) buffer_put_cstring(m, challenge); debug3("%s: sending challenge success: %u", __func__, success); mm_request_send(sock, MONITOR_ANS_SKEYQUERY, m); return (0); } int mm_answer_skeyrespond(int sock, Buffer *m) { char *response; int authok; response = buffer_get_string(m, NULL); authok = (options.challenge_response_authentication && authctxt->valid && skey_haskey(authctxt->pw->pw_name) == 0 && skey_passcheck(authctxt->pw->pw_name, response) != -1); free(response); buffer_clear(m); buffer_put_int(m, authok); debug3("%s: sending authenticated: %d", __func__, authok); mm_request_send(sock, MONITOR_ANS_SKEYRESPOND, m); auth_method = "skey"; return (authok != 0); } #endif #ifdef USE_PAM int mm_answer_pam_start(int sock, Buffer *m) { if (!options.use_pam) fatal("UsePAM not set, but ended up in %s anyway", __func__); start_pam(authctxt); monitor_permit(mon_dispatch, MONITOR_REQ_PAM_ACCOUNT, 1); return (0); } int mm_answer_pam_account(int sock, Buffer *m) { u_int ret; if (!options.use_pam) fatal("UsePAM not set, but ended up in %s anyway", __func__); ret = do_pam_account(); buffer_put_int(m, ret); buffer_put_string(m, buffer_ptr(&loginmsg), buffer_len(&loginmsg)); mm_request_send(sock, MONITOR_ANS_PAM_ACCOUNT, m); return (ret); } static void *sshpam_ctxt, *sshpam_authok; extern KbdintDevice sshpam_device; int mm_answer_pam_init_ctx(int sock, Buffer *m) { - debug3("%s", __func__); - authctxt->user = buffer_get_string(m, NULL); sshpam_ctxt = (sshpam_device.init_ctx)(authctxt); sshpam_authok = NULL; buffer_clear(m); if (sshpam_ctxt != NULL) { monitor_permit(mon_dispatch, MONITOR_REQ_PAM_FREE_CTX, 1); buffer_put_int(m, 1); } else { buffer_put_int(m, 0); } mm_request_send(sock, MONITOR_ANS_PAM_INIT_CTX, m); return (0); } int mm_answer_pam_query(int sock, Buffer *m) { char *name = NULL, *info = NULL, **prompts = NULL; u_int i, num = 0, *echo_on = 0; int ret; debug3("%s", __func__); sshpam_authok = NULL; ret = (sshpam_device.query)(sshpam_ctxt, &name, &info, &num, &prompts, &echo_on); if (ret == 0 && num == 0) sshpam_authok = sshpam_ctxt; if (num > 1 || name == NULL || info == NULL) ret = -1; buffer_clear(m); buffer_put_int(m, ret); buffer_put_cstring(m, name); free(name); buffer_put_cstring(m, info); free(info); buffer_put_int(m, num); for (i = 0; i < num; ++i) { buffer_put_cstring(m, prompts[i]); free(prompts[i]); buffer_put_int(m, echo_on[i]); } free(prompts); free(echo_on); auth_method = "keyboard-interactive"; auth_submethod = "pam"; mm_request_send(sock, MONITOR_ANS_PAM_QUERY, m); return (0); } int mm_answer_pam_respond(int sock, Buffer *m) { char **resp; u_int i, num; int ret; debug3("%s", __func__); sshpam_authok = NULL; num = buffer_get_int(m); if (num > 0) { resp = xcalloc(num, sizeof(char *)); for (i = 0; i < num; ++i) resp[i] = buffer_get_string(m, NULL); ret = (sshpam_device.respond)(sshpam_ctxt, num, resp); for (i = 0; i < num; ++i) free(resp[i]); free(resp); } else { ret = (sshpam_device.respond)(sshpam_ctxt, num, NULL); } buffer_clear(m); buffer_put_int(m, ret); mm_request_send(sock, MONITOR_ANS_PAM_RESPOND, m); auth_method = "keyboard-interactive"; auth_submethod = "pam"; if (ret == 0) sshpam_authok = sshpam_ctxt; return (0); } int mm_answer_pam_free_ctx(int sock, Buffer *m) { + int r = sshpam_authok != NULL && sshpam_authok == sshpam_ctxt; debug3("%s", __func__); (sshpam_device.free_ctx)(sshpam_ctxt); + sshpam_ctxt = sshpam_authok = NULL; buffer_clear(m); mm_request_send(sock, MONITOR_ANS_PAM_FREE_CTX, m); auth_method = "keyboard-interactive"; auth_submethod = "pam"; - return (sshpam_authok == sshpam_ctxt); + return r; } #endif int mm_answer_keyallowed(int sock, Buffer *m) { Key *key; char *cuser, *chost; u_char *blob; u_int bloblen; enum mm_keytype type = 0; int allowed = 0; debug3("%s entering", __func__); type = buffer_get_int(m); cuser = buffer_get_string(m, NULL); chost = buffer_get_string(m, NULL); blob = buffer_get_string(m, &bloblen); key = key_from_blob(blob, bloblen); if ((compat20 && type == MM_RSAHOSTKEY) || (!compat20 && type != MM_RSAHOSTKEY)) fatal("%s: key type and protocol mismatch", __func__); debug3("%s: key_from_blob: %p", __func__, key); if (key != NULL && authctxt->valid) { switch (type) { case MM_USERKEY: allowed = options.pubkey_authentication && user_key_allowed(authctxt->pw, key); pubkey_auth_info(authctxt, key, NULL); auth_method = "publickey"; if (options.pubkey_authentication && allowed != 1) auth_clear_options(); break; case MM_HOSTKEY: allowed = options.hostbased_authentication && hostbased_key_allowed(authctxt->pw, cuser, chost, key); pubkey_auth_info(authctxt, key, "client user \"%.100s\", client host \"%.100s\"", cuser, chost); auth_method = "hostbased"; break; case MM_RSAHOSTKEY: key->type = KEY_RSA1; /* XXX */ allowed = options.rhosts_rsa_authentication && auth_rhosts_rsa_key_allowed(authctxt->pw, cuser, chost, key); if (options.rhosts_rsa_authentication && allowed != 1) auth_clear_options(); auth_method = "rsa"; break; default: fatal("%s: unknown key type %d", __func__, type); break; } } if (key != NULL) key_free(key); /* clear temporarily storage (used by verify) */ monitor_reset_key_state(); if (allowed) { /* Save temporarily for comparison in verify */ key_blob = blob; key_bloblen = bloblen; key_blobtype = type; hostbased_cuser = cuser; hostbased_chost = chost; } else { /* Log failed attempt */ auth_log(authctxt, 0, 0, auth_method, NULL); free(blob); free(cuser); free(chost); } debug3("%s: key %p is %s", __func__, key, allowed ? "allowed" : "not allowed"); buffer_clear(m); buffer_put_int(m, allowed); buffer_put_int(m, forced_command != NULL); mm_request_send(sock, MONITOR_ANS_KEYALLOWED, m); if (type == MM_RSAHOSTKEY) monitor_permit(mon_dispatch, MONITOR_REQ_RSACHALLENGE, allowed); return (0); } static int monitor_valid_userblob(u_char *data, u_int datalen) { Buffer b; char *p, *userstyle; u_int len; int fail = 0; buffer_init(&b); buffer_append(&b, data, datalen); if (datafellows & SSH_OLD_SESSIONID) { p = buffer_ptr(&b); len = buffer_len(&b); if ((session_id2 == NULL) || (len < session_id2_len) || (timingsafe_bcmp(p, session_id2, session_id2_len) != 0)) fail++; buffer_consume(&b, session_id2_len); } else { p = buffer_get_string(&b, &len); if ((session_id2 == NULL) || (len != session_id2_len) || (timingsafe_bcmp(p, session_id2, session_id2_len) != 0)) fail++; free(p); } if (buffer_get_char(&b) != SSH2_MSG_USERAUTH_REQUEST) fail++; p = buffer_get_cstring(&b, NULL); xasprintf(&userstyle, "%s%s%s", authctxt->user, authctxt->style ? ":" : "", authctxt->style ? authctxt->style : ""); if (strcmp(userstyle, p) != 0) { logit("wrong user name passed to monitor: expected %s != %.100s", userstyle, p); fail++; } free(userstyle); free(p); buffer_skip_string(&b); if (datafellows & SSH_BUG_PKAUTH) { if (!buffer_get_char(&b)) fail++; } else { p = buffer_get_cstring(&b, NULL); if (strcmp("publickey", p) != 0) fail++; free(p); if (!buffer_get_char(&b)) fail++; buffer_skip_string(&b); } buffer_skip_string(&b); if (buffer_len(&b) != 0) fail++; buffer_free(&b); return (fail == 0); } static int monitor_valid_hostbasedblob(u_char *data, u_int datalen, char *cuser, char *chost) { Buffer b; char *p, *userstyle; u_int len; int fail = 0; buffer_init(&b); buffer_append(&b, data, datalen); p = buffer_get_string(&b, &len); if ((session_id2 == NULL) || (len != session_id2_len) || (timingsafe_bcmp(p, session_id2, session_id2_len) != 0)) fail++; free(p); if (buffer_get_char(&b) != SSH2_MSG_USERAUTH_REQUEST) fail++; p = buffer_get_cstring(&b, NULL); xasprintf(&userstyle, "%s%s%s", authctxt->user, authctxt->style ? ":" : "", authctxt->style ? authctxt->style : ""); if (strcmp(userstyle, p) != 0) { logit("wrong user name passed to monitor: expected %s != %.100s", userstyle, p); fail++; } free(userstyle); free(p); buffer_skip_string(&b); /* service */ p = buffer_get_cstring(&b, NULL); if (strcmp(p, "hostbased") != 0) fail++; free(p); buffer_skip_string(&b); /* pkalg */ buffer_skip_string(&b); /* pkblob */ /* verify client host, strip trailing dot if necessary */ p = buffer_get_string(&b, NULL); if (((len = strlen(p)) > 0) && p[len - 1] == '.') p[len - 1] = '\0'; if (strcmp(p, chost) != 0) fail++; free(p); /* verify client user */ p = buffer_get_string(&b, NULL); if (strcmp(p, cuser) != 0) fail++; free(p); if (buffer_len(&b) != 0) fail++; buffer_free(&b); return (fail == 0); } int mm_answer_keyverify(int sock, Buffer *m) { Key *key; u_char *signature, *data, *blob; u_int signaturelen, datalen, bloblen; int verified = 0; int valid_data = 0; blob = buffer_get_string(m, &bloblen); signature = buffer_get_string(m, &signaturelen); data = buffer_get_string(m, &datalen); if (hostbased_cuser == NULL || hostbased_chost == NULL || !monitor_allowed_key(blob, bloblen)) fatal("%s: bad key, not previously allowed", __func__); key = key_from_blob(blob, bloblen); if (key == NULL) fatal("%s: bad public key blob", __func__); switch (key_blobtype) { case MM_USERKEY: valid_data = monitor_valid_userblob(data, datalen); break; case MM_HOSTKEY: valid_data = monitor_valid_hostbasedblob(data, datalen, hostbased_cuser, hostbased_chost); break; default: valid_data = 0; break; } if (!valid_data) fatal("%s: bad signature data blob", __func__); verified = key_verify(key, signature, signaturelen, data, datalen); debug3("%s: key %p signature %s", __func__, key, (verified == 1) ? "verified" : "unverified"); key_free(key); free(blob); free(signature); free(data); auth_method = key_blobtype == MM_USERKEY ? "publickey" : "hostbased"; monitor_reset_key_state(); buffer_clear(m); buffer_put_int(m, verified); mm_request_send(sock, MONITOR_ANS_KEYVERIFY, m); return (verified == 1); } static void mm_record_login(Session *s, struct passwd *pw) { socklen_t fromlen; struct sockaddr_storage from; /* * Get IP address of client. If the connection is not a socket, let * the address be 0.0.0.0. */ memset(&from, 0, sizeof(from)); fromlen = sizeof(from); if (packet_connection_is_on_socket()) { if (getpeername(packet_get_connection_in(), (struct sockaddr *)&from, &fromlen) < 0) { debug("getpeername: %.100s", strerror(errno)); cleanup_exit(255); } } /* Record that there was a login on that tty from the remote host. */ record_login(s->pid, s->tty, pw->pw_name, pw->pw_uid, get_remote_name_or_ip(utmp_len, options.use_dns), (struct sockaddr *)&from, fromlen); } static void mm_session_close(Session *s) { debug3("%s: session %d pid %ld", __func__, s->self, (long)s->pid); if (s->ttyfd != -1) { debug3("%s: tty %s ptyfd %d", __func__, s->tty, s->ptyfd); session_pty_cleanup2(s); } session_unused(s->self); } int mm_answer_pty(int sock, Buffer *m) { extern struct monitor *pmonitor; Session *s; int res, fd0; debug3("%s entering", __func__); buffer_clear(m); s = session_new(); if (s == NULL) goto error; s->authctxt = authctxt; s->pw = authctxt->pw; s->pid = pmonitor->m_pid; res = pty_allocate(&s->ptyfd, &s->ttyfd, s->tty, sizeof(s->tty)); if (res == 0) goto error; pty_setowner(authctxt->pw, s->tty); buffer_put_int(m, 1); buffer_put_cstring(m, s->tty); /* We need to trick ttyslot */ if (dup2(s->ttyfd, 0) == -1) fatal("%s: dup2", __func__); mm_record_login(s, authctxt->pw); /* Now we can close the file descriptor again */ close(0); /* send messages generated by record_login */ buffer_put_string(m, buffer_ptr(&loginmsg), buffer_len(&loginmsg)); buffer_clear(&loginmsg); mm_request_send(sock, MONITOR_ANS_PTY, m); if (mm_send_fd(sock, s->ptyfd) == -1 || mm_send_fd(sock, s->ttyfd) == -1) fatal("%s: send fds failed", __func__); /* make sure nothing uses fd 0 */ if ((fd0 = open(_PATH_DEVNULL, O_RDONLY)) < 0) fatal("%s: open(/dev/null): %s", __func__, strerror(errno)); if (fd0 != 0) error("%s: fd0 %d != 0", __func__, fd0); /* slave is not needed */ close(s->ttyfd); s->ttyfd = s->ptyfd; /* no need to dup() because nobody closes ptyfd */ s->ptymaster = s->ptyfd; debug3("%s: tty %s ptyfd %d", __func__, s->tty, s->ttyfd); return (0); error: if (s != NULL) mm_session_close(s); buffer_put_int(m, 0); mm_request_send(sock, MONITOR_ANS_PTY, m); return (0); } int mm_answer_pty_cleanup(int sock, Buffer *m) { Session *s; char *tty; debug3("%s entering", __func__); tty = buffer_get_string(m, NULL); if ((s = session_by_tty(tty)) != NULL) mm_session_close(s); buffer_clear(m); free(tty); return (0); } int mm_answer_sesskey(int sock, Buffer *m) { BIGNUM *p; int rsafail; /* Turn off permissions */ monitor_permit(mon_dispatch, MONITOR_REQ_SESSKEY, 0); if ((p = BN_new()) == NULL) fatal("%s: BN_new", __func__); buffer_get_bignum2(m, p); rsafail = ssh1_session_key(p); buffer_clear(m); buffer_put_int(m, rsafail); buffer_put_bignum2(m, p); BN_clear_free(p); mm_request_send(sock, MONITOR_ANS_SESSKEY, m); /* Turn on permissions for sessid passing */ monitor_permit(mon_dispatch, MONITOR_REQ_SESSID, 1); return (0); } int mm_answer_sessid(int sock, Buffer *m) { int i; debug3("%s entering", __func__); if (buffer_len(m) != 16) fatal("%s: bad ssh1 session id", __func__); for (i = 0; i < 16; i++) session_id[i] = buffer_get_char(m); /* Turn on permissions for getpwnam */ monitor_permit(mon_dispatch, MONITOR_REQ_PWNAM, 1); return (0); } int mm_answer_rsa_keyallowed(int sock, Buffer *m) { BIGNUM *client_n; Key *key = NULL; u_char *blob = NULL; u_int blen = 0; int allowed = 0; debug3("%s entering", __func__); auth_method = "rsa"; if (options.rsa_authentication && authctxt->valid) { if ((client_n = BN_new()) == NULL) fatal("%s: BN_new", __func__); buffer_get_bignum2(m, client_n); allowed = auth_rsa_key_allowed(authctxt->pw, client_n, &key); BN_clear_free(client_n); } buffer_clear(m); buffer_put_int(m, allowed); buffer_put_int(m, forced_command != NULL); /* clear temporarily storage (used by generate challenge) */ monitor_reset_key_state(); if (allowed && key != NULL) { key->type = KEY_RSA; /* cheat for key_to_blob */ if (key_to_blob(key, &blob, &blen) == 0) fatal("%s: key_to_blob failed", __func__); buffer_put_string(m, blob, blen); /* Save temporarily for comparison in verify */ key_blob = blob; key_bloblen = blen; key_blobtype = MM_RSAUSERKEY; } if (key != NULL) key_free(key); mm_request_send(sock, MONITOR_ANS_RSAKEYALLOWED, m); monitor_permit(mon_dispatch, MONITOR_REQ_RSACHALLENGE, allowed); monitor_permit(mon_dispatch, MONITOR_REQ_RSARESPONSE, 0); return (0); } int mm_answer_rsa_challenge(int sock, Buffer *m) { Key *key = NULL; u_char *blob; u_int blen; debug3("%s entering", __func__); if (!authctxt->valid) fatal("%s: authctxt not valid", __func__); blob = buffer_get_string(m, &blen); if (!monitor_allowed_key(blob, blen)) fatal("%s: bad key, not previously allowed", __func__); if (key_blobtype != MM_RSAUSERKEY && key_blobtype != MM_RSAHOSTKEY) fatal("%s: key type mismatch", __func__); if ((key = key_from_blob(blob, blen)) == NULL) fatal("%s: received bad key", __func__); if (key->type != KEY_RSA) fatal("%s: received bad key type %d", __func__, key->type); key->type = KEY_RSA1; if (ssh1_challenge) BN_clear_free(ssh1_challenge); ssh1_challenge = auth_rsa_generate_challenge(key); buffer_clear(m); buffer_put_bignum2(m, ssh1_challenge); debug3("%s sending reply", __func__); mm_request_send(sock, MONITOR_ANS_RSACHALLENGE, m); monitor_permit(mon_dispatch, MONITOR_REQ_RSARESPONSE, 1); free(blob); key_free(key); return (0); } int mm_answer_rsa_response(int sock, Buffer *m) { Key *key = NULL; u_char *blob, *response; u_int blen, len; int success; debug3("%s entering", __func__); if (!authctxt->valid) fatal("%s: authctxt not valid", __func__); if (ssh1_challenge == NULL) fatal("%s: no ssh1_challenge", __func__); blob = buffer_get_string(m, &blen); if (!monitor_allowed_key(blob, blen)) fatal("%s: bad key, not previously allowed", __func__); if (key_blobtype != MM_RSAUSERKEY && key_blobtype != MM_RSAHOSTKEY) fatal("%s: key type mismatch: %d", __func__, key_blobtype); if ((key = key_from_blob(blob, blen)) == NULL) fatal("%s: received bad key", __func__); response = buffer_get_string(m, &len); if (len != 16) fatal("%s: received bad response to challenge", __func__); success = auth_rsa_verify_response(key, ssh1_challenge, response); free(blob); key_free(key); free(response); auth_method = key_blobtype == MM_RSAUSERKEY ? "rsa" : "rhosts-rsa"; /* reset state */ BN_clear_free(ssh1_challenge); ssh1_challenge = NULL; monitor_reset_key_state(); buffer_clear(m); buffer_put_int(m, success); mm_request_send(sock, MONITOR_ANS_RSARESPONSE, m); return (success); } int mm_answer_term(int sock, Buffer *req) { extern struct monitor *pmonitor; int res, status; debug3("%s: tearing down sessions", __func__); /* The child is terminating */ session_destroy_all(&mm_session_close); #ifdef USE_PAM if (options.use_pam) sshpam_cleanup(); #endif while (waitpid(pmonitor->m_pid, &status, 0) == -1) if (errno != EINTR) exit(1); res = WIFEXITED(status) ? WEXITSTATUS(status) : 1; /* Terminate process */ exit(res); } #ifdef SSH_AUDIT_EVENTS /* Report that an audit event occurred */ int mm_answer_audit_event(int socket, Buffer *m) { ssh_audit_event_t event; debug3("%s entering", __func__); event = buffer_get_int(m); switch(event) { case SSH_AUTH_FAIL_PUBKEY: case SSH_AUTH_FAIL_HOSTBASED: case SSH_AUTH_FAIL_GSSAPI: case SSH_LOGIN_EXCEED_MAXTRIES: case SSH_LOGIN_ROOT_DENIED: case SSH_CONNECTION_CLOSE: case SSH_INVALID_USER: audit_event(event); break; default: fatal("Audit event type %d not permitted", event); } return (0); } int mm_answer_audit_command(int socket, Buffer *m) { u_int len; char *cmd; debug3("%s entering", __func__); cmd = buffer_get_string(m, &len); /* sanity check command, if so how? */ audit_run_command(cmd); free(cmd); return (0); } #endif /* SSH_AUDIT_EVENTS */ void monitor_apply_keystate(struct monitor *pmonitor) { if (compat20) { set_newkeys(MODE_IN); set_newkeys(MODE_OUT); } else { packet_set_protocol_flags(child_state.ssh1protoflags); packet_set_encryption_key(child_state.ssh1key, child_state.ssh1keylen, child_state.ssh1cipher); free(child_state.ssh1key); } /* for rc4 and other stateful ciphers */ packet_set_keycontext(MODE_OUT, child_state.keyout); free(child_state.keyout); packet_set_keycontext(MODE_IN, child_state.keyin); free(child_state.keyin); if (!compat20) { packet_set_iv(MODE_OUT, child_state.ivout); free(child_state.ivout); packet_set_iv(MODE_IN, child_state.ivin); free(child_state.ivin); } memcpy(&incoming_stream, &child_state.incoming, sizeof(incoming_stream)); memcpy(&outgoing_stream, &child_state.outgoing, sizeof(outgoing_stream)); /* Update with new address */ if (options.compression) mm_init_compression(pmonitor->m_zlib); if (options.rekey_limit || options.rekey_interval) packet_set_rekey_limits((u_int32_t)options.rekey_limit, (time_t)options.rekey_interval); /* Network I/O buffers */ /* XXX inefficient for large buffers, need: buffer_init_from_string */ buffer_clear(packet_get_input()); buffer_append(packet_get_input(), child_state.input, child_state.ilen); explicit_bzero(child_state.input, child_state.ilen); free(child_state.input); buffer_clear(packet_get_output()); buffer_append(packet_get_output(), child_state.output, child_state.olen); explicit_bzero(child_state.output, child_state.olen); free(child_state.output); /* Roaming */ if (compat20) roam_set_bytes(child_state.sent_bytes, child_state.recv_bytes); } static Kex * mm_get_kex(Buffer *m) { Kex *kex; void *blob; u_int bloblen; kex = xcalloc(1, sizeof(*kex)); kex->session_id = buffer_get_string(m, &kex->session_id_len); if (session_id2 == NULL || kex->session_id_len != session_id2_len || timingsafe_bcmp(kex->session_id, session_id2, session_id2_len) != 0) fatal("mm_get_get: internal error: bad session id"); kex->we_need = buffer_get_int(m); kex->kex[KEX_DH_GRP1_SHA1] = kexdh_server; kex->kex[KEX_DH_GRP14_SHA1] = kexdh_server; kex->kex[KEX_DH_GEX_SHA1] = kexgex_server; kex->kex[KEX_DH_GEX_SHA256] = kexgex_server; kex->kex[KEX_ECDH_SHA2] = kexecdh_server; kex->kex[KEX_C25519_SHA256] = kexc25519_server; kex->server = 1; kex->hostkey_type = buffer_get_int(m); kex->kex_type = buffer_get_int(m); blob = buffer_get_string(m, &bloblen); buffer_init(&kex->my); buffer_append(&kex->my, blob, bloblen); free(blob); blob = buffer_get_string(m, &bloblen); buffer_init(&kex->peer); buffer_append(&kex->peer, blob, bloblen); free(blob); kex->done = 1; kex->flags = buffer_get_int(m); kex->client_version_string = buffer_get_string(m, NULL); kex->server_version_string = buffer_get_string(m, NULL); kex->load_host_public_key=&get_hostkey_public_by_type; kex->load_host_private_key=&get_hostkey_private_by_type; kex->host_key_index=&get_hostkey_index; kex->sign = sshd_hostkey_sign; return (kex); } /* This function requries careful sanity checking */ void mm_get_keystate(struct monitor *pmonitor) { Buffer m; u_char *blob, *p; u_int bloblen, plen; u_int32_t seqnr, packets; u_int64_t blocks, bytes; debug3("%s: Waiting for new keys", __func__); buffer_init(&m); mm_request_receive_expect(pmonitor->m_sendfd, MONITOR_REQ_KEYEXPORT, &m); if (!compat20) { child_state.ssh1protoflags = buffer_get_int(&m); child_state.ssh1cipher = buffer_get_int(&m); child_state.ssh1key = buffer_get_string(&m, &child_state.ssh1keylen); child_state.ivout = buffer_get_string(&m, &child_state.ivoutlen); child_state.ivin = buffer_get_string(&m, &child_state.ivinlen); goto skip; } else { /* Get the Kex for rekeying */ *pmonitor->m_pkex = mm_get_kex(&m); } blob = buffer_get_string(&m, &bloblen); current_keys[MODE_OUT] = mm_newkeys_from_blob(blob, bloblen); free(blob); debug3("%s: Waiting for second key", __func__); blob = buffer_get_string(&m, &bloblen); current_keys[MODE_IN] = mm_newkeys_from_blob(blob, bloblen); free(blob); /* Now get sequence numbers for the packets */ seqnr = buffer_get_int(&m); blocks = buffer_get_int64(&m); packets = buffer_get_int(&m); bytes = buffer_get_int64(&m); packet_set_state(MODE_OUT, seqnr, blocks, packets, bytes); seqnr = buffer_get_int(&m); blocks = buffer_get_int64(&m); packets = buffer_get_int(&m); bytes = buffer_get_int64(&m); packet_set_state(MODE_IN, seqnr, blocks, packets, bytes); skip: /* Get the key context */ child_state.keyout = buffer_get_string(&m, &child_state.keyoutlen); child_state.keyin = buffer_get_string(&m, &child_state.keyinlen); debug3("%s: Getting compression state", __func__); /* Get compression state */ p = buffer_get_string(&m, &plen); if (plen != sizeof(child_state.outgoing)) fatal("%s: bad request size", __func__); memcpy(&child_state.outgoing, p, sizeof(child_state.outgoing)); free(p); p = buffer_get_string(&m, &plen); if (plen != sizeof(child_state.incoming)) fatal("%s: bad request size", __func__); memcpy(&child_state.incoming, p, sizeof(child_state.incoming)); free(p); /* Network I/O buffers */ debug3("%s: Getting Network I/O buffers", __func__); child_state.input = buffer_get_string(&m, &child_state.ilen); child_state.output = buffer_get_string(&m, &child_state.olen); /* Roaming */ if (compat20) { child_state.sent_bytes = buffer_get_int64(&m); child_state.recv_bytes = buffer_get_int64(&m); } buffer_free(&m); } /* Allocation functions for zlib */ void * mm_zalloc(struct mm_master *mm, u_int ncount, u_int size) { size_t len = (size_t) size * ncount; void *address; if (len == 0 || ncount > SIZE_T_MAX / size) fatal("%s: mm_zalloc(%u, %u)", __func__, ncount, size); address = mm_malloc(mm, len); return (address); } void mm_zfree(struct mm_master *mm, void *address) { mm_free(mm, address); } void mm_init_compression(struct mm_master *mm) { outgoing_stream.zalloc = (alloc_func)mm_zalloc; outgoing_stream.zfree = (free_func)mm_zfree; outgoing_stream.opaque = mm; incoming_stream.zalloc = (alloc_func)mm_zalloc; incoming_stream.zfree = (free_func)mm_zfree; incoming_stream.opaque = mm; } /* XXX */ #define FD_CLOSEONEXEC(x) do { \ if (fcntl(x, F_SETFD, FD_CLOEXEC) == -1) \ fatal("fcntl(%d, F_SETFD)", x); \ } while (0) static void monitor_openfds(struct monitor *mon, int do_logfds) { int pair[2]; if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) == -1) fatal("%s: socketpair: %s", __func__, strerror(errno)); FD_CLOSEONEXEC(pair[0]); FD_CLOSEONEXEC(pair[1]); mon->m_recvfd = pair[0]; mon->m_sendfd = pair[1]; if (do_logfds) { if (pipe(pair) == -1) fatal("%s: pipe: %s", __func__, strerror(errno)); FD_CLOSEONEXEC(pair[0]); FD_CLOSEONEXEC(pair[1]); mon->m_log_recvfd = pair[0]; mon->m_log_sendfd = pair[1]; } else mon->m_log_recvfd = mon->m_log_sendfd = -1; } #define MM_MEMSIZE 65536 struct monitor * monitor_init(void) { struct monitor *mon; mon = xcalloc(1, sizeof(*mon)); monitor_openfds(mon, 1); /* Used to share zlib space across processes */ if (options.compression) { mon->m_zback = mm_create(NULL, MM_MEMSIZE); mon->m_zlib = mm_create(mon->m_zback, 20 * MM_MEMSIZE); /* Compression needs to share state across borders */ mm_init_compression(mon->m_zlib); } return mon; } void monitor_reinit(struct monitor *mon) { monitor_openfds(mon, 0); } #ifdef GSSAPI int mm_answer_gss_setup_ctx(int sock, Buffer *m) { gss_OID_desc goid; OM_uint32 major; u_int len; goid.elements = buffer_get_string(m, &len); goid.length = len; major = ssh_gssapi_server_ctx(&gsscontext, &goid); free(goid.elements); buffer_clear(m); buffer_put_int(m, major); mm_request_send(sock, MONITOR_ANS_GSSSETUP, m); /* Now we have a context, enable the step */ monitor_permit(mon_dispatch, MONITOR_REQ_GSSSTEP, 1); return (0); } int mm_answer_gss_accept_ctx(int sock, Buffer *m) { gss_buffer_desc in; gss_buffer_desc out = GSS_C_EMPTY_BUFFER; OM_uint32 major, minor; OM_uint32 flags = 0; /* GSI needs this */ u_int len; in.value = buffer_get_string(m, &len); in.length = len; major = ssh_gssapi_accept_ctx(gsscontext, &in, &out, &flags); free(in.value); buffer_clear(m); buffer_put_int(m, major); buffer_put_string(m, out.value, out.length); buffer_put_int(m, flags); mm_request_send(sock, MONITOR_ANS_GSSSTEP, m); gss_release_buffer(&minor, &out); if (major == GSS_S_COMPLETE) { monitor_permit(mon_dispatch, MONITOR_REQ_GSSSTEP, 0); monitor_permit(mon_dispatch, MONITOR_REQ_GSSUSEROK, 1); monitor_permit(mon_dispatch, MONITOR_REQ_GSSCHECKMIC, 1); } return (0); } int mm_answer_gss_checkmic(int sock, Buffer *m) { gss_buffer_desc gssbuf, mic; OM_uint32 ret; u_int len; gssbuf.value = buffer_get_string(m, &len); gssbuf.length = len; mic.value = buffer_get_string(m, &len); mic.length = len; ret = ssh_gssapi_checkmic(gsscontext, &gssbuf, &mic); free(gssbuf.value); free(mic.value); buffer_clear(m); buffer_put_int(m, ret); mm_request_send(sock, MONITOR_ANS_GSSCHECKMIC, m); if (!GSS_ERROR(ret)) monitor_permit(mon_dispatch, MONITOR_REQ_GSSUSEROK, 1); return (0); } int mm_answer_gss_userok(int sock, Buffer *m) { int authenticated; authenticated = authctxt->valid && ssh_gssapi_userok(authctxt->user); buffer_clear(m); buffer_put_int(m, authenticated); debug3("%s: sending result %d", __func__, authenticated); mm_request_send(sock, MONITOR_ANS_GSSUSEROK, m); auth_method = "gssapi-with-mic"; /* Monitor loop will terminate if authenticated */ return (authenticated); } #endif /* GSSAPI */ Index: releng/9.3/crypto/openssh/monitor_wrap.c =================================================================== --- releng/9.3/crypto/openssh/monitor_wrap.c (revision 287146) +++ releng/9.3/crypto/openssh/monitor_wrap.c (revision 287147) @@ -1,1292 +1,1291 @@ /* $OpenBSD: monitor_wrap.c,v 1.79 2014/02/02 03:44:31 djm Exp $ */ /* * Copyright 2002 Niels Provos * Copyright 2002 Markus Friedl * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "includes.h" #include #include #include #include #include #include #include #include #include #include #include #include #include "openbsd-compat/sys-queue.h" #include "xmalloc.h" #include "ssh.h" #include "dh.h" #include "buffer.h" #include "key.h" #include "cipher.h" #include "kex.h" #include "hostfile.h" #include "auth.h" #include "auth-options.h" #include "packet.h" #include "mac.h" #include "log.h" #ifdef TARGET_OS_MAC /* XXX Broken krb5 headers on Mac */ #undef TARGET_OS_MAC #include "zlib.h" #define TARGET_OS_MAC 1 #else #include "zlib.h" #endif #include "monitor.h" #ifdef GSSAPI #include "ssh-gss.h" #endif #include "monitor_wrap.h" #include "atomicio.h" #include "monitor_fdpass.h" #include "misc.h" #include "uuencode.h" #include "channels.h" #include "session.h" #include "servconf.h" #include "roaming.h" /* Imports */ extern int compat20; extern z_stream incoming_stream; extern z_stream outgoing_stream; extern struct monitor *pmonitor; extern Buffer loginmsg; extern ServerOptions options; void mm_log_handler(LogLevel level, const char *msg, void *ctx) { Buffer log_msg; struct monitor *mon = (struct monitor *)ctx; if (mon->m_log_sendfd == -1) fatal("%s: no log channel", __func__); buffer_init(&log_msg); /* * Placeholder for packet length. Will be filled in with the actual * packet length once the packet has been constucted. This saves * fragile math. */ buffer_put_int(&log_msg, 0); buffer_put_int(&log_msg, level); buffer_put_cstring(&log_msg, msg); put_u32(buffer_ptr(&log_msg), buffer_len(&log_msg) - 4); if (atomicio(vwrite, mon->m_log_sendfd, buffer_ptr(&log_msg), buffer_len(&log_msg)) != buffer_len(&log_msg)) fatal("%s: write: %s", __func__, strerror(errno)); buffer_free(&log_msg); } int mm_is_monitor(void) { /* * m_pid is only set in the privileged part, and * points to the unprivileged child. */ return (pmonitor && pmonitor->m_pid > 0); } void mm_request_send(int sock, enum monitor_reqtype type, Buffer *m) { u_int mlen = buffer_len(m); u_char buf[5]; debug3("%s entering: type %d", __func__, type); put_u32(buf, mlen + 1); buf[4] = (u_char) type; /* 1st byte of payload is mesg-type */ if (atomicio(vwrite, sock, buf, sizeof(buf)) != sizeof(buf)) fatal("%s: write: %s", __func__, strerror(errno)); if (atomicio(vwrite, sock, buffer_ptr(m), mlen) != mlen) fatal("%s: write: %s", __func__, strerror(errno)); } void mm_request_receive(int sock, Buffer *m) { u_char buf[4]; u_int msg_len; debug3("%s entering", __func__); if (atomicio(read, sock, buf, sizeof(buf)) != sizeof(buf)) { if (errno == EPIPE) cleanup_exit(255); fatal("%s: read: %s", __func__, strerror(errno)); } msg_len = get_u32(buf); if (msg_len > 256 * 1024) fatal("%s: read: bad msg_len %d", __func__, msg_len); buffer_clear(m); buffer_append_space(m, msg_len); if (atomicio(read, sock, buffer_ptr(m), msg_len) != msg_len) fatal("%s: read: %s", __func__, strerror(errno)); } void mm_request_receive_expect(int sock, enum monitor_reqtype type, Buffer *m) { u_char rtype; debug3("%s entering: type %d", __func__, type); mm_request_receive(sock, m); rtype = buffer_get_char(m); if (rtype != type) fatal("%s: read: rtype %d != type %d", __func__, rtype, type); } DH * mm_choose_dh(int min, int nbits, int max) { BIGNUM *p, *g; int success = 0; Buffer m; buffer_init(&m); buffer_put_int(&m, min); buffer_put_int(&m, nbits); buffer_put_int(&m, max); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_MODULI, &m); debug3("%s: waiting for MONITOR_ANS_MODULI", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_MODULI, &m); success = buffer_get_char(&m); if (success == 0) fatal("%s: MONITOR_ANS_MODULI failed", __func__); if ((p = BN_new()) == NULL) fatal("%s: BN_new failed", __func__); if ((g = BN_new()) == NULL) fatal("%s: BN_new failed", __func__); buffer_get_bignum2(&m, p); buffer_get_bignum2(&m, g); debug3("%s: remaining %d", __func__, buffer_len(&m)); buffer_free(&m); return (dh_new_group(g, p)); } int mm_key_sign(Key *key, u_char **sigp, u_int *lenp, u_char *data, u_int datalen) { Kex *kex = *pmonitor->m_pkex; Buffer m; debug3("%s entering", __func__); buffer_init(&m); buffer_put_int(&m, kex->host_key_index(key)); buffer_put_string(&m, data, datalen); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_SIGN, &m); debug3("%s: waiting for MONITOR_ANS_SIGN", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_SIGN, &m); *sigp = buffer_get_string(&m, lenp); buffer_free(&m); return (0); } struct passwd * mm_getpwnamallow(const char *username) { Buffer m; struct passwd *pw; u_int len, i; ServerOptions *newopts; debug3("%s entering", __func__); buffer_init(&m); buffer_put_cstring(&m, username); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PWNAM, &m); debug3("%s: waiting for MONITOR_ANS_PWNAM", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PWNAM, &m); if (buffer_get_char(&m) == 0) { pw = NULL; goto out; } pw = buffer_get_string(&m, &len); if (len != sizeof(struct passwd)) fatal("%s: struct passwd size mismatch", __func__); pw->pw_name = buffer_get_string(&m, NULL); pw->pw_passwd = buffer_get_string(&m, NULL); #ifdef HAVE_STRUCT_PASSWD_PW_GECOS pw->pw_gecos = buffer_get_string(&m, NULL); #endif #ifdef HAVE_STRUCT_PASSWD_PW_CLASS pw->pw_class = buffer_get_string(&m, NULL); #endif pw->pw_dir = buffer_get_string(&m, NULL); pw->pw_shell = buffer_get_string(&m, NULL); out: /* copy options block as a Match directive may have changed some */ newopts = buffer_get_string(&m, &len); if (len != sizeof(*newopts)) fatal("%s: option block size mismatch", __func__); #define M_CP_STROPT(x) do { \ if (newopts->x != NULL) \ newopts->x = buffer_get_string(&m, NULL); \ } while (0) #define M_CP_STRARRAYOPT(x, nx) do { \ for (i = 0; i < newopts->nx; i++) \ newopts->x[i] = buffer_get_string(&m, NULL); \ } while (0) /* See comment in servconf.h */ COPY_MATCH_STRING_OPTS(); #undef M_CP_STROPT #undef M_CP_STRARRAYOPT copy_set_server_options(&options, newopts, 1); free(newopts); buffer_free(&m); return (pw); } char * mm_auth2_read_banner(void) { Buffer m; char *banner; debug3("%s entering", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUTH2_READ_BANNER, &m); buffer_clear(&m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_AUTH2_READ_BANNER, &m); banner = buffer_get_string(&m, NULL); buffer_free(&m); /* treat empty banner as missing banner */ if (strlen(banner) == 0) { free(banner); banner = NULL; } return (banner); } /* Inform the privileged process about service and style */ void mm_inform_authserv(char *service, char *style) { Buffer m; debug3("%s entering", __func__); buffer_init(&m); buffer_put_cstring(&m, service); buffer_put_cstring(&m, style ? style : ""); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUTHSERV, &m); buffer_free(&m); } /* Do the password authentication */ int mm_auth_password(Authctxt *authctxt, char *password) { Buffer m; int authenticated = 0; debug3("%s entering", __func__); buffer_init(&m); buffer_put_cstring(&m, password); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUTHPASSWORD, &m); debug3("%s: waiting for MONITOR_ANS_AUTHPASSWORD", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_AUTHPASSWORD, &m); authenticated = buffer_get_int(&m); buffer_free(&m); debug3("%s: user %sauthenticated", __func__, authenticated ? "" : "not "); return (authenticated); } int mm_user_key_allowed(struct passwd *pw, Key *key) { return (mm_key_allowed(MM_USERKEY, NULL, NULL, key)); } int mm_hostbased_key_allowed(struct passwd *pw, char *user, char *host, Key *key) { return (mm_key_allowed(MM_HOSTKEY, user, host, key)); } int mm_auth_rhosts_rsa_key_allowed(struct passwd *pw, char *user, char *host, Key *key) { int ret; key->type = KEY_RSA; /* XXX hack for key_to_blob */ ret = mm_key_allowed(MM_RSAHOSTKEY, user, host, key); key->type = KEY_RSA1; return (ret); } int mm_key_allowed(enum mm_keytype type, char *user, char *host, Key *key) { Buffer m; u_char *blob; u_int len; int allowed = 0, have_forced = 0; debug3("%s entering", __func__); /* Convert the key to a blob and the pass it over */ if (!key_to_blob(key, &blob, &len)) return (0); buffer_init(&m); buffer_put_int(&m, type); buffer_put_cstring(&m, user ? user : ""); buffer_put_cstring(&m, host ? host : ""); buffer_put_string(&m, blob, len); free(blob); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_KEYALLOWED, &m); debug3("%s: waiting for MONITOR_ANS_KEYALLOWED", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_KEYALLOWED, &m); allowed = buffer_get_int(&m); /* fake forced command */ auth_clear_options(); have_forced = buffer_get_int(&m); forced_command = have_forced ? xstrdup("true") : NULL; buffer_free(&m); return (allowed); } /* * This key verify needs to send the key type along, because the * privileged parent makes the decision if the key is allowed * for authentication. */ int mm_key_verify(Key *key, u_char *sig, u_int siglen, u_char *data, u_int datalen) { Buffer m; u_char *blob; u_int len; int verified = 0; debug3("%s entering", __func__); /* Convert the key to a blob and the pass it over */ if (!key_to_blob(key, &blob, &len)) return (0); buffer_init(&m); buffer_put_string(&m, blob, len); buffer_put_string(&m, sig, siglen); buffer_put_string(&m, data, datalen); free(blob); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_KEYVERIFY, &m); debug3("%s: waiting for MONITOR_ANS_KEYVERIFY", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_KEYVERIFY, &m); verified = buffer_get_int(&m); buffer_free(&m); return (verified); } /* Export key state after authentication */ Newkeys * mm_newkeys_from_blob(u_char *blob, int blen) { Buffer b; u_int len; Newkeys *newkey = NULL; Enc *enc; Mac *mac; Comp *comp; debug3("%s: %p(%d)", __func__, blob, blen); #ifdef DEBUG_PK dump_base64(stderr, blob, blen); #endif buffer_init(&b); buffer_append(&b, blob, blen); newkey = xcalloc(1, sizeof(*newkey)); enc = &newkey->enc; mac = &newkey->mac; comp = &newkey->comp; /* Enc structure */ enc->name = buffer_get_string(&b, NULL); buffer_get(&b, &enc->cipher, sizeof(enc->cipher)); enc->enabled = buffer_get_int(&b); enc->block_size = buffer_get_int(&b); enc->key = buffer_get_string(&b, &enc->key_len); enc->iv = buffer_get_string(&b, &enc->iv_len); if (enc->name == NULL || cipher_by_name(enc->name) != enc->cipher) fatal("%s: bad cipher name %s or pointer %p", __func__, enc->name, enc->cipher); /* Mac structure */ if (cipher_authlen(enc->cipher) == 0) { mac->name = buffer_get_string(&b, NULL); if (mac->name == NULL || mac_setup(mac, mac->name) == -1) fatal("%s: can not setup mac %s", __func__, mac->name); mac->enabled = buffer_get_int(&b); mac->key = buffer_get_string(&b, &len); if (len > mac->key_len) fatal("%s: bad mac key length: %u > %d", __func__, len, mac->key_len); mac->key_len = len; } /* Comp structure */ comp->type = buffer_get_int(&b); comp->enabled = buffer_get_int(&b); comp->name = buffer_get_string(&b, NULL); len = buffer_len(&b); if (len != 0) error("newkeys_from_blob: remaining bytes in blob %u", len); buffer_free(&b); return (newkey); } int mm_newkeys_to_blob(int mode, u_char **blobp, u_int *lenp) { Buffer b; int len; Enc *enc; Mac *mac; Comp *comp; Newkeys *newkey = (Newkeys *)packet_get_newkeys(mode); debug3("%s: converting %p", __func__, newkey); if (newkey == NULL) { error("%s: newkey == NULL", __func__); return 0; } enc = &newkey->enc; mac = &newkey->mac; comp = &newkey->comp; buffer_init(&b); /* Enc structure */ buffer_put_cstring(&b, enc->name); /* The cipher struct is constant and shared, you export pointer */ buffer_append(&b, &enc->cipher, sizeof(enc->cipher)); buffer_put_int(&b, enc->enabled); buffer_put_int(&b, enc->block_size); buffer_put_string(&b, enc->key, enc->key_len); packet_get_keyiv(mode, enc->iv, enc->iv_len); buffer_put_string(&b, enc->iv, enc->iv_len); /* Mac structure */ if (cipher_authlen(enc->cipher) == 0) { buffer_put_cstring(&b, mac->name); buffer_put_int(&b, mac->enabled); buffer_put_string(&b, mac->key, mac->key_len); } /* Comp structure */ buffer_put_int(&b, comp->type); buffer_put_int(&b, comp->enabled); buffer_put_cstring(&b, comp->name); len = buffer_len(&b); if (lenp != NULL) *lenp = len; if (blobp != NULL) { *blobp = xmalloc(len); memcpy(*blobp, buffer_ptr(&b), len); } explicit_bzero(buffer_ptr(&b), len); buffer_free(&b); return len; } static void mm_send_kex(Buffer *m, Kex *kex) { buffer_put_string(m, kex->session_id, kex->session_id_len); buffer_put_int(m, kex->we_need); buffer_put_int(m, kex->hostkey_type); buffer_put_int(m, kex->kex_type); buffer_put_string(m, buffer_ptr(&kex->my), buffer_len(&kex->my)); buffer_put_string(m, buffer_ptr(&kex->peer), buffer_len(&kex->peer)); buffer_put_int(m, kex->flags); buffer_put_cstring(m, kex->client_version_string); buffer_put_cstring(m, kex->server_version_string); } void mm_send_keystate(struct monitor *monitor) { Buffer m, *input, *output; u_char *blob, *p; u_int bloblen, plen; u_int32_t seqnr, packets; u_int64_t blocks, bytes; buffer_init(&m); if (!compat20) { u_char iv[24]; u_char *key; u_int ivlen, keylen; buffer_put_int(&m, packet_get_protocol_flags()); buffer_put_int(&m, packet_get_ssh1_cipher()); debug3("%s: Sending ssh1 KEY+IV", __func__); keylen = packet_get_encryption_key(NULL); key = xmalloc(keylen+1); /* add 1 if keylen == 0 */ keylen = packet_get_encryption_key(key); buffer_put_string(&m, key, keylen); explicit_bzero(key, keylen); free(key); ivlen = packet_get_keyiv_len(MODE_OUT); packet_get_keyiv(MODE_OUT, iv, ivlen); buffer_put_string(&m, iv, ivlen); ivlen = packet_get_keyiv_len(MODE_IN); packet_get_keyiv(MODE_IN, iv, ivlen); buffer_put_string(&m, iv, ivlen); goto skip; } else { /* Kex for rekeying */ mm_send_kex(&m, *monitor->m_pkex); } debug3("%s: Sending new keys: %p %p", __func__, packet_get_newkeys(MODE_OUT), packet_get_newkeys(MODE_IN)); /* Keys from Kex */ if (!mm_newkeys_to_blob(MODE_OUT, &blob, &bloblen)) fatal("%s: conversion of newkeys failed", __func__); buffer_put_string(&m, blob, bloblen); free(blob); if (!mm_newkeys_to_blob(MODE_IN, &blob, &bloblen)) fatal("%s: conversion of newkeys failed", __func__); buffer_put_string(&m, blob, bloblen); free(blob); packet_get_state(MODE_OUT, &seqnr, &blocks, &packets, &bytes); buffer_put_int(&m, seqnr); buffer_put_int64(&m, blocks); buffer_put_int(&m, packets); buffer_put_int64(&m, bytes); packet_get_state(MODE_IN, &seqnr, &blocks, &packets, &bytes); buffer_put_int(&m, seqnr); buffer_put_int64(&m, blocks); buffer_put_int(&m, packets); buffer_put_int64(&m, bytes); debug3("%s: New keys have been sent", __func__); skip: /* More key context */ plen = packet_get_keycontext(MODE_OUT, NULL); p = xmalloc(plen+1); packet_get_keycontext(MODE_OUT, p); buffer_put_string(&m, p, plen); free(p); plen = packet_get_keycontext(MODE_IN, NULL); p = xmalloc(plen+1); packet_get_keycontext(MODE_IN, p); buffer_put_string(&m, p, plen); free(p); /* Compression state */ debug3("%s: Sending compression state", __func__); buffer_put_string(&m, &outgoing_stream, sizeof(outgoing_stream)); buffer_put_string(&m, &incoming_stream, sizeof(incoming_stream)); /* Network I/O buffers */ input = (Buffer *)packet_get_input(); output = (Buffer *)packet_get_output(); buffer_put_string(&m, buffer_ptr(input), buffer_len(input)); buffer_put_string(&m, buffer_ptr(output), buffer_len(output)); /* Roaming */ if (compat20) { buffer_put_int64(&m, get_sent_bytes()); buffer_put_int64(&m, get_recv_bytes()); } mm_request_send(monitor->m_recvfd, MONITOR_REQ_KEYEXPORT, &m); debug3("%s: Finished sending state", __func__); buffer_free(&m); } int mm_pty_allocate(int *ptyfd, int *ttyfd, char *namebuf, size_t namebuflen) { Buffer m; char *p, *msg; int success = 0, tmp1 = -1, tmp2 = -1; /* Kludge: ensure there are fds free to receive the pty/tty */ if ((tmp1 = dup(pmonitor->m_recvfd)) == -1 || (tmp2 = dup(pmonitor->m_recvfd)) == -1) { error("%s: cannot allocate fds for pty", __func__); if (tmp1 > 0) close(tmp1); if (tmp2 > 0) close(tmp2); return 0; } close(tmp1); close(tmp2); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PTY, &m); debug3("%s: waiting for MONITOR_ANS_PTY", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PTY, &m); success = buffer_get_int(&m); if (success == 0) { debug3("%s: pty alloc failed", __func__); buffer_free(&m); return (0); } p = buffer_get_string(&m, NULL); msg = buffer_get_string(&m, NULL); buffer_free(&m); strlcpy(namebuf, p, namebuflen); /* Possible truncation */ free(p); buffer_append(&loginmsg, msg, strlen(msg)); free(msg); if ((*ptyfd = mm_receive_fd(pmonitor->m_recvfd)) == -1 || (*ttyfd = mm_receive_fd(pmonitor->m_recvfd)) == -1) fatal("%s: receive fds failed", __func__); /* Success */ return (1); } void mm_session_pty_cleanup2(Session *s) { Buffer m; if (s->ttyfd == -1) return; buffer_init(&m); buffer_put_cstring(&m, s->tty); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PTYCLEANUP, &m); buffer_free(&m); /* closed dup'ed master */ if (s->ptymaster != -1 && close(s->ptymaster) < 0) error("close(s->ptymaster/%d): %s", s->ptymaster, strerror(errno)); /* unlink pty from session */ s->ttyfd = -1; } #ifdef USE_PAM void mm_start_pam(Authctxt *authctxt) { Buffer m; debug3("%s entering", __func__); if (!options.use_pam) fatal("UsePAM=no, but ended up in %s anyway", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_START, &m); buffer_free(&m); } u_int mm_do_pam_account(void) { Buffer m; u_int ret; char *msg; debug3("%s entering", __func__); if (!options.use_pam) fatal("UsePAM=no, but ended up in %s anyway", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_ACCOUNT, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PAM_ACCOUNT, &m); ret = buffer_get_int(&m); msg = buffer_get_string(&m, NULL); buffer_append(&loginmsg, msg, strlen(msg)); free(msg); buffer_free(&m); debug3("%s returning %d", __func__, ret); return (ret); } void * mm_sshpam_init_ctx(Authctxt *authctxt) { Buffer m; int success; debug3("%s", __func__); buffer_init(&m); - buffer_put_cstring(&m, authctxt->user); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_INIT_CTX, &m); debug3("%s: waiting for MONITOR_ANS_PAM_INIT_CTX", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PAM_INIT_CTX, &m); success = buffer_get_int(&m); if (success == 0) { debug3("%s: pam_init_ctx failed", __func__); buffer_free(&m); return (NULL); } buffer_free(&m); return (authctxt); } int mm_sshpam_query(void *ctx, char **name, char **info, u_int *num, char ***prompts, u_int **echo_on) { Buffer m; u_int i; int ret; debug3("%s", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_QUERY, &m); debug3("%s: waiting for MONITOR_ANS_PAM_QUERY", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PAM_QUERY, &m); ret = buffer_get_int(&m); debug3("%s: pam_query returned %d", __func__, ret); *name = buffer_get_string(&m, NULL); *info = buffer_get_string(&m, NULL); *num = buffer_get_int(&m); if (*num > PAM_MAX_NUM_MSG) fatal("%s: recieved %u PAM messages, expected <= %u", __func__, *num, PAM_MAX_NUM_MSG); *prompts = xcalloc((*num + 1), sizeof(char *)); *echo_on = xcalloc((*num + 1), sizeof(u_int)); for (i = 0; i < *num; ++i) { (*prompts)[i] = buffer_get_string(&m, NULL); (*echo_on)[i] = buffer_get_int(&m); } buffer_free(&m); return (ret); } int mm_sshpam_respond(void *ctx, u_int num, char **resp) { Buffer m; u_int i; int ret; debug3("%s", __func__); buffer_init(&m); buffer_put_int(&m, num); for (i = 0; i < num; ++i) buffer_put_cstring(&m, resp[i]); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_RESPOND, &m); debug3("%s: waiting for MONITOR_ANS_PAM_RESPOND", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PAM_RESPOND, &m); ret = buffer_get_int(&m); debug3("%s: pam_respond returned %d", __func__, ret); buffer_free(&m); return (ret); } void mm_sshpam_free_ctx(void *ctxtp) { Buffer m; debug3("%s", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_FREE_CTX, &m); debug3("%s: waiting for MONITOR_ANS_PAM_FREE_CTX", __func__); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PAM_FREE_CTX, &m); buffer_free(&m); } #endif /* USE_PAM */ /* Request process termination */ void mm_terminate(void) { Buffer m; buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_TERM, &m); buffer_free(&m); } int mm_ssh1_session_key(BIGNUM *num) { int rsafail; Buffer m; buffer_init(&m); buffer_put_bignum2(&m, num); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_SESSKEY, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_SESSKEY, &m); rsafail = buffer_get_int(&m); buffer_get_bignum2(&m, num); buffer_free(&m); return (rsafail); } static void mm_chall_setup(char **name, char **infotxt, u_int *numprompts, char ***prompts, u_int **echo_on) { *name = xstrdup(""); *infotxt = xstrdup(""); *numprompts = 1; *prompts = xcalloc(*numprompts, sizeof(char *)); *echo_on = xcalloc(*numprompts, sizeof(u_int)); (*echo_on)[0] = 0; } int mm_bsdauth_query(void *ctx, char **name, char **infotxt, u_int *numprompts, char ***prompts, u_int **echo_on) { Buffer m; u_int success; char *challenge; debug3("%s: entering", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_BSDAUTHQUERY, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_BSDAUTHQUERY, &m); success = buffer_get_int(&m); if (success == 0) { debug3("%s: no challenge", __func__); buffer_free(&m); return (-1); } /* Get the challenge, and format the response */ challenge = buffer_get_string(&m, NULL); buffer_free(&m); mm_chall_setup(name, infotxt, numprompts, prompts, echo_on); (*prompts)[0] = challenge; debug3("%s: received challenge: %s", __func__, challenge); return (0); } int mm_bsdauth_respond(void *ctx, u_int numresponses, char **responses) { Buffer m; int authok; debug3("%s: entering", __func__); if (numresponses != 1) return (-1); buffer_init(&m); buffer_put_cstring(&m, responses[0]); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_BSDAUTHRESPOND, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_BSDAUTHRESPOND, &m); authok = buffer_get_int(&m); buffer_free(&m); return ((authok == 0) ? -1 : 0); } #ifdef SKEY int mm_skey_query(void *ctx, char **name, char **infotxt, u_int *numprompts, char ***prompts, u_int **echo_on) { Buffer m; u_int success; char *challenge; debug3("%s: entering", __func__); buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_SKEYQUERY, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_SKEYQUERY, &m); success = buffer_get_int(&m); if (success == 0) { debug3("%s: no challenge", __func__); buffer_free(&m); return (-1); } /* Get the challenge, and format the response */ challenge = buffer_get_string(&m, NULL); buffer_free(&m); debug3("%s: received challenge: %s", __func__, challenge); mm_chall_setup(name, infotxt, numprompts, prompts, echo_on); xasprintf(*prompts, "%s%s", challenge, SKEY_PROMPT); free(challenge); return (0); } int mm_skey_respond(void *ctx, u_int numresponses, char **responses) { Buffer m; int authok; debug3("%s: entering", __func__); if (numresponses != 1) return (-1); buffer_init(&m); buffer_put_cstring(&m, responses[0]); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_SKEYRESPOND, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_SKEYRESPOND, &m); authok = buffer_get_int(&m); buffer_free(&m); return ((authok == 0) ? -1 : 0); } #endif /* SKEY */ void mm_ssh1_session_id(u_char session_id[16]) { Buffer m; int i; debug3("%s entering", __func__); buffer_init(&m); for (i = 0; i < 16; i++) buffer_put_char(&m, session_id[i]); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_SESSID, &m); buffer_free(&m); } int mm_auth_rsa_key_allowed(struct passwd *pw, BIGNUM *client_n, Key **rkey) { Buffer m; Key *key; u_char *blob; u_int blen; int allowed = 0, have_forced = 0; debug3("%s entering", __func__); buffer_init(&m); buffer_put_bignum2(&m, client_n); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_RSAKEYALLOWED, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_RSAKEYALLOWED, &m); allowed = buffer_get_int(&m); /* fake forced command */ auth_clear_options(); have_forced = buffer_get_int(&m); forced_command = have_forced ? xstrdup("true") : NULL; if (allowed && rkey != NULL) { blob = buffer_get_string(&m, &blen); if ((key = key_from_blob(blob, blen)) == NULL) fatal("%s: key_from_blob failed", __func__); *rkey = key; free(blob); } buffer_free(&m); return (allowed); } BIGNUM * mm_auth_rsa_generate_challenge(Key *key) { Buffer m; BIGNUM *challenge; u_char *blob; u_int blen; debug3("%s entering", __func__); if ((challenge = BN_new()) == NULL) fatal("%s: BN_new failed", __func__); key->type = KEY_RSA; /* XXX cheat for key_to_blob */ if (key_to_blob(key, &blob, &blen) == 0) fatal("%s: key_to_blob failed", __func__); key->type = KEY_RSA1; buffer_init(&m); buffer_put_string(&m, blob, blen); free(blob); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_RSACHALLENGE, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_RSACHALLENGE, &m); buffer_get_bignum2(&m, challenge); buffer_free(&m); return (challenge); } int mm_auth_rsa_verify_response(Key *key, BIGNUM *p, u_char response[16]) { Buffer m; u_char *blob; u_int blen; int success = 0; debug3("%s entering", __func__); key->type = KEY_RSA; /* XXX cheat for key_to_blob */ if (key_to_blob(key, &blob, &blen) == 0) fatal("%s: key_to_blob failed", __func__); key->type = KEY_RSA1; buffer_init(&m); buffer_put_string(&m, blob, blen); buffer_put_string(&m, response, 16); free(blob); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_RSARESPONSE, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_RSARESPONSE, &m); success = buffer_get_int(&m); buffer_free(&m); return (success); } #ifdef SSH_AUDIT_EVENTS void mm_audit_event(ssh_audit_event_t event) { Buffer m; debug3("%s entering", __func__); buffer_init(&m); buffer_put_int(&m, event); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUDIT_EVENT, &m); buffer_free(&m); } void mm_audit_run_command(const char *command) { Buffer m; debug3("%s entering command %s", __func__, command); buffer_init(&m); buffer_put_cstring(&m, command); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUDIT_COMMAND, &m); buffer_free(&m); } #endif /* SSH_AUDIT_EVENTS */ #ifdef GSSAPI OM_uint32 mm_ssh_gssapi_server_ctx(Gssctxt **ctx, gss_OID goid) { Buffer m; OM_uint32 major; /* Client doesn't get to see the context */ *ctx = NULL; buffer_init(&m); buffer_put_string(&m, goid->elements, goid->length); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSSETUP, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GSSSETUP, &m); major = buffer_get_int(&m); buffer_free(&m); return (major); } OM_uint32 mm_ssh_gssapi_accept_ctx(Gssctxt *ctx, gss_buffer_desc *in, gss_buffer_desc *out, OM_uint32 *flags) { Buffer m; OM_uint32 major; u_int len; buffer_init(&m); buffer_put_string(&m, in->value, in->length); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSSTEP, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GSSSTEP, &m); major = buffer_get_int(&m); out->value = buffer_get_string(&m, &len); out->length = len; if (flags) *flags = buffer_get_int(&m); buffer_free(&m); return (major); } OM_uint32 mm_ssh_gssapi_checkmic(Gssctxt *ctx, gss_buffer_t gssbuf, gss_buffer_t gssmic) { Buffer m; OM_uint32 major; buffer_init(&m); buffer_put_string(&m, gssbuf->value, gssbuf->length); buffer_put_string(&m, gssmic->value, gssmic->length); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSCHECKMIC, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GSSCHECKMIC, &m); major = buffer_get_int(&m); buffer_free(&m); return(major); } int mm_ssh_gssapi_userok(char *user) { Buffer m; int authenticated = 0; buffer_init(&m); mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSUSEROK, &m); mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GSSUSEROK, &m); authenticated = buffer_get_int(&m); buffer_free(&m); debug3("%s: user %sauthenticated",__func__, authenticated ? "" : "not "); return (authenticated); } #endif /* GSSAPI */ Index: releng/9.3/crypto/openssh/mux.c =================================================================== --- releng/9.3/crypto/openssh/mux.c (revision 287146) +++ releng/9.3/crypto/openssh/mux.c (revision 287147) @@ -1,2104 +1,2106 @@ /* $OpenBSD: mux.c,v 1.44 2013/07/12 00:19:58 djm Exp $ */ /* $FreeBSD$ */ /* * Copyright (c) 2002-2008 Damien Miller * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ /* ssh session multiplexing support */ /* * TODO: * - Better signalling from master to slave, especially passing of * error messages * - Better fall-back from mux slave error to new connection. * - ExitOnForwardingFailure * - Maybe extension mechanisms for multi-X11/multi-agent forwarding * - Support ~^Z in mux slaves. * - Inspect or control sessions in master. * - If we ever support the "signal" channel request, send signals on * sessions in master. */ #include "includes.h" __RCSID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef HAVE_PATHS_H #include #endif #ifdef HAVE_POLL_H #include #else # ifdef HAVE_SYS_POLL_H # include # endif #endif #ifdef HAVE_UTIL_H # include #endif #include "openbsd-compat/sys-queue.h" #include "xmalloc.h" #include "log.h" #include "ssh.h" #include "ssh2.h" #include "pathnames.h" #include "misc.h" #include "match.h" #include "buffer.h" #include "channels.h" #include "msg.h" #include "packet.h" #include "monitor_fdpass.h" #include "sshpty.h" #include "key.h" #include "readconf.h" #include "clientloop.h" /* from ssh.c */ extern int tty_flag; extern Options options; extern int stdin_null_flag; extern char *host; extern int subsystem_flag; extern Buffer command; extern volatile sig_atomic_t quit_pending; extern char *stdio_forward_host; extern int stdio_forward_port; /* Context for session open confirmation callback */ struct mux_session_confirm_ctx { u_int want_tty; u_int want_subsys; u_int want_x_fwd; u_int want_agent_fwd; Buffer cmd; char *term; struct termios tio; char **env; u_int rid; }; /* Context for global channel callback */ struct mux_channel_confirm_ctx { u_int cid; /* channel id */ u_int rid; /* request id */ int fid; /* forward id */ }; /* fd to control socket */ int muxserver_sock = -1; /* client request id */ u_int muxclient_request_id = 0; /* Multiplexing control command */ u_int muxclient_command = 0; /* Set when signalled. */ static volatile sig_atomic_t muxclient_terminate = 0; /* PID of multiplex server */ static u_int muxserver_pid = 0; static Channel *mux_listener_channel = NULL; struct mux_master_state { int hello_rcvd; }; /* mux protocol messages */ #define MUX_MSG_HELLO 0x00000001 #define MUX_C_NEW_SESSION 0x10000002 #define MUX_C_ALIVE_CHECK 0x10000004 #define MUX_C_TERMINATE 0x10000005 #define MUX_C_OPEN_FWD 0x10000006 #define MUX_C_CLOSE_FWD 0x10000007 #define MUX_C_NEW_STDIO_FWD 0x10000008 #define MUX_C_STOP_LISTENING 0x10000009 #define MUX_S_OK 0x80000001 #define MUX_S_PERMISSION_DENIED 0x80000002 #define MUX_S_FAILURE 0x80000003 #define MUX_S_EXIT_MESSAGE 0x80000004 #define MUX_S_ALIVE 0x80000005 #define MUX_S_SESSION_OPENED 0x80000006 #define MUX_S_REMOTE_PORT 0x80000007 #define MUX_S_TTY_ALLOC_FAIL 0x80000008 /* type codes for MUX_C_OPEN_FWD and MUX_C_CLOSE_FWD */ #define MUX_FWD_LOCAL 1 #define MUX_FWD_REMOTE 2 #define MUX_FWD_DYNAMIC 3 static void mux_session_confirm(int, int, void *); static int process_mux_master_hello(u_int, Channel *, Buffer *, Buffer *); static int process_mux_new_session(u_int, Channel *, Buffer *, Buffer *); static int process_mux_alive_check(u_int, Channel *, Buffer *, Buffer *); static int process_mux_terminate(u_int, Channel *, Buffer *, Buffer *); static int process_mux_open_fwd(u_int, Channel *, Buffer *, Buffer *); static int process_mux_close_fwd(u_int, Channel *, Buffer *, Buffer *); static int process_mux_stdio_fwd(u_int, Channel *, Buffer *, Buffer *); static int process_mux_stop_listening(u_int, Channel *, Buffer *, Buffer *); static const struct { u_int type; int (*handler)(u_int, Channel *, Buffer *, Buffer *); } mux_master_handlers[] = { { MUX_MSG_HELLO, process_mux_master_hello }, { MUX_C_NEW_SESSION, process_mux_new_session }, { MUX_C_ALIVE_CHECK, process_mux_alive_check }, { MUX_C_TERMINATE, process_mux_terminate }, { MUX_C_OPEN_FWD, process_mux_open_fwd }, { MUX_C_CLOSE_FWD, process_mux_close_fwd }, { MUX_C_NEW_STDIO_FWD, process_mux_stdio_fwd }, { MUX_C_STOP_LISTENING, process_mux_stop_listening }, { 0, NULL } }; /* Cleanup callback fired on closure of mux slave _session_ channel */ /* ARGSUSED */ static void mux_master_session_cleanup_cb(int cid, void *unused) { Channel *cc, *c = channel_by_id(cid); debug3("%s: entering for channel %d", __func__, cid); if (c == NULL) fatal("%s: channel_by_id(%i) == NULL", __func__, cid); if (c->ctl_chan != -1) { if ((cc = channel_by_id(c->ctl_chan)) == NULL) fatal("%s: channel %d missing control channel %d", __func__, c->self, c->ctl_chan); c->ctl_chan = -1; cc->remote_id = -1; chan_rcvd_oclose(cc); } channel_cancel_cleanup(c->self); } /* Cleanup callback fired on closure of mux slave _control_ channel */ /* ARGSUSED */ static void mux_master_control_cleanup_cb(int cid, void *unused) { Channel *sc, *c = channel_by_id(cid); debug3("%s: entering for channel %d", __func__, cid); if (c == NULL) fatal("%s: channel_by_id(%i) == NULL", __func__, cid); if (c->remote_id != -1) { if ((sc = channel_by_id(c->remote_id)) == NULL) fatal("%s: channel %d missing session channel %d", __func__, c->self, c->remote_id); c->remote_id = -1; sc->ctl_chan = -1; if (sc->type != SSH_CHANNEL_OPEN && sc->type != SSH_CHANNEL_OPENING) { debug2("%s: channel %d: not open", __func__, sc->self); chan_mark_dead(sc); } else { if (sc->istate == CHAN_INPUT_OPEN) chan_read_failed(sc); if (sc->ostate == CHAN_OUTPUT_OPEN) chan_write_failed(sc); } } channel_cancel_cleanup(c->self); } /* Check mux client environment variables before passing them to mux master. */ static int env_permitted(char *env) { int i, ret; char name[1024], *cp; if ((cp = strchr(env, '=')) == NULL || cp == env) return 0; ret = snprintf(name, sizeof(name), "%.*s", (int)(cp - env), env); if (ret <= 0 || (size_t)ret >= sizeof(name)) { error("env_permitted: name '%.100s...' too long", env); return 0; } for (i = 0; i < options.num_send_env; i++) if (match_pattern(name, options.send_env[i])) return 1; return 0; } /* Mux master protocol message handlers */ static int process_mux_master_hello(u_int rid, Channel *c, Buffer *m, Buffer *r) { u_int ver; struct mux_master_state *state = (struct mux_master_state *)c->mux_ctx; if (state == NULL) fatal("%s: channel %d: c->mux_ctx == NULL", __func__, c->self); if (state->hello_rcvd) { error("%s: HELLO received twice", __func__); return -1; } if (buffer_get_int_ret(&ver, m) != 0) { malf: error("%s: malformed message", __func__); return -1; } if (ver != SSHMUX_VER) { error("Unsupported multiplexing protocol version %d " "(expected %d)", ver, SSHMUX_VER); return -1; } debug2("%s: channel %d slave version %u", __func__, c->self, ver); /* No extensions are presently defined */ while (buffer_len(m) > 0) { char *name = buffer_get_string_ret(m, NULL); char *value = buffer_get_string_ret(m, NULL); if (name == NULL || value == NULL) { free(name); free(value); goto malf; } debug2("Unrecognised slave extension \"%s\"", name); free(name); free(value); } state->hello_rcvd = 1; return 0; } static int process_mux_new_session(u_int rid, Channel *c, Buffer *m, Buffer *r) { Channel *nc; struct mux_session_confirm_ctx *cctx; char *reserved, *cmd, *cp; u_int i, j, len, env_len, escape_char, window, packetmax; int new_fd[3]; /* Reply for SSHMUX_COMMAND_OPEN */ cctx = xcalloc(1, sizeof(*cctx)); cctx->term = NULL; cctx->rid = rid; cmd = reserved = NULL; cctx->env = NULL; env_len = 0; if ((reserved = buffer_get_string_ret(m, NULL)) == NULL || buffer_get_int_ret(&cctx->want_tty, m) != 0 || buffer_get_int_ret(&cctx->want_x_fwd, m) != 0 || buffer_get_int_ret(&cctx->want_agent_fwd, m) != 0 || buffer_get_int_ret(&cctx->want_subsys, m) != 0 || buffer_get_int_ret(&escape_char, m) != 0 || (cctx->term = buffer_get_string_ret(m, &len)) == NULL || (cmd = buffer_get_string_ret(m, &len)) == NULL) { malf: free(cmd); free(reserved); for (j = 0; j < env_len; j++) free(cctx->env[j]); free(cctx->env); free(cctx->term); free(cctx); error("%s: malformed message", __func__); return -1; } free(reserved); reserved = NULL; while (buffer_len(m) > 0) { #define MUX_MAX_ENV_VARS 4096 if ((cp = buffer_get_string_ret(m, &len)) == NULL) goto malf; if (!env_permitted(cp)) { free(cp); continue; } cctx->env = xrealloc(cctx->env, env_len + 2, sizeof(*cctx->env)); cctx->env[env_len++] = cp; cctx->env[env_len] = NULL; if (env_len > MUX_MAX_ENV_VARS) { error(">%d environment variables received, ignoring " "additional", MUX_MAX_ENV_VARS); break; } } debug2("%s: channel %d: request tty %d, X %d, agent %d, subsys %d, " "term \"%s\", cmd \"%s\", env %u", __func__, c->self, cctx->want_tty, cctx->want_x_fwd, cctx->want_agent_fwd, cctx->want_subsys, cctx->term, cmd, env_len); buffer_init(&cctx->cmd); buffer_append(&cctx->cmd, cmd, strlen(cmd)); free(cmd); cmd = NULL; /* Gather fds from client */ for(i = 0; i < 3; i++) { if ((new_fd[i] = mm_receive_fd(c->sock)) == -1) { error("%s: failed to receive fd %d from slave", __func__, i); for (j = 0; j < i; j++) close(new_fd[j]); for (j = 0; j < env_len; j++) free(cctx->env[j]); free(cctx->env); free(cctx->term); buffer_free(&cctx->cmd); free(cctx); /* prepare reply */ buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, "did not receive file descriptors"); return -1; } } debug3("%s: got fds stdin %d, stdout %d, stderr %d", __func__, new_fd[0], new_fd[1], new_fd[2]); /* XXX support multiple child sessions in future */ if (c->remote_id != -1) { debug2("%s: session already open", __func__); /* prepare reply */ buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, "Multiple sessions not supported"); cleanup: close(new_fd[0]); close(new_fd[1]); close(new_fd[2]); free(cctx->term); if (env_len != 0) { for (i = 0; i < env_len; i++) free(cctx->env[i]); free(cctx->env); } buffer_free(&cctx->cmd); free(cctx); return 0; } if (options.control_master == SSHCTL_MASTER_ASK || options.control_master == SSHCTL_MASTER_AUTO_ASK) { if (!ask_permission("Allow shared connection to %s? ", host)) { debug2("%s: session refused by user", __func__); /* prepare reply */ buffer_put_int(r, MUX_S_PERMISSION_DENIED); buffer_put_int(r, rid); buffer_put_cstring(r, "Permission denied"); goto cleanup; } } /* Try to pick up ttymodes from client before it goes raw */ if (cctx->want_tty && tcgetattr(new_fd[0], &cctx->tio) == -1) error("%s: tcgetattr: %s", __func__, strerror(errno)); /* enable nonblocking unless tty */ if (!isatty(new_fd[0])) set_nonblock(new_fd[0]); if (!isatty(new_fd[1])) set_nonblock(new_fd[1]); if (!isatty(new_fd[2])) set_nonblock(new_fd[2]); window = CHAN_SES_WINDOW_DEFAULT; packetmax = CHAN_SES_PACKET_DEFAULT; if (cctx->want_tty) { window >>= 1; packetmax >>= 1; } nc = channel_new("session", SSH_CHANNEL_OPENING, new_fd[0], new_fd[1], new_fd[2], window, packetmax, CHAN_EXTENDED_WRITE, "client-session", /*nonblock*/0); nc->ctl_chan = c->self; /* link session -> control channel */ c->remote_id = nc->self; /* link control -> session channel */ if (cctx->want_tty && escape_char != 0xffffffff) { channel_register_filter(nc->self, client_simple_escape_filter, NULL, client_filter_cleanup, client_new_escape_filter_ctx((int)escape_char)); } debug2("%s: channel_new: %d linked to control channel %d", __func__, nc->self, nc->ctl_chan); channel_send_open(nc->self); channel_register_open_confirm(nc->self, mux_session_confirm, cctx); c->mux_pause = 1; /* stop handling messages until open_confirm done */ channel_register_cleanup(nc->self, mux_master_session_cleanup_cb, 1); /* reply is deferred, sent by mux_session_confirm */ return 0; } static int process_mux_alive_check(u_int rid, Channel *c, Buffer *m, Buffer *r) { debug2("%s: channel %d: alive check", __func__, c->self); /* prepare reply */ buffer_put_int(r, MUX_S_ALIVE); buffer_put_int(r, rid); buffer_put_int(r, (u_int)getpid()); return 0; } static int process_mux_terminate(u_int rid, Channel *c, Buffer *m, Buffer *r) { debug2("%s: channel %d: terminate request", __func__, c->self); if (options.control_master == SSHCTL_MASTER_ASK || options.control_master == SSHCTL_MASTER_AUTO_ASK) { if (!ask_permission("Terminate shared connection to %s? ", host)) { debug2("%s: termination refused by user", __func__); buffer_put_int(r, MUX_S_PERMISSION_DENIED); buffer_put_int(r, rid); buffer_put_cstring(r, "Permission denied"); return 0; } } quit_pending = 1; buffer_put_int(r, MUX_S_OK); buffer_put_int(r, rid); /* XXX exit happens too soon - message never makes it to client */ return 0; } static char * format_forward(u_int ftype, Forward *fwd) { char *ret; switch (ftype) { case MUX_FWD_LOCAL: xasprintf(&ret, "local forward %.200s:%d -> %.200s:%d", (fwd->listen_host == NULL) ? (options.gateway_ports ? "*" : "LOCALHOST") : fwd->listen_host, fwd->listen_port, fwd->connect_host, fwd->connect_port); break; case MUX_FWD_DYNAMIC: xasprintf(&ret, "dynamic forward %.200s:%d -> *", (fwd->listen_host == NULL) ? (options.gateway_ports ? "*" : "LOCALHOST") : fwd->listen_host, fwd->listen_port); break; case MUX_FWD_REMOTE: xasprintf(&ret, "remote forward %.200s:%d -> %.200s:%d", (fwd->listen_host == NULL) ? "LOCALHOST" : fwd->listen_host, fwd->listen_port, fwd->connect_host, fwd->connect_port); break; default: fatal("%s: unknown forward type %u", __func__, ftype); } return ret; } static int compare_host(const char *a, const char *b) { if (a == NULL && b == NULL) return 1; if (a == NULL || b == NULL) return 0; return strcmp(a, b) == 0; } static int compare_forward(Forward *a, Forward *b) { if (!compare_host(a->listen_host, b->listen_host)) return 0; if (a->listen_port != b->listen_port) return 0; if (!compare_host(a->connect_host, b->connect_host)) return 0; if (a->connect_port != b->connect_port) return 0; return 1; } static void mux_confirm_remote_forward(int type, u_int32_t seq, void *ctxt) { struct mux_channel_confirm_ctx *fctx = ctxt; char *failmsg = NULL; Forward *rfwd; Channel *c; Buffer out; if ((c = channel_by_id(fctx->cid)) == NULL) { /* no channel for reply */ error("%s: unknown channel", __func__); return; } buffer_init(&out); if (fctx->fid >= options.num_remote_forwards) { xasprintf(&failmsg, "unknown forwarding id %d", fctx->fid); goto fail; } rfwd = &options.remote_forwards[fctx->fid]; debug("%s: %s for: listen %d, connect %s:%d", __func__, type == SSH2_MSG_REQUEST_SUCCESS ? "success" : "failure", rfwd->listen_port, rfwd->connect_host, rfwd->connect_port); if (type == SSH2_MSG_REQUEST_SUCCESS) { if (rfwd->listen_port == 0) { rfwd->allocated_port = packet_get_int(); logit("Allocated port %u for mux remote forward" " to %s:%d", rfwd->allocated_port, rfwd->connect_host, rfwd->connect_port); buffer_put_int(&out, MUX_S_REMOTE_PORT); buffer_put_int(&out, fctx->rid); buffer_put_int(&out, rfwd->allocated_port); channel_update_permitted_opens(rfwd->handle, rfwd->allocated_port); } else { buffer_put_int(&out, MUX_S_OK); buffer_put_int(&out, fctx->rid); } goto out; } else { if (rfwd->listen_port == 0) channel_update_permitted_opens(rfwd->handle, -1); xasprintf(&failmsg, "remote port forwarding failed for " "listen port %d", rfwd->listen_port); } fail: error("%s: %s", __func__, failmsg); buffer_put_int(&out, MUX_S_FAILURE); buffer_put_int(&out, fctx->rid); buffer_put_cstring(&out, failmsg); free(failmsg); out: buffer_put_string(&c->output, buffer_ptr(&out), buffer_len(&out)); buffer_free(&out); if (c->mux_pause <= 0) fatal("%s: mux_pause %d", __func__, c->mux_pause); c->mux_pause = 0; /* start processing messages again */ } static int process_mux_open_fwd(u_int rid, Channel *c, Buffer *m, Buffer *r) { Forward fwd; char *fwd_desc = NULL; u_int ftype; u_int lport, cport; int i, ret = 0, freefwd = 1; - fwd.listen_host = fwd.connect_host = NULL; + memset(&fwd, 0, sizeof(fwd)); + if (buffer_get_int_ret(&ftype, m) != 0 || (fwd.listen_host = buffer_get_string_ret(m, NULL)) == NULL || buffer_get_int_ret(&lport, m) != 0 || (fwd.connect_host = buffer_get_string_ret(m, NULL)) == NULL || buffer_get_int_ret(&cport, m) != 0 || lport > 65535 || cport > 65535) { error("%s: malformed message", __func__); ret = -1; goto out; } fwd.listen_port = lport; fwd.connect_port = cport; if (*fwd.listen_host == '\0') { free(fwd.listen_host); fwd.listen_host = NULL; } if (*fwd.connect_host == '\0') { free(fwd.connect_host); fwd.connect_host = NULL; } debug2("%s: channel %d: request %s", __func__, c->self, (fwd_desc = format_forward(ftype, &fwd))); if (ftype != MUX_FWD_LOCAL && ftype != MUX_FWD_REMOTE && ftype != MUX_FWD_DYNAMIC) { logit("%s: invalid forwarding type %u", __func__, ftype); invalid: free(fwd.listen_host); free(fwd.connect_host); buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, "Invalid forwarding request"); return 0; } if (fwd.listen_port >= 65536) { logit("%s: invalid listen port %u", __func__, fwd.listen_port); goto invalid; } if (fwd.connect_port >= 65536 || (ftype != MUX_FWD_DYNAMIC && ftype != MUX_FWD_REMOTE && fwd.connect_port == 0)) { logit("%s: invalid connect port %u", __func__, fwd.connect_port); goto invalid; } if (ftype != MUX_FWD_DYNAMIC && fwd.connect_host == NULL) { logit("%s: missing connect host", __func__); goto invalid; } /* Skip forwards that have already been requested */ switch (ftype) { case MUX_FWD_LOCAL: case MUX_FWD_DYNAMIC: for (i = 0; i < options.num_local_forwards; i++) { if (compare_forward(&fwd, options.local_forwards + i)) { exists: debug2("%s: found existing forwarding", __func__); buffer_put_int(r, MUX_S_OK); buffer_put_int(r, rid); goto out; } } break; case MUX_FWD_REMOTE: for (i = 0; i < options.num_remote_forwards; i++) { if (compare_forward(&fwd, options.remote_forwards + i)) { if (fwd.listen_port != 0) goto exists; debug2("%s: found allocated port", __func__); buffer_put_int(r, MUX_S_REMOTE_PORT); buffer_put_int(r, rid); buffer_put_int(r, options.remote_forwards[i].allocated_port); goto out; } } break; } if (options.control_master == SSHCTL_MASTER_ASK || options.control_master == SSHCTL_MASTER_AUTO_ASK) { if (!ask_permission("Open %s on %s?", fwd_desc, host)) { debug2("%s: forwarding refused by user", __func__); buffer_put_int(r, MUX_S_PERMISSION_DENIED); buffer_put_int(r, rid); buffer_put_cstring(r, "Permission denied"); goto out; } } if (ftype == MUX_FWD_LOCAL || ftype == MUX_FWD_DYNAMIC) { if (!channel_setup_local_fwd_listener(fwd.listen_host, fwd.listen_port, fwd.connect_host, fwd.connect_port, options.gateway_ports)) { fail: logit("slave-requested %s failed", fwd_desc); buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, "Port forwarding failed"); goto out; } add_local_forward(&options, &fwd); freefwd = 0; } else { struct mux_channel_confirm_ctx *fctx; fwd.handle = channel_request_remote_forwarding(fwd.listen_host, fwd.listen_port, fwd.connect_host, fwd.connect_port); if (fwd.handle < 0) goto fail; add_remote_forward(&options, &fwd); fctx = xcalloc(1, sizeof(*fctx)); fctx->cid = c->self; fctx->rid = rid; fctx->fid = options.num_remote_forwards - 1; client_register_global_confirm(mux_confirm_remote_forward, fctx); freefwd = 0; c->mux_pause = 1; /* wait for mux_confirm_remote_forward */ /* delayed reply in mux_confirm_remote_forward */ goto out; } buffer_put_int(r, MUX_S_OK); buffer_put_int(r, rid); out: free(fwd_desc); if (freefwd) { free(fwd.listen_host); free(fwd.connect_host); } return ret; } static int process_mux_close_fwd(u_int rid, Channel *c, Buffer *m, Buffer *r) { Forward fwd, *found_fwd; char *fwd_desc = NULL; const char *error_reason = NULL; u_int ftype; int i, listen_port, ret = 0; u_int lport, cport; - fwd.listen_host = fwd.connect_host = NULL; + memset(&fwd, 0, sizeof(fwd)); + if (buffer_get_int_ret(&ftype, m) != 0 || (fwd.listen_host = buffer_get_string_ret(m, NULL)) == NULL || buffer_get_int_ret(&lport, m) != 0 || (fwd.connect_host = buffer_get_string_ret(m, NULL)) == NULL || buffer_get_int_ret(&cport, m) != 0 || lport > 65535 || cport > 65535) { error("%s: malformed message", __func__); ret = -1; goto out; } fwd.listen_port = lport; fwd.connect_port = cport; if (*fwd.listen_host == '\0') { free(fwd.listen_host); fwd.listen_host = NULL; } if (*fwd.connect_host == '\0') { free(fwd.connect_host); fwd.connect_host = NULL; } debug2("%s: channel %d: request cancel %s", __func__, c->self, (fwd_desc = format_forward(ftype, &fwd))); /* make sure this has been requested */ found_fwd = NULL; switch (ftype) { case MUX_FWD_LOCAL: case MUX_FWD_DYNAMIC: for (i = 0; i < options.num_local_forwards; i++) { if (compare_forward(&fwd, options.local_forwards + i)) { found_fwd = options.local_forwards + i; break; } } break; case MUX_FWD_REMOTE: for (i = 0; i < options.num_remote_forwards; i++) { if (compare_forward(&fwd, options.remote_forwards + i)) { found_fwd = options.remote_forwards + i; break; } } break; } if (found_fwd == NULL) error_reason = "port not forwarded"; else if (ftype == MUX_FWD_REMOTE) { /* * This shouldn't fail unless we confused the host/port * between options.remote_forwards and permitted_opens. * However, for dynamic allocated listen ports we need * to lookup the actual listen port. */ listen_port = (fwd.listen_port == 0) ? found_fwd->allocated_port : fwd.listen_port; if (channel_request_rforward_cancel(fwd.listen_host, listen_port) == -1) error_reason = "port not in permitted opens"; } else { /* local and dynamic forwards */ /* Ditto */ if (channel_cancel_lport_listener(fwd.listen_host, fwd.listen_port, fwd.connect_port, options.gateway_ports) == -1) error_reason = "port not found"; } if (error_reason == NULL) { buffer_put_int(r, MUX_S_OK); buffer_put_int(r, rid); free(found_fwd->listen_host); free(found_fwd->connect_host); found_fwd->listen_host = found_fwd->connect_host = NULL; found_fwd->listen_port = found_fwd->connect_port = 0; } else { buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, error_reason); } out: free(fwd_desc); free(fwd.listen_host); free(fwd.connect_host); return ret; } static int process_mux_stdio_fwd(u_int rid, Channel *c, Buffer *m, Buffer *r) { Channel *nc; char *reserved, *chost; u_int cport, i, j; int new_fd[2]; chost = reserved = NULL; if ((reserved = buffer_get_string_ret(m, NULL)) == NULL || (chost = buffer_get_string_ret(m, NULL)) == NULL || buffer_get_int_ret(&cport, m) != 0) { free(reserved); free(chost); error("%s: malformed message", __func__); return -1; } free(reserved); debug2("%s: channel %d: request stdio fwd to %s:%u", __func__, c->self, chost, cport); /* Gather fds from client */ for(i = 0; i < 2; i++) { if ((new_fd[i] = mm_receive_fd(c->sock)) == -1) { error("%s: failed to receive fd %d from slave", __func__, i); for (j = 0; j < i; j++) close(new_fd[j]); free(chost); /* prepare reply */ buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, "did not receive file descriptors"); return -1; } } debug3("%s: got fds stdin %d, stdout %d", __func__, new_fd[0], new_fd[1]); /* XXX support multiple child sessions in future */ if (c->remote_id != -1) { debug2("%s: session already open", __func__); /* prepare reply */ buffer_put_int(r, MUX_S_FAILURE); buffer_put_int(r, rid); buffer_put_cstring(r, "Multiple sessions not supported"); cleanup: close(new_fd[0]); close(new_fd[1]); free(chost); return 0; } if (options.control_master == SSHCTL_MASTER_ASK || options.control_master == SSHCTL_MASTER_AUTO_ASK) { if (!ask_permission("Allow forward to %s:%u? ", chost, cport)) { debug2("%s: stdio fwd refused by user", __func__); /* prepare reply */ buffer_put_int(r, MUX_S_PERMISSION_DENIED); buffer_put_int(r, rid); buffer_put_cstring(r, "Permission denied"); goto cleanup; } } /* enable nonblocking unless tty */ if (!isatty(new_fd[0])) set_nonblock(new_fd[0]); if (!isatty(new_fd[1])) set_nonblock(new_fd[1]); nc = channel_connect_stdio_fwd(chost, cport, new_fd[0], new_fd[1]); nc->ctl_chan = c->self; /* link session -> control channel */ c->remote_id = nc->self; /* link control -> session channel */ debug2("%s: channel_new: %d linked to control channel %d", __func__, nc->self, nc->ctl_chan); channel_register_cleanup(nc->self, mux_master_session_cleanup_cb, 1); /* prepare reply */ /* XXX defer until channel confirmed */ buffer_put_int(r, MUX_S_SESSION_OPENED); buffer_put_int(r, rid); buffer_put_int(r, nc->self); return 0; } static int process_mux_stop_listening(u_int rid, Channel *c, Buffer *m, Buffer *r) { debug("%s: channel %d: stop listening", __func__, c->self); if (options.control_master == SSHCTL_MASTER_ASK || options.control_master == SSHCTL_MASTER_AUTO_ASK) { if (!ask_permission("Disable further multiplexing on shared " "connection to %s? ", host)) { debug2("%s: stop listen refused by user", __func__); buffer_put_int(r, MUX_S_PERMISSION_DENIED); buffer_put_int(r, rid); buffer_put_cstring(r, "Permission denied"); return 0; } } if (mux_listener_channel != NULL) { channel_free(mux_listener_channel); client_stop_mux(); free(options.control_path); options.control_path = NULL; mux_listener_channel = NULL; muxserver_sock = -1; } /* prepare reply */ buffer_put_int(r, MUX_S_OK); buffer_put_int(r, rid); return 0; } /* Channel callbacks fired on read/write from mux slave fd */ static int mux_master_read_cb(Channel *c) { struct mux_master_state *state = (struct mux_master_state *)c->mux_ctx; Buffer in, out; void *ptr; u_int type, rid, have, i; int ret = -1; /* Setup ctx and */ if (c->mux_ctx == NULL) { state = xcalloc(1, sizeof(*state)); c->mux_ctx = state; channel_register_cleanup(c->self, mux_master_control_cleanup_cb, 0); /* Send hello */ buffer_init(&out); buffer_put_int(&out, MUX_MSG_HELLO); buffer_put_int(&out, SSHMUX_VER); /* no extensions */ buffer_put_string(&c->output, buffer_ptr(&out), buffer_len(&out)); buffer_free(&out); debug3("%s: channel %d: hello sent", __func__, c->self); return 0; } buffer_init(&in); buffer_init(&out); /* Channel code ensures that we receive whole packets */ if ((ptr = buffer_get_string_ptr_ret(&c->input, &have)) == NULL) { malf: error("%s: malformed message", __func__); goto out; } buffer_append(&in, ptr, have); if (buffer_get_int_ret(&type, &in) != 0) goto malf; debug3("%s: channel %d packet type 0x%08x len %u", __func__, c->self, type, buffer_len(&in)); if (type == MUX_MSG_HELLO) rid = 0; else { if (!state->hello_rcvd) { error("%s: expected MUX_MSG_HELLO(0x%08x), " "received 0x%08x", __func__, MUX_MSG_HELLO, type); goto out; } if (buffer_get_int_ret(&rid, &in) != 0) goto malf; } for (i = 0; mux_master_handlers[i].handler != NULL; i++) { if (type == mux_master_handlers[i].type) { ret = mux_master_handlers[i].handler(rid, c, &in, &out); break; } } if (mux_master_handlers[i].handler == NULL) { error("%s: unsupported mux message 0x%08x", __func__, type); buffer_put_int(&out, MUX_S_FAILURE); buffer_put_int(&out, rid); buffer_put_cstring(&out, "unsupported request"); ret = 0; } /* Enqueue reply packet */ if (buffer_len(&out) != 0) { buffer_put_string(&c->output, buffer_ptr(&out), buffer_len(&out)); } out: buffer_free(&in); buffer_free(&out); return ret; } void mux_exit_message(Channel *c, int exitval) { Buffer m; Channel *mux_chan; debug3("%s: channel %d: exit message, exitval %d", __func__, c->self, exitval); if ((mux_chan = channel_by_id(c->ctl_chan)) == NULL) fatal("%s: channel %d missing mux channel %d", __func__, c->self, c->ctl_chan); /* Append exit message packet to control socket output queue */ buffer_init(&m); buffer_put_int(&m, MUX_S_EXIT_MESSAGE); buffer_put_int(&m, c->self); buffer_put_int(&m, exitval); buffer_put_string(&mux_chan->output, buffer_ptr(&m), buffer_len(&m)); buffer_free(&m); } void mux_tty_alloc_failed(Channel *c) { Buffer m; Channel *mux_chan; debug3("%s: channel %d: TTY alloc failed", __func__, c->self); if ((mux_chan = channel_by_id(c->ctl_chan)) == NULL) fatal("%s: channel %d missing mux channel %d", __func__, c->self, c->ctl_chan); /* Append exit message packet to control socket output queue */ buffer_init(&m); buffer_put_int(&m, MUX_S_TTY_ALLOC_FAIL); buffer_put_int(&m, c->self); buffer_put_string(&mux_chan->output, buffer_ptr(&m), buffer_len(&m)); buffer_free(&m); } /* Prepare a mux master to listen on a Unix domain socket. */ void muxserver_listen(void) { struct sockaddr_un addr; socklen_t sun_len; mode_t old_umask; char *orig_control_path = options.control_path; char rbuf[16+1]; u_int i, r; if (options.control_path == NULL || options.control_master == SSHCTL_MASTER_NO) return; debug("setting up multiplex master socket"); /* * Use a temporary path before listen so we can pseudo-atomically * establish the listening socket in its final location to avoid * other processes racing in between bind() and listen() and hitting * an unready socket. */ for (i = 0; i < sizeof(rbuf) - 1; i++) { r = arc4random_uniform(26+26+10); rbuf[i] = (r < 26) ? 'a' + r : (r < 26*2) ? 'A' + r - 26 : '0' + r - 26 - 26; } rbuf[sizeof(rbuf) - 1] = '\0'; options.control_path = NULL; xasprintf(&options.control_path, "%s.%s", orig_control_path, rbuf); debug3("%s: temporary control path %s", __func__, options.control_path); memset(&addr, '\0', sizeof(addr)); addr.sun_family = AF_UNIX; sun_len = offsetof(struct sockaddr_un, sun_path) + strlen(options.control_path) + 1; if (strlcpy(addr.sun_path, options.control_path, sizeof(addr.sun_path)) >= sizeof(addr.sun_path)) { error("ControlPath \"%s\" too long for Unix domain socket", options.control_path); goto disable_mux_master; } if ((muxserver_sock = socket(PF_UNIX, SOCK_STREAM, 0)) < 0) fatal("%s socket(): %s", __func__, strerror(errno)); old_umask = umask(0177); if (bind(muxserver_sock, (struct sockaddr *)&addr, sun_len) == -1) { if (errno == EINVAL || errno == EADDRINUSE) { error("ControlSocket %s already exists, " "disabling multiplexing", options.control_path); disable_mux_master: if (muxserver_sock != -1) { close(muxserver_sock); muxserver_sock = -1; } free(orig_control_path); free(options.control_path); options.control_path = NULL; options.control_master = SSHCTL_MASTER_NO; return; } else fatal("%s bind(): %s", __func__, strerror(errno)); } umask(old_umask); if (listen(muxserver_sock, 64) == -1) fatal("%s listen(): %s", __func__, strerror(errno)); /* Now atomically "move" the mux socket into position */ if (link(options.control_path, orig_control_path) != 0) { if (errno != EEXIST) { fatal("%s: link mux listener %s => %s: %s", __func__, options.control_path, orig_control_path, strerror(errno)); } error("ControlSocket %s already exists, disabling multiplexing", orig_control_path); unlink(options.control_path); goto disable_mux_master; } unlink(options.control_path); free(options.control_path); options.control_path = orig_control_path; set_nonblock(muxserver_sock); mux_listener_channel = channel_new("mux listener", SSH_CHANNEL_MUX_LISTENER, muxserver_sock, muxserver_sock, -1, CHAN_TCP_WINDOW_DEFAULT, CHAN_TCP_PACKET_DEFAULT, 0, options.control_path, 1); mux_listener_channel->mux_rcb = mux_master_read_cb; debug3("%s: mux listener channel %d fd %d", __func__, mux_listener_channel->self, mux_listener_channel->sock); } /* Callback on open confirmation in mux master for a mux client session. */ static void mux_session_confirm(int id, int success, void *arg) { struct mux_session_confirm_ctx *cctx = arg; const char *display; Channel *c, *cc; int i; Buffer reply; if (cctx == NULL) fatal("%s: cctx == NULL", __func__); if ((c = channel_by_id(id)) == NULL) fatal("%s: no channel for id %d", __func__, id); if ((cc = channel_by_id(c->ctl_chan)) == NULL) fatal("%s: channel %d lacks control channel %d", __func__, id, c->ctl_chan); if (!success) { debug3("%s: sending failure reply", __func__); /* prepare reply */ buffer_init(&reply); buffer_put_int(&reply, MUX_S_FAILURE); buffer_put_int(&reply, cctx->rid); buffer_put_cstring(&reply, "Session open refused by peer"); goto done; } display = getenv("DISPLAY"); if (cctx->want_x_fwd && options.forward_x11 && display != NULL) { char *proto, *data; /* Get reasonable local authentication information. */ client_x11_get_proto(display, options.xauth_location, options.forward_x11_trusted, options.forward_x11_timeout, &proto, &data); /* Request forwarding with authentication spoofing. */ debug("Requesting X11 forwarding with authentication " "spoofing."); x11_request_forwarding_with_spoofing(id, display, proto, data, 1); client_expect_confirm(id, "X11 forwarding", CONFIRM_WARN); /* XXX exit_on_forward_failure */ } if (cctx->want_agent_fwd && options.forward_agent) { debug("Requesting authentication agent forwarding."); channel_request_start(id, "auth-agent-req@openssh.com", 0); packet_send(); } client_session2_setup(id, cctx->want_tty, cctx->want_subsys, cctx->term, &cctx->tio, c->rfd, &cctx->cmd, cctx->env); debug3("%s: sending success reply", __func__); /* prepare reply */ buffer_init(&reply); buffer_put_int(&reply, MUX_S_SESSION_OPENED); buffer_put_int(&reply, cctx->rid); buffer_put_int(&reply, c->self); done: /* Send reply */ buffer_put_string(&cc->output, buffer_ptr(&reply), buffer_len(&reply)); buffer_free(&reply); if (cc->mux_pause <= 0) fatal("%s: mux_pause %d", __func__, cc->mux_pause); cc->mux_pause = 0; /* start processing messages again */ c->open_confirm_ctx = NULL; buffer_free(&cctx->cmd); free(cctx->term); if (cctx->env != NULL) { for (i = 0; cctx->env[i] != NULL; i++) free(cctx->env[i]); free(cctx->env); } free(cctx); } /* ** Multiplexing client support */ /* Exit signal handler */ static void control_client_sighandler(int signo) { muxclient_terminate = signo; } /* * Relay signal handler - used to pass some signals from mux client to * mux master. */ static void control_client_sigrelay(int signo) { int save_errno = errno; if (muxserver_pid > 1) kill(muxserver_pid, signo); errno = save_errno; } static int mux_client_read(int fd, Buffer *b, u_int need) { u_int have; ssize_t len; u_char *p; struct pollfd pfd; pfd.fd = fd; pfd.events = POLLIN; p = buffer_append_space(b, need); for (have = 0; have < need; ) { if (muxclient_terminate) { errno = EINTR; return -1; } len = read(fd, p + have, need - have); if (len < 0) { switch (errno) { #if defined(EWOULDBLOCK) && (EWOULDBLOCK != EAGAIN) case EWOULDBLOCK: #endif case EAGAIN: (void)poll(&pfd, 1, -1); /* FALLTHROUGH */ case EINTR: continue; default: return -1; } } if (len == 0) { errno = EPIPE; return -1; } have += (u_int)len; } return 0; } static int mux_client_write_packet(int fd, Buffer *m) { Buffer queue; u_int have, need; int oerrno, len; u_char *ptr; struct pollfd pfd; pfd.fd = fd; pfd.events = POLLOUT; buffer_init(&queue); buffer_put_string(&queue, buffer_ptr(m), buffer_len(m)); need = buffer_len(&queue); ptr = buffer_ptr(&queue); for (have = 0; have < need; ) { if (muxclient_terminate) { buffer_free(&queue); errno = EINTR; return -1; } len = write(fd, ptr + have, need - have); if (len < 0) { switch (errno) { #if defined(EWOULDBLOCK) && (EWOULDBLOCK != EAGAIN) case EWOULDBLOCK: #endif case EAGAIN: (void)poll(&pfd, 1, -1); /* FALLTHROUGH */ case EINTR: continue; default: oerrno = errno; buffer_free(&queue); errno = oerrno; return -1; } } if (len == 0) { buffer_free(&queue); errno = EPIPE; return -1; } have += (u_int)len; } buffer_free(&queue); return 0; } static int mux_client_read_packet(int fd, Buffer *m) { Buffer queue; u_int need, have; void *ptr; int oerrno; buffer_init(&queue); if (mux_client_read(fd, &queue, 4) != 0) { if ((oerrno = errno) == EPIPE) debug3("%s: read header failed: %s", __func__, strerror(errno)); buffer_free(&queue); errno = oerrno; return -1; } need = get_u32(buffer_ptr(&queue)); if (mux_client_read(fd, &queue, need) != 0) { oerrno = errno; debug3("%s: read body failed: %s", __func__, strerror(errno)); buffer_free(&queue); errno = oerrno; return -1; } ptr = buffer_get_string_ptr(&queue, &have); buffer_append(m, ptr, have); buffer_free(&queue); return 0; } static int mux_client_hello_exchange(int fd) { Buffer m; u_int type, ver; buffer_init(&m); buffer_put_int(&m, MUX_MSG_HELLO); buffer_put_int(&m, SSHMUX_VER); /* no extensions */ if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); buffer_clear(&m); /* Read their HELLO */ if (mux_client_read_packet(fd, &m) != 0) { buffer_free(&m); return -1; } type = buffer_get_int(&m); if (type != MUX_MSG_HELLO) fatal("%s: expected HELLO (%u) received %u", __func__, MUX_MSG_HELLO, type); ver = buffer_get_int(&m); if (ver != SSHMUX_VER) fatal("Unsupported multiplexing protocol version %d " "(expected %d)", ver, SSHMUX_VER); debug2("%s: master version %u", __func__, ver); /* No extensions are presently defined */ while (buffer_len(&m) > 0) { char *name = buffer_get_string(&m, NULL); char *value = buffer_get_string(&m, NULL); debug2("Unrecognised master extension \"%s\"", name); free(name); free(value); } buffer_free(&m); return 0; } static u_int mux_client_request_alive(int fd) { Buffer m; char *e; u_int pid, type, rid; debug3("%s: entering", __func__); buffer_init(&m); buffer_put_int(&m, MUX_C_ALIVE_CHECK); buffer_put_int(&m, muxclient_request_id); if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); buffer_clear(&m); /* Read their reply */ if (mux_client_read_packet(fd, &m) != 0) { buffer_free(&m); return 0; } type = buffer_get_int(&m); if (type != MUX_S_ALIVE) { e = buffer_get_string(&m, NULL); fatal("%s: master returned error: %s", __func__, e); } if ((rid = buffer_get_int(&m)) != muxclient_request_id) fatal("%s: out of sequence reply: my id %u theirs %u", __func__, muxclient_request_id, rid); pid = buffer_get_int(&m); buffer_free(&m); debug3("%s: done pid = %u", __func__, pid); muxclient_request_id++; return pid; } static void mux_client_request_terminate(int fd) { Buffer m; char *e; u_int type, rid; debug3("%s: entering", __func__); buffer_init(&m); buffer_put_int(&m, MUX_C_TERMINATE); buffer_put_int(&m, muxclient_request_id); if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); buffer_clear(&m); /* Read their reply */ if (mux_client_read_packet(fd, &m) != 0) { /* Remote end exited already */ if (errno == EPIPE) { buffer_free(&m); return; } fatal("%s: read from master failed: %s", __func__, strerror(errno)); } type = buffer_get_int(&m); if ((rid = buffer_get_int(&m)) != muxclient_request_id) fatal("%s: out of sequence reply: my id %u theirs %u", __func__, muxclient_request_id, rid); switch (type) { case MUX_S_OK: break; case MUX_S_PERMISSION_DENIED: e = buffer_get_string(&m, NULL); fatal("Master refused termination request: %s", e); case MUX_S_FAILURE: e = buffer_get_string(&m, NULL); fatal("%s: termination request failed: %s", __func__, e); default: fatal("%s: unexpected response from master 0x%08x", __func__, type); } buffer_free(&m); muxclient_request_id++; } static int mux_client_forward(int fd, int cancel_flag, u_int ftype, Forward *fwd) { Buffer m; char *e, *fwd_desc; u_int type, rid; fwd_desc = format_forward(ftype, fwd); debug("Requesting %s %s", cancel_flag ? "cancellation of" : "forwarding of", fwd_desc); free(fwd_desc); buffer_init(&m); buffer_put_int(&m, cancel_flag ? MUX_C_CLOSE_FWD : MUX_C_OPEN_FWD); buffer_put_int(&m, muxclient_request_id); buffer_put_int(&m, ftype); buffer_put_cstring(&m, fwd->listen_host == NULL ? "" : fwd->listen_host); buffer_put_int(&m, fwd->listen_port); buffer_put_cstring(&m, fwd->connect_host == NULL ? "" : fwd->connect_host); buffer_put_int(&m, fwd->connect_port); if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); buffer_clear(&m); /* Read their reply */ if (mux_client_read_packet(fd, &m) != 0) { buffer_free(&m); return -1; } type = buffer_get_int(&m); if ((rid = buffer_get_int(&m)) != muxclient_request_id) fatal("%s: out of sequence reply: my id %u theirs %u", __func__, muxclient_request_id, rid); switch (type) { case MUX_S_OK: break; case MUX_S_REMOTE_PORT: if (cancel_flag) fatal("%s: got MUX_S_REMOTE_PORT for cancel", __func__); fwd->allocated_port = buffer_get_int(&m); logit("Allocated port %u for remote forward to %s:%d", fwd->allocated_port, fwd->connect_host ? fwd->connect_host : "", fwd->connect_port); if (muxclient_command == SSHMUX_COMMAND_FORWARD) fprintf(stdout, "%u\n", fwd->allocated_port); break; case MUX_S_PERMISSION_DENIED: e = buffer_get_string(&m, NULL); buffer_free(&m); error("Master refused forwarding request: %s", e); return -1; case MUX_S_FAILURE: e = buffer_get_string(&m, NULL); buffer_free(&m); error("%s: forwarding request failed: %s", __func__, e); return -1; default: fatal("%s: unexpected response from master 0x%08x", __func__, type); } buffer_free(&m); muxclient_request_id++; return 0; } static int mux_client_forwards(int fd, int cancel_flag) { int i, ret = 0; debug3("%s: %s forwardings: %d local, %d remote", __func__, cancel_flag ? "cancel" : "request", options.num_local_forwards, options.num_remote_forwards); /* XXX ExitOnForwardingFailure */ for (i = 0; i < options.num_local_forwards; i++) { if (mux_client_forward(fd, cancel_flag, options.local_forwards[i].connect_port == 0 ? MUX_FWD_DYNAMIC : MUX_FWD_LOCAL, options.local_forwards + i) != 0) ret = -1; } for (i = 0; i < options.num_remote_forwards; i++) { if (mux_client_forward(fd, cancel_flag, MUX_FWD_REMOTE, options.remote_forwards + i) != 0) ret = -1; } return ret; } static int mux_client_request_session(int fd) { Buffer m; char *e, *term; u_int i, rid, sid, esid, exitval, type, exitval_seen; extern char **environ; int devnull, rawmode; debug3("%s: entering", __func__); if ((muxserver_pid = mux_client_request_alive(fd)) == 0) { error("%s: master alive request failed", __func__); return -1; } signal(SIGPIPE, SIG_IGN); if (stdin_null_flag) { if ((devnull = open(_PATH_DEVNULL, O_RDONLY)) == -1) fatal("open(/dev/null): %s", strerror(errno)); if (dup2(devnull, STDIN_FILENO) == -1) fatal("dup2: %s", strerror(errno)); if (devnull > STDERR_FILENO) close(devnull); } term = getenv("TERM"); buffer_init(&m); buffer_put_int(&m, MUX_C_NEW_SESSION); buffer_put_int(&m, muxclient_request_id); buffer_put_cstring(&m, ""); /* reserved */ buffer_put_int(&m, tty_flag); buffer_put_int(&m, options.forward_x11); buffer_put_int(&m, options.forward_agent); buffer_put_int(&m, subsystem_flag); buffer_put_int(&m, options.escape_char == SSH_ESCAPECHAR_NONE ? 0xffffffff : (u_int)options.escape_char); buffer_put_cstring(&m, term == NULL ? "" : term); buffer_put_string(&m, buffer_ptr(&command), buffer_len(&command)); if (options.num_send_env > 0 && environ != NULL) { /* Pass environment */ for (i = 0; environ[i] != NULL; i++) { if (env_permitted(environ[i])) { buffer_put_cstring(&m, environ[i]); } } } if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); /* Send the stdio file descriptors */ if (mm_send_fd(fd, STDIN_FILENO) == -1 || mm_send_fd(fd, STDOUT_FILENO) == -1 || mm_send_fd(fd, STDERR_FILENO) == -1) fatal("%s: send fds failed", __func__); debug3("%s: session request sent", __func__); /* Read their reply */ buffer_clear(&m); if (mux_client_read_packet(fd, &m) != 0) { error("%s: read from master failed: %s", __func__, strerror(errno)); buffer_free(&m); return -1; } type = buffer_get_int(&m); if ((rid = buffer_get_int(&m)) != muxclient_request_id) fatal("%s: out of sequence reply: my id %u theirs %u", __func__, muxclient_request_id, rid); switch (type) { case MUX_S_SESSION_OPENED: sid = buffer_get_int(&m); debug("%s: master session id: %u", __func__, sid); break; case MUX_S_PERMISSION_DENIED: e = buffer_get_string(&m, NULL); buffer_free(&m); error("Master refused session request: %s", e); return -1; case MUX_S_FAILURE: e = buffer_get_string(&m, NULL); buffer_free(&m); error("%s: session request failed: %s", __func__, e); return -1; default: buffer_free(&m); error("%s: unexpected response from master 0x%08x", __func__, type); return -1; } muxclient_request_id++; signal(SIGHUP, control_client_sighandler); signal(SIGINT, control_client_sighandler); signal(SIGTERM, control_client_sighandler); signal(SIGWINCH, control_client_sigrelay); rawmode = tty_flag; if (tty_flag) enter_raw_mode(options.request_tty == REQUEST_TTY_FORCE); /* * Stick around until the controlee closes the client_fd. * Before it does, it is expected to write an exit message. * This process must read the value and wait for the closure of * the client_fd; if this one closes early, the multiplex master will * terminate early too (possibly losing data). */ for (exitval = 255, exitval_seen = 0;;) { buffer_clear(&m); if (mux_client_read_packet(fd, &m) != 0) break; type = buffer_get_int(&m); switch (type) { case MUX_S_TTY_ALLOC_FAIL: if ((esid = buffer_get_int(&m)) != sid) fatal("%s: tty alloc fail on unknown session: " "my id %u theirs %u", __func__, sid, esid); leave_raw_mode(options.request_tty == REQUEST_TTY_FORCE); rawmode = 0; continue; case MUX_S_EXIT_MESSAGE: if ((esid = buffer_get_int(&m)) != sid) fatal("%s: exit on unknown session: " "my id %u theirs %u", __func__, sid, esid); if (exitval_seen) fatal("%s: exitval sent twice", __func__); exitval = buffer_get_int(&m); exitval_seen = 1; continue; default: e = buffer_get_string(&m, NULL); fatal("%s: master returned error: %s", __func__, e); } } close(fd); if (rawmode) leave_raw_mode(options.request_tty == REQUEST_TTY_FORCE); if (muxclient_terminate) { debug2("Exiting on signal %ld", (long)muxclient_terminate); exitval = 255; } else if (!exitval_seen) { debug2("Control master terminated unexpectedly"); exitval = 255; } else debug2("Received exit status from master %d", exitval); if (tty_flag && options.log_level != SYSLOG_LEVEL_QUIET) fprintf(stderr, "Shared connection to %s closed.\r\n", host); exit(exitval); } static int mux_client_request_stdio_fwd(int fd) { Buffer m; char *e; u_int type, rid, sid; int devnull; debug3("%s: entering", __func__); if ((muxserver_pid = mux_client_request_alive(fd)) == 0) { error("%s: master alive request failed", __func__); return -1; } signal(SIGPIPE, SIG_IGN); if (stdin_null_flag) { if ((devnull = open(_PATH_DEVNULL, O_RDONLY)) == -1) fatal("open(/dev/null): %s", strerror(errno)); if (dup2(devnull, STDIN_FILENO) == -1) fatal("dup2: %s", strerror(errno)); if (devnull > STDERR_FILENO) close(devnull); } buffer_init(&m); buffer_put_int(&m, MUX_C_NEW_STDIO_FWD); buffer_put_int(&m, muxclient_request_id); buffer_put_cstring(&m, ""); /* reserved */ buffer_put_cstring(&m, stdio_forward_host); buffer_put_int(&m, stdio_forward_port); if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); /* Send the stdio file descriptors */ if (mm_send_fd(fd, STDIN_FILENO) == -1 || mm_send_fd(fd, STDOUT_FILENO) == -1) fatal("%s: send fds failed", __func__); debug3("%s: stdio forward request sent", __func__); /* Read their reply */ buffer_clear(&m); if (mux_client_read_packet(fd, &m) != 0) { error("%s: read from master failed: %s", __func__, strerror(errno)); buffer_free(&m); return -1; } type = buffer_get_int(&m); if ((rid = buffer_get_int(&m)) != muxclient_request_id) fatal("%s: out of sequence reply: my id %u theirs %u", __func__, muxclient_request_id, rid); switch (type) { case MUX_S_SESSION_OPENED: sid = buffer_get_int(&m); debug("%s: master session id: %u", __func__, sid); break; case MUX_S_PERMISSION_DENIED: e = buffer_get_string(&m, NULL); buffer_free(&m); fatal("Master refused stdio forwarding request: %s", e); case MUX_S_FAILURE: e = buffer_get_string(&m, NULL); buffer_free(&m); fatal("%s: stdio forwarding request failed: %s", __func__, e); default: buffer_free(&m); error("%s: unexpected response from master 0x%08x", __func__, type); return -1; } muxclient_request_id++; signal(SIGHUP, control_client_sighandler); signal(SIGINT, control_client_sighandler); signal(SIGTERM, control_client_sighandler); signal(SIGWINCH, control_client_sigrelay); /* * Stick around until the controlee closes the client_fd. */ buffer_clear(&m); if (mux_client_read_packet(fd, &m) != 0) { if (errno == EPIPE || (errno == EINTR && muxclient_terminate != 0)) return 0; fatal("%s: mux_client_read_packet: %s", __func__, strerror(errno)); } fatal("%s: master returned unexpected message %u", __func__, type); } static void mux_client_request_stop_listening(int fd) { Buffer m; char *e; u_int type, rid; debug3("%s: entering", __func__); buffer_init(&m); buffer_put_int(&m, MUX_C_STOP_LISTENING); buffer_put_int(&m, muxclient_request_id); if (mux_client_write_packet(fd, &m) != 0) fatal("%s: write packet: %s", __func__, strerror(errno)); buffer_clear(&m); /* Read their reply */ if (mux_client_read_packet(fd, &m) != 0) fatal("%s: read from master failed: %s", __func__, strerror(errno)); type = buffer_get_int(&m); if ((rid = buffer_get_int(&m)) != muxclient_request_id) fatal("%s: out of sequence reply: my id %u theirs %u", __func__, muxclient_request_id, rid); switch (type) { case MUX_S_OK: break; case MUX_S_PERMISSION_DENIED: e = buffer_get_string(&m, NULL); fatal("Master refused stop listening request: %s", e); case MUX_S_FAILURE: e = buffer_get_string(&m, NULL); fatal("%s: stop listening request failed: %s", __func__, e); default: fatal("%s: unexpected response from master 0x%08x", __func__, type); } buffer_free(&m); muxclient_request_id++; } /* Multiplex client main loop. */ void muxclient(const char *path) { struct sockaddr_un addr; socklen_t sun_len; int sock; u_int pid; if (muxclient_command == 0) { if (stdio_forward_host != NULL) muxclient_command = SSHMUX_COMMAND_STDIO_FWD; else muxclient_command = SSHMUX_COMMAND_OPEN; } switch (options.control_master) { case SSHCTL_MASTER_AUTO: case SSHCTL_MASTER_AUTO_ASK: debug("auto-mux: Trying existing master"); /* FALLTHROUGH */ case SSHCTL_MASTER_NO: break; default: return; } memset(&addr, '\0', sizeof(addr)); addr.sun_family = AF_UNIX; sun_len = offsetof(struct sockaddr_un, sun_path) + strlen(path) + 1; if (strlcpy(addr.sun_path, path, sizeof(addr.sun_path)) >= sizeof(addr.sun_path)) fatal("ControlPath too long"); if ((sock = socket(PF_UNIX, SOCK_STREAM, 0)) < 0) fatal("%s socket(): %s", __func__, strerror(errno)); if (connect(sock, (struct sockaddr *)&addr, sun_len) == -1) { switch (muxclient_command) { case SSHMUX_COMMAND_OPEN: case SSHMUX_COMMAND_STDIO_FWD: break; default: fatal("Control socket connect(%.100s): %s", path, strerror(errno)); } if (errno == ECONNREFUSED && options.control_master != SSHCTL_MASTER_NO) { debug("Stale control socket %.100s, unlinking", path); unlink(path); } else if (errno == ENOENT) { debug("Control socket \"%.100s\" does not exist", path); } else { error("Control socket connect(%.100s): %s", path, strerror(errno)); } close(sock); return; } set_nonblock(sock); if (mux_client_hello_exchange(sock) != 0) { error("%s: master hello exchange failed", __func__); close(sock); return; } switch (muxclient_command) { case SSHMUX_COMMAND_ALIVE_CHECK: if ((pid = mux_client_request_alive(sock)) == 0) fatal("%s: master alive check failed", __func__); fprintf(stderr, "Master running (pid=%d)\r\n", pid); exit(0); case SSHMUX_COMMAND_TERMINATE: mux_client_request_terminate(sock); fprintf(stderr, "Exit request sent.\r\n"); exit(0); case SSHMUX_COMMAND_FORWARD: if (mux_client_forwards(sock, 0) != 0) fatal("%s: master forward request failed", __func__); exit(0); case SSHMUX_COMMAND_OPEN: if (mux_client_forwards(sock, 0) != 0) { error("%s: master forward request failed", __func__); return; } mux_client_request_session(sock); return; case SSHMUX_COMMAND_STDIO_FWD: mux_client_request_stdio_fwd(sock); exit(0); case SSHMUX_COMMAND_STOP: mux_client_request_stop_listening(sock); fprintf(stderr, "Stop listening request sent.\r\n"); exit(0); case SSHMUX_COMMAND_CANCEL_FWD: if (mux_client_forwards(sock, 1) != 0) error("%s: master cancel forward request failed", __func__); exit(0); default: fatal("unrecognised muxclient_command %d", muxclient_command); } } Index: releng/9.3/sys/amd64/amd64/exception.S =================================================================== --- releng/9.3/sys/amd64/amd64/exception.S (revision 287146) +++ releng/9.3/sys/amd64/amd64/exception.S (revision 287147) @@ -1,923 +1,928 @@ /*- * Copyright (c) 1989, 1990 William F. Jolitz. * Copyright (c) 1990 The Regents of the University of California. * Copyright (c) 2007 The FreeBSD Foundation * All rights reserved. * * Portions of this software were developed by A. Joseph Koshy under * sponsorship from the FreeBSD Foundation and Google, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include "opt_atpic.h" #include "opt_compat.h" #include "opt_hwpmc_hooks.h" #include "opt_kdtrace.h" #include #include #include #include #include "assym.s" #ifdef KDTRACE_HOOKS .bss .globl dtrace_invop_jump_addr .align 8 .type dtrace_invop_jump_addr,@object .size dtrace_invop_jump_addr,8 dtrace_invop_jump_addr: .zero 8 .globl dtrace_invop_calltrap_addr .align 8 .type dtrace_invop_calltrap_addr,@object .size dtrace_invop_calltrap_addr,8 dtrace_invop_calltrap_addr: .zero 8 #endif .text #ifdef HWPMC_HOOKS ENTRY(start_exceptions) #endif /*****************************************************************************/ /* Trap handling */ /*****************************************************************************/ /* * Trap and fault vector routines. * * All traps are 'interrupt gates', SDT_SYSIGT. An interrupt gate pushes * state on the stack but also disables interrupts. This is important for * us for the use of the swapgs instruction. We cannot be interrupted * until the GS.base value is correct. For most traps, we automatically * then enable interrupts if the interrupted context had them enabled. * This is equivalent to the i386 port's use of SDT_SYS386TGT. * * The cpu will push a certain amount of state onto the kernel stack for * the current process. See amd64/include/frame.h. * This includes the current RFLAGS (status register, which includes * the interrupt disable state prior to the trap), the code segment register, * and the return instruction pointer are pushed by the cpu. The cpu * will also push an 'error' code for certain traps. We push a dummy * error code for those traps where the cpu doesn't in order to maintain * a consistent frame. We also push a contrived 'trap number'. * * The CPU does not push the general registers, so we must do that, and we * must restore them prior to calling 'iret'. The CPU adjusts %cs and %ss * but does not mess with %ds, %es, %gs or %fs. We swap the %gs base for * for the kernel mode operation shortly, without changes to the selector * loaded. Since superuser long mode works with any selectors loaded into * segment registers other then %cs, which makes them mostly unused in long * mode, and kernel does not reference %fs, leave them alone. The segment * registers are reloaded on return to the usermode. */ MCOUNT_LABEL(user) MCOUNT_LABEL(btrap) /* Traps that we leave interrupts disabled for.. */ #define TRAP_NOEN(a) \ subq $TF_RIP,%rsp; \ movl $(a),TF_TRAPNO(%rsp) ; \ movq $0,TF_ADDR(%rsp) ; \ movq $0,TF_ERR(%rsp) ; \ jmp alltraps_noen IDTVEC(dbg) TRAP_NOEN(T_TRCTRAP) IDTVEC(bpt) TRAP_NOEN(T_BPTFLT) #ifdef KDTRACE_HOOKS IDTVEC(dtrace_ret) TRAP_NOEN(T_DTRACE_RET) #endif /* Regular traps; The cpu does not supply tf_err for these. */ #define TRAP(a) \ subq $TF_RIP,%rsp; \ movl $(a),TF_TRAPNO(%rsp) ; \ movq $0,TF_ADDR(%rsp) ; \ movq $0,TF_ERR(%rsp) ; \ jmp alltraps IDTVEC(div) TRAP(T_DIVIDE) IDTVEC(ofl) TRAP(T_OFLOW) IDTVEC(bnd) TRAP(T_BOUND) IDTVEC(ill) TRAP(T_PRIVINFLT) IDTVEC(dna) TRAP(T_DNA) IDTVEC(fpusegm) TRAP(T_FPOPFLT) IDTVEC(mchk) TRAP(T_MCHK) IDTVEC(rsvd) TRAP(T_RESERVED) IDTVEC(fpu) TRAP(T_ARITHTRAP) IDTVEC(xmm) TRAP(T_XMMFLT) /* This group of traps have tf_err already pushed by the cpu */ #define TRAP_ERR(a) \ subq $TF_ERR,%rsp; \ movl $(a),TF_TRAPNO(%rsp) ; \ movq $0,TF_ADDR(%rsp) ; \ jmp alltraps IDTVEC(tss) TRAP_ERR(T_TSSFLT) IDTVEC(missing) - TRAP_ERR(T_SEGNPFLT) + subq $TF_ERR,%rsp + movl $T_SEGNPFLT,TF_TRAPNO(%rsp) + jmp prot_addrf IDTVEC(stk) - TRAP_ERR(T_STKFLT) + subq $TF_ERR,%rsp + movl $T_STKFLT,TF_TRAPNO(%rsp) + jmp prot_addrf IDTVEC(align) TRAP_ERR(T_ALIGNFLT) /* * alltraps entry point. Use swapgs if this is the first time in the * kernel from userland. Reenable interrupts if they were enabled * before the trap. This approximates SDT_SYS386TGT on the i386 port. */ SUPERALIGN_TEXT .globl alltraps .type alltraps,@function alltraps: movq %rdi,TF_RDI(%rsp) testb $SEL_RPL_MASK,TF_CS(%rsp) /* Did we come from kernel? */ jz alltraps_testi /* already running with kernel GS.base */ swapgs movq PCPU(CURPCB),%rdi andl $~PCB_FULL_IRET,PCB_FLAGS(%rdi) movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) alltraps_testi: testl $PSL_I,TF_RFLAGS(%rsp) jz alltraps_pushregs_no_rdi sti alltraps_pushregs_no_rdi: movq %rsi,TF_RSI(%rsp) movq %rdx,TF_RDX(%rsp) movq %rcx,TF_RCX(%rsp) movq %r8,TF_R8(%rsp) movq %r9,TF_R9(%rsp) movq %rax,TF_RAX(%rsp) movq %rbx,TF_RBX(%rsp) movq %rbp,TF_RBP(%rsp) movq %r10,TF_R10(%rsp) movq %r11,TF_R11(%rsp) movq %r12,TF_R12(%rsp) movq %r13,TF_R13(%rsp) movq %r14,TF_R14(%rsp) movq %r15,TF_R15(%rsp) movl $TF_HASSEGS,TF_FLAGS(%rsp) cld FAKE_MCOUNT(TF_RIP(%rsp)) #ifdef KDTRACE_HOOKS /* * DTrace Function Boundary Trace (fbt) probes are triggered * by int3 (0xcc) which causes the #BP (T_BPTFLT) breakpoint * interrupt. For all other trap types, just handle them in * the usual way. */ cmpl $T_BPTFLT,TF_TRAPNO(%rsp) jne calltrap /* Check if there is no DTrace hook registered. */ cmpq $0,dtrace_invop_jump_addr je calltrap /* * Set our jump address for the jump back in the event that * the breakpoint wasn't caused by DTrace at all. */ movq $calltrap,dtrace_invop_calltrap_addr(%rip) /* Jump to the code hooked in by DTrace. */ movq dtrace_invop_jump_addr,%rax jmpq *dtrace_invop_jump_addr #endif .globl calltrap .type calltrap,@function calltrap: movq %rsp,%rdi call trap MEXITCOUNT jmp doreti /* Handle any pending ASTs */ /* * alltraps_noen entry point. Unlike alltraps above, we want to * leave the interrupts disabled. This corresponds to * SDT_SYS386IGT on the i386 port. */ SUPERALIGN_TEXT .globl alltraps_noen .type alltraps_noen,@function alltraps_noen: movq %rdi,TF_RDI(%rsp) testb $SEL_RPL_MASK,TF_CS(%rsp) /* Did we come from kernel? */ jz 1f /* already running with kernel GS.base */ swapgs movq PCPU(CURPCB),%rdi andl $~PCB_FULL_IRET,PCB_FLAGS(%rdi) 1: movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) jmp alltraps_pushregs_no_rdi IDTVEC(dblfault) subq $TF_ERR,%rsp movl $T_DOUBLEFLT,TF_TRAPNO(%rsp) movq $0,TF_ADDR(%rsp) movq $0,TF_ERR(%rsp) movq %rdi,TF_RDI(%rsp) movq %rsi,TF_RSI(%rsp) movq %rdx,TF_RDX(%rsp) movq %rcx,TF_RCX(%rsp) movq %r8,TF_R8(%rsp) movq %r9,TF_R9(%rsp) movq %rax,TF_RAX(%rsp) movq %rbx,TF_RBX(%rsp) movq %rbp,TF_RBP(%rsp) movq %r10,TF_R10(%rsp) movq %r11,TF_R11(%rsp) movq %r12,TF_R12(%rsp) movq %r13,TF_R13(%rsp) movq %r14,TF_R14(%rsp) movq %r15,TF_R15(%rsp) movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) movl $TF_HASSEGS,TF_FLAGS(%rsp) cld testb $SEL_RPL_MASK,TF_CS(%rsp) /* Did we come from kernel? */ jz 1f /* already running with kernel GS.base */ swapgs 1: movq %rsp,%rdi call dblfault_handler 2: hlt jmp 2b IDTVEC(page) subq $TF_ERR,%rsp movl $T_PAGEFLT,TF_TRAPNO(%rsp) movq %rdi,TF_RDI(%rsp) /* free up a GP register */ testb $SEL_RPL_MASK,TF_CS(%rsp) /* Did we come from kernel? */ jz 1f /* already running with kernel GS.base */ swapgs movq PCPU(CURPCB),%rdi andl $~PCB_FULL_IRET,PCB_FLAGS(%rdi) 1: movq %cr2,%rdi /* preserve %cr2 before .. */ movq %rdi,TF_ADDR(%rsp) /* enabling interrupts. */ movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz alltraps_pushregs_no_rdi sti jmp alltraps_pushregs_no_rdi /* * We have to special-case this one. If we get a trap in doreti() at * the iretq stage, we'll reenter with the wrong gs state. We'll have * to do a special the swapgs in this case even coming from the kernel. * XXX linux has a trap handler for their equivalent of load_gs(). */ IDTVEC(prot) subq $TF_ERR,%rsp movl $T_PROTFLT,TF_TRAPNO(%rsp) +prot_addrf: movq $0,TF_ADDR(%rsp) movq %rdi,TF_RDI(%rsp) /* free up a GP register */ leaq doreti_iret(%rip),%rdi cmpq %rdi,TF_RIP(%rsp) je 1f /* kernel but with user gsbase!! */ testb $SEL_RPL_MASK,TF_CS(%rsp) /* Did we come from kernel? */ jz 2f /* already running with kernel GS.base */ 1: swapgs 2: movq PCPU(CURPCB),%rdi orl $PCB_FULL_IRET,PCB_FLAGS(%rdi) /* always full iret from GPF */ movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz alltraps_pushregs_no_rdi sti jmp alltraps_pushregs_no_rdi /* * Fast syscall entry point. We enter here with just our new %cs/%ss set, * and the new privilige level. We are still running on the old user stack * pointer. We have to juggle a few things around to find our stack etc. * swapgs gives us access to our PCPU space only. * * We do not support invoking this from a custom %cs or %ss (e.g. using * entries from an LDT). */ IDTVEC(fast_syscall) swapgs movq %rsp,PCPU(SCRATCH_RSP) movq PCPU(RSP0),%rsp /* Now emulate a trapframe. Make the 8 byte alignment odd for call. */ subq $TF_SIZE,%rsp /* defer TF_RSP till we have a spare register */ movq %r11,TF_RFLAGS(%rsp) movq %rcx,TF_RIP(%rsp) /* %rcx original value is in %r10 */ movq PCPU(SCRATCH_RSP),%r11 /* %r11 already saved */ movq %r11,TF_RSP(%rsp) /* user stack pointer */ movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) movq PCPU(CURPCB),%r11 andl $~PCB_FULL_IRET,PCB_FLAGS(%r11) sti movq $KUDSEL,TF_SS(%rsp) movq $KUCSEL,TF_CS(%rsp) movq $2,TF_ERR(%rsp) movq %rdi,TF_RDI(%rsp) /* arg 1 */ movq %rsi,TF_RSI(%rsp) /* arg 2 */ movq %rdx,TF_RDX(%rsp) /* arg 3 */ movq %r10,TF_RCX(%rsp) /* arg 4 */ movq %r8,TF_R8(%rsp) /* arg 5 */ movq %r9,TF_R9(%rsp) /* arg 6 */ movq %rax,TF_RAX(%rsp) /* syscall number */ movq %rbx,TF_RBX(%rsp) /* C preserved */ movq %rbp,TF_RBP(%rsp) /* C preserved */ movq %r12,TF_R12(%rsp) /* C preserved */ movq %r13,TF_R13(%rsp) /* C preserved */ movq %r14,TF_R14(%rsp) /* C preserved */ movq %r15,TF_R15(%rsp) /* C preserved */ movl $TF_HASSEGS,TF_FLAGS(%rsp) cld FAKE_MCOUNT(TF_RIP(%rsp)) movq PCPU(CURTHREAD),%rdi movq %rsp,TD_FRAME(%rdi) movl TF_RFLAGS(%rsp),%esi andl $PSL_T,%esi call amd64_syscall 1: movq PCPU(CURPCB),%rax /* Disable interrupts before testing PCB_FULL_IRET. */ cli testl $PCB_FULL_IRET,PCB_FLAGS(%rax) jnz 3f /* Check for and handle AST's on return to userland. */ movq PCPU(CURTHREAD),%rax testl $TDF_ASTPENDING | TDF_NEEDRESCHED,TD_FLAGS(%rax) jne 2f /* Restore preserved registers. */ MEXITCOUNT movq TF_RDI(%rsp),%rdi /* bonus; preserve arg 1 */ movq TF_RSI(%rsp),%rsi /* bonus: preserve arg 2 */ movq TF_RDX(%rsp),%rdx /* return value 2 */ movq TF_RAX(%rsp),%rax /* return value 1 */ movq TF_RFLAGS(%rsp),%r11 /* original %rflags */ movq TF_RIP(%rsp),%rcx /* original %rip */ movq TF_RSP(%rsp),%rsp /* user stack pointer */ swapgs sysretq 2: /* AST scheduled. */ sti movq %rsp,%rdi call ast jmp 1b 3: /* Requested full context restore, use doreti for that. */ MEXITCOUNT jmp doreti /* * Here for CYA insurance, in case a "syscall" instruction gets * issued from 32 bit compatability mode. MSR_CSTAR has to point * to *something* if EFER_SCE is enabled. */ IDTVEC(fast_syscall32) sysret /* * NMI handling is special. * * First, NMIs do not respect the state of the processor's RFLAGS.IF * bit. The NMI handler may be entered at any time, including when * the processor is in a critical section with RFLAGS.IF == 0. * The processor's GS.base value could be invalid on entry to the * handler. * * Second, the processor treats NMIs specially, blocking further NMIs * until an 'iretq' instruction is executed. We thus need to execute * the NMI handler with interrupts disabled, to prevent a nested interrupt * from executing an 'iretq' instruction and inadvertently taking the * processor out of NMI mode. * * Third, the NMI handler runs on its own stack (tss_ist2). The canonical * GS.base value for the processor is stored just above the bottom of its * NMI stack. For NMIs taken from kernel mode, the current value in * the processor's GS.base is saved at entry to C-preserved register %r12, * the canonical value for GS.base is then loaded into the processor, and * the saved value is restored at exit time. For NMIs taken from user mode, * the cheaper 'SWAPGS' instructions are used for swapping GS.base. */ IDTVEC(nmi) subq $TF_RIP,%rsp movl $(T_NMI),TF_TRAPNO(%rsp) movq $0,TF_ADDR(%rsp) movq $0,TF_ERR(%rsp) movq %rdi,TF_RDI(%rsp) movq %rsi,TF_RSI(%rsp) movq %rdx,TF_RDX(%rsp) movq %rcx,TF_RCX(%rsp) movq %r8,TF_R8(%rsp) movq %r9,TF_R9(%rsp) movq %rax,TF_RAX(%rsp) movq %rbx,TF_RBX(%rsp) movq %rbp,TF_RBP(%rsp) movq %r10,TF_R10(%rsp) movq %r11,TF_R11(%rsp) movq %r12,TF_R12(%rsp) movq %r13,TF_R13(%rsp) movq %r14,TF_R14(%rsp) movq %r15,TF_R15(%rsp) movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) movl $TF_HASSEGS,TF_FLAGS(%rsp) cld xorl %ebx,%ebx testb $SEL_RPL_MASK,TF_CS(%rsp) jnz nmi_fromuserspace /* * We've interrupted the kernel. Preserve GS.base in %r12. */ movl $MSR_GSBASE,%ecx rdmsr movq %rax,%r12 shlq $32,%rdx orq %rdx,%r12 /* Retrieve and load the canonical value for GS.base. */ movq TF_SIZE(%rsp),%rdx movl %edx,%eax shrq $32,%rdx wrmsr jmp nmi_calltrap nmi_fromuserspace: incl %ebx swapgs /* Note: this label is also used by ddb and gdb: */ nmi_calltrap: FAKE_MCOUNT(TF_RIP(%rsp)) movq %rsp,%rdi call trap MEXITCOUNT #ifdef HWPMC_HOOKS /* * Capture a userspace callchain if needed. * * - Check if the current trap was from user mode. * - Check if the current thread is valid. * - Check if the thread requires a user call chain to be * captured. * * We are still in NMI mode at this point. */ testl %ebx,%ebx jz nocallchain /* not from userspace */ movq PCPU(CURTHREAD),%rax orq %rax,%rax /* curthread present? */ jz nocallchain testl $TDP_CALLCHAIN,TD_PFLAGS(%rax) /* flagged for capture? */ jz nocallchain /* * A user callchain is to be captured, so: * - Move execution to the regular kernel stack, to allow for * nested NMI interrupts. * - Take the processor out of "NMI" mode by faking an "iret". * - Enable interrupts, so that copyin() can work. */ movq %rsp,%rsi /* source stack pointer */ movq $TF_SIZE,%rcx movq PCPU(RSP0),%rdx subq %rcx,%rdx movq %rdx,%rdi /* destination stack pointer */ shrq $3,%rcx /* trap frame size in long words */ cld rep movsq /* copy trapframe */ movl %ss,%eax pushq %rax /* tf_ss */ pushq %rdx /* tf_rsp (on kernel stack) */ pushfq /* tf_rflags */ movl %cs,%eax pushq %rax /* tf_cs */ pushq $outofnmi /* tf_rip */ iretq outofnmi: /* * At this point the processor has exited NMI mode and is running * with interrupts turned off on the normal kernel stack. * * If a pending NMI gets recognized at or after this point, it * will cause a kernel callchain to be traced. * * We turn interrupts back on, and call the user callchain capture hook. */ movq pmc_hook,%rax orq %rax,%rax jz nocallchain movq PCPU(CURTHREAD),%rdi /* thread */ movq $PMC_FN_USER_CALLCHAIN,%rsi /* command */ movq %rsp,%rdx /* frame */ sti call *%rax cli nocallchain: #endif testl %ebx,%ebx jnz doreti_exit nmi_kernelexit: /* * Put back the preserved MSR_GSBASE value. */ movl $MSR_GSBASE,%ecx movq %r12,%rdx movl %edx,%eax shrq $32,%rdx wrmsr nmi_restoreregs: movq TF_RDI(%rsp),%rdi movq TF_RSI(%rsp),%rsi movq TF_RDX(%rsp),%rdx movq TF_RCX(%rsp),%rcx movq TF_R8(%rsp),%r8 movq TF_R9(%rsp),%r9 movq TF_RAX(%rsp),%rax movq TF_RBX(%rsp),%rbx movq TF_RBP(%rsp),%rbp movq TF_R10(%rsp),%r10 movq TF_R11(%rsp),%r11 movq TF_R12(%rsp),%r12 movq TF_R13(%rsp),%r13 movq TF_R14(%rsp),%r14 movq TF_R15(%rsp),%r15 addq $TF_RIP,%rsp jmp doreti_iret ENTRY(fork_trampoline) movq %r12,%rdi /* function */ movq %rbx,%rsi /* arg1 */ movq %rsp,%rdx /* trapframe pointer */ call fork_exit MEXITCOUNT jmp doreti /* Handle any ASTs */ /* * To efficiently implement classification of trap and interrupt handlers * for profiling, there must be only trap handlers between the labels btrap * and bintr, and only interrupt handlers between the labels bintr and * eintr. This is implemented (partly) by including files that contain * some of the handlers. Before including the files, set up a normal asm * environment so that the included files doen't need to know that they are * included. */ #ifdef COMPAT_FREEBSD32 .data .p2align 4 .text SUPERALIGN_TEXT #include #endif .data .p2align 4 .text SUPERALIGN_TEXT MCOUNT_LABEL(bintr) #include #ifdef DEV_ATPIC .data .p2align 4 .text SUPERALIGN_TEXT #include #endif .text MCOUNT_LABEL(eintr) /* * void doreti(struct trapframe) * * Handle return from interrupts, traps and syscalls. */ .text SUPERALIGN_TEXT .type doreti,@function doreti: FAKE_MCOUNT($bintr) /* init "from" bintr -> doreti */ /* * Check if ASTs can be handled now. */ testb $SEL_RPL_MASK,TF_CS(%rsp) /* are we returning to user mode? */ jz doreti_exit /* can't handle ASTs now if not */ doreti_ast: /* * Check for ASTs atomically with returning. Disabling CPU * interrupts provides sufficient locking even in the SMP case, * since we will be informed of any new ASTs by an IPI. */ cli movq PCPU(CURTHREAD),%rax testl $TDF_ASTPENDING | TDF_NEEDRESCHED,TD_FLAGS(%rax) je doreti_exit sti movq %rsp,%rdi /* pass a pointer to the trapframe */ call ast jmp doreti_ast /* * doreti_exit: pop registers, iret. * * The segment register pop is a special case, since it may * fault if (for example) a sigreturn specifies bad segment * registers. The fault is handled in trap.c. */ doreti_exit: MEXITCOUNT movq PCPU(CURPCB),%r8 /* * Do not reload segment registers for kernel. * Since we do not reload segments registers with sane * values on kernel entry, descriptors referenced by * segments registers might be not valid. This is fatal * for user mode, but is not a problem for the kernel. */ testb $SEL_RPL_MASK,TF_CS(%rsp) jz ld_regs testl $PCB_FULL_IRET,PCB_FLAGS(%r8) jz ld_regs testl $TF_HASSEGS,TF_FLAGS(%rsp) je set_segs do_segs: /* Restore %fs and fsbase */ movw TF_FS(%rsp),%ax .globl ld_fs ld_fs: movw %ax,%fs cmpw $KUF32SEL,%ax jne 1f movl $MSR_FSBASE,%ecx movl PCB_FSBASE(%r8),%eax movl PCB_FSBASE+4(%r8),%edx .globl ld_fsbase ld_fsbase: wrmsr 1: /* Restore %gs and gsbase */ movw TF_GS(%rsp),%si pushfq cli movl $MSR_GSBASE,%ecx /* Save current kernel %gs base into %r12d:%r13d */ rdmsr movl %eax,%r12d movl %edx,%r13d .globl ld_gs ld_gs: movw %si,%gs /* Save user %gs base into %r14d:%r15d */ rdmsr movl %eax,%r14d movl %edx,%r15d /* Restore kernel %gs base */ movl %r12d,%eax movl %r13d,%edx wrmsr popfq /* * Restore user %gs base, either from PCB if used for TLS, or * from the previously saved msr read. */ movl $MSR_KGSBASE,%ecx cmpw $KUG32SEL,%si jne 1f movl PCB_GSBASE(%r8),%eax movl PCB_GSBASE+4(%r8),%edx jmp ld_gsbase 1: movl %r14d,%eax movl %r15d,%edx .globl ld_gsbase ld_gsbase: wrmsr /* May trap if non-canonical, but only for TLS. */ .globl ld_es ld_es: movw TF_ES(%rsp),%es .globl ld_ds ld_ds: movw TF_DS(%rsp),%ds ld_regs: movq TF_RDI(%rsp),%rdi movq TF_RSI(%rsp),%rsi movq TF_RDX(%rsp),%rdx movq TF_RCX(%rsp),%rcx movq TF_R8(%rsp),%r8 movq TF_R9(%rsp),%r9 movq TF_RAX(%rsp),%rax movq TF_RBX(%rsp),%rbx movq TF_RBP(%rsp),%rbp movq TF_R10(%rsp),%r10 movq TF_R11(%rsp),%r11 movq TF_R12(%rsp),%r12 movq TF_R13(%rsp),%r13 movq TF_R14(%rsp),%r14 movq TF_R15(%rsp),%r15 testb $SEL_RPL_MASK,TF_CS(%rsp) /* Did we come from kernel? */ jz 1f /* keep running with kernel GS.base */ cli swapgs 1: addq $TF_RIP,%rsp /* skip over tf_err, tf_trapno */ .globl doreti_iret doreti_iret: iretq set_segs: movw $KUDSEL,%ax movw %ax,TF_DS(%rsp) movw %ax,TF_ES(%rsp) movw $KUF32SEL,TF_FS(%rsp) movw $KUG32SEL,TF_GS(%rsp) jmp do_segs /* * doreti_iret_fault. Alternative return code for * the case where we get a fault in the doreti_exit code * above. trap() (amd64/amd64/trap.c) catches this specific * case, sends the process a signal and continues in the * corresponding place in the code below. */ ALIGN_TEXT .globl doreti_iret_fault doreti_iret_fault: subq $TF_RIP,%rsp /* space including tf_err, tf_trapno */ testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movw %fs,TF_FS(%rsp) movw %gs,TF_GS(%rsp) movw %es,TF_ES(%rsp) movw %ds,TF_DS(%rsp) movl $TF_HASSEGS,TF_FLAGS(%rsp) movq %rdi,TF_RDI(%rsp) movq %rsi,TF_RSI(%rsp) movq %rdx,TF_RDX(%rsp) movq %rcx,TF_RCX(%rsp) movq %r8,TF_R8(%rsp) movq %r9,TF_R9(%rsp) movq %rax,TF_RAX(%rsp) movq %rbx,TF_RBX(%rsp) movq %rbp,TF_RBP(%rsp) movq %r10,TF_R10(%rsp) movq %r11,TF_R11(%rsp) movq %r12,TF_R12(%rsp) movq %r13,TF_R13(%rsp) movq %r14,TF_R14(%rsp) movq %r15,TF_R15(%rsp) movl $T_PROTFLT,TF_TRAPNO(%rsp) movq $0,TF_ERR(%rsp) /* XXX should be the error code */ movq $0,TF_ADDR(%rsp) FAKE_MCOUNT(TF_RIP(%rsp)) jmp calltrap ALIGN_TEXT .globl ds_load_fault ds_load_fault: movl $T_PROTFLT,TF_TRAPNO(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movq %rsp,%rdi call trap movw $KUDSEL,TF_DS(%rsp) jmp doreti ALIGN_TEXT .globl es_load_fault es_load_fault: movl $T_PROTFLT,TF_TRAPNO(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movq %rsp,%rdi call trap movw $KUDSEL,TF_ES(%rsp) jmp doreti ALIGN_TEXT .globl fs_load_fault fs_load_fault: testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movl $T_PROTFLT,TF_TRAPNO(%rsp) movq %rsp,%rdi call trap movw $KUF32SEL,TF_FS(%rsp) jmp doreti ALIGN_TEXT .globl gs_load_fault gs_load_fault: popfq movl $T_PROTFLT,TF_TRAPNO(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movq %rsp,%rdi call trap movw $KUG32SEL,TF_GS(%rsp) jmp doreti ALIGN_TEXT .globl fsbase_load_fault fsbase_load_fault: movl $T_PROTFLT,TF_TRAPNO(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movq %rsp,%rdi call trap movq PCPU(CURTHREAD),%r8 movq TD_PCB(%r8),%r8 movq $0,PCB_FSBASE(%r8) jmp doreti ALIGN_TEXT .globl gsbase_load_fault gsbase_load_fault: movl $T_PROTFLT,TF_TRAPNO(%rsp) testl $PSL_I,TF_RFLAGS(%rsp) jz 1f sti 1: movq %rsp,%rdi call trap movq PCPU(CURTHREAD),%r8 movq TD_PCB(%r8),%r8 movq $0,PCB_GSBASE(%r8) jmp doreti #ifdef HWPMC_HOOKS ENTRY(end_exceptions) #endif Index: releng/9.3/sys/amd64/amd64/machdep.c =================================================================== --- releng/9.3/sys/amd64/amd64/machdep.c (revision 287146) +++ releng/9.3/sys/amd64/amd64/machdep.c (revision 287147) @@ -1,2550 +1,2551 @@ /*- * Copyright (c) 2003 Peter Wemm. * Copyright (c) 1992 Terrence R. Lambert. * Copyright (c) 1982, 1987, 1990 The Regents of the University of California. * All rights reserved. * * This code is derived from software contributed to Berkeley by * William Jolitz. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)machdep.c 7.4 (Berkeley) 6/3/91 */ #include __FBSDID("$FreeBSD$"); #include "opt_atalk.h" #include "opt_atpic.h" #include "opt_compat.h" #include "opt_cpu.h" #include "opt_ddb.h" #include "opt_inet.h" #include "opt_ipx.h" #include "opt_isa.h" #include "opt_kstack_pages.h" #include "opt_maxmem.h" #include "opt_mp_watchdog.h" #include "opt_perfmon.h" #include "opt_sched.h" #include "opt_kdtrace.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef SMP #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DDB #ifndef KDB #error KDB must be enabled in order for DDB to work! #endif #include #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef PERFMON #include #endif #include #ifdef SMP #include #endif #ifdef DEV_ATPIC #include #else #include #endif #include #include /* Sanity check for __curthread() */ CTASSERT(offsetof(struct pcpu, pc_curthread) == 0); extern u_int64_t hammer_time(u_int64_t, u_int64_t); extern void printcpuinfo(void); /* XXX header file */ extern void identify_cpu(void); extern void panicifcpuunsupported(void); #define CS_SECURE(cs) (ISPL(cs) == SEL_UPL) #define EFL_SECURE(ef, oef) ((((ef) ^ (oef)) & ~PSL_USERCHANGE) == 0) static void cpu_startup(void *); static void get_fpcontext(struct thread *td, mcontext_t *mcp, char *xfpusave, size_t xfpusave_len); static int set_fpcontext(struct thread *td, const mcontext_t *mcp, char *xfpustate, size_t xfpustate_len); SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL); /* * The file "conf/ldscript.amd64" defines the symbol "kernphys". Its value is * the physical address at which the kernel is loaded. */ extern char kernphys[]; #ifdef DDB extern vm_offset_t ksym_start, ksym_end; #endif struct msgbuf *msgbufp; /* Intel ICH registers */ #define ICH_PMBASE 0x400 #define ICH_SMI_EN ICH_PMBASE + 0x30 int _udatasel, _ucodesel, _ucode32sel, _ufssel, _ugssel; int cold = 1; long Maxmem = 0; long realmem = 0; /* * The number of PHYSMAP entries must be one less than the number of * PHYSSEG entries because the PHYSMAP entry that spans the largest * physical address that is accessible by ISA DMA is split into two * PHYSSEG entries. */ #define PHYSMAP_SIZE (2 * (VM_PHYSSEG_MAX - 1)) vm_paddr_t phys_avail[PHYSMAP_SIZE + 2]; vm_paddr_t dump_avail[PHYSMAP_SIZE + 2]; /* must be 2 less so 0 0 can signal end of chunks */ #define PHYS_AVAIL_ARRAY_END ((sizeof(phys_avail) / sizeof(phys_avail[0])) - 2) #define DUMP_AVAIL_ARRAY_END ((sizeof(dump_avail) / sizeof(dump_avail[0])) - 2) struct kva_md_info kmi; static struct trapframe proc0_tf; struct region_descriptor r_gdt, r_idt; struct pcpu __pcpu[MAXCPU]; struct mtx icu_lock; struct mem_range_softc mem_range_softc; struct mtx dt_lock; /* lock for GDT and LDT */ static void cpu_startup(dummy) void *dummy; { uintmax_t memsize; char *sysenv; /* * On MacBooks, we need to disallow the legacy USB circuit to * generate an SMI# because this can cause several problems, * namely: incorrect CPU frequency detection and failure to * start the APs. * We do this by disabling a bit in the SMI_EN (SMI Control and * Enable register) of the Intel ICH LPC Interface Bridge. */ sysenv = getenv("smbios.system.product"); if (sysenv != NULL) { if (strncmp(sysenv, "MacBook1,1", 10) == 0 || strncmp(sysenv, "MacBook3,1", 10) == 0 || strncmp(sysenv, "MacBookPro1,1", 13) == 0 || strncmp(sysenv, "MacBookPro1,2", 13) == 0 || strncmp(sysenv, "MacBookPro3,1", 13) == 0 || strncmp(sysenv, "Macmini1,1", 10) == 0) { if (bootverbose) printf("Disabling LEGACY_USB_EN bit on " "Intel ICH.\n"); outl(ICH_SMI_EN, inl(ICH_SMI_EN) & ~0x8); } freeenv(sysenv); } /* * Good {morning,afternoon,evening,night}. */ startrtclock(); printcpuinfo(); panicifcpuunsupported(); #ifdef PERFMON perfmon_init(); #endif realmem = Maxmem; /* * Display physical memory if SMBIOS reports reasonable amount. */ memsize = 0; sysenv = getenv("smbios.memory.enabled"); if (sysenv != NULL) { memsize = (uintmax_t)strtoul(sysenv, (char **)NULL, 10) << 10; freeenv(sysenv); } if (memsize < ptoa((uintmax_t)cnt.v_free_count)) memsize = ptoa((uintmax_t)Maxmem); printf("real memory = %ju (%ju MB)\n", memsize, memsize >> 20); /* * Display any holes after the first chunk of extended memory. */ if (bootverbose) { int indx; printf("Physical memory chunk(s):\n"); for (indx = 0; phys_avail[indx + 1] != 0; indx += 2) { vm_paddr_t size; size = phys_avail[indx + 1] - phys_avail[indx]; printf( "0x%016jx - 0x%016jx, %ju bytes (%ju pages)\n", (uintmax_t)phys_avail[indx], (uintmax_t)phys_avail[indx + 1] - 1, (uintmax_t)size, (uintmax_t)size / PAGE_SIZE); } } vm_ksubmap_init(&kmi); printf("avail memory = %ju (%ju MB)\n", ptoa((uintmax_t)cnt.v_free_count), ptoa((uintmax_t)cnt.v_free_count) / 1048576); /* * Set up buffers, so they can be used to read disk labels. */ bufinit(); vm_pager_bufferinit(); cpu_setregs(); /* * Add BSP as an interrupt target. */ intr_add_cpu(0); } /* * Send an interrupt to process. * * Stack is set up to allow sigcode stored * at top to call routine, followed by call * to sigreturn routine below. After sigreturn * resets the signal mask, the stack, and the * frame pointer, it returns to the user * specified pc, psl. */ void sendsig(sig_t catcher, ksiginfo_t *ksi, sigset_t *mask) { struct sigframe sf, *sfp; struct pcb *pcb; struct proc *p; struct thread *td; struct sigacts *psp; char *sp; struct trapframe *regs; char *xfpusave; size_t xfpusave_len; int sig; int oonstack; td = curthread; pcb = td->td_pcb; p = td->td_proc; PROC_LOCK_ASSERT(p, MA_OWNED); sig = ksi->ksi_signo; psp = p->p_sigacts; mtx_assert(&psp->ps_mtx, MA_OWNED); regs = td->td_frame; oonstack = sigonstack(regs->tf_rsp); if (cpu_max_ext_state_size > sizeof(struct savefpu) && use_xsave) { xfpusave_len = cpu_max_ext_state_size - sizeof(struct savefpu); xfpusave = __builtin_alloca(xfpusave_len); } else { xfpusave_len = 0; xfpusave = NULL; } /* Save user context. */ bzero(&sf, sizeof(sf)); sf.sf_uc.uc_sigmask = *mask; sf.sf_uc.uc_stack = td->td_sigstk; sf.sf_uc.uc_stack.ss_flags = (td->td_pflags & TDP_ALTSTACK) ? ((oonstack) ? SS_ONSTACK : 0) : SS_DISABLE; sf.sf_uc.uc_mcontext.mc_onstack = (oonstack) ? 1 : 0; bcopy(regs, &sf.sf_uc.uc_mcontext.mc_rdi, sizeof(*regs)); sf.sf_uc.uc_mcontext.mc_len = sizeof(sf.sf_uc.uc_mcontext); /* magic */ get_fpcontext(td, &sf.sf_uc.uc_mcontext, xfpusave, xfpusave_len); fpstate_drop(td); sf.sf_uc.uc_mcontext.mc_fsbase = pcb->pcb_fsbase; sf.sf_uc.uc_mcontext.mc_gsbase = pcb->pcb_gsbase; bzero(sf.sf_uc.uc_mcontext.mc_spare, sizeof(sf.sf_uc.uc_mcontext.mc_spare)); bzero(sf.sf_uc.__spare__, sizeof(sf.sf_uc.__spare__)); /* Allocate space for the signal handler context. */ if ((td->td_pflags & TDP_ALTSTACK) != 0 && !oonstack && SIGISMEMBER(psp->ps_sigonstack, sig)) { sp = td->td_sigstk.ss_sp + td->td_sigstk.ss_size; #if defined(COMPAT_43) td->td_sigstk.ss_flags |= SS_ONSTACK; #endif } else sp = (char *)regs->tf_rsp - 128; if (xfpusave != NULL) { sp -= xfpusave_len; sp = (char *)((unsigned long)sp & ~0x3Ful); sf.sf_uc.uc_mcontext.mc_xfpustate = (register_t)sp; } sp -= sizeof(struct sigframe); /* Align to 16 bytes. */ sfp = (struct sigframe *)((unsigned long)sp & ~0xFul); /* Translate the signal if appropriate. */ if (p->p_sysent->sv_sigtbl && sig <= p->p_sysent->sv_sigsize) sig = p->p_sysent->sv_sigtbl[_SIG_IDX(sig)]; /* Build the argument list for the signal handler. */ regs->tf_rdi = sig; /* arg 1 in %rdi */ regs->tf_rdx = (register_t)&sfp->sf_uc; /* arg 3 in %rdx */ bzero(&sf.sf_si, sizeof(sf.sf_si)); if (SIGISMEMBER(psp->ps_siginfo, sig)) { /* Signal handler installed with SA_SIGINFO. */ regs->tf_rsi = (register_t)&sfp->sf_si; /* arg 2 in %rsi */ sf.sf_ahu.sf_action = (__siginfohandler_t *)catcher; /* Fill in POSIX parts */ sf.sf_si = ksi->ksi_info; sf.sf_si.si_signo = sig; /* maybe a translated signal */ regs->tf_rcx = (register_t)ksi->ksi_addr; /* arg 4 in %rcx */ } else { /* Old FreeBSD-style arguments. */ regs->tf_rsi = ksi->ksi_code; /* arg 2 in %rsi */ regs->tf_rcx = (register_t)ksi->ksi_addr; /* arg 4 in %rcx */ sf.sf_ahu.sf_handler = catcher; } mtx_unlock(&psp->ps_mtx); PROC_UNLOCK(p); /* * Copy the sigframe out to the user's stack. */ if (copyout(&sf, sfp, sizeof(*sfp)) != 0 || (xfpusave != NULL && copyout(xfpusave, (void *)sf.sf_uc.uc_mcontext.mc_xfpustate, xfpusave_len) != 0)) { #ifdef DEBUG printf("process %ld has trashed its stack\n", (long)p->p_pid); #endif PROC_LOCK(p); sigexit(td, SIGILL); } regs->tf_rsp = (long)sfp; regs->tf_rip = p->p_sysent->sv_sigcode_base; regs->tf_rflags &= ~(PSL_T | PSL_D); regs->tf_cs = _ucodesel; regs->tf_ds = _udatasel; + regs->tf_ss = _udatasel; regs->tf_es = _udatasel; regs->tf_fs = _ufssel; regs->tf_gs = _ugssel; regs->tf_flags = TF_HASSEGS; set_pcb_flags(pcb, PCB_FULL_IRET); PROC_LOCK(p); mtx_lock(&psp->ps_mtx); } /* * System call to cleanup state after a signal * has been taken. Reset signal mask and * stack state from context left by sendsig (above). * Return to previous pc and psl as specified by * context left by sendsig. Check carefully to * make sure that the user has not modified the * state to gain improper privileges. * * MPSAFE */ int sys_sigreturn(td, uap) struct thread *td; struct sigreturn_args /* { const struct __ucontext *sigcntxp; } */ *uap; { ucontext_t uc; struct pcb *pcb; struct proc *p; struct trapframe *regs; ucontext_t *ucp; char *xfpustate; size_t xfpustate_len; long rflags; int cs, error, ret; ksiginfo_t ksi; pcb = td->td_pcb; p = td->td_proc; error = copyin(uap->sigcntxp, &uc, sizeof(uc)); if (error != 0) { uprintf("pid %d (%s): sigreturn copyin failed\n", p->p_pid, td->td_name); return (error); } ucp = &uc; if ((ucp->uc_mcontext.mc_flags & ~_MC_FLAG_MASK) != 0) { uprintf("pid %d (%s): sigreturn mc_flags %x\n", p->p_pid, td->td_name, ucp->uc_mcontext.mc_flags); return (EINVAL); } regs = td->td_frame; rflags = ucp->uc_mcontext.mc_rflags; /* * Don't allow users to change privileged or reserved flags. */ if (!EFL_SECURE(rflags, regs->tf_rflags)) { uprintf("pid %d (%s): sigreturn rflags = 0x%lx\n", p->p_pid, td->td_name, rflags); return (EINVAL); } /* * Don't allow users to load a valid privileged %cs. Let the * hardware check for invalid selectors, excess privilege in * other selectors, invalid %eip's and invalid %esp's. */ cs = ucp->uc_mcontext.mc_cs; if (!CS_SECURE(cs)) { uprintf("pid %d (%s): sigreturn cs = 0x%x\n", p->p_pid, td->td_name, cs); ksiginfo_init_trap(&ksi); ksi.ksi_signo = SIGBUS; ksi.ksi_code = BUS_OBJERR; ksi.ksi_trapno = T_PROTFLT; ksi.ksi_addr = (void *)regs->tf_rip; trapsignal(td, &ksi); return (EINVAL); } if ((uc.uc_mcontext.mc_flags & _MC_HASFPXSTATE) != 0) { xfpustate_len = uc.uc_mcontext.mc_xfpustate_len; if (xfpustate_len > cpu_max_ext_state_size - sizeof(struct savefpu)) { uprintf("pid %d (%s): sigreturn xfpusave_len = 0x%zx\n", p->p_pid, td->td_name, xfpustate_len); return (EINVAL); } xfpustate = __builtin_alloca(xfpustate_len); error = copyin((const void *)uc.uc_mcontext.mc_xfpustate, xfpustate, xfpustate_len); if (error != 0) { uprintf( "pid %d (%s): sigreturn copying xfpustate failed\n", p->p_pid, td->td_name); return (error); } } else { xfpustate = NULL; xfpustate_len = 0; } ret = set_fpcontext(td, &ucp->uc_mcontext, xfpustate, xfpustate_len); if (ret != 0) { uprintf("pid %d (%s): sigreturn set_fpcontext err %d\n", p->p_pid, td->td_name, ret); return (ret); } bcopy(&ucp->uc_mcontext.mc_rdi, regs, sizeof(*regs)); pcb->pcb_fsbase = ucp->uc_mcontext.mc_fsbase; pcb->pcb_gsbase = ucp->uc_mcontext.mc_gsbase; #if defined(COMPAT_43) if (ucp->uc_mcontext.mc_onstack & 1) td->td_sigstk.ss_flags |= SS_ONSTACK; else td->td_sigstk.ss_flags &= ~SS_ONSTACK; #endif kern_sigprocmask(td, SIG_SETMASK, &ucp->uc_sigmask, NULL, 0); set_pcb_flags(pcb, PCB_FULL_IRET); return (EJUSTRETURN); } #ifdef COMPAT_FREEBSD4 int freebsd4_sigreturn(struct thread *td, struct freebsd4_sigreturn_args *uap) { return sys_sigreturn(td, (struct sigreturn_args *)uap); } #endif /* * Machine dependent boot() routine * * I haven't seen anything to put here yet * Possibly some stuff might be grafted back here from boot() */ void cpu_boot(int howto) { } /* * Flush the D-cache for non-DMA I/O so that the I-cache can * be made coherent later. */ void cpu_flush_dcache(void *ptr, size_t len) { /* Not applicable */ } /* Get current clock frequency for the given cpu id. */ int cpu_est_clockrate(int cpu_id, uint64_t *rate) { uint64_t tsc1, tsc2; uint64_t acnt, mcnt, perf; register_t reg; if (pcpu_find(cpu_id) == NULL || rate == NULL) return (EINVAL); /* * If TSC is P-state invariant and APERF/MPERF MSRs do not exist, * DELAY(9) based logic fails. */ if (tsc_is_invariant && !tsc_perf_stat) return (EOPNOTSUPP); #ifdef SMP if (smp_cpus > 1) { /* Schedule ourselves on the indicated cpu. */ thread_lock(curthread); sched_bind(curthread, cpu_id); thread_unlock(curthread); } #endif /* Calibrate by measuring a short delay. */ reg = intr_disable(); if (tsc_is_invariant) { wrmsr(MSR_MPERF, 0); wrmsr(MSR_APERF, 0); tsc1 = rdtsc(); DELAY(1000); mcnt = rdmsr(MSR_MPERF); acnt = rdmsr(MSR_APERF); tsc2 = rdtsc(); intr_restore(reg); perf = 1000 * acnt / mcnt; *rate = (tsc2 - tsc1) * perf; } else { tsc1 = rdtsc(); DELAY(1000); tsc2 = rdtsc(); intr_restore(reg); *rate = (tsc2 - tsc1) * 1000; } #ifdef SMP if (smp_cpus > 1) { thread_lock(curthread); sched_unbind(curthread); thread_unlock(curthread); } #endif return (0); } /* * Shutdown the CPU as much as possible */ void cpu_halt(void) { for (;;) __asm__ ("hlt"); } void (*cpu_idle_hook)(void) = NULL; /* ACPI idle hook. */ static int cpu_ident_amdc1e = 0; /* AMD C1E supported. */ static int idle_mwait = 1; /* Use MONITOR/MWAIT for short idle. */ TUNABLE_INT("machdep.idle_mwait", &idle_mwait); SYSCTL_INT(_machdep, OID_AUTO, idle_mwait, CTLFLAG_RW, &idle_mwait, 0, "Use MONITOR/MWAIT for short idle"); #define STATE_RUNNING 0x0 #define STATE_MWAIT 0x1 #define STATE_SLEEPING 0x2 static void cpu_idle_acpi(int busy) { int *state; state = (int *)PCPU_PTR(monitorbuf); *state = STATE_SLEEPING; disable_intr(); if (sched_runnable()) enable_intr(); else if (cpu_idle_hook) cpu_idle_hook(); else __asm __volatile("sti; hlt"); *state = STATE_RUNNING; } static void cpu_idle_hlt(int busy) { int *state; state = (int *)PCPU_PTR(monitorbuf); *state = STATE_SLEEPING; /* * We must absolutely guarentee that hlt is the next instruction * after sti or we introduce a timing window. */ disable_intr(); if (sched_runnable()) enable_intr(); else __asm __volatile("sti; hlt"); *state = STATE_RUNNING; } /* * MWAIT cpu power states. Lower 4 bits are sub-states. */ #define MWAIT_C0 0xf0 #define MWAIT_C1 0x00 #define MWAIT_C2 0x10 #define MWAIT_C3 0x20 #define MWAIT_C4 0x30 static void cpu_idle_mwait(int busy) { int *state; state = (int *)PCPU_PTR(monitorbuf); *state = STATE_MWAIT; if (!sched_runnable()) { cpu_monitor(state, 0, 0); if (*state == STATE_MWAIT) cpu_mwait(0, MWAIT_C1); } *state = STATE_RUNNING; } static void cpu_idle_spin(int busy) { int *state; int i; state = (int *)PCPU_PTR(monitorbuf); *state = STATE_RUNNING; for (i = 0; i < 1000; i++) { if (sched_runnable()) return; cpu_spinwait(); } } /* * C1E renders the local APIC timer dead, so we disable it by * reading the Interrupt Pending Message register and clearing * both C1eOnCmpHalt (bit 28) and SmiOnCmpHalt (bit 27). * * Reference: * "BIOS and Kernel Developer's Guide for AMD NPT Family 0Fh Processors" * #32559 revision 3.00+ */ #define MSR_AMDK8_IPM 0xc0010055 #define AMDK8_SMIONCMPHALT (1ULL << 27) #define AMDK8_C1EONCMPHALT (1ULL << 28) #define AMDK8_CMPHALT (AMDK8_SMIONCMPHALT | AMDK8_C1EONCMPHALT) static void cpu_probe_amdc1e(void) { /* * Detect the presence of C1E capability mostly on latest * dual-cores (or future) k8 family. */ if (cpu_vendor_id == CPU_VENDOR_AMD && (cpu_id & 0x00000f00) == 0x00000f00 && (cpu_id & 0x0fff0000) >= 0x00040000) { cpu_ident_amdc1e = 1; } } void (*cpu_idle_fn)(int) = cpu_idle_acpi; void cpu_idle(int busy) { uint64_t msr; CTR2(KTR_SPARE2, "cpu_idle(%d) at %d", busy, curcpu); #ifdef MP_WATCHDOG ap_watchdog(PCPU_GET(cpuid)); #endif /* If we are busy - try to use fast methods. */ if (busy) { if ((cpu_feature2 & CPUID2_MON) && idle_mwait) { cpu_idle_mwait(busy); goto out; } } /* If we have time - switch timers into idle mode. */ if (!busy) { critical_enter(); cpu_idleclock(); } /* Apply AMD APIC timer C1E workaround. */ if (cpu_ident_amdc1e && cpu_disable_deep_sleep) { msr = rdmsr(MSR_AMDK8_IPM); if (msr & AMDK8_CMPHALT) wrmsr(MSR_AMDK8_IPM, msr & ~AMDK8_CMPHALT); } /* Call main idle method. */ cpu_idle_fn(busy); /* Switch timers mack into active mode. */ if (!busy) { cpu_activeclock(); critical_exit(); } out: CTR2(KTR_SPARE2, "cpu_idle(%d) at %d done", busy, curcpu); } int cpu_idle_wakeup(int cpu) { struct pcpu *pcpu; int *state; pcpu = pcpu_find(cpu); state = (int *)pcpu->pc_monitorbuf; /* * This doesn't need to be atomic since missing the race will * simply result in unnecessary IPIs. */ if (*state == STATE_SLEEPING) return (0); if (*state == STATE_MWAIT) *state = STATE_RUNNING; return (1); } /* * Ordered by speed/power consumption. */ struct { void *id_fn; char *id_name; } idle_tbl[] = { { cpu_idle_spin, "spin" }, { cpu_idle_mwait, "mwait" }, { cpu_idle_hlt, "hlt" }, { cpu_idle_acpi, "acpi" }, { NULL, NULL } }; static int idle_sysctl_available(SYSCTL_HANDLER_ARGS) { char *avail, *p; int error; int i; avail = malloc(256, M_TEMP, M_WAITOK); p = avail; for (i = 0; idle_tbl[i].id_name != NULL; i++) { if (strstr(idle_tbl[i].id_name, "mwait") && (cpu_feature2 & CPUID2_MON) == 0) continue; if (strcmp(idle_tbl[i].id_name, "acpi") == 0 && cpu_idle_hook == NULL) continue; p += sprintf(p, "%s%s", p != avail ? ", " : "", idle_tbl[i].id_name); } error = sysctl_handle_string(oidp, avail, 0, req); free(avail, M_TEMP); return (error); } SYSCTL_PROC(_machdep, OID_AUTO, idle_available, CTLTYPE_STRING | CTLFLAG_RD, 0, 0, idle_sysctl_available, "A", "list of available idle functions"); static int idle_sysctl(SYSCTL_HANDLER_ARGS) { char buf[16]; int error; char *p; int i; p = "unknown"; for (i = 0; idle_tbl[i].id_name != NULL; i++) { if (idle_tbl[i].id_fn == cpu_idle_fn) { p = idle_tbl[i].id_name; break; } } strncpy(buf, p, sizeof(buf)); error = sysctl_handle_string(oidp, buf, sizeof(buf), req); if (error != 0 || req->newptr == NULL) return (error); for (i = 0; idle_tbl[i].id_name != NULL; i++) { if (strstr(idle_tbl[i].id_name, "mwait") && (cpu_feature2 & CPUID2_MON) == 0) continue; if (strcmp(idle_tbl[i].id_name, "acpi") == 0 && cpu_idle_hook == NULL) continue; if (strcmp(idle_tbl[i].id_name, buf)) continue; cpu_idle_fn = idle_tbl[i].id_fn; return (0); } return (EINVAL); } SYSCTL_PROC(_machdep, OID_AUTO, idle, CTLTYPE_STRING | CTLFLAG_RW, 0, 0, idle_sysctl, "A", "currently selected idle function"); /* * Reset registers to default values on exec. */ void exec_setregs(struct thread *td, struct image_params *imgp, u_long stack) { struct trapframe *regs = td->td_frame; struct pcb *pcb = td->td_pcb; mtx_lock(&dt_lock); if (td->td_proc->p_md.md_ldt != NULL) user_ldt_free(td); else mtx_unlock(&dt_lock); pcb->pcb_fsbase = 0; pcb->pcb_gsbase = 0; clear_pcb_flags(pcb, PCB_32BIT); pcb->pcb_initial_fpucw = __INITIAL_FPUCW__; set_pcb_flags(pcb, PCB_FULL_IRET); bzero((char *)regs, sizeof(struct trapframe)); regs->tf_rip = imgp->entry_addr; regs->tf_rsp = ((stack - 8) & ~0xFul) + 8; regs->tf_rdi = stack; /* argv */ regs->tf_rflags = PSL_USER | (regs->tf_rflags & PSL_T); regs->tf_ss = _udatasel; regs->tf_cs = _ucodesel; regs->tf_ds = _udatasel; regs->tf_es = _udatasel; regs->tf_fs = _ufssel; regs->tf_gs = _ugssel; regs->tf_flags = TF_HASSEGS; td->td_retval[1] = 0; /* * Reset the hardware debug registers if they were in use. * They won't have any meaning for the newly exec'd process. */ if (pcb->pcb_flags & PCB_DBREGS) { pcb->pcb_dr0 = 0; pcb->pcb_dr1 = 0; pcb->pcb_dr2 = 0; pcb->pcb_dr3 = 0; pcb->pcb_dr6 = 0; pcb->pcb_dr7 = 0; if (pcb == curpcb) { /* * Clear the debug registers on the running * CPU, otherwise they will end up affecting * the next process we switch to. */ reset_dbregs(); } clear_pcb_flags(pcb, PCB_DBREGS); } /* * Drop the FP state if we hold it, so that the process gets a * clean FP state if it uses the FPU again. */ fpstate_drop(td); } void cpu_setregs(void) { register_t cr0; cr0 = rcr0(); /* * CR0_MP, CR0_NE and CR0_TS are also set by npx_probe() for the * BSP. See the comments there about why we set them. */ cr0 |= CR0_MP | CR0_NE | CR0_TS | CR0_WP | CR0_AM; load_cr0(cr0); } /* * Initialize amd64 and configure to run kernel */ /* * Initialize segments & interrupt table */ struct user_segment_descriptor gdt[NGDT * MAXCPU];/* global descriptor tables */ static struct gate_descriptor idt0[NIDT]; struct gate_descriptor *idt = &idt0[0]; /* interrupt descriptor table */ static char dblfault_stack[PAGE_SIZE] __aligned(16); static char nmi0_stack[PAGE_SIZE] __aligned(16); CTASSERT(sizeof(struct nmi_pcpu) == 16); struct amd64tss common_tss[MAXCPU]; /* * Software prototypes -- in more palatable form. * * Keep GUFS32, GUGS32, GUCODE32 and GUDATA at the same * slots as corresponding segments for i386 kernel. */ struct soft_segment_descriptor gdt_segs[] = { /* GNULL_SEL 0 Null Descriptor */ { .ssd_base = 0x0, .ssd_limit = 0x0, .ssd_type = 0, .ssd_dpl = 0, .ssd_p = 0, .ssd_long = 0, .ssd_def32 = 0, .ssd_gran = 0 }, /* GNULL2_SEL 1 Null Descriptor */ { .ssd_base = 0x0, .ssd_limit = 0x0, .ssd_type = 0, .ssd_dpl = 0, .ssd_p = 0, .ssd_long = 0, .ssd_def32 = 0, .ssd_gran = 0 }, /* GUFS32_SEL 2 32 bit %gs Descriptor for user */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMRWA, .ssd_dpl = SEL_UPL, .ssd_p = 1, .ssd_long = 0, .ssd_def32 = 1, .ssd_gran = 1 }, /* GUGS32_SEL 3 32 bit %fs Descriptor for user */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMRWA, .ssd_dpl = SEL_UPL, .ssd_p = 1, .ssd_long = 0, .ssd_def32 = 1, .ssd_gran = 1 }, /* GCODE_SEL 4 Code Descriptor for kernel */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMERA, .ssd_dpl = SEL_KPL, .ssd_p = 1, .ssd_long = 1, .ssd_def32 = 0, .ssd_gran = 1 }, /* GDATA_SEL 5 Data Descriptor for kernel */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMRWA, .ssd_dpl = SEL_KPL, .ssd_p = 1, .ssd_long = 1, .ssd_def32 = 0, .ssd_gran = 1 }, /* GUCODE32_SEL 6 32 bit Code Descriptor for user */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMERA, .ssd_dpl = SEL_UPL, .ssd_p = 1, .ssd_long = 0, .ssd_def32 = 1, .ssd_gran = 1 }, /* GUDATA_SEL 7 32/64 bit Data Descriptor for user */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMRWA, .ssd_dpl = SEL_UPL, .ssd_p = 1, .ssd_long = 0, .ssd_def32 = 1, .ssd_gran = 1 }, /* GUCODE_SEL 8 64 bit Code Descriptor for user */ { .ssd_base = 0x0, .ssd_limit = 0xfffff, .ssd_type = SDT_MEMERA, .ssd_dpl = SEL_UPL, .ssd_p = 1, .ssd_long = 1, .ssd_def32 = 0, .ssd_gran = 1 }, /* GPROC0_SEL 9 Proc 0 Tss Descriptor */ { .ssd_base = 0x0, .ssd_limit = sizeof(struct amd64tss) + IOPAGES * PAGE_SIZE - 1, .ssd_type = SDT_SYSTSS, .ssd_dpl = SEL_KPL, .ssd_p = 1, .ssd_long = 0, .ssd_def32 = 0, .ssd_gran = 0 }, /* Actually, the TSS is a system descriptor which is double size */ { .ssd_base = 0x0, .ssd_limit = 0x0, .ssd_type = 0, .ssd_dpl = 0, .ssd_p = 0, .ssd_long = 0, .ssd_def32 = 0, .ssd_gran = 0 }, /* GUSERLDT_SEL 11 LDT Descriptor */ { .ssd_base = 0x0, .ssd_limit = 0x0, .ssd_type = 0, .ssd_dpl = 0, .ssd_p = 0, .ssd_long = 0, .ssd_def32 = 0, .ssd_gran = 0 }, /* GUSERLDT_SEL 12 LDT Descriptor, double size */ { .ssd_base = 0x0, .ssd_limit = 0x0, .ssd_type = 0, .ssd_dpl = 0, .ssd_p = 0, .ssd_long = 0, .ssd_def32 = 0, .ssd_gran = 0 }, }; void setidt(idx, func, typ, dpl, ist) int idx; inthand_t *func; int typ; int dpl; int ist; { struct gate_descriptor *ip; ip = idt + idx; ip->gd_looffset = (uintptr_t)func; ip->gd_selector = GSEL(GCODE_SEL, SEL_KPL); ip->gd_ist = ist; ip->gd_xx = 0; ip->gd_type = typ; ip->gd_dpl = dpl; ip->gd_p = 1; ip->gd_hioffset = ((uintptr_t)func)>>16 ; } extern inthand_t IDTVEC(div), IDTVEC(dbg), IDTVEC(nmi), IDTVEC(bpt), IDTVEC(ofl), IDTVEC(bnd), IDTVEC(ill), IDTVEC(dna), IDTVEC(fpusegm), IDTVEC(tss), IDTVEC(missing), IDTVEC(stk), IDTVEC(prot), IDTVEC(page), IDTVEC(mchk), IDTVEC(rsvd), IDTVEC(fpu), IDTVEC(align), IDTVEC(xmm), IDTVEC(dblfault), #ifdef KDTRACE_HOOKS IDTVEC(dtrace_ret), #endif IDTVEC(fast_syscall), IDTVEC(fast_syscall32); #ifdef DDB /* * Display the index and function name of any IDT entries that don't use * the default 'rsvd' entry point. */ DB_SHOW_COMMAND(idt, db_show_idt) { struct gate_descriptor *ip; int idx; uintptr_t func; ip = idt; for (idx = 0; idx < NIDT && !db_pager_quit; idx++) { func = ((long)ip->gd_hioffset << 16 | ip->gd_looffset); if (func != (uintptr_t)&IDTVEC(rsvd)) { db_printf("%3d\t", idx); db_printsym(func, DB_STGY_PROC); db_printf("\n"); } ip++; } } /* Show privileged registers. */ DB_SHOW_COMMAND(sysregs, db_show_sysregs) { struct { uint16_t limit; uint64_t base; } __packed idtr, gdtr; uint16_t ldt, tr; __asm __volatile("sidt %0" : "=m" (idtr)); db_printf("idtr\t0x%016lx/%04x\n", (u_long)idtr.base, (u_int)idtr.limit); __asm __volatile("sgdt %0" : "=m" (gdtr)); db_printf("gdtr\t0x%016lx/%04x\n", (u_long)gdtr.base, (u_int)gdtr.limit); __asm __volatile("sldt %0" : "=r" (ldt)); db_printf("ldtr\t0x%04x\n", ldt); __asm __volatile("str %0" : "=r" (tr)); db_printf("tr\t0x%04x\n", tr); db_printf("cr0\t0x%016lx\n", rcr0()); db_printf("cr2\t0x%016lx\n", rcr2()); db_printf("cr3\t0x%016lx\n", rcr3()); db_printf("cr4\t0x%016lx\n", rcr4()); db_printf("EFER\t%016lx\n", rdmsr(MSR_EFER)); db_printf("FEATURES_CTL\t%016lx\n", rdmsr(MSR_IA32_FEATURE_CONTROL)); db_printf("DEBUG_CTL\t%016lx\n", rdmsr(MSR_DEBUGCTLMSR)); db_printf("PAT\t%016lx\n", rdmsr(MSR_PAT)); db_printf("GSBASE\t%016lx\n", rdmsr(MSR_GSBASE)); } #endif void sdtossd(sd, ssd) struct user_segment_descriptor *sd; struct soft_segment_descriptor *ssd; { ssd->ssd_base = (sd->sd_hibase << 24) | sd->sd_lobase; ssd->ssd_limit = (sd->sd_hilimit << 16) | sd->sd_lolimit; ssd->ssd_type = sd->sd_type; ssd->ssd_dpl = sd->sd_dpl; ssd->ssd_p = sd->sd_p; ssd->ssd_long = sd->sd_long; ssd->ssd_def32 = sd->sd_def32; ssd->ssd_gran = sd->sd_gran; } void ssdtosd(ssd, sd) struct soft_segment_descriptor *ssd; struct user_segment_descriptor *sd; { sd->sd_lobase = (ssd->ssd_base) & 0xffffff; sd->sd_hibase = (ssd->ssd_base >> 24) & 0xff; sd->sd_lolimit = (ssd->ssd_limit) & 0xffff; sd->sd_hilimit = (ssd->ssd_limit >> 16) & 0xf; sd->sd_type = ssd->ssd_type; sd->sd_dpl = ssd->ssd_dpl; sd->sd_p = ssd->ssd_p; sd->sd_long = ssd->ssd_long; sd->sd_def32 = ssd->ssd_def32; sd->sd_gran = ssd->ssd_gran; } void ssdtosyssd(ssd, sd) struct soft_segment_descriptor *ssd; struct system_segment_descriptor *sd; { sd->sd_lobase = (ssd->ssd_base) & 0xffffff; sd->sd_hibase = (ssd->ssd_base >> 24) & 0xfffffffffful; sd->sd_lolimit = (ssd->ssd_limit) & 0xffff; sd->sd_hilimit = (ssd->ssd_limit >> 16) & 0xf; sd->sd_type = ssd->ssd_type; sd->sd_dpl = ssd->ssd_dpl; sd->sd_p = ssd->ssd_p; sd->sd_gran = ssd->ssd_gran; } #if !defined(DEV_ATPIC) && defined(DEV_ISA) #include #include /* * Return a bitmap of the current interrupt requests. This is 8259-specific * and is only suitable for use at probe time. * This is only here to pacify sio. It is NOT FATAL if this doesn't work. * It shouldn't be here. There should probably be an APIC centric * implementation in the apic driver code, if at all. */ intrmask_t isa_irq_pending(void) { u_char irr1; u_char irr2; irr1 = inb(IO_ICU1); irr2 = inb(IO_ICU2); return ((irr2 << 8) | irr1); } #endif u_int basemem; static int add_smap_entry(struct bios_smap *smap, vm_paddr_t *physmap, int *physmap_idxp) { int i, insert_idx, physmap_idx; physmap_idx = *physmap_idxp; if (boothowto & RB_VERBOSE) printf("SMAP type=%02x base=%016lx len=%016lx\n", smap->type, smap->base, smap->length); if (smap->type != SMAP_TYPE_MEMORY) return (1); if (smap->length == 0) return (0); /* * Find insertion point while checking for overlap. Start off by * assuming the new entry will be added to the end. */ insert_idx = physmap_idx + 2; for (i = 0; i <= physmap_idx; i += 2) { if (smap->base < physmap[i + 1]) { if (smap->base + smap->length <= physmap[i]) { insert_idx = i; break; } if (boothowto & RB_VERBOSE) printf( "Overlapping memory regions, ignoring second region\n"); return (1); } } /* See if we can prepend to the next entry. */ if (insert_idx <= physmap_idx && smap->base + smap->length == physmap[insert_idx]) { physmap[insert_idx] = smap->base; return (1); } /* See if we can append to the previous entry. */ if (insert_idx > 0 && smap->base == physmap[insert_idx - 1]) { physmap[insert_idx - 1] += smap->length; return (1); } physmap_idx += 2; *physmap_idxp = physmap_idx; if (physmap_idx == PHYSMAP_SIZE) { printf( "Too many segments in the physical address map, giving up\n"); return (0); } /* * Move the last 'N' entries down to make room for the new * entry if needed. */ for (i = physmap_idx; i > insert_idx; i -= 2) { physmap[i] = physmap[i - 2]; physmap[i + 1] = physmap[i - 1]; } /* Insert the new entry. */ physmap[insert_idx] = smap->base; physmap[insert_idx + 1] = smap->base + smap->length; return (1); } /* * Populate the (physmap) array with base/bound pairs describing the * available physical memory in the system, then test this memory and * build the phys_avail array describing the actually-available memory. * * Total memory size may be set by the kernel environment variable * hw.physmem or the compile-time define MAXMEM. * * XXX first should be vm_paddr_t. */ static void getmemsize(caddr_t kmdp, u_int64_t first) { int i, physmap_idx, pa_indx, da_indx; vm_paddr_t pa, physmap[PHYSMAP_SIZE]; u_long physmem_start, physmem_tunable, memtest; pt_entry_t *pte; struct bios_smap *smapbase, *smap, *smapend; u_int32_t smapsize; quad_t dcons_addr, dcons_size; bzero(physmap, sizeof(physmap)); basemem = 0; physmap_idx = 0; /* * get memory map from INT 15:E820, kindly supplied by the loader. * * subr_module.c says: * "Consumer may safely assume that size value precedes data." * ie: an int32_t immediately precedes smap. */ smapbase = (struct bios_smap *)preload_search_info(kmdp, MODINFO_METADATA | MODINFOMD_SMAP); if (smapbase == NULL) panic("No BIOS smap info from loader!"); smapsize = *((u_int32_t *)smapbase - 1); smapend = (struct bios_smap *)((uintptr_t)smapbase + smapsize); for (smap = smapbase; smap < smapend; smap++) if (!add_smap_entry(smap, physmap, &physmap_idx)) break; /* * Find the 'base memory' segment for SMP */ basemem = 0; for (i = 0; i <= physmap_idx; i += 2) { if (physmap[i] == 0x00000000) { basemem = physmap[i + 1] / 1024; break; } } if (basemem == 0) panic("BIOS smap did not include a basemem segment!"); #ifdef SMP /* make hole for AP bootstrap code */ physmap[1] = mp_bootaddress(physmap[1] / 1024); #endif /* * Maxmem isn't the "maximum memory", it's one larger than the * highest page of the physical address space. It should be * called something like "Maxphyspage". We may adjust this * based on ``hw.physmem'' and the results of the memory test. */ Maxmem = atop(physmap[physmap_idx + 1]); #ifdef MAXMEM Maxmem = MAXMEM / 4; #endif if (TUNABLE_ULONG_FETCH("hw.physmem", &physmem_tunable)) Maxmem = atop(physmem_tunable); /* * By default enable the memory test on real hardware, and disable * it if we appear to be running in a VM. This avoids touching all * pages unnecessarily, which doesn't matter on real hardware but is * bad for shared VM hosts. Use a general name so that * one could eventually do more with the code than just disable it. */ memtest = (vm_guest > VM_GUEST_NO) ? 0 : 1; TUNABLE_ULONG_FETCH("hw.memtest.tests", &memtest); /* * Don't allow MAXMEM or hw.physmem to extend the amount of memory * in the system. */ if (Maxmem > atop(physmap[physmap_idx + 1])) Maxmem = atop(physmap[physmap_idx + 1]); if (atop(physmap[physmap_idx + 1]) != Maxmem && (boothowto & RB_VERBOSE)) printf("Physical memory use set to %ldK\n", Maxmem * 4); /* call pmap initialization to make new kernel address space */ pmap_bootstrap(&first); /* * Size up each available chunk of physical memory. * * XXX Some BIOSes corrupt low 64KB between suspend and resume. * By default, mask off the first 16 pages unless we appear to be * running in a VM. */ physmem_start = (vm_guest > VM_GUEST_NO ? 1 : 16) << PAGE_SHIFT; TUNABLE_ULONG_FETCH("hw.physmem.start", &physmem_start); if (physmem_start < PAGE_SIZE) physmap[0] = PAGE_SIZE; else if (physmem_start >= physmap[1]) physmap[0] = round_page(physmap[1] - PAGE_SIZE); else physmap[0] = round_page(physmem_start); pa_indx = 0; da_indx = 1; phys_avail[pa_indx++] = physmap[0]; phys_avail[pa_indx] = physmap[0]; dump_avail[da_indx] = physmap[0]; pte = CMAP1; /* * Get dcons buffer address */ if (getenv_quad("dcons.addr", &dcons_addr) == 0 || getenv_quad("dcons.size", &dcons_size) == 0) dcons_addr = 0; /* * physmap is in bytes, so when converting to page boundaries, * round up the start address and round down the end address. */ for (i = 0; i <= physmap_idx; i += 2) { vm_paddr_t end; end = ptoa((vm_paddr_t)Maxmem); if (physmap[i + 1] < end) end = trunc_page(physmap[i + 1]); for (pa = round_page(physmap[i]); pa < end; pa += PAGE_SIZE) { int tmp, page_bad, full; int *ptr = (int *)CADDR1; full = FALSE; /* * block out kernel memory as not available. */ if (pa >= (vm_paddr_t)kernphys && pa < first) goto do_dump_avail; /* * block out dcons buffer */ if (dcons_addr > 0 && pa >= trunc_page(dcons_addr) && pa < dcons_addr + dcons_size) goto do_dump_avail; page_bad = FALSE; if (memtest == 0) goto skip_memtest; /* * map page into kernel: valid, read/write,non-cacheable */ *pte = pa | PG_V | PG_RW | PG_N; invltlb(); tmp = *(int *)ptr; /* * Test for alternating 1's and 0's */ *(volatile int *)ptr = 0xaaaaaaaa; if (*(volatile int *)ptr != 0xaaaaaaaa) page_bad = TRUE; /* * Test for alternating 0's and 1's */ *(volatile int *)ptr = 0x55555555; if (*(volatile int *)ptr != 0x55555555) page_bad = TRUE; /* * Test for all 1's */ *(volatile int *)ptr = 0xffffffff; if (*(volatile int *)ptr != 0xffffffff) page_bad = TRUE; /* * Test for all 0's */ *(volatile int *)ptr = 0x0; if (*(volatile int *)ptr != 0x0) page_bad = TRUE; /* * Restore original value. */ *(int *)ptr = tmp; skip_memtest: /* * Adjust array of valid/good pages. */ if (page_bad == TRUE) continue; /* * If this good page is a continuation of the * previous set of good pages, then just increase * the end pointer. Otherwise start a new chunk. * Note that "end" points one higher than end, * making the range >= start and < end. * If we're also doing a speculative memory * test and we at or past the end, bump up Maxmem * so that we keep going. The first bad page * will terminate the loop. */ if (phys_avail[pa_indx] == pa) { phys_avail[pa_indx] += PAGE_SIZE; } else { pa_indx++; if (pa_indx == PHYS_AVAIL_ARRAY_END) { printf( "Too many holes in the physical address space, giving up\n"); pa_indx--; full = TRUE; goto do_dump_avail; } phys_avail[pa_indx++] = pa; /* start */ phys_avail[pa_indx] = pa + PAGE_SIZE; /* end */ } physmem++; do_dump_avail: if (dump_avail[da_indx] == pa) { dump_avail[da_indx] += PAGE_SIZE; } else { da_indx++; if (da_indx == DUMP_AVAIL_ARRAY_END) { da_indx--; goto do_next; } dump_avail[da_indx++] = pa; /* start */ dump_avail[da_indx] = pa + PAGE_SIZE; /* end */ } do_next: if (full) break; } } *pte = 0; invltlb(); /* * XXX * The last chunk must contain at least one page plus the message * buffer to avoid complicating other code (message buffer address * calculation, etc.). */ while (phys_avail[pa_indx - 1] + PAGE_SIZE + round_page(msgbufsize) >= phys_avail[pa_indx]) { physmem -= atop(phys_avail[pa_indx] - phys_avail[pa_indx - 1]); phys_avail[pa_indx--] = 0; phys_avail[pa_indx--] = 0; } Maxmem = atop(phys_avail[pa_indx]); /* Trim off space for the message buffer. */ phys_avail[pa_indx] -= round_page(msgbufsize); /* Map the message buffer. */ msgbufp = (struct msgbuf *)PHYS_TO_DMAP(phys_avail[pa_indx]); } u_int64_t hammer_time(u_int64_t modulep, u_int64_t physfree) { caddr_t kmdp; int gsel_tss, x; struct pcpu *pc; struct nmi_pcpu *np; struct xstate_hdr *xhdr; u_int64_t msr; char *env; size_t kstack0_sz; thread0.td_kstack = physfree + KERNBASE; thread0.td_kstack_pages = KSTACK_PAGES; kstack0_sz = thread0.td_kstack_pages * PAGE_SIZE; bzero((void *)thread0.td_kstack, kstack0_sz); physfree += kstack0_sz; /* * This may be done better later if it gets more high level * components in it. If so just link td->td_proc here. */ proc_linkup0(&proc0, &thread0); preload_metadata = (caddr_t)(uintptr_t)(modulep + KERNBASE); preload_bootstrap_relocate(KERNBASE); kmdp = preload_search_by_type("elf kernel"); if (kmdp == NULL) kmdp = preload_search_by_type("elf64 kernel"); boothowto = MD_FETCH(kmdp, MODINFOMD_HOWTO, int); kern_envp = MD_FETCH(kmdp, MODINFOMD_ENVP, char *) + KERNBASE; #ifdef DDB ksym_start = MD_FETCH(kmdp, MODINFOMD_SSYM, uintptr_t); ksym_end = MD_FETCH(kmdp, MODINFOMD_ESYM, uintptr_t); #endif /* Init basic tunables, hz etc */ init_param1(); /* * make gdt memory segments */ for (x = 0; x < NGDT; x++) { if (x != GPROC0_SEL && x != (GPROC0_SEL + 1) && x != GUSERLDT_SEL && x != (GUSERLDT_SEL) + 1) ssdtosd(&gdt_segs[x], &gdt[x]); } gdt_segs[GPROC0_SEL].ssd_base = (uintptr_t)&common_tss[0]; ssdtosyssd(&gdt_segs[GPROC0_SEL], (struct system_segment_descriptor *)&gdt[GPROC0_SEL]); r_gdt.rd_limit = NGDT * sizeof(gdt[0]) - 1; r_gdt.rd_base = (long) gdt; lgdt(&r_gdt); pc = &__pcpu[0]; wrmsr(MSR_FSBASE, 0); /* User value */ wrmsr(MSR_GSBASE, (u_int64_t)pc); wrmsr(MSR_KGSBASE, 0); /* User value while in the kernel */ pcpu_init(pc, 0, sizeof(struct pcpu)); dpcpu_init((void *)(physfree + KERNBASE), 0); physfree += DPCPU_SIZE; PCPU_SET(prvspace, pc); PCPU_SET(curthread, &thread0); PCPU_SET(tssp, &common_tss[0]); PCPU_SET(commontssp, &common_tss[0]); PCPU_SET(tss, (struct system_segment_descriptor *)&gdt[GPROC0_SEL]); PCPU_SET(ldt, (struct system_segment_descriptor *)&gdt[GUSERLDT_SEL]); PCPU_SET(fs32p, &gdt[GUFS32_SEL]); PCPU_SET(gs32p, &gdt[GUGS32_SEL]); /* * Initialize mutexes. * * icu_lock: in order to allow an interrupt to occur in a critical * section, to set pcpu->ipending (etc...) properly, we * must be able to get the icu lock, so it can't be * under witness. */ mutex_init(); mtx_init(&icu_lock, "icu", NULL, MTX_SPIN | MTX_NOWITNESS); mtx_init(&dt_lock, "descriptor tables", NULL, MTX_DEF); /* exceptions */ for (x = 0; x < NIDT; x++) setidt(x, &IDTVEC(rsvd), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_DE, &IDTVEC(div), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_DB, &IDTVEC(dbg), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_NMI, &IDTVEC(nmi), SDT_SYSIGT, SEL_KPL, 2); setidt(IDT_BP, &IDTVEC(bpt), SDT_SYSIGT, SEL_UPL, 0); setidt(IDT_OF, &IDTVEC(ofl), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_BR, &IDTVEC(bnd), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_UD, &IDTVEC(ill), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_NM, &IDTVEC(dna), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_DF, &IDTVEC(dblfault), SDT_SYSIGT, SEL_KPL, 1); setidt(IDT_FPUGP, &IDTVEC(fpusegm), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_TS, &IDTVEC(tss), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_NP, &IDTVEC(missing), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_SS, &IDTVEC(stk), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_GP, &IDTVEC(prot), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_PF, &IDTVEC(page), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_MF, &IDTVEC(fpu), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_AC, &IDTVEC(align), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_MC, &IDTVEC(mchk), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_XF, &IDTVEC(xmm), SDT_SYSIGT, SEL_KPL, 0); #ifdef KDTRACE_HOOKS setidt(IDT_DTRACE_RET, &IDTVEC(dtrace_ret), SDT_SYSIGT, SEL_UPL, 0); #endif r_idt.rd_limit = sizeof(idt0) - 1; r_idt.rd_base = (long) idt; lidt(&r_idt); /* * Initialize the i8254 before the console so that console * initialization can use DELAY(). */ i8254_init(); /* * Initialize the console before we print anything out. */ cninit(); #ifdef DEV_ISA #ifdef DEV_ATPIC elcr_probe(); atpic_startup(); #else /* Reset and mask the atpics and leave them shut down. */ atpic_reset(); /* * Point the ICU spurious interrupt vectors at the APIC spurious * interrupt handler. */ setidt(IDT_IO_INTS + 7, IDTVEC(spuriousint), SDT_SYSIGT, SEL_KPL, 0); setidt(IDT_IO_INTS + 15, IDTVEC(spuriousint), SDT_SYSIGT, SEL_KPL, 0); #endif #else #error "have you forgotten the isa device?"; #endif kdb_init(); #ifdef KDB if (boothowto & RB_KDB) kdb_enter(KDB_WHY_BOOTFLAGS, "Boot flags requested debugger"); #endif identify_cpu(); /* Final stage of CPU initialization */ initializecpu(); /* Initialize CPU registers */ initializecpucache(); /* doublefault stack space, runs on ist1 */ common_tss[0].tss_ist1 = (long)&dblfault_stack[sizeof(dblfault_stack)]; /* * NMI stack, runs on ist2. The pcpu pointer is stored just * above the start of the ist2 stack. */ np = ((struct nmi_pcpu *) &nmi0_stack[sizeof(nmi0_stack)]) - 1; np->np_pcpu = (register_t) pc; common_tss[0].tss_ist2 = (long) np; /* Set the IO permission bitmap (empty due to tss seg limit) */ common_tss[0].tss_iobase = sizeof(struct amd64tss) + IOPAGES * PAGE_SIZE; gsel_tss = GSEL(GPROC0_SEL, SEL_KPL); ltr(gsel_tss); /* Set up the fast syscall stuff */ msr = rdmsr(MSR_EFER) | EFER_SCE; wrmsr(MSR_EFER, msr); wrmsr(MSR_LSTAR, (u_int64_t)IDTVEC(fast_syscall)); wrmsr(MSR_CSTAR, (u_int64_t)IDTVEC(fast_syscall32)); msr = ((u_int64_t)GSEL(GCODE_SEL, SEL_KPL) << 32) | ((u_int64_t)GSEL(GUCODE32_SEL, SEL_UPL) << 48); wrmsr(MSR_STAR, msr); wrmsr(MSR_SF_MASK, PSL_NT|PSL_T|PSL_I|PSL_C|PSL_D); getmemsize(kmdp, physfree); init_param2(physmem); /* now running on new page tables, configured,and u/iom is accessible */ msgbufinit(msgbufp, msgbufsize); fpuinit(); /* * Set up thread0 pcb after fpuinit calculated pcb + fpu save * area size. Zero out the extended state header in fpu save * area. */ thread0.td_pcb = get_pcb_td(&thread0); bzero(get_pcb_user_save_td(&thread0), cpu_max_ext_state_size); if (use_xsave) { xhdr = (struct xstate_hdr *)(get_pcb_user_save_td(&thread0) + 1); xhdr->xstate_bv = xsave_mask; } /* make an initial tss so cpu can get interrupt stack on syscall! */ common_tss[0].tss_rsp0 = (vm_offset_t)thread0.td_pcb; /* Ensure the stack is aligned to 16 bytes */ common_tss[0].tss_rsp0 &= ~0xFul; PCPU_SET(rsp0, common_tss[0].tss_rsp0); PCPU_SET(curpcb, thread0.td_pcb); /* transfer to user mode */ _ucodesel = GSEL(GUCODE_SEL, SEL_UPL); _udatasel = GSEL(GUDATA_SEL, SEL_UPL); _ucode32sel = GSEL(GUCODE32_SEL, SEL_UPL); _ufssel = GSEL(GUFS32_SEL, SEL_UPL); _ugssel = GSEL(GUGS32_SEL, SEL_UPL); load_ds(_udatasel); load_es(_udatasel); load_fs(_ufssel); /* setup proc 0's pcb */ thread0.td_pcb->pcb_flags = 0; thread0.td_pcb->pcb_cr3 = KPML4phys; thread0.td_frame = &proc0_tf; env = getenv("kernelname"); if (env != NULL) strlcpy(kernelname, env, sizeof(kernelname)); #ifdef XENHVM if (inw(0x10) == 0x49d2) { if (bootverbose) printf("Xen detected: disabling emulated block and network devices\n"); outw(0x10, 3); } #endif cpu_probe_amdc1e(); /* Location of kernel stack for locore */ return ((u_int64_t)thread0.td_pcb); } void cpu_pcpu_init(struct pcpu *pcpu, int cpuid, size_t size) { pcpu->pc_acpi_id = 0xffffffff; } void spinlock_enter(void) { struct thread *td; register_t flags; td = curthread; if (td->td_md.md_spinlock_count == 0) { flags = intr_disable(); td->td_md.md_spinlock_count = 1; td->td_md.md_saved_flags = flags; } else td->td_md.md_spinlock_count++; critical_enter(); } void spinlock_exit(void) { struct thread *td; register_t flags; td = curthread; critical_exit(); flags = td->td_md.md_saved_flags; td->td_md.md_spinlock_count--; if (td->td_md.md_spinlock_count == 0) intr_restore(flags); } /* * Construct a PCB from a trapframe. This is called from kdb_trap() where * we want to start a backtrace from the function that caused us to enter * the debugger. We have the context in the trapframe, but base the trace * on the PCB. The PCB doesn't have to be perfect, as long as it contains * enough for a backtrace. */ void makectx(struct trapframe *tf, struct pcb *pcb) { pcb->pcb_r12 = tf->tf_r12; pcb->pcb_r13 = tf->tf_r13; pcb->pcb_r14 = tf->tf_r14; pcb->pcb_r15 = tf->tf_r15; pcb->pcb_rbp = tf->tf_rbp; pcb->pcb_rbx = tf->tf_rbx; pcb->pcb_rip = tf->tf_rip; pcb->pcb_rsp = tf->tf_rsp; } int ptrace_set_pc(struct thread *td, unsigned long addr) { td->td_frame->tf_rip = addr; return (0); } int ptrace_single_step(struct thread *td) { td->td_frame->tf_rflags |= PSL_T; return (0); } int ptrace_clear_single_step(struct thread *td) { td->td_frame->tf_rflags &= ~PSL_T; return (0); } int fill_regs(struct thread *td, struct reg *regs) { struct trapframe *tp; tp = td->td_frame; return (fill_frame_regs(tp, regs)); } int fill_frame_regs(struct trapframe *tp, struct reg *regs) { regs->r_r15 = tp->tf_r15; regs->r_r14 = tp->tf_r14; regs->r_r13 = tp->tf_r13; regs->r_r12 = tp->tf_r12; regs->r_r11 = tp->tf_r11; regs->r_r10 = tp->tf_r10; regs->r_r9 = tp->tf_r9; regs->r_r8 = tp->tf_r8; regs->r_rdi = tp->tf_rdi; regs->r_rsi = tp->tf_rsi; regs->r_rbp = tp->tf_rbp; regs->r_rbx = tp->tf_rbx; regs->r_rdx = tp->tf_rdx; regs->r_rcx = tp->tf_rcx; regs->r_rax = tp->tf_rax; regs->r_rip = tp->tf_rip; regs->r_cs = tp->tf_cs; regs->r_rflags = tp->tf_rflags; regs->r_rsp = tp->tf_rsp; regs->r_ss = tp->tf_ss; if (tp->tf_flags & TF_HASSEGS) { regs->r_ds = tp->tf_ds; regs->r_es = tp->tf_es; regs->r_fs = tp->tf_fs; regs->r_gs = tp->tf_gs; } else { regs->r_ds = 0; regs->r_es = 0; regs->r_fs = 0; regs->r_gs = 0; } return (0); } int set_regs(struct thread *td, struct reg *regs) { struct trapframe *tp; register_t rflags; tp = td->td_frame; rflags = regs->r_rflags & 0xffffffff; if (!EFL_SECURE(rflags, tp->tf_rflags) || !CS_SECURE(regs->r_cs)) return (EINVAL); tp->tf_r15 = regs->r_r15; tp->tf_r14 = regs->r_r14; tp->tf_r13 = regs->r_r13; tp->tf_r12 = regs->r_r12; tp->tf_r11 = regs->r_r11; tp->tf_r10 = regs->r_r10; tp->tf_r9 = regs->r_r9; tp->tf_r8 = regs->r_r8; tp->tf_rdi = regs->r_rdi; tp->tf_rsi = regs->r_rsi; tp->tf_rbp = regs->r_rbp; tp->tf_rbx = regs->r_rbx; tp->tf_rdx = regs->r_rdx; tp->tf_rcx = regs->r_rcx; tp->tf_rax = regs->r_rax; tp->tf_rip = regs->r_rip; tp->tf_cs = regs->r_cs; tp->tf_rflags = rflags; tp->tf_rsp = regs->r_rsp; tp->tf_ss = regs->r_ss; if (0) { /* XXXKIB */ tp->tf_ds = regs->r_ds; tp->tf_es = regs->r_es; tp->tf_fs = regs->r_fs; tp->tf_gs = regs->r_gs; tp->tf_flags = TF_HASSEGS; set_pcb_flags(td->td_pcb, PCB_FULL_IRET); } return (0); } /* XXX check all this stuff! */ /* externalize from sv_xmm */ static void fill_fpregs_xmm(struct savefpu *sv_xmm, struct fpreg *fpregs) { struct envxmm *penv_fpreg = (struct envxmm *)&fpregs->fpr_env; struct envxmm *penv_xmm = &sv_xmm->sv_env; int i; /* pcb -> fpregs */ bzero(fpregs, sizeof(*fpregs)); /* FPU control/status */ penv_fpreg->en_cw = penv_xmm->en_cw; penv_fpreg->en_sw = penv_xmm->en_sw; penv_fpreg->en_tw = penv_xmm->en_tw; penv_fpreg->en_opcode = penv_xmm->en_opcode; penv_fpreg->en_rip = penv_xmm->en_rip; penv_fpreg->en_rdp = penv_xmm->en_rdp; penv_fpreg->en_mxcsr = penv_xmm->en_mxcsr; penv_fpreg->en_mxcsr_mask = penv_xmm->en_mxcsr_mask; /* FPU registers */ for (i = 0; i < 8; ++i) bcopy(sv_xmm->sv_fp[i].fp_acc.fp_bytes, fpregs->fpr_acc[i], 10); /* SSE registers */ for (i = 0; i < 16; ++i) bcopy(sv_xmm->sv_xmm[i].xmm_bytes, fpregs->fpr_xacc[i], 16); } /* internalize from fpregs into sv_xmm */ static void set_fpregs_xmm(struct fpreg *fpregs, struct savefpu *sv_xmm) { struct envxmm *penv_xmm = &sv_xmm->sv_env; struct envxmm *penv_fpreg = (struct envxmm *)&fpregs->fpr_env; int i; /* fpregs -> pcb */ /* FPU control/status */ penv_xmm->en_cw = penv_fpreg->en_cw; penv_xmm->en_sw = penv_fpreg->en_sw; penv_xmm->en_tw = penv_fpreg->en_tw; penv_xmm->en_opcode = penv_fpreg->en_opcode; penv_xmm->en_rip = penv_fpreg->en_rip; penv_xmm->en_rdp = penv_fpreg->en_rdp; penv_xmm->en_mxcsr = penv_fpreg->en_mxcsr; penv_xmm->en_mxcsr_mask = penv_fpreg->en_mxcsr_mask & cpu_mxcsr_mask; /* FPU registers */ for (i = 0; i < 8; ++i) bcopy(fpregs->fpr_acc[i], sv_xmm->sv_fp[i].fp_acc.fp_bytes, 10); /* SSE registers */ for (i = 0; i < 16; ++i) bcopy(fpregs->fpr_xacc[i], sv_xmm->sv_xmm[i].xmm_bytes, 16); } /* externalize from td->pcb */ int fill_fpregs(struct thread *td, struct fpreg *fpregs) { KASSERT(td == curthread || TD_IS_SUSPENDED(td) || P_SHOULDSTOP(td->td_proc), ("not suspended thread %p", td)); fpugetregs(td); fill_fpregs_xmm(get_pcb_user_save_td(td), fpregs); return (0); } /* internalize to td->pcb */ int set_fpregs(struct thread *td, struct fpreg *fpregs) { set_fpregs_xmm(fpregs, get_pcb_user_save_td(td)); fpuuserinited(td); return (0); } /* * Get machine context. */ int get_mcontext(struct thread *td, mcontext_t *mcp, int flags) { struct pcb *pcb; struct trapframe *tp; pcb = td->td_pcb; tp = td->td_frame; PROC_LOCK(curthread->td_proc); mcp->mc_onstack = sigonstack(tp->tf_rsp); PROC_UNLOCK(curthread->td_proc); mcp->mc_r15 = tp->tf_r15; mcp->mc_r14 = tp->tf_r14; mcp->mc_r13 = tp->tf_r13; mcp->mc_r12 = tp->tf_r12; mcp->mc_r11 = tp->tf_r11; mcp->mc_r10 = tp->tf_r10; mcp->mc_r9 = tp->tf_r9; mcp->mc_r8 = tp->tf_r8; mcp->mc_rdi = tp->tf_rdi; mcp->mc_rsi = tp->tf_rsi; mcp->mc_rbp = tp->tf_rbp; mcp->mc_rbx = tp->tf_rbx; mcp->mc_rcx = tp->tf_rcx; mcp->mc_rflags = tp->tf_rflags; if (flags & GET_MC_CLEAR_RET) { mcp->mc_rax = 0; mcp->mc_rdx = 0; mcp->mc_rflags &= ~PSL_C; } else { mcp->mc_rax = tp->tf_rax; mcp->mc_rdx = tp->tf_rdx; } mcp->mc_rip = tp->tf_rip; mcp->mc_cs = tp->tf_cs; mcp->mc_rsp = tp->tf_rsp; mcp->mc_ss = tp->tf_ss; mcp->mc_ds = tp->tf_ds; mcp->mc_es = tp->tf_es; mcp->mc_fs = tp->tf_fs; mcp->mc_gs = tp->tf_gs; mcp->mc_flags = tp->tf_flags; mcp->mc_len = sizeof(*mcp); get_fpcontext(td, mcp, NULL, 0); mcp->mc_fsbase = pcb->pcb_fsbase; mcp->mc_gsbase = pcb->pcb_gsbase; mcp->mc_xfpustate = 0; mcp->mc_xfpustate_len = 0; bzero(mcp->mc_spare, sizeof(mcp->mc_spare)); return (0); } /* * Set machine context. * * However, we don't set any but the user modifiable flags, and we won't * touch the cs selector. */ int set_mcontext(struct thread *td, const mcontext_t *mcp) { struct pcb *pcb; struct trapframe *tp; char *xfpustate; long rflags; int ret; pcb = td->td_pcb; tp = td->td_frame; if (mcp->mc_len != sizeof(*mcp) || (mcp->mc_flags & ~_MC_FLAG_MASK) != 0) return (EINVAL); rflags = (mcp->mc_rflags & PSL_USERCHANGE) | (tp->tf_rflags & ~PSL_USERCHANGE); if (mcp->mc_flags & _MC_HASFPXSTATE) { if (mcp->mc_xfpustate_len > cpu_max_ext_state_size - sizeof(struct savefpu)) return (EINVAL); xfpustate = __builtin_alloca(mcp->mc_xfpustate_len); ret = copyin((void *)mcp->mc_xfpustate, xfpustate, mcp->mc_xfpustate_len); if (ret != 0) return (ret); } else xfpustate = NULL; ret = set_fpcontext(td, mcp, xfpustate, mcp->mc_xfpustate_len); if (ret != 0) return (ret); tp->tf_r15 = mcp->mc_r15; tp->tf_r14 = mcp->mc_r14; tp->tf_r13 = mcp->mc_r13; tp->tf_r12 = mcp->mc_r12; tp->tf_r11 = mcp->mc_r11; tp->tf_r10 = mcp->mc_r10; tp->tf_r9 = mcp->mc_r9; tp->tf_r8 = mcp->mc_r8; tp->tf_rdi = mcp->mc_rdi; tp->tf_rsi = mcp->mc_rsi; tp->tf_rbp = mcp->mc_rbp; tp->tf_rbx = mcp->mc_rbx; tp->tf_rdx = mcp->mc_rdx; tp->tf_rcx = mcp->mc_rcx; tp->tf_rax = mcp->mc_rax; tp->tf_rip = mcp->mc_rip; tp->tf_rflags = rflags; tp->tf_rsp = mcp->mc_rsp; tp->tf_ss = mcp->mc_ss; tp->tf_flags = mcp->mc_flags; if (tp->tf_flags & TF_HASSEGS) { tp->tf_ds = mcp->mc_ds; tp->tf_es = mcp->mc_es; tp->tf_fs = mcp->mc_fs; tp->tf_gs = mcp->mc_gs; } if (mcp->mc_flags & _MC_HASBASES) { pcb->pcb_fsbase = mcp->mc_fsbase; pcb->pcb_gsbase = mcp->mc_gsbase; } set_pcb_flags(pcb, PCB_FULL_IRET); return (0); } static void get_fpcontext(struct thread *td, mcontext_t *mcp, char *xfpusave, size_t xfpusave_len) { size_t max_len, len; mcp->mc_ownedfp = fpugetregs(td); bcopy(get_pcb_user_save_td(td), &mcp->mc_fpstate[0], sizeof(mcp->mc_fpstate)); mcp->mc_fpformat = fpuformat(); if (!use_xsave || xfpusave_len == 0) return; max_len = cpu_max_ext_state_size - sizeof(struct savefpu); len = xfpusave_len; if (len > max_len) { len = max_len; bzero(xfpusave + max_len, len - max_len); } mcp->mc_flags |= _MC_HASFPXSTATE; mcp->mc_xfpustate_len = len; bcopy(get_pcb_user_save_td(td) + 1, xfpusave, len); } static int set_fpcontext(struct thread *td, const mcontext_t *mcp, char *xfpustate, size_t xfpustate_len) { struct savefpu *fpstate; int error; if (mcp->mc_fpformat == _MC_FPFMT_NODEV) return (0); else if (mcp->mc_fpformat != _MC_FPFMT_XMM) return (EINVAL); else if (mcp->mc_ownedfp == _MC_FPOWNED_NONE) { /* We don't care what state is left in the FPU or PCB. */ fpstate_drop(td); error = 0; } else if (mcp->mc_ownedfp == _MC_FPOWNED_FPU || mcp->mc_ownedfp == _MC_FPOWNED_PCB) { fpstate = (struct savefpu *)&mcp->mc_fpstate; fpstate->sv_env.en_mxcsr &= cpu_mxcsr_mask; error = fpusetregs(td, fpstate, xfpustate, xfpustate_len); } else return (EINVAL); return (error); } void fpstate_drop(struct thread *td) { KASSERT(PCB_USER_FPU(td->td_pcb), ("fpstate_drop: kernel-owned fpu")); critical_enter(); if (PCPU_GET(fpcurthread) == td) fpudrop(); /* * XXX force a full drop of the fpu. The above only drops it if we * owned it. * * XXX I don't much like fpugetuserregs()'s semantics of doing a full * drop. Dropping only to the pcb matches fnsave's behaviour. * We only need to drop to !PCB_INITDONE in sendsig(). But * sendsig() is the only caller of fpugetuserregs()... perhaps we just * have too many layers. */ clear_pcb_flags(curthread->td_pcb, PCB_FPUINITDONE | PCB_USERFPUINITDONE); critical_exit(); } int fill_dbregs(struct thread *td, struct dbreg *dbregs) { struct pcb *pcb; if (td == NULL) { dbregs->dr[0] = rdr0(); dbregs->dr[1] = rdr1(); dbregs->dr[2] = rdr2(); dbregs->dr[3] = rdr3(); dbregs->dr[6] = rdr6(); dbregs->dr[7] = rdr7(); } else { pcb = td->td_pcb; dbregs->dr[0] = pcb->pcb_dr0; dbregs->dr[1] = pcb->pcb_dr1; dbregs->dr[2] = pcb->pcb_dr2; dbregs->dr[3] = pcb->pcb_dr3; dbregs->dr[6] = pcb->pcb_dr6; dbregs->dr[7] = pcb->pcb_dr7; } dbregs->dr[4] = 0; dbregs->dr[5] = 0; dbregs->dr[8] = 0; dbregs->dr[9] = 0; dbregs->dr[10] = 0; dbregs->dr[11] = 0; dbregs->dr[12] = 0; dbregs->dr[13] = 0; dbregs->dr[14] = 0; dbregs->dr[15] = 0; return (0); } int set_dbregs(struct thread *td, struct dbreg *dbregs) { struct pcb *pcb; int i; if (td == NULL) { load_dr0(dbregs->dr[0]); load_dr1(dbregs->dr[1]); load_dr2(dbregs->dr[2]); load_dr3(dbregs->dr[3]); load_dr6(dbregs->dr[6]); load_dr7(dbregs->dr[7]); } else { /* * Don't let an illegal value for dr7 get set. Specifically, * check for undefined settings. Setting these bit patterns * result in undefined behaviour and can lead to an unexpected * TRCTRAP or a general protection fault right here. * Upper bits of dr6 and dr7 must not be set */ for (i = 0; i < 4; i++) { if (DBREG_DR7_ACCESS(dbregs->dr[7], i) == 0x02) return (EINVAL); if (td->td_frame->tf_cs == _ucode32sel && DBREG_DR7_LEN(dbregs->dr[7], i) == DBREG_DR7_LEN_8) return (EINVAL); } if ((dbregs->dr[6] & 0xffffffff00000000ul) != 0 || (dbregs->dr[7] & 0xffffffff00000000ul) != 0) return (EINVAL); pcb = td->td_pcb; /* * Don't let a process set a breakpoint that is not within the * process's address space. If a process could do this, it * could halt the system by setting a breakpoint in the kernel * (if ddb was enabled). Thus, we need to check to make sure * that no breakpoints are being enabled for addresses outside * process's address space. * * XXX - what about when the watched area of the user's * address space is written into from within the kernel * ... wouldn't that still cause a breakpoint to be generated * from within kernel mode? */ if (DBREG_DR7_ENABLED(dbregs->dr[7], 0)) { /* dr0 is enabled */ if (dbregs->dr[0] >= VM_MAXUSER_ADDRESS) return (EINVAL); } if (DBREG_DR7_ENABLED(dbregs->dr[7], 1)) { /* dr1 is enabled */ if (dbregs->dr[1] >= VM_MAXUSER_ADDRESS) return (EINVAL); } if (DBREG_DR7_ENABLED(dbregs->dr[7], 2)) { /* dr2 is enabled */ if (dbregs->dr[2] >= VM_MAXUSER_ADDRESS) return (EINVAL); } if (DBREG_DR7_ENABLED(dbregs->dr[7], 3)) { /* dr3 is enabled */ if (dbregs->dr[3] >= VM_MAXUSER_ADDRESS) return (EINVAL); } pcb->pcb_dr0 = dbregs->dr[0]; pcb->pcb_dr1 = dbregs->dr[1]; pcb->pcb_dr2 = dbregs->dr[2]; pcb->pcb_dr3 = dbregs->dr[3]; pcb->pcb_dr6 = dbregs->dr[6]; pcb->pcb_dr7 = dbregs->dr[7]; set_pcb_flags(pcb, PCB_DBREGS); } return (0); } void reset_dbregs(void) { load_dr7(0); /* Turn off the control bits first */ load_dr0(0); load_dr1(0); load_dr2(0); load_dr3(0); load_dr6(0); } /* * Return > 0 if a hardware breakpoint has been hit, and the * breakpoint was in user space. Return 0, otherwise. */ int user_dbreg_trap(void) { u_int64_t dr7, dr6; /* debug registers dr6 and dr7 */ u_int64_t bp; /* breakpoint bits extracted from dr6 */ int nbp; /* number of breakpoints that triggered */ caddr_t addr[4]; /* breakpoint addresses */ int i; dr7 = rdr7(); if ((dr7 & 0x000000ff) == 0) { /* * all GE and LE bits in the dr7 register are zero, * thus the trap couldn't have been caused by the * hardware debug registers */ return 0; } nbp = 0; dr6 = rdr6(); bp = dr6 & 0x0000000f; if (!bp) { /* * None of the breakpoint bits are set meaning this * trap was not caused by any of the debug registers */ return 0; } /* * at least one of the breakpoints were hit, check to see * which ones and if any of them are user space addresses */ if (bp & 0x01) { addr[nbp++] = (caddr_t)rdr0(); } if (bp & 0x02) { addr[nbp++] = (caddr_t)rdr1(); } if (bp & 0x04) { addr[nbp++] = (caddr_t)rdr2(); } if (bp & 0x08) { addr[nbp++] = (caddr_t)rdr3(); } for (i = 0; i < nbp; i++) { if (addr[i] < (caddr_t)VM_MAXUSER_ADDRESS) { /* * addr[i] is in user space */ return nbp; } } /* * None of the breakpoints are in user space. */ return 0; } #ifdef KDB /* * Provide inb() and outb() as functions. They are normally only available as * inline functions, thus cannot be called from the debugger. */ /* silence compiler warnings */ u_char inb_(u_short); void outb_(u_short, u_char); u_char inb_(u_short port) { return inb(port); } void outb_(u_short port, u_char data) { outb(port, data); } #endif /* KDB */ Index: releng/9.3/sys/amd64/amd64/trap.c =================================================================== --- releng/9.3/sys/amd64/amd64/trap.c (revision 287146) +++ releng/9.3/sys/amd64/amd64/trap.c (revision 287147) @@ -1,1012 +1,1010 @@ /*- * Copyright (C) 1994, David Greenman * Copyright (c) 1990, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * the University of Utah, and William Jolitz. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)trap.c 7.4 (Berkeley) 5/13/91 */ #include __FBSDID("$FreeBSD$"); /* * AMD64 Trap and System call handling */ #include "opt_clock.h" #include "opt_cpu.h" #include "opt_hwpmc_hooks.h" #include "opt_isa.h" #include "opt_kdb.h" #include "opt_kdtrace.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef HWPMC_HOOKS #include PMC_SOFT_DEFINE( , , page_fault, all); PMC_SOFT_DEFINE( , , page_fault, read); PMC_SOFT_DEFINE( , , page_fault, write); #endif #include #include #include #include #include #include #include #include #include #include #include #include #ifdef SMP #include #endif #include #ifdef KDTRACE_HOOKS #include /* * This is a hook which is initialised by the dtrace module * to handle traps which might occur during DTrace probe * execution. */ dtrace_trap_func_t dtrace_trap_func; dtrace_doubletrap_func_t dtrace_doubletrap_func; /* * This is a hook which is initialised by the systrace module * when it is loaded. This keeps the DTrace syscall provider * implementation opaque. */ systrace_probe_func_t systrace_probe_func; /* * These hooks are necessary for the pid and usdt providers. */ dtrace_pid_probe_ptr_t dtrace_pid_probe_ptr; dtrace_return_probe_ptr_t dtrace_return_probe_ptr; #endif extern void trap(struct trapframe *frame); extern void syscall(struct trapframe *frame); void dblfault_handler(struct trapframe *frame); static int trap_pfault(struct trapframe *, int); static void trap_fatal(struct trapframe *, vm_offset_t); #define MAX_TRAP_MSG 32 static char *trap_msg[] = { "", /* 0 unused */ "privileged instruction fault", /* 1 T_PRIVINFLT */ "", /* 2 unused */ "breakpoint instruction fault", /* 3 T_BPTFLT */ "", /* 4 unused */ "", /* 5 unused */ "arithmetic trap", /* 6 T_ARITHTRAP */ "", /* 7 unused */ "", /* 8 unused */ "general protection fault", /* 9 T_PROTFLT */ "trace trap", /* 10 T_TRCTRAP */ "", /* 11 unused */ "page fault", /* 12 T_PAGEFLT */ "", /* 13 unused */ "alignment fault", /* 14 T_ALIGNFLT */ "", /* 15 unused */ "", /* 16 unused */ "", /* 17 unused */ "integer divide fault", /* 18 T_DIVIDE */ "non-maskable interrupt trap", /* 19 T_NMI */ "overflow trap", /* 20 T_OFLOW */ "FPU bounds check fault", /* 21 T_BOUND */ "FPU device not available", /* 22 T_DNA */ "double fault", /* 23 T_DOUBLEFLT */ "FPU operand fetch fault", /* 24 T_FPOPFLT */ "invalid TSS fault", /* 25 T_TSSFLT */ "segment not present fault", /* 26 T_SEGNPFLT */ "stack fault", /* 27 T_STKFLT */ "machine check trap", /* 28 T_MCHK */ "SIMD floating-point exception", /* 29 T_XMMFLT */ "reserved (unknown) fault", /* 30 T_RESERVED */ "", /* 31 unused (reserved) */ "DTrace pid return trap", /* 32 T_DTRACE_RET */ }; #ifdef KDB static int kdb_on_nmi = 1; SYSCTL_INT(_machdep, OID_AUTO, kdb_on_nmi, CTLFLAG_RW, &kdb_on_nmi, 0, "Go to KDB on NMI"); TUNABLE_INT("machdep.kdb_on_nmi", &kdb_on_nmi); #endif static int panic_on_nmi = 1; SYSCTL_INT(_machdep, OID_AUTO, panic_on_nmi, CTLFLAG_RW, &panic_on_nmi, 0, "Panic on NMI"); TUNABLE_INT("machdep.panic_on_nmi", &panic_on_nmi); static int prot_fault_translation; SYSCTL_INT(_machdep, OID_AUTO, prot_fault_translation, CTLFLAG_RW, &prot_fault_translation, 0, "Select signal to deliver on protection fault"); static int uprintf_signal; SYSCTL_INT(_machdep, OID_AUTO, uprintf_signal, CTLFLAG_RW, &uprintf_signal, 0, "Print debugging information on trap signal to ctty"); /* * Exception, fault, and trap interface to the FreeBSD kernel. * This common code is called from assembly language IDT gate entry * routines that prepare a suitable stack frame, and restore this * frame after the exception has been processed. */ void trap(struct trapframe *frame) { struct thread *td = curthread; struct proc *p = td->td_proc; int i = 0, ucode = 0, code; u_int type; register_t addr = 0; ksiginfo_t ksi; PCPU_INC(cnt.v_trap); type = frame->tf_trapno; #ifdef SMP /* Handler for NMI IPIs used for stopping CPUs. */ if (type == T_NMI) { if (ipi_nmi_handler() == 0) goto out; } #endif /* SMP */ #ifdef KDB if (kdb_active) { kdb_reenter(); goto out; } #endif if (type == T_RESERVED) { trap_fatal(frame, 0); goto out; } #ifdef HWPMC_HOOKS /* * CPU PMCs interrupt using an NMI. If the PMC module is * active, pass the 'rip' value to the PMC module's interrupt * handler. A return value of '1' from the handler means that * the NMI was handled by it and we can return immediately. */ if (type == T_NMI && pmc_intr && (*pmc_intr)(PCPU_GET(cpuid), frame)) goto out; #endif if (type == T_MCHK) { mca_intr(); goto out; } #ifdef KDTRACE_HOOKS /* * A trap can occur while DTrace executes a probe. Before * executing the probe, DTrace blocks re-scheduling and sets * a flag in it's per-cpu flags to indicate that it doesn't * want to fault. On returning from the probe, the no-fault * flag is cleared and finally re-scheduling is enabled. * * If the DTrace kernel module has registered a trap handler, * call it and if it returns non-zero, assume that it has * handled the trap and modified the trap frame so that this * function can return normally. */ if (type == T_DTRACE_RET || type == T_BPTFLT) { struct reg regs; fill_frame_regs(frame, ®s); if (type == T_BPTFLT && dtrace_pid_probe_ptr != NULL && dtrace_pid_probe_ptr(®s) == 0) goto out; else if (type == T_DTRACE_RET && dtrace_return_probe_ptr != NULL && dtrace_return_probe_ptr(®s) == 0) goto out; } if (dtrace_trap_func != NULL && (*dtrace_trap_func)(frame, type)) goto out; #endif if ((frame->tf_rflags & PSL_I) == 0) { /* * Buggy application or kernel code has disabled * interrupts and then trapped. Enabling interrupts * now is wrong, but it is better than running with * interrupts disabled until they are accidentally * enabled later. */ if (ISPL(frame->tf_cs) == SEL_UPL) uprintf( "pid %ld (%s): trap %d with interrupts disabled\n", (long)curproc->p_pid, curthread->td_name, type); else if (type != T_NMI && type != T_BPTFLT && type != T_TRCTRAP) { /* * XXX not quite right, since this may be for a * multiple fault in user mode. */ printf("kernel trap %d with interrupts disabled\n", type); /* * We shouldn't enable interrupts while holding a * spin lock. */ if (td->td_md.md_spinlock_count == 0) enable_intr(); } } code = frame->tf_err; if (ISPL(frame->tf_cs) == SEL_UPL) { /* user trap */ td->td_pticks = 0; td->td_frame = frame; addr = frame->tf_rip; if (td->td_ucred != p->p_ucred) cred_update_thread(td); switch (type) { case T_PRIVINFLT: /* privileged instruction fault */ i = SIGILL; ucode = ILL_PRVOPC; break; case T_BPTFLT: /* bpt instruction fault */ case T_TRCTRAP: /* trace trap */ enable_intr(); frame->tf_rflags &= ~PSL_T; i = SIGTRAP; ucode = (type == T_TRCTRAP ? TRAP_TRACE : TRAP_BRKPT); break; case T_ARITHTRAP: /* arithmetic trap */ ucode = fputrap_x87(); if (ucode == -1) goto userout; i = SIGFPE; break; case T_PROTFLT: /* general protection fault */ i = SIGBUS; ucode = BUS_OBJERR; break; case T_STKFLT: /* stack fault */ case T_SEGNPFLT: /* segment not present fault */ i = SIGBUS; ucode = BUS_ADRERR; break; case T_TSSFLT: /* invalid TSS fault */ i = SIGBUS; ucode = BUS_OBJERR; break; case T_DOUBLEFLT: /* double fault */ default: i = SIGBUS; ucode = BUS_OBJERR; break; case T_PAGEFLT: /* page fault */ addr = frame->tf_addr; i = trap_pfault(frame, TRUE); if (i == -1) goto userout; if (i == 0) goto user; if (i == SIGSEGV) ucode = SEGV_MAPERR; else { if (prot_fault_translation == 0) { /* * Autodetect. * This check also covers the images * without the ABI-tag ELF note. */ if (SV_CURPROC_ABI() == SV_ABI_FREEBSD && p->p_osrel >= P_OSREL_SIGSEGV) { i = SIGSEGV; ucode = SEGV_ACCERR; } else { i = SIGBUS; ucode = BUS_PAGE_FAULT; } } else if (prot_fault_translation == 1) { /* * Always compat mode. */ i = SIGBUS; ucode = BUS_PAGE_FAULT; } else { /* * Always SIGSEGV mode. */ i = SIGSEGV; ucode = SEGV_ACCERR; } } break; case T_DIVIDE: /* integer divide fault */ ucode = FPE_INTDIV; i = SIGFPE; break; #ifdef DEV_ISA case T_NMI: /* machine/parity/power fail/"kitchen sink" faults */ if (isa_nmi(code) == 0) { #ifdef KDB /* * NMI can be hooked up to a pushbutton * for debugging. */ if (kdb_on_nmi) { printf ("NMI ... going to debugger\n"); kdb_trap(type, 0, frame); } #endif /* KDB */ goto userout; } else if (panic_on_nmi) panic("NMI indicates hardware failure"); break; #endif /* DEV_ISA */ case T_OFLOW: /* integer overflow fault */ ucode = FPE_INTOVF; i = SIGFPE; break; case T_BOUND: /* bounds check fault */ ucode = FPE_FLTSUB; i = SIGFPE; break; case T_DNA: /* transparent fault (due to context switch "late") */ KASSERT(PCB_USER_FPU(td->td_pcb), ("kernel FPU ctx has leaked")); fpudna(); goto userout; case T_FPOPFLT: /* FPU operand fetch fault */ ucode = ILL_COPROC; i = SIGILL; break; case T_XMMFLT: /* SIMD floating-point exception */ ucode = fputrap_sse(); if (ucode == -1) goto userout; i = SIGFPE; break; } } else { /* kernel trap */ KASSERT(cold || td->td_ucred != NULL, ("kernel trap doesn't have ucred")); switch (type) { case T_PAGEFLT: /* page fault */ (void) trap_pfault(frame, FALSE); goto out; case T_DNA: KASSERT(!PCB_USER_FPU(td->td_pcb), ("Unregistered use of FPU in kernel")); fpudna(); goto out; case T_ARITHTRAP: /* arithmetic trap */ case T_XMMFLT: /* SIMD floating-point exception */ case T_FPOPFLT: /* FPU operand fetch fault */ /* * XXXKIB for now disable any FPU traps in kernel * handler registration seems to be overkill */ trap_fatal(frame, 0); goto out; case T_STKFLT: /* stack fault */ - break; - case T_PROTFLT: /* general protection fault */ case T_SEGNPFLT: /* segment not present fault */ if (td->td_intr_nesting_level != 0) break; /* * Invalid segment selectors and out of bounds * %rip's and %rsp's can be set up in user mode. * This causes a fault in kernel mode when the * kernel tries to return to user mode. We want * to get this fault so that we can fix the * problem here and not have to check all the * selectors and pointers when the user changes * them. */ if (frame->tf_rip == (long)doreti_iret) { frame->tf_rip = (long)doreti_iret_fault; goto out; } if (frame->tf_rip == (long)ld_ds) { frame->tf_rip = (long)ds_load_fault; goto out; } if (frame->tf_rip == (long)ld_es) { frame->tf_rip = (long)es_load_fault; goto out; } if (frame->tf_rip == (long)ld_fs) { frame->tf_rip = (long)fs_load_fault; goto out; } if (frame->tf_rip == (long)ld_gs) { frame->tf_rip = (long)gs_load_fault; goto out; } if (frame->tf_rip == (long)ld_gsbase) { frame->tf_rip = (long)gsbase_load_fault; goto out; } if (frame->tf_rip == (long)ld_fsbase) { frame->tf_rip = (long)fsbase_load_fault; goto out; } if (curpcb->pcb_onfault != NULL) { frame->tf_rip = (long)curpcb->pcb_onfault; goto out; } break; case T_TSSFLT: /* * PSL_NT can be set in user mode and isn't cleared * automatically when the kernel is entered. This * causes a TSS fault when the kernel attempts to * `iret' because the TSS link is uninitialized. We * want to get this fault so that we can fix the * problem here and not every time the kernel is * entered. */ if (frame->tf_rflags & PSL_NT) { frame->tf_rflags &= ~PSL_NT; goto out; } break; case T_TRCTRAP: /* trace trap */ /* * Ignore debug register trace traps due to * accesses in the user's address space, which * can happen under several conditions such as * if a user sets a watchpoint on a buffer and * then passes that buffer to a system call. * We still want to get TRCTRAPS for addresses * in kernel space because that is useful when * debugging the kernel. */ if (user_dbreg_trap()) { /* * Reset breakpoint bits because the * processor doesn't */ /* XXX check upper bits here */ load_dr6(rdr6() & 0xfffffff0); goto out; } /* * FALLTHROUGH (TRCTRAP kernel mode, kernel address) */ case T_BPTFLT: /* * If KDB is enabled, let it handle the debugger trap. * Otherwise, debugger traps "can't happen". */ #ifdef KDB if (kdb_trap(type, 0, frame)) goto out; #endif break; #ifdef DEV_ISA case T_NMI: /* machine/parity/power fail/"kitchen sink" faults */ if (isa_nmi(code) == 0) { #ifdef KDB /* * NMI can be hooked up to a pushbutton * for debugging. */ if (kdb_on_nmi) { printf ("NMI ... going to debugger\n"); kdb_trap(type, 0, frame); } #endif /* KDB */ goto out; } else if (panic_on_nmi == 0) goto out; /* FALLTHROUGH */ #endif /* DEV_ISA */ } trap_fatal(frame, 0); goto out; } /* Translate fault for emulators (e.g. Linux) */ if (*p->p_sysent->sv_transtrap) i = (*p->p_sysent->sv_transtrap)(i, type); ksiginfo_init_trap(&ksi); ksi.ksi_signo = i; ksi.ksi_code = ucode; ksi.ksi_trapno = type; ksi.ksi_addr = (void *)addr; if (uprintf_signal) { uprintf("pid %d comm %s: signal %d err %lx code %d type %d " "addr 0x%lx rsp 0x%lx rip 0x%lx " "<%02x %02x %02x %02x %02x %02x %02x %02x>\n", p->p_pid, p->p_comm, i, frame->tf_err, ucode, type, addr, frame->tf_rsp, frame->tf_rip, fubyte((void *)(frame->tf_rip + 0)), fubyte((void *)(frame->tf_rip + 1)), fubyte((void *)(frame->tf_rip + 2)), fubyte((void *)(frame->tf_rip + 3)), fubyte((void *)(frame->tf_rip + 4)), fubyte((void *)(frame->tf_rip + 5)), fubyte((void *)(frame->tf_rip + 6)), fubyte((void *)(frame->tf_rip + 7))); } KASSERT((read_rflags() & PSL_I) != 0, ("interrupts disabled")); trapsignal(td, &ksi); user: userret(td, frame); mtx_assert(&Giant, MA_NOTOWNED); KASSERT(PCB_USER_FPU(td->td_pcb), ("Return from trap with kernel FPU ctx leaked")); userout: out: return; } static int trap_pfault(frame, usermode) struct trapframe *frame; int usermode; { vm_offset_t va; struct vmspace *vm = NULL; vm_map_t map; int rv = 0; vm_prot_t ftype; struct thread *td = curthread; struct proc *p = td->td_proc; vm_offset_t eva = frame->tf_addr; if (__predict_false((td->td_pflags & TDP_NOFAULTING) != 0)) { /* * Due to both processor errata and lazy TLB invalidation when * access restrictions are removed from virtual pages, memory * accesses that are allowed by the physical mapping layer may * nonetheless cause one spurious page fault per virtual page. * When the thread is executing a "no faulting" section that * is bracketed by vm_fault_{disable,enable}_pagefaults(), * every page fault is treated as a spurious page fault, * unless it accesses the same virtual address as the most * recent page fault within the same "no faulting" section. */ if (td->td_md.md_spurflt_addr != eva || (td->td_pflags & TDP_RESETSPUR) != 0) { /* * Do nothing to the TLB. A stale TLB entry is * flushed automatically by a page fault. */ td->td_md.md_spurflt_addr = eva; td->td_pflags &= ~TDP_RESETSPUR; return (0); } } else { /* * If we get a page fault while in a critical section, then * it is most likely a fatal kernel page fault. The kernel * is already going to panic trying to get a sleep lock to * do the VM lookup, so just consider it a fatal trap so the * kernel can print out a useful trap message and even get * to the debugger. * * If we get a page fault while holding a non-sleepable * lock, then it is most likely a fatal kernel page fault. * If WITNESS is enabled, then it's going to whine about * bogus LORs with various VM locks, so just skip to the * fatal trap handling directly. */ if (td->td_critnest != 0 || WITNESS_CHECK(WARN_SLEEPOK | WARN_GIANTOK, NULL, "Kernel page fault") != 0) { trap_fatal(frame, eva); return (-1); } } va = trunc_page(eva); if (va >= VM_MIN_KERNEL_ADDRESS) { /* * Don't allow user-mode faults in kernel address space. */ if (usermode) goto nogo; map = kernel_map; } else { /* * This is a fault on non-kernel virtual memory. * vm is initialized above to NULL. If curproc is NULL * or curproc->p_vmspace is NULL the fault is fatal. */ if (p != NULL) vm = p->p_vmspace; if (vm == NULL) goto nogo; map = &vm->vm_map; /* * When accessing a usermode address, kernel must be * ready to accept the page fault, and provide a * handling routine. Since accessing the address * without the handler is a bug, do not try to handle * it normally, and panic immediately. */ if (!usermode && (td->td_intr_nesting_level != 0 || curpcb->pcb_onfault == NULL)) { trap_fatal(frame, eva); return (-1); } } /* * PGEX_I is defined only if the execute disable bit capability is * supported and enabled. */ if (frame->tf_err & PGEX_W) ftype = VM_PROT_WRITE; else if ((frame->tf_err & PGEX_I) && pg_nx != 0) ftype = VM_PROT_EXECUTE; else ftype = VM_PROT_READ; if (map != kernel_map) { /* * Keep swapout from messing with us during this * critical time. */ PROC_LOCK(p); ++p->p_lock; PROC_UNLOCK(p); /* Fault in the user page: */ rv = vm_fault(map, va, ftype, VM_FAULT_NORMAL); PROC_LOCK(p); --p->p_lock; PROC_UNLOCK(p); } else { /* * Don't have to worry about process locking or stacks in the * kernel. */ rv = vm_fault(map, va, ftype, VM_FAULT_NORMAL); } if (rv == KERN_SUCCESS) { #ifdef HWPMC_HOOKS if (ftype == VM_PROT_READ || ftype == VM_PROT_WRITE) { PMC_SOFT_CALL_TF( , , page_fault, all, frame); if (ftype == VM_PROT_READ) PMC_SOFT_CALL_TF( , , page_fault, read, frame); else PMC_SOFT_CALL_TF( , , page_fault, write, frame); } #endif return (0); } nogo: if (!usermode) { if (td->td_intr_nesting_level == 0 && curpcb->pcb_onfault != NULL) { frame->tf_rip = (long)curpcb->pcb_onfault; return (0); } if ((td->td_pflags & TDP_DEVMEMIO) != 0) { KASSERT(curpcb->pcb_onfault != NULL, ("/dev/mem without pcb_onfault")); frame->tf_rip = (long)curpcb->pcb_onfault; return (0); } trap_fatal(frame, eva); return (-1); } return((rv == KERN_PROTECTION_FAILURE) ? SIGBUS : SIGSEGV); } static void trap_fatal(frame, eva) struct trapframe *frame; vm_offset_t eva; { int code, ss; u_int type; long esp; struct soft_segment_descriptor softseg; char *msg; code = frame->tf_err; type = frame->tf_trapno; sdtossd(&gdt[NGDT * PCPU_GET(cpuid) + IDXSEL(frame->tf_cs & 0xffff)], &softseg); if (type <= MAX_TRAP_MSG) msg = trap_msg[type]; else msg = "UNKNOWN"; printf("\n\nFatal trap %d: %s while in %s mode\n", type, msg, ISPL(frame->tf_cs) == SEL_UPL ? "user" : "kernel"); #ifdef SMP /* two separate prints in case of a trap on an unmapped page */ printf("cpuid = %d; ", PCPU_GET(cpuid)); printf("apic id = %02x\n", PCPU_GET(apic_id)); #endif if (type == T_PAGEFLT) { printf("fault virtual address = 0x%lx\n", eva); printf("fault code = %s %s %s, %s\n", code & PGEX_U ? "user" : "supervisor", code & PGEX_W ? "write" : "read", code & PGEX_I ? "instruction" : "data", code & PGEX_P ? "protection violation" : "page not present"); } printf("instruction pointer = 0x%lx:0x%lx\n", frame->tf_cs & 0xffff, frame->tf_rip); if (ISPL(frame->tf_cs) == SEL_UPL) { ss = frame->tf_ss & 0xffff; esp = frame->tf_rsp; } else { ss = GSEL(GDATA_SEL, SEL_KPL); esp = (long)&frame->tf_rsp; } printf("stack pointer = 0x%x:0x%lx\n", ss, esp); printf("frame pointer = 0x%x:0x%lx\n", ss, frame->tf_rbp); printf("code segment = base 0x%lx, limit 0x%lx, type 0x%x\n", softseg.ssd_base, softseg.ssd_limit, softseg.ssd_type); printf(" = DPL %d, pres %d, long %d, def32 %d, gran %d\n", softseg.ssd_dpl, softseg.ssd_p, softseg.ssd_long, softseg.ssd_def32, softseg.ssd_gran); printf("processor eflags = "); if (frame->tf_rflags & PSL_T) printf("trace trap, "); if (frame->tf_rflags & PSL_I) printf("interrupt enabled, "); if (frame->tf_rflags & PSL_NT) printf("nested task, "); if (frame->tf_rflags & PSL_RF) printf("resume, "); printf("IOPL = %ld\n", (frame->tf_rflags & PSL_IOPL) >> 12); printf("current process = "); if (curproc) { printf("%lu (%s)\n", (u_long)curproc->p_pid, curthread->td_name ? curthread->td_name : ""); } else { printf("Idle\n"); } #ifdef KDB if (debugger_on_panic || kdb_active) if (kdb_trap(type, 0, frame)) return; #endif printf("trap number = %d\n", type); if (type <= MAX_TRAP_MSG) panic("%s", trap_msg[type]); else panic("unknown/reserved trap"); } /* * Double fault handler. Called when a fault occurs while writing * a frame for a trap/exception onto the stack. This usually occurs * when the stack overflows (such is the case with infinite recursion, * for example). */ void dblfault_handler(struct trapframe *frame) { #ifdef KDTRACE_HOOKS if (dtrace_doubletrap_func != NULL) (*dtrace_doubletrap_func)(); #endif printf("\nFatal double fault\n"); printf("rip = 0x%lx\n", frame->tf_rip); printf("rsp = 0x%lx\n", frame->tf_rsp); printf("rbp = 0x%lx\n", frame->tf_rbp); #ifdef SMP /* two separate prints in case of a trap on an unmapped page */ printf("cpuid = %d; ", PCPU_GET(cpuid)); printf("apic id = %02x\n", PCPU_GET(apic_id)); #endif panic("double fault"); } int cpu_fetch_syscall_args(struct thread *td, struct syscall_args *sa) { struct proc *p; struct trapframe *frame; register_t *argp; caddr_t params; int reg, regcnt, error; p = td->td_proc; frame = td->td_frame; reg = 0; regcnt = 6; params = (caddr_t)frame->tf_rsp + sizeof(register_t); sa->code = frame->tf_rax; if (sa->code == SYS_syscall || sa->code == SYS___syscall) { sa->code = frame->tf_rdi; reg++; regcnt--; } if (p->p_sysent->sv_mask) sa->code &= p->p_sysent->sv_mask; if (sa->code >= p->p_sysent->sv_size) sa->callp = &p->p_sysent->sv_table[0]; else sa->callp = &p->p_sysent->sv_table[sa->code]; sa->narg = sa->callp->sy_narg; KASSERT(sa->narg <= sizeof(sa->args) / sizeof(sa->args[0]), ("Too many syscall arguments!")); error = 0; argp = &frame->tf_rdi; argp += reg; bcopy(argp, sa->args, sizeof(sa->args[0]) * regcnt); if (sa->narg > regcnt) { KASSERT(params != NULL, ("copyin args with no params!")); error = copyin(params, &sa->args[regcnt], (sa->narg - regcnt) * sizeof(sa->args[0])); } if (error == 0) { td->td_retval[0] = 0; td->td_retval[1] = frame->tf_rdx; } return (error); } #include "../../kern/subr_syscall.c" /* * System call handler for native binaries. The trap frame is already * set up by the assembler trampoline and a pointer to it is saved in * td_frame. */ void amd64_syscall(struct thread *td, int traced) { struct syscall_args sa; int error; ksiginfo_t ksi; #ifdef DIAGNOSTIC if (ISPL(td->td_frame->tf_cs) != SEL_UPL) { panic("syscall"); /* NOT REACHED */ } #endif error = syscallenter(td, &sa); /* * Traced syscall. */ if (__predict_false(traced)) { td->td_frame->tf_rflags &= ~PSL_T; ksiginfo_init_trap(&ksi); ksi.ksi_signo = SIGTRAP; ksi.ksi_code = TRAP_TRACE; ksi.ksi_addr = (void *)td->td_frame->tf_rip; trapsignal(td, &ksi); } KASSERT(PCB_USER_FPU(td->td_pcb), ("System call %s returing with kernel FPU ctx leaked", syscallname(td->td_proc, sa.code))); KASSERT(td->td_pcb->pcb_save == get_pcb_user_save_td(td), ("System call %s returning with mangled pcb_save", syscallname(td->td_proc, sa.code))); syscallret(td, error, &sa); /* * If the user-supplied value of %rip is not a canonical * address, then some CPUs will trigger a ring 0 #GP during * the sysret instruction. However, the fault handler would * execute in ring 0 with the user's %gs and %rsp which would * not be safe. Instead, use the full return path which * catches the problem safely. */ if (td->td_frame->tf_rip >= VM_MAXUSER_ADDRESS) set_pcb_flags(td->td_pcb, PCB_FULL_IRET); } Index: releng/9.3/sys/conf/newvers.sh =================================================================== --- releng/9.3/sys/conf/newvers.sh (revision 287146) +++ releng/9.3/sys/conf/newvers.sh (revision 287147) @@ -1,160 +1,160 @@ #!/bin/sh - # # Copyright (c) 1984, 1986, 1990, 1993 # The Regents of the University of California. All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 4. Neither the name of the University nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # @(#)newvers.sh 8.1 (Berkeley) 4/20/94 # $FreeBSD$ TYPE="FreeBSD" REVISION="9.3" -BRANCH="RELEASE-p23" +BRANCH="RELEASE-p24" if [ "X${BRANCH_OVERRIDE}" != "X" ]; then BRANCH=${BRANCH_OVERRIDE} fi RELEASE="${REVISION}-${BRANCH}" VERSION="${TYPE} ${RELEASE}" if [ "X${SYSDIR}" = "X" ]; then SYSDIR=$(dirname $0)/.. fi if [ "X${PARAMFILE}" != "X" ]; then RELDATE=$(awk '/__FreeBSD_version.*propagated to newvers/ {print $3}' \ ${PARAMFILE}) else RELDATE=$(awk '/__FreeBSD_version.*propagated to newvers/ {print $3}' \ ${SYSDIR}/sys/param.h) fi b=share/examples/etc/bsd-style-copyright year=`date '+%Y'` # look for copyright template for bsd_copyright in ../$b ../../$b ../../../$b /usr/src/$b /usr/$b do if [ -r "$bsd_copyright" ]; then COPYRIGHT=`sed \ -e "s/\[year\]/1992-$year/" \ -e 's/\[your name here\]\.* /The FreeBSD Project./' \ -e 's/\[your name\]\.*/The FreeBSD Project./' \ -e '/\[id for your version control system, if any\]/d' \ $bsd_copyright` break fi done # no copyright found, use a dummy if [ X"$COPYRIGHT" = X ]; then COPYRIGHT="/*- * Copyright (c) 1992-$year The FreeBSD Project. * All rights reserved. * */" fi # add newline COPYRIGHT="$COPYRIGHT " LC_ALL=C; export LC_ALL if [ ! -r version ] then echo 0 > version fi touch version v=`cat version` u=${USER:-root} d=`pwd` h=${HOSTNAME:-`hostname`} t=`date` i=`${MAKE:-make} -V KERN_IDENT` compiler_v=$($(${MAKE:-make} -V CC) -v 2>&1 | grep 'version') for dir in /bin /usr/bin /usr/local/bin; do if [ -x "${dir}/svnversion" ] ; then svnversion=${dir}/svnversion break fi done if [ -d "${SYSDIR}/../.git" ] ; then for dir in /bin /usr/bin /usr/local/bin; do if [ -x "${dir}/git" ] ; then git_cmd="${dir}/git --git-dir=${SYSDIR}/../.git" break fi done fi if [ -n "$svnversion" ] ; then echo "$svnversion" svn=`cd ${SYSDIR} && $svnversion` case "$svn" in [0-9]*) svn=" r${svn}" ;; *) unset svn ;; esac fi if [ -n "$git_cmd" ] ; then git=`$git_cmd rev-parse --verify --short HEAD 2>/dev/null` svn=`$git_cmd svn find-rev $git 2>/dev/null` if [ -n "$svn" ] ; then svn=" r${svn}" git="=${git}" else svn=`$git_cmd log | fgrep 'git-svn-id:' | head -1 | \ sed -n 's/^.*@\([0-9][0-9]*\).*$/\1/p'` if [ -z "$svn" ] ; then svn=`$git_cmd log --format='format:%N' | \ grep '^svn ' | head -1 | \ sed -n 's/^.*revision=\([0-9][0-9]*\).*$/\1/p'` fi if [ -n "$svn" ] ; then svn=" r${svn}" git="+${git}" else git=" ${git}" fi fi if $git_cmd --work-tree=${SYSDIR}/.. diff-index \ --name-only HEAD | read dummy; then git="${git}-dirty" fi fi cat << EOF > vers.c $COPYRIGHT #define SCCSSTR "@(#)${VERSION} #${v}${svn}${git}: ${t}" #define VERSTR "${VERSION} #${v}${svn}${git}: ${t}\\n ${u}@${h}:${d}\\n" #define RELSTR "${RELEASE}" char sccs[sizeof(SCCSSTR) > 128 ? sizeof(SCCSSTR) : 128] = SCCSSTR; char version[sizeof(VERSTR) > 256 ? sizeof(VERSTR) : 256] = VERSTR; char compiler_version[] = "${compiler_v}"; char ostype[] = "${TYPE}"; char osrelease[sizeof(RELSTR) > 32 ? sizeof(RELSTR) : 32] = RELSTR; int osreldate = ${RELDATE}; char kern_ident[] = "${i}"; EOF echo $((v + 1)) > version Index: releng/9.3/usr.sbin/pkg/pkg.c =================================================================== --- releng/9.3/usr.sbin/pkg/pkg.c (revision 287146) +++ releng/9.3/usr.sbin/pkg/pkg.c (revision 287147) @@ -1,941 +1,953 @@ /*- * Copyright (c) 2012-2014 Baptiste Daroussin * Copyright (c) 2013 Bryan Drewery * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #define _WITH_GETLINE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "dns_utils.h" #include "config.h" struct sig_cert { char *name; unsigned char *sig; int siglen; unsigned char *cert; int certlen; bool trusted; }; typedef enum { HASH_UNKNOWN, HASH_SHA256, } hash_t; struct fingerprint { hash_t type; char *name; char hash[BUFSIZ]; STAILQ_ENTRY(fingerprint) next; }; STAILQ_HEAD(fingerprint_list, fingerprint); static int extract_pkg_static(int fd, char *p, int sz) { struct archive *a; struct archive_entry *ae; char *end; int ret, r; ret = -1; a = archive_read_new(); if (a == NULL) { warn("archive_read_new"); return (ret); } archive_read_support_compression_all(a); archive_read_support_format_tar(a); if (lseek(fd, 0, 0) == -1) { warn("lseek"); goto cleanup; } if (archive_read_open_fd(a, fd, 4096) != ARCHIVE_OK) { warnx("archive_read_open_fd: %s", archive_error_string(a)); goto cleanup; } ae = NULL; while ((r = archive_read_next_header(a, &ae)) == ARCHIVE_OK) { end = strrchr(archive_entry_pathname(ae), '/'); if (end == NULL) continue; if (strcmp(end, "/pkg-static") == 0) { r = archive_read_extract(a, ae, ARCHIVE_EXTRACT_OWNER | ARCHIVE_EXTRACT_PERM | ARCHIVE_EXTRACT_TIME | ARCHIVE_EXTRACT_ACL | ARCHIVE_EXTRACT_FFLAGS | ARCHIVE_EXTRACT_XATTR); strlcpy(p, archive_entry_pathname(ae), sz); break; } } if (r == ARCHIVE_OK) ret = 0; else warnx("fail to extract pkg-static"); cleanup: archive_read_free(a); return (ret); } static int install_pkg_static(const char *path, const char *pkgpath, bool force) { int pstat; pid_t pid; switch ((pid = fork())) { case -1: return (-1); case 0: if (force) execl(path, "pkg-static", "add", "-f", pkgpath, (char *)NULL); else execl(path, "pkg-static", "add", pkgpath, (char *)NULL); _exit(1); default: break; } while (waitpid(pid, &pstat, 0) == -1) if (errno != EINTR) return (-1); if (WEXITSTATUS(pstat)) return (WEXITSTATUS(pstat)); else if (WIFSIGNALED(pstat)) return (128 & (WTERMSIG(pstat))); return (pstat); } static int fetch_to_fd(const char *url, char *path) { struct url *u; struct dns_srvinfo *mirrors, *current; struct url_stat st; FILE *remote; /* To store _https._tcp. + hostname + \0 */ int fd; int retry, max_retry; off_t done, r; time_t now, last; char buf[10240]; char zone[MAXHOSTNAMELEN + 13]; static const char *mirror_type = NULL; done = 0; last = 0; max_retry = 3; current = mirrors = NULL; remote = NULL; if (mirror_type == NULL && config_string(MIRROR_TYPE, &mirror_type) != 0) { warnx("No MIRROR_TYPE defined"); return (-1); } if ((fd = mkstemp(path)) == -1) { warn("mkstemp()"); return (-1); } retry = max_retry; u = fetchParseURL(url); while (remote == NULL) { if (retry == max_retry) { if (strcmp(u->scheme, "file") != 0 && strcasecmp(mirror_type, "srv") == 0) { snprintf(zone, sizeof(zone), "_%s._tcp.%s", u->scheme, u->host); mirrors = dns_getsrvinfo(zone); current = mirrors; } } if (mirrors != NULL) { strlcpy(u->host, current->host, sizeof(u->host)); u->port = current->port; } remote = fetchXGet(u, &st, ""); if (remote == NULL) { --retry; if (retry <= 0) goto fetchfail; if (mirrors == NULL) { sleep(1); } else { current = current->next; if (current == NULL) current = mirrors; } } } while (done < st.size) { if ((r = fread(buf, 1, sizeof(buf), remote)) < 1) break; if (write(fd, buf, r) != r) { warn("write()"); goto fetchfail; } done += r; now = time(NULL); if (now > last || done == st.size) last = now; } if (ferror(remote)) goto fetchfail; goto cleanup; fetchfail: if (fd != -1) { close(fd); fd = -1; unlink(path); } cleanup: if (remote != NULL) fclose(remote); return fd; } static struct fingerprint * parse_fingerprint(ucl_object_t *obj) { ucl_object_t *cur; ucl_object_iter_t it = NULL; const char *function, *fp, *key; struct fingerprint *f; hash_t fct = HASH_UNKNOWN; function = fp = NULL; while ((cur = ucl_iterate_object(obj, &it, true))) { key = ucl_object_key(cur); if (cur->type != UCL_STRING) continue; if (strcasecmp(key, "function") == 0) { function = ucl_object_tostring(cur); continue; } if (strcasecmp(key, "fingerprint") == 0) { fp = ucl_object_tostring(cur); continue; } } if (fp == NULL || function == NULL) return (NULL); if (strcasecmp(function, "sha256") == 0) fct = HASH_SHA256; if (fct == HASH_UNKNOWN) { warnx("Unsupported hashing function: %s", function); return (NULL); } f = calloc(1, sizeof(struct fingerprint)); f->type = fct; strlcpy(f->hash, fp, sizeof(f->hash)); return (f); } static void free_fingerprint_list(struct fingerprint_list* list) { struct fingerprint *fingerprint, *tmp; STAILQ_FOREACH_SAFE(fingerprint, list, next, tmp) { free(fingerprint->name); free(fingerprint); } free(list); } static struct fingerprint * load_fingerprint(const char *dir, const char *filename) { ucl_object_t *obj = NULL; struct ucl_parser *p = NULL; struct fingerprint *f; char path[MAXPATHLEN]; f = NULL; snprintf(path, MAXPATHLEN, "%s/%s", dir, filename); p = ucl_parser_new(0); if (!ucl_parser_add_file(p, path)) { warnx("%s: %s", path, ucl_parser_get_error(p)); ucl_parser_free(p); return (NULL); } obj = ucl_parser_get_object(p); if (obj->type == UCL_OBJECT) f = parse_fingerprint(obj); if (f != NULL) f->name = strdup(filename); ucl_object_free(obj); ucl_parser_free(p); return (f); } static struct fingerprint_list * load_fingerprints(const char *path, int *count) { DIR *d; struct dirent *ent; struct fingerprint *finger; struct fingerprint_list *fingerprints; *count = 0; fingerprints = calloc(1, sizeof(struct fingerprint_list)); if (fingerprints == NULL) return (NULL); STAILQ_INIT(fingerprints); if ((d = opendir(path)) == NULL) return (NULL); while ((ent = readdir(d))) { if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0) continue; finger = load_fingerprint(path, ent->d_name); if (finger != NULL) { STAILQ_INSERT_TAIL(fingerprints, finger, next); ++(*count); } } closedir(d); return (fingerprints); } static void sha256_hash(unsigned char hash[SHA256_DIGEST_LENGTH], char out[SHA256_DIGEST_LENGTH * 2 + 1]) { int i; for (i = 0; i < SHA256_DIGEST_LENGTH; i++) sprintf(out + (i * 2), "%02x", hash[i]); out[SHA256_DIGEST_LENGTH * 2] = '\0'; } static void sha256_buf_bin(char *buf, size_t len, char hash[SHA256_DIGEST_LENGTH]) { SHA256_CTX sha256; SHA256_Init(&sha256); SHA256_Update(&sha256, buf, len); SHA256_Final(hash, &sha256); } static void sha256_buf(char *buf, size_t len, char out[SHA256_DIGEST_LENGTH * 2 + 1]) { unsigned char hash[SHA256_DIGEST_LENGTH]; SHA256_CTX sha256; out[0] = '\0'; SHA256_Init(&sha256); SHA256_Update(&sha256, buf, len); SHA256_Final(hash, &sha256); sha256_hash(hash, out); } static int sha256_fd(int fd, char out[SHA256_DIGEST_LENGTH * 2 + 1]) { int my_fd; FILE *fp; char buffer[BUFSIZ]; unsigned char hash[SHA256_DIGEST_LENGTH]; size_t r; int ret; SHA256_CTX sha256; my_fd = -1; fp = NULL; r = 0; ret = 1; out[0] = '\0'; /* Duplicate the fd so that fclose(3) does not close it. */ if ((my_fd = dup(fd)) == -1) { warnx("dup"); goto cleanup; } if ((fp = fdopen(my_fd, "rb")) == NULL) { warnx("fdopen"); goto cleanup; } SHA256_Init(&sha256); while ((r = fread(buffer, 1, BUFSIZ, fp)) > 0) SHA256_Update(&sha256, buffer, r); if (ferror(fp) != 0) { warnx("fread"); goto cleanup; } SHA256_Final(hash, &sha256); sha256_hash(hash, out); ret = 0; cleanup: if (fp != NULL) fclose(fp); else if (my_fd != -1) close(my_fd); (void)lseek(fd, 0, SEEK_SET); return (ret); } static RSA * load_rsa_public_key_buf(unsigned char *cert, int certlen) { RSA *rsa = NULL; BIO *bp; char errbuf[1024]; bp = BIO_new_mem_buf((void *)cert, certlen); if (!PEM_read_bio_RSA_PUBKEY(bp, &rsa, NULL, NULL)) { warn("error reading public key: %s", ERR_error_string(ERR_get_error(), errbuf)); BIO_free(bp); return (NULL); } BIO_free(bp); return (rsa); } static bool rsa_verify_cert(int fd, unsigned char *key, int keylen, unsigned char *sig, int siglen) { char sha256[SHA256_DIGEST_LENGTH *2 +1]; char hash[SHA256_DIGEST_LENGTH]; char errbuf[1024]; RSA *rsa = NULL; int ret; lseek(fd, 0, SEEK_SET); sha256_fd(fd, sha256); SSL_load_error_strings(); OpenSSL_add_all_algorithms(); OpenSSL_add_all_ciphers(); sha256_buf_bin(sha256, strlen(sha256), hash); rsa = load_rsa_public_key_buf(key, keylen); if (rsa == NULL) return (false); ret = RSA_verify(NID_sha256, hash, sizeof(hash), sig, siglen, rsa); if (ret == 0) { warnx("%s: %s", key, ERR_error_string(ERR_get_error(), errbuf)); return (false); } RSA_free(rsa); ERR_free_strings(); return (true); } static struct sig_cert * parse_cert(int fd) { int my_fd; struct sig_cert *sc; FILE *fp; struct sbuf *buf, *sig, *cert; char *line; size_t linecap; ssize_t linelen; buf = NULL; my_fd = -1; sc = NULL; line = NULL; linecap = 0; if (lseek(fd, 0, 0) == -1) { warn("lseek"); return (NULL); } /* Duplicate the fd so that fclose(3) does not close it. */ if ((my_fd = dup(fd)) == -1) { warnx("dup"); return (NULL); } if ((fp = fdopen(my_fd, "rb")) == NULL) { warn("fdopen"); close(my_fd); return (NULL); } sig = sbuf_new_auto(); cert = sbuf_new_auto(); while ((linelen = getline(&line, &linecap, fp)) > 0) { if (strcmp(line, "SIGNATURE\n") == 0) { buf = sig; continue; } else if (strcmp(line, "CERT\n") == 0) { buf = cert; continue; } else if (strcmp(line, "END\n") == 0) { break; } if (buf != NULL) sbuf_bcat(buf, line, linelen); } fclose(fp); /* Trim out unrelated trailing newline */ sbuf_setpos(sig, sbuf_len(sig) - 1); sbuf_finish(sig); sbuf_finish(cert); sc = calloc(1, sizeof(struct sig_cert)); sc->siglen = sbuf_len(sig); sc->sig = calloc(1, sc->siglen); memcpy(sc->sig, sbuf_data(sig), sc->siglen); sc->certlen = sbuf_len(cert); sc->cert = strdup(sbuf_data(cert)); sbuf_delete(sig); sbuf_delete(cert); return (sc); } static bool verify_signature(int fd_pkg, int fd_sig) { struct fingerprint_list *trusted, *revoked; struct fingerprint *fingerprint; struct sig_cert *sc; bool ret; int trusted_count, revoked_count; const char *fingerprints; char path[MAXPATHLEN]; char hash[SHA256_DIGEST_LENGTH * 2 + 1]; sc = NULL; trusted = revoked = NULL; ret = false; /* Read and parse fingerprints. */ if (config_string(FINGERPRINTS, &fingerprints) != 0) { warnx("No CONFIG_FINGERPRINTS defined"); goto cleanup; } snprintf(path, MAXPATHLEN, "%s/trusted", fingerprints); if ((trusted = load_fingerprints(path, &trusted_count)) == NULL) { warnx("Error loading trusted certificates"); goto cleanup; } if (trusted_count == 0 || trusted == NULL) { fprintf(stderr, "No trusted certificates found.\n"); goto cleanup; } snprintf(path, MAXPATHLEN, "%s/revoked", fingerprints); if ((revoked = load_fingerprints(path, &revoked_count)) == NULL) { warnx("Error loading revoked certificates"); goto cleanup; } /* Read certificate and signature in. */ if ((sc = parse_cert(fd_sig)) == NULL) { warnx("Error parsing certificate"); goto cleanup; } /* Explicitly mark as non-trusted until proven otherwise. */ sc->trusted = false; /* Parse signature and pubkey out of the certificate */ sha256_buf(sc->cert, sc->certlen, hash); /* Check if this hash is revoked */ if (revoked != NULL) { STAILQ_FOREACH(fingerprint, revoked, next) { if (strcasecmp(fingerprint->hash, hash) == 0) { fprintf(stderr, "The package was signed with " "revoked certificate %s\n", fingerprint->name); goto cleanup; } } } STAILQ_FOREACH(fingerprint, trusted, next) { if (strcasecmp(fingerprint->hash, hash) == 0) { sc->trusted = true; sc->name = strdup(fingerprint->name); break; } } if (sc->trusted == false) { fprintf(stderr, "No trusted fingerprint found matching " "package's certificate\n"); goto cleanup; } /* Verify the signature. */ printf("Verifying signature with trusted certificate %s... ", sc->name); if (rsa_verify_cert(fd_pkg, sc->cert, sc->certlen, sc->sig, sc->siglen) == false) { printf("failed\n"); fprintf(stderr, "Signature is not valid\n"); goto cleanup; } printf("done\n"); ret = true; cleanup: if (trusted) free_fingerprint_list(trusted); if (revoked) free_fingerprint_list(revoked); if (sc) { free(sc->cert); free(sc->sig); free(sc->name); free(sc); } return (ret); } static int bootstrap_pkg(bool force) { int fd_pkg, fd_sig; int ret; char url[MAXPATHLEN]; char tmppkg[MAXPATHLEN]; char tmpsig[MAXPATHLEN]; const char *packagesite; const char *signature_type; char pkgstatic[MAXPATHLEN]; fd_sig = -1; ret = -1; if (config_string(PACKAGESITE, &packagesite) != 0) { warnx("No PACKAGESITE defined"); return (-1); } if (config_string(SIGNATURE_TYPE, &signature_type) != 0) { warnx("Error looking up SIGNATURE_TYPE"); return (-1); } printf("Bootstrapping pkg from %s, please wait...\n", packagesite); /* Support pkg+http:// for PACKAGESITE which is the new format in 1.2 to avoid confusion on why http://pkg.FreeBSD.org has no A record. */ if (strncmp(URL_SCHEME_PREFIX, packagesite, strlen(URL_SCHEME_PREFIX)) == 0) packagesite += strlen(URL_SCHEME_PREFIX); snprintf(url, MAXPATHLEN, "%s/Latest/pkg.txz", packagesite); snprintf(tmppkg, MAXPATHLEN, "%s/pkg.txz.XXXXXX", getenv("TMPDIR") ? getenv("TMPDIR") : _PATH_TMP); if ((fd_pkg = fetch_to_fd(url, tmppkg)) == -1) goto fetchfail; if (signature_type != NULL && - strcasecmp(signature_type, "FINGERPRINTS") == 0) { + strcasecmp(signature_type, "NONE") != 0) { + if (strcasecmp(signature_type, "FINGERPRINTS") != 0) { + warnx("Signature type %s is not supported for " + "bootstrapping.", signature_type); + goto cleanup; + } + snprintf(tmpsig, MAXPATHLEN, "%s/pkg.txz.sig.XXXXXX", getenv("TMPDIR") ? getenv("TMPDIR") : _PATH_TMP); snprintf(url, MAXPATHLEN, "%s/Latest/pkg.txz.sig", packagesite); if ((fd_sig = fetch_to_fd(url, tmpsig)) == -1) { fprintf(stderr, "Signature for pkg not available.\n"); goto fetchfail; } if (verify_signature(fd_pkg, fd_sig) == false) goto cleanup; } if ((ret = extract_pkg_static(fd_pkg, pkgstatic, MAXPATHLEN)) == 0) ret = install_pkg_static(pkgstatic, tmppkg, force); goto cleanup; fetchfail: warnx("Error fetching %s: %s", url, fetchLastErrString); fprintf(stderr, "A pre-built version of pkg could not be found for " "your system.\n"); fprintf(stderr, "Consider changing PACKAGESITE or installing it from " "ports: 'ports-mgmt/pkg'.\n"); cleanup: if (fd_sig != -1) { close(fd_sig); unlink(tmpsig); } close(fd_pkg); unlink(tmppkg); return (ret); } static const char confirmation_message[] = "The package management tool is not yet installed on your system.\n" "Do you want to fetch and install it now? [y/N]: "; static const char non_interactive_message[] = "The package management tool is not yet installed on your system.\n" "Please set ASSUME_ALWAYS_YES=yes environment variable to be able to bootstrap " "in non-interactive (stdin not being a tty)\n"; static int pkg_query_yes_no(void) { int ret, c; c = getchar(); if (c == 'y' || c == 'Y') ret = 1; else ret = 0; while (c != '\n' && c != EOF) c = getchar(); return (ret); } static int bootstrap_pkg_local(const char *pkgpath, bool force) { char path[MAXPATHLEN]; char pkgstatic[MAXPATHLEN]; const char *signature_type; int fd_pkg, fd_sig, ret; fd_sig = -1; ret = -1; fd_pkg = open(pkgpath, O_RDONLY); if (fd_pkg == -1) err(EXIT_FAILURE, "Unable to open %s", pkgpath); if (config_string(SIGNATURE_TYPE, &signature_type) != 0) { warnx("Error looking up SIGNATURE_TYPE"); return (-1); } if (signature_type != NULL && - strcasecmp(signature_type, "FINGERPRINTS") == 0) { + strcasecmp(signature_type, "NONE") != 0) { + if (strcasecmp(signature_type, "FINGERPRINTS") != 0) { + warnx("Signature type %s is not supported for " + "bootstrapping.", signature_type); + goto cleanup; + } + snprintf(path, sizeof(path), "%s.sig", pkgpath); if ((fd_sig = open(path, O_RDONLY)) == -1) { fprintf(stderr, "Signature for pkg not available.\n"); goto cleanup; } if (verify_signature(fd_pkg, fd_sig) == false) goto cleanup; } if ((ret = extract_pkg_static(fd_pkg, pkgstatic, MAXPATHLEN)) == 0) ret = install_pkg_static(pkgstatic, pkgpath, force); cleanup: close(fd_pkg); if (fd_sig != -1) close(fd_sig); return (ret); } int main(int argc, char *argv[]) { char pkgpath[MAXPATHLEN]; const char *pkgarg; bool bootstrap_only, force, yes; bootstrap_only = false; force = false; pkgarg = NULL; yes = false; snprintf(pkgpath, MAXPATHLEN, "%s/sbin/pkg", getenv("LOCALBASE") ? getenv("LOCALBASE") : _LOCALBASE); if (argc > 1 && strcmp(argv[1], "bootstrap") == 0) { bootstrap_only = true; if (argc == 3 && strcmp(argv[2], "-f") == 0) force = true; } if ((bootstrap_only && force) || access(pkgpath, X_OK) == -1) { /* * To allow 'pkg -N' to be used as a reliable test for whether * a system is configured to use pkg, don't bootstrap pkg * when that argument is given as argv[1]. */ if (argv[1] != NULL && strcmp(argv[1], "-N") == 0) errx(EXIT_FAILURE, "pkg is not installed"); config_init(); if (argc > 1 && strcmp(argv[1], "add") == 0) { if (argc > 2 && strcmp(argv[2], "-f") == 0) { force = true; pkgarg = argv[3]; } else pkgarg = argv[2]; if (pkgarg == NULL) { fprintf(stderr, "Path to pkg.txz required\n"); exit(EXIT_FAILURE); } if (access(pkgarg, R_OK) == -1) { fprintf(stderr, "No such file: %s\n", pkgarg); exit(EXIT_FAILURE); } if (bootstrap_pkg_local(pkgarg, force) != 0) exit(EXIT_FAILURE); exit(EXIT_SUCCESS); } /* * Do not ask for confirmation if either of stdin or stdout is * not tty. Check the environment to see if user has answer * tucked in there already. */ config_bool(ASSUME_ALWAYS_YES, &yes); if (!yes) { if (!isatty(fileno(stdin))) { fprintf(stderr, non_interactive_message); exit(EXIT_FAILURE); } printf("%s", confirmation_message); if (pkg_query_yes_no() == 0) exit(EXIT_FAILURE); } if (bootstrap_pkg(force) != 0) exit(EXIT_FAILURE); config_finish(); if (bootstrap_only) exit(EXIT_SUCCESS); } else if (bootstrap_only) { printf("pkg already bootstrapped at %s\n", pkgpath); exit(EXIT_SUCCESS); } execv(pkgpath, argv); /* NOT REACHED */ return (EXIT_FAILURE); }