Index: head/UPDATING =================================================================== --- head/UPDATING (revision 279749) +++ head/UPDATING (revision 279750) @@ -1,1156 +1,1162 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. NOTE TO PEOPLE WHO THINK THAT FreeBSD 11.x IS SLOW: FreeBSD 11.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely disable the most expensive debugging functionality run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) +20150307: + The 32-bit PowerPC kernel has been changed to a position-independent + executable. This can only be booted with a version of loader(8) + newer than January 31, 2015, so make sure to update both world and + kernel before rebooting. + 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The rc.d/jail script has been updated to support jail(8) configuration file. The "jail__*" rc.conf(5) variables for per-jail configuration are automatically converted to /var/run/jail..conf before the jail(8) utility is invoked. This is transparently backward compatible. See below about some incompatibilities and rc.conf(5) manual page for more details. These variables are now deprecated in favor of jail(8) configuration file. One can use "rc.d/jail config " command to generate a jail(8) configuration file in /var/run/jail..conf without running the jail(8) utility. The default pathname of the configuration file is /etc/jail.conf and can be specified by using $jail_conf or $jail__conf variables. Please note that jail_devfs_ruleset accepts an integer at this moment. Please consider to rewrite the ruleset name with an integer. 20130930: BIND has been removed from the base system. If all you need is a local resolver, simply enable and start the local_unbound service instead. Otherwise, several versions of BIND are available in the ports tree. The dns/bind99 port is one example. With this change, nslookup(1) and dig(1) are no longer in the base system. Users should instead use host(1) and drill(1) which are in the base system. Alternatively, nslookup and dig can be obtained by installing the dns/bind-tools port. 20130916: With the addition of unbound(8), a new unbound user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20130911: OpenSSH is now built with DNSSEC support, and will by default silently trust signed SSHFP records. This can be controlled with the VerifyHostKeyDNS client configuration setting. DNSSEC support can be disabled entirely with the WITHOUT_LDNS option in src.conf. 20130906: The GNU Compiler Collection and C++ standard library (libstdc++) are no longer built by default on platforms where clang is the system compiler. You can enable them with the WITH_GCC and WITH_GNUCXX options in src.conf. 20130905: The PROCDESC kernel option is now part of the GENERIC kernel configuration and is required for the rwhod(8) to work. If you are using custom kernel configuration, you should include 'options PROCDESC'. 20130905: The API and ABI related to the Capsicum framework was modified in backward incompatible way. The userland libraries and programs have to be recompiled to work with the new kernel. This includes the following libraries and programs, but the whole buildworld is advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl, kdump, procstat, rwho, rwhod, uniq. 20130903: AES-NI intrinsic support has been added to gcc. The AES-NI module has been updated to use this support. A new gcc is required to build the aesni module on both i386 and amd64. 20130821: The PADLOCK_RNG and RDRAND_RNG kernel options are now devices. Thus "device padlock_rng" and "device rdrand_rng" should be used instead of "options PADLOCK_RNG" & "options RDRAND_RNG". 20130813: WITH_ICONV has been split into two feature sets. WITH_ICONV now enables just the iconv* functionality and is now on by default. WITH_LIBICONV_COMPAT enables the libiconv api and link time compatability. Set WITHOUT_ICONV to build the old way. If you have been using WITH_ICONV before, you will very likely need to turn on WITH_LIBICONV_COMPAT. 20130806: INVARIANTS option now enables DEBUG for code with OpenSolaris and Illumos origin, including ZFS. If you have INVARIANTS in your kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG explicitly. DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS) locks if WITNESS option was set. Because that generated a lot of witness(9) reports and all of them were believed to be false positives, this is no longer done. New option OPENSOLARIS_WITNESS can be used to achieve the previous behavior. 20130806: Timer values in IPv6 data structures now use time_uptime instead of time_second. Although this is not a user-visible functional change, userland utilities which directly use them---ndp(8), rtadvd(8), and rtsold(8) in the base system---need to be updated to r253970 or later. 20130802: find -delete can now delete the pathnames given as arguments, instead of only files found below them or if the pathname did not contain any slashes. Formerly, the following error message would result: find: -delete: : relative path potentially not safe Deleting the pathnames given as arguments can be prevented without error messages using -mindepth 1 or by changing directory and passing "." as argument to find. This works in the old as well as the new version of find. 20130726: Behavior of devfs rules path matching has been changed. Pattern is now always matched against fully qualified devfs path and slash characters must be explicitly matched by slashes in pattern (FNM_PATHNAME). Rulesets involving devfs subdirectories must be reviewed. 20130716: The default ARM ABI has changed to the ARM EABI. The old ABI is incompatible with the ARM EABI and all programs and modules will need to be rebuilt to work with a new kernel. To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set. NOTE: Support for the old ABI will be removed in the future and users are advised to upgrade. 20130709: pkg_install has been disconnected from the build if you really need it you should add WITH_PKGTOOLS in your src.conf(5). 20130709: Most of network statistics structures were changed to be able keep 64-bits counters. Thus all tools, that work with networking statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.) 20130629: Fix targets that run multiple make's to use && rather than ; so that subsequent steps depend on success of previous. NOTE: if building 'universe' with -j* on stable/8 or stable/9 it would be better to start the build using bmake, to avoid overloading the machine. 20130618: Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file. 20130615: CVS has been removed from the base system. An exact copy of the code is available from the devel/cvs port. 20130613: Some people report the following error after the switch to bmake: make: illegal option -- J usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable] ... *** [buildworld] Error code 2 this likely due to an old instance of make in ${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE}) which src/Makefile will use that blindly, if it exists, so if you see the above error: rm -rf `make -V MAKEPATH` should resolve it. 20130516: Use bmake by default. Whereas before one could choose to build with bmake via -DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old make. The goal is to remove these knobs for 10-RELEASE. It is worth noting that bmake (like gmake) treats the command line as the unit of failure, rather than statements within the command line. Thus '(cd some/where && dosomething)' is safer than 'cd some/where; dosomething'. The '()' allows consistent behavior in parallel build. 20130429: Fix a bug that allows NFS clients to issue READDIR on files. 20130426: The WITHOUT_IDEA option has been removed because the IDEA patent expired. 20130426: The sysctl which controls TRIM support under ZFS has been renamed from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been enabled by default. 20130425: The mergemaster command now uses the default MAKEOBJDIRPREFIX rather than creating it's own in the temporary directory in order allow access to bootstrapped versions of tools such as install and mtree. When upgrading from version of FreeBSD where the install command does not support -l, you will need to install a new mergemaster command if mergemaster -p is required. This can be accomplished with the command (cd src/usr.sbin/mergemaster && make install). 20130404: Legacy ATA stack, disabled and replaced by new CAM-based one since FreeBSD 9.0, completely removed from the sources. Kernel modules atadisk and atapi*, user-level tools atacontrol and burncd are removed. Kernel option `options ATA_CAM` is now permanently enabled and removed. 20130319: SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2) and socketpair(2). Software, in particular Kerberos, may automatically detect and use these during building. The resulting binaries will not work on older kernels. 20130308: CTL_DISABLE has also been added to the sparc64 GENERIC (for further information, see the respective 20130304 entry). 20130304: Recent commits to callout(9) changed the size of struct callout, so the KBI is probably heavily disturbed. Also, some functions in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced by macros. Every kernel module using it won't load, so rebuild is requested. The ctl device has been re-enabled in GENERIC for i386 and amd64, but does not initialize by default (because of the new CTL_DISABLE option) to save memory. To re-enable it, remove the CTL_DISABLE option from the kernel config file or set kern.cam.ctl.disable=0 in /boot/loader.conf. 20130301: The ctl device has been disabled in GENERIC for i386 and amd64. This was done due to the extra memory being allocated at system initialisation time by the ctl driver which was only used if a CAM target device was created. This makes a FreeBSD system unusable on 128MB or less of RAM. 20130208: A new compression method (lz4) has been merged to -HEAD. Please refer to zpool-features(7) for more information. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20130129: A BSD-licensed patch(1) variant has been added and is installed as bsdpatch, being the GNU version the default patch. To inverse the logic and use the BSD-licensed one as default, while having the GNU version installed as gnupatch, rebuild and install world with the WITH_BSD_PATCH knob set. 20130121: Due to the use of the new -l option to install(1) during build and install, you must take care not to directly set the INSTALL make variable in your /etc/make.conf, /etc/src.conf, or on the command line. If you wish to use the -C flag for all installs you may be able to add INSTALL+=-C to /etc/make.conf or /etc/src.conf. 20130118: The install(1) option -M has changed meaning and now takes an argument that is a file or path to append logs to. In the unlikely event that -M was the last option on the command line and the command line contained at least two files and a target directory the first file will have logs appended to it. The -M option served little practical purpose in the last decade so its use is expected to be extremely rare. 20121223: After switching to Clang as the default compiler some users of ZFS on i386 systems started to experience stack overflow kernel panics. Please consider using 'options KSTACK_PAGES=4' in such configurations. 20121222: GEOM_LABEL now mangles label names read from file system metadata. Mangling affect labels containing spaces, non-printable characters, '%' or '"'. Device names in /etc/fstab and other places may need to be updated. 20121217: By default, only the 10 most recent kernel dumps will be saved. To restore the previous behaviour (no limit on the number of kernel dumps stored in the dump directory) add the following line to /etc/rc.conf: savecore_flags="" 20121201: With the addition of auditdistd(8), a new auditdistd user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20121117: The sin6_scope_id member variable in struct sockaddr_in6 is now filled by the kernel before passing the structure to the userland via sysctl or routing socket. This means the KAME-specific embedded scope id in sin6_addr.s6_addr[2] is always cleared in userland application. This behavior can be controlled by net.inet6.ip6.deembed_scopeid. __FreeBSD_version is bumped to 1000025. 20121105: On i386 and amd64 systems WITH_CLANG_IS_CC is now the default. This means that the world and kernel will be compiled with clang and that clang will be installed as /usr/bin/cc, /usr/bin/c++, and /usr/bin/cpp. To disable this behavior and revert to building with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions of current may need to bootstrap WITHOUT_CLANG first if the clang build fails (its compatibility window doesn't extend to the 9 stable branch point). 20121102: The IPFIREWALL_FORWARD kernel option has been removed. Its functionality now turned on by default. 20121023: The ZERO_COPY_SOCKET kernel option has been removed and split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP. NB: SOCKET_SEND_COW uses the VM page based copy-on-write mechanism which is not safe and may result in kernel crashes. NB: The SOCKET_RECV_PFLIP mechanism is useless as no current driver supports disposeable external page sized mbuf storage. Proper replacements for both zero-copy mechanisms are under consideration and will eventually lead to complete removal of the two kernel options. 20121023: The IPv4 network stack has been converted to network byte order. The following modules need to be recompiled together with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4), pf(4), ipfw(4), ng_ipfw(4), stf(4). 20121022: Support for non-MPSAFE filesystems was removed from VFS. The VFS_VERSION was bumped, all filesystem modules shall be recompiled. 20121018: All the non-MPSAFE filesystems have been disconnected from the build. The full list includes: codafs, hpfs, ntfs, nwfs, portalfs, smbfs, xfs. 20121016: The interface cloning API and ABI has changed. The following modules need to be recompiled together with kernel: ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4), vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4), faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4). 20121015: The sdhci driver was split in two parts: sdhci (generic SD Host Controller logic) and sdhci_pci (actual hardware driver). No kernel config modifications are required, but if you load sdhc as a module you must switch to sdhci_pci instead. 20121014: Import the FUSE kernel and userland support into base system. 20121013: The GNU sort(1) program has been removed since the BSD-licensed sort(1) has been the default for quite some time and no serious problems have been reported. The corresponding WITH_GNU_SORT knob has also gone. 20121006: The pfil(9) API/ABI for AF_INET family has been changed. Packet filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled with new kernel. 20121001: The net80211(4) ABI has been changed to allow for improved driver PS-POLL and power-save support. All wireless drivers need to be recompiled to work with the new kernel. 20120913: The random(4) support for the VIA hardware random number generator (`PADLOCK') is no longer enabled unconditionally. Add the padlock_rng device in the custom kernel config if needed. The GENERIC kernels on i386 and amd64 do include the device, so the change only affects the custom kernel configurations. 20120908: The pf(4) packet filter ABI has been changed. pfctl(8) and snmp_pf module need to be recompiled to work with new kernel. 20120828: A new ZFS feature flag "com.delphix:empty_bpobj" has been merged to -HEAD. Pools that have empty_bpobj in active state can not be imported read-write with ZFS implementations that do not support this feature. For more information read the zpool-features(5) manual page. 20120727: The sparc64 ZFS loader has been changed to no longer try to auto- detect ZFS providers based on diskN aliases but now requires these to be explicitly listed in the OFW boot-device environment variable. 20120712: The OpenSSL has been upgraded to 1.0.1c. Any binaries requiring libcrypto.so.6 or libssl.so.6 must be recompiled. Also, there are configuration changes. Make sure to merge /etc/ssl/openssl.cnf. 20120712: The following sysctls and tunables have been renamed for consistency with other variables: kern.cam.da.da_send_ordered -> kern.cam.da.send_ordered kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered 20120628: The sort utility has been replaced with BSD sort. For now, GNU sort is also available as "gnusort" or the default can be set back to GNU sort by setting WITH_GNU_SORT. In this case, BSD sort will be installed as "bsdsort". 20120611: A new version of ZFS (pool version 5000) has been merged to -HEAD. Starting with this version the old system of ZFS pool versioning is superseded by "feature flags". This concept enables forward compatibility against certain future changes in functionality of ZFS pools. The first read-only compatible "feature flag" for ZFS pools is named "com.delphix:async_destroy". For more information read the new zpool-features(5) manual page. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20120417: The malloc(3) implementation embedded in libc now uses sources imported as contrib/jemalloc. The most disruptive API change is to /etc/malloc.conf. If your system has an old-style /etc/malloc.conf, delete it prior to installworld, and optionally re-create it using the new format after rebooting. See malloc.conf(5) for details (specifically the TUNING section and the "opt.*" entries in the MALLCTL NAMESPACE section). 20120328: Big-endian MIPS TARGET_ARCH values no longer end in "eb". mips64eb is now spelled mips64. mipsn32eb is now spelled mipsn32. mipseb is now spelled mips. This is to aid compatibility with third-party software that expects this naming scheme in uname(3). Little-endian settings are unchanged. If you are updating a big-endian mips64 machine from before this change, you may need to set MACHINE_ARCH=mips64 in your environment before the new build system will recognize your machine. 20120306: Disable by default the option VFS_ALLOW_NONMPSAFE for all supported platforms. 20120229: Now unix domain sockets behave "as expected" on nullfs(5). Previously nullfs(5) did not pass through all behaviours to the underlying layer, as a result if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. 20120211: The getifaddrs upgrade path broken with 20111215 has been restored. If you have upgraded in between 20111215 and 20120209 you need to recompile libc again with your kernel. You still need to recompile world to be able to configure CARP but this restriction already comes from 20111215. 20120114: The set_rcvar() function has been removed from /etc/rc.subr. All base and ports rc.d scripts have been updated, so if you have a port installed with a script in /usr/local/etc/rc.d you can either hand-edit the rcvar= line, or reinstall the port. An easy way to handle the mass-update of /etc/rc.d: rm /etc/rc.d/* && mergemaster -i 20120109: panic(9) now stops other CPUs in the SMP systems, disables interrupts on the current CPU and prevents other threads from running. This behavior can be reverted using the kern.stop_scheduler_on_panic tunable/sysctl. The new behavior can be incompatible with kern.sync_on_panic. 20111215: The carp(4) facility has been changed significantly. Configuration of the CARP protocol via ifconfig(8) has changed, as well as format of CARP events submitted to devd(8) has changed. See manual pages for more information. The arpbalance feature of carp(4) is currently not supported anymore. Size of struct in_aliasreq, struct in6_aliasreq has changed. User utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8), need to be recompiled. 20111122: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. 20111108: The option VFS_ALLOW_NONMPSAFE option has been added in order to explicitely support non-MPSAFE filesystems. It is on by default for all supported platform at this present time. 20111101: The broken amd(4) driver has been replaced with esp(4) in the amd64, i386 and pc98 GENERIC kernel configuration files. 20110930: sysinstall has been removed 20110923: The stable/9 branch created in subversion. This corresponds to the RELENG_9 branch in CVS. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach before reporting problems with a major version upgrade. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To just build a kernel when you know that it won't mess you up -------------------------------------------------------------- This assumes you are already running a CURRENT system. Replace ${arch} with the architecture of your machine (e.g. "i386", "arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc). cd src/sys/${arch}/conf config KERNEL_NAME_HERE cd ../compile/KERNEL_NAME_HERE make depend make make install If this fails, go to the "To build a kernel" section. To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make kernel KERNCONF=YOUR_KERNEL_HERE [8] [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a cd src adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a noop. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] In order to have a kernel that can run the 4.x binaries needed to do an installworld, you must include the COMPAT_FREEBSD4 option in your kernel. Failure to do so may leave you with a system that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is required to run the 5.x binaries on more recent kernels. And so on for COMPAT_FREEBSD6 and COMPAT_FREEBSD7. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. [9] When checking out sources, you must include the -P flag to have cvs prune empty directories. If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since October 10, 2007. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. All Rights Reserved. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: head/sys/conf/Makefile.powerpc =================================================================== --- head/sys/conf/Makefile.powerpc (revision 279749) +++ head/sys/conf/Makefile.powerpc (revision 279750) @@ -1,63 +1,62 @@ # Makefile.powerpc -- with config changes. # Copyright 1990 W. Jolitz # from: @(#)Makefile.i386 7.1 5/10/91 # $FreeBSD$ # # Makefile for FreeBSD # # This makefile is constructed from a machine description: # config machineid # Most changes should be made in the machine description # /sys/powerpc/conf/``machineid'' # after which you should do # config machineid # Generic makefile changes should be made in # /sys/conf/Makefile.powerpc # after which config should be rerun for all machines. # # Which version of config(8) is required. %VERSREQ= 600012 STD8X16FONT?= iso .if !defined(S) .if exists(./@/.) S= ./@ .else S= ../../.. .endif .endif LDSCRIPT_NAME?= ldscript.${MACHINE_ARCH} .include "$S/conf/kern.pre.mk" INCLUDES+= -I$S/contrib/libfdt CFLAGS+= -msoft-float -Wa,-many -.if ${MACHINE_ARCH} == "powerpc64" +# Build position-independent kernel CFLAGS+= -fPIC LDFLAGS+= -pie -.endif .if !empty(DDB_ENABLED) CFLAGS+= -fno-omit-frame-pointer .endif %BEFORE_DEPEND %OBJS %FILES.c %FILES.s %FILES.m %CLEAN %RULES .include "$S/conf/kern.post.mk" Index: head/sys/kern/link_elf.c =================================================================== --- head/sys/kern/link_elf.c (revision 279749) +++ head/sys/kern/link_elf.c (revision 279750) @@ -1,1656 +1,1656 @@ /*- * Copyright (c) 1998-2000 Doug Rabson * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_ddb.h" #include "opt_gdb.h" #include #include #ifdef GPROF #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef SPARSE_MAPPING #include #include #include #endif #include #include #include #ifdef DDB_CTF #include #endif #include "linker_if.h" #define MAXSEGS 4 typedef struct elf_file { struct linker_file lf; /* Common fields */ int preloaded; /* Was file pre-loaded */ caddr_t address; /* Relocation address */ #ifdef SPARSE_MAPPING vm_object_t object; /* VM object to hold file pages */ #endif Elf_Dyn *dynamic; /* Symbol table etc. */ Elf_Hashelt nbuckets; /* DT_HASH info */ Elf_Hashelt nchains; const Elf_Hashelt *buckets; const Elf_Hashelt *chains; caddr_t hash; caddr_t strtab; /* DT_STRTAB */ int strsz; /* DT_STRSZ */ const Elf_Sym *symtab; /* DT_SYMTAB */ Elf_Addr *got; /* DT_PLTGOT */ const Elf_Rel *pltrel; /* DT_JMPREL */ int pltrelsize; /* DT_PLTRELSZ */ const Elf_Rela *pltrela; /* DT_JMPREL */ int pltrelasize; /* DT_PLTRELSZ */ const Elf_Rel *rel; /* DT_REL */ int relsize; /* DT_RELSZ */ const Elf_Rela *rela; /* DT_RELA */ int relasize; /* DT_RELASZ */ caddr_t modptr; const Elf_Sym *ddbsymtab; /* The symbol table we are using */ long ddbsymcnt; /* Number of symbols */ caddr_t ddbstrtab; /* String table */ long ddbstrcnt; /* number of bytes in string table */ caddr_t symbase; /* malloc'ed symbold base */ caddr_t strbase; /* malloc'ed string base */ caddr_t ctftab; /* CTF table */ long ctfcnt; /* number of bytes in CTF table */ caddr_t ctfoff; /* CTF offset table */ caddr_t typoff; /* Type offset table */ long typlen; /* Number of type entries. */ Elf_Addr pcpu_start; /* Pre-relocation pcpu set start. */ Elf_Addr pcpu_stop; /* Pre-relocation pcpu set stop. */ Elf_Addr pcpu_base; /* Relocated pcpu set address. */ #ifdef VIMAGE Elf_Addr vnet_start; /* Pre-relocation vnet set start. */ Elf_Addr vnet_stop; /* Pre-relocation vnet set stop. */ Elf_Addr vnet_base; /* Relocated vnet set address. */ #endif #ifdef GDB struct link_map gdb; /* hooks for gdb */ #endif } *elf_file_t; struct elf_set { Elf_Addr es_start; Elf_Addr es_stop; Elf_Addr es_base; TAILQ_ENTRY(elf_set) es_link; }; TAILQ_HEAD(elf_set_head, elf_set); #include static int link_elf_link_common_finish(linker_file_t); static int link_elf_link_preload(linker_class_t cls, const char *, linker_file_t *); static int link_elf_link_preload_finish(linker_file_t); static int link_elf_load_file(linker_class_t, const char *, linker_file_t *); static int link_elf_lookup_symbol(linker_file_t, const char *, c_linker_sym_t *); static int link_elf_symbol_values(linker_file_t, c_linker_sym_t, linker_symval_t *); static int link_elf_search_symbol(linker_file_t, caddr_t, c_linker_sym_t *, long *); static void link_elf_unload_file(linker_file_t); static void link_elf_unload_preload(linker_file_t); static int link_elf_lookup_set(linker_file_t, const char *, void ***, void ***, int *); static int link_elf_each_function_name(linker_file_t, int (*)(const char *, void *), void *); static int link_elf_each_function_nameval(linker_file_t, linker_function_nameval_callback_t, void *); static void link_elf_reloc_local(linker_file_t); static long link_elf_symtab_get(linker_file_t, const Elf_Sym **); static long link_elf_strtab_get(linker_file_t, caddr_t *); static Elf_Addr elf_lookup(linker_file_t, Elf_Size, int); static kobj_method_t link_elf_methods[] = { KOBJMETHOD(linker_lookup_symbol, link_elf_lookup_symbol), KOBJMETHOD(linker_symbol_values, link_elf_symbol_values), KOBJMETHOD(linker_search_symbol, link_elf_search_symbol), KOBJMETHOD(linker_unload, link_elf_unload_file), KOBJMETHOD(linker_load_file, link_elf_load_file), KOBJMETHOD(linker_link_preload, link_elf_link_preload), KOBJMETHOD(linker_link_preload_finish, link_elf_link_preload_finish), KOBJMETHOD(linker_lookup_set, link_elf_lookup_set), KOBJMETHOD(linker_each_function_name, link_elf_each_function_name), KOBJMETHOD(linker_each_function_nameval, link_elf_each_function_nameval), KOBJMETHOD(linker_ctf_get, link_elf_ctf_get), KOBJMETHOD(linker_symtab_get, link_elf_symtab_get), KOBJMETHOD(linker_strtab_get, link_elf_strtab_get), { 0, 0 } }; static struct linker_class link_elf_class = { #if ELF_TARG_CLASS == ELFCLASS32 "elf32", #else "elf64", #endif link_elf_methods, sizeof(struct elf_file) }; static int parse_dynamic(elf_file_t); static int relocate_file(elf_file_t); static int link_elf_preload_parse_symbols(elf_file_t); static struct elf_set_head set_pcpu_list; #ifdef VIMAGE static struct elf_set_head set_vnet_list; #endif static void elf_set_add(struct elf_set_head *list, Elf_Addr start, Elf_Addr stop, Elf_Addr base) { struct elf_set *set, *iter; set = malloc(sizeof(*set), M_LINKER, M_WAITOK); set->es_start = start; set->es_stop = stop; set->es_base = base; TAILQ_FOREACH(iter, list, es_link) { KASSERT((set->es_start < iter->es_start && set->es_stop < iter->es_stop) || (set->es_start > iter->es_start && set->es_stop > iter->es_stop), ("linker sets intersection: to insert: 0x%jx-0x%jx; inserted: 0x%jx-0x%jx", (uintmax_t)set->es_start, (uintmax_t)set->es_stop, (uintmax_t)iter->es_start, (uintmax_t)iter->es_stop)); if (iter->es_start > set->es_start) { TAILQ_INSERT_BEFORE(iter, set, es_link); break; } } if (iter == NULL) TAILQ_INSERT_TAIL(list, set, es_link); } static int elf_set_find(struct elf_set_head *list, Elf_Addr addr, Elf_Addr *start, Elf_Addr *base) { struct elf_set *set; TAILQ_FOREACH(set, list, es_link) { if (addr < set->es_start) return (0); if (addr < set->es_stop) { *start = set->es_start; *base = set->es_base; return (1); } } return (0); } static void elf_set_delete(struct elf_set_head *list, Elf_Addr start) { struct elf_set *set; TAILQ_FOREACH(set, list, es_link) { if (start < set->es_start) break; if (start == set->es_start) { TAILQ_REMOVE(list, set, es_link); free(set, M_LINKER); return; } } KASSERT(0, ("deleting unknown linker set (start = 0x%jx)", (uintmax_t)start)); } #ifdef GDB static void r_debug_state(struct r_debug *, struct link_map *); /* * A list of loaded modules for GDB to use for loading symbols. */ struct r_debug r_debug; #define GDB_STATE(s) do { \ r_debug.r_state = s; r_debug_state(NULL, NULL); \ } while (0) /* * Function for the debugger to set a breakpoint on to gain control. */ static void r_debug_state(struct r_debug *dummy_one __unused, struct link_map *dummy_two __unused) { } static void link_elf_add_gdb(struct link_map *l) { struct link_map *prev; l->l_next = NULL; if (r_debug.r_map == NULL) { /* Add first. */ l->l_prev = NULL; r_debug.r_map = l; } else { /* Append to list. */ for (prev = r_debug.r_map; prev->l_next != NULL; prev = prev->l_next) ; l->l_prev = prev; prev->l_next = l; } } static void link_elf_delete_gdb(struct link_map *l) { if (l->l_prev == NULL) { /* Remove first. */ if ((r_debug.r_map = l->l_next) != NULL) l->l_next->l_prev = NULL; } else { /* Remove any but first. */ if ((l->l_prev->l_next = l->l_next) != NULL) l->l_next->l_prev = l->l_prev; } } #endif /* GDB */ /* * The kernel symbol table starts here. */ extern struct _dynamic _DYNAMIC; static void link_elf_error(const char *filename, const char *s) { if (filename == NULL) printf("kldload: %s\n", s); else printf("kldload: %s: %s\n", filename, s); } static void link_elf_invoke_ctors(caddr_t addr, size_t size) { void (**ctor)(void); size_t i, cnt; if (addr == NULL || size == 0) return; cnt = size / sizeof(*ctor); ctor = (void *)addr; for (i = 0; i < cnt; i++) { if (ctor[i] != NULL) (*ctor[i])(); } } /* * Actions performed after linking/loading both the preloaded kernel and any * modules; whether preloaded or dynamicly loaded. */ static int link_elf_link_common_finish(linker_file_t lf) { #ifdef GDB elf_file_t ef = (elf_file_t)lf; char *newfilename; #endif int error; /* Notify MD code that a module is being loaded. */ error = elf_cpu_load_file(lf); if (error != 0) return (error); #ifdef GDB GDB_STATE(RT_ADD); ef->gdb.l_addr = lf->address; newfilename = malloc(strlen(lf->filename) + 1, M_LINKER, M_WAITOK); strcpy(newfilename, lf->filename); ef->gdb.l_name = newfilename; ef->gdb.l_ld = ef->dynamic; link_elf_add_gdb(&ef->gdb); GDB_STATE(RT_CONSISTENT); #endif /* Invoke .ctors */ link_elf_invoke_ctors(lf->ctors_addr, lf->ctors_size); return (0); } extern vm_offset_t __startkernel; static void link_elf_init(void* arg) { Elf_Dyn *dp; Elf_Addr *ctors_addrp; Elf_Size *ctors_sizep; caddr_t modptr, baseptr, sizeptr; elf_file_t ef; char *modname; linker_add_class(&link_elf_class); dp = (Elf_Dyn *)&_DYNAMIC; modname = NULL; modptr = preload_search_by_type("elf" __XSTRING(__ELF_WORD_SIZE) " kernel"); if (modptr == NULL) modptr = preload_search_by_type("elf kernel"); if (modptr != NULL) modname = (char *)preload_search_info(modptr, MODINFO_NAME); if (modname == NULL) modname = "kernel"; linker_kernel_file = linker_make_file(modname, &link_elf_class); if (linker_kernel_file == NULL) panic("%s: Can't create linker structures for kernel", __func__); ef = (elf_file_t) linker_kernel_file; ef->preloaded = 1; -#ifdef __powerpc64__ +#ifdef __powerpc__ ef->address = (caddr_t) (__startkernel - KERNBASE); #else ef->address = 0; #endif #ifdef SPARSE_MAPPING ef->object = 0; #endif ef->dynamic = dp; if (dp != NULL) parse_dynamic(ef); linker_kernel_file->address += KERNBASE; linker_kernel_file->size = -(intptr_t)linker_kernel_file->address; if (modptr != NULL) { ef->modptr = modptr; baseptr = preload_search_info(modptr, MODINFO_ADDR); if (baseptr != NULL) linker_kernel_file->address = *(caddr_t *)baseptr; sizeptr = preload_search_info(modptr, MODINFO_SIZE); if (sizeptr != NULL) linker_kernel_file->size = *(size_t *)sizeptr; ctors_addrp = (Elf_Addr *)preload_search_info(modptr, MODINFO_METADATA | MODINFOMD_CTORS_ADDR); ctors_sizep = (Elf_Size *)preload_search_info(modptr, MODINFO_METADATA | MODINFOMD_CTORS_SIZE); if (ctors_addrp != NULL && ctors_sizep != NULL) { linker_kernel_file->ctors_addr = ef->address + *ctors_addrp; linker_kernel_file->ctors_size = *ctors_sizep; } } (void)link_elf_preload_parse_symbols(ef); #ifdef GDB r_debug.r_map = NULL; r_debug.r_brk = r_debug_state; r_debug.r_state = RT_CONSISTENT; #endif (void)link_elf_link_common_finish(linker_kernel_file); linker_kernel_file->flags |= LINKER_FILE_LINKED; TAILQ_INIT(&set_pcpu_list); #ifdef VIMAGE TAILQ_INIT(&set_vnet_list); #endif } SYSINIT(link_elf, SI_SUB_KLD, SI_ORDER_THIRD, link_elf_init, 0); static int link_elf_preload_parse_symbols(elf_file_t ef) { caddr_t pointer; caddr_t ssym, esym, base; caddr_t strtab; int strcnt; Elf_Sym *symtab; int symcnt; if (ef->modptr == NULL) return (0); pointer = preload_search_info(ef->modptr, MODINFO_METADATA | MODINFOMD_SSYM); if (pointer == NULL) return (0); ssym = *(caddr_t *)pointer; pointer = preload_search_info(ef->modptr, MODINFO_METADATA | MODINFOMD_ESYM); if (pointer == NULL) return (0); esym = *(caddr_t *)pointer; base = ssym; symcnt = *(long *)base; base += sizeof(long); symtab = (Elf_Sym *)base; base += roundup(symcnt, sizeof(long)); if (base > esym || base < ssym) { printf("Symbols are corrupt!\n"); return (EINVAL); } strcnt = *(long *)base; base += sizeof(long); strtab = base; base += roundup(strcnt, sizeof(long)); if (base > esym || base < ssym) { printf("Symbols are corrupt!\n"); return (EINVAL); } ef->ddbsymtab = symtab; ef->ddbsymcnt = symcnt / sizeof(Elf_Sym); ef->ddbstrtab = strtab; ef->ddbstrcnt = strcnt; return (0); } static int parse_dynamic(elf_file_t ef) { Elf_Dyn *dp; int plttype = DT_REL; for (dp = ef->dynamic; dp->d_tag != DT_NULL; dp++) { switch (dp->d_tag) { case DT_HASH: { /* From src/libexec/rtld-elf/rtld.c */ const Elf_Hashelt *hashtab = (const Elf_Hashelt *) (ef->address + dp->d_un.d_ptr); ef->nbuckets = hashtab[0]; ef->nchains = hashtab[1]; ef->buckets = hashtab + 2; ef->chains = ef->buckets + ef->nbuckets; break; } case DT_STRTAB: ef->strtab = (caddr_t) (ef->address + dp->d_un.d_ptr); break; case DT_STRSZ: ef->strsz = dp->d_un.d_val; break; case DT_SYMTAB: ef->symtab = (Elf_Sym*) (ef->address + dp->d_un.d_ptr); break; case DT_SYMENT: if (dp->d_un.d_val != sizeof(Elf_Sym)) return (ENOEXEC); break; case DT_PLTGOT: ef->got = (Elf_Addr *) (ef->address + dp->d_un.d_ptr); break; case DT_REL: ef->rel = (const Elf_Rel *) (ef->address + dp->d_un.d_ptr); break; case DT_RELSZ: ef->relsize = dp->d_un.d_val; break; case DT_RELENT: if (dp->d_un.d_val != sizeof(Elf_Rel)) return (ENOEXEC); break; case DT_JMPREL: ef->pltrel = (const Elf_Rel *) (ef->address + dp->d_un.d_ptr); break; case DT_PLTRELSZ: ef->pltrelsize = dp->d_un.d_val; break; case DT_RELA: ef->rela = (const Elf_Rela *) (ef->address + dp->d_un.d_ptr); break; case DT_RELASZ: ef->relasize = dp->d_un.d_val; break; case DT_RELAENT: if (dp->d_un.d_val != sizeof(Elf_Rela)) return (ENOEXEC); break; case DT_PLTREL: plttype = dp->d_un.d_val; if (plttype != DT_REL && plttype != DT_RELA) return (ENOEXEC); break; #ifdef GDB case DT_DEBUG: dp->d_un.d_ptr = (Elf_Addr)&r_debug; break; #endif } } if (plttype == DT_RELA) { ef->pltrela = (const Elf_Rela *)ef->pltrel; ef->pltrel = NULL; ef->pltrelasize = ef->pltrelsize; ef->pltrelsize = 0; } ef->ddbsymtab = ef->symtab; ef->ddbsymcnt = ef->nchains; ef->ddbstrtab = ef->strtab; ef->ddbstrcnt = ef->strsz; return (0); } static int parse_dpcpu(elf_file_t ef) { int count; int error; ef->pcpu_start = 0; ef->pcpu_stop = 0; error = link_elf_lookup_set(&ef->lf, "pcpu", (void ***)&ef->pcpu_start, (void ***)&ef->pcpu_stop, &count); /* Error just means there is no pcpu set to relocate. */ if (error != 0) return (0); count *= sizeof(void *); /* * Allocate space in the primary pcpu area. Copy in our * initialization from the data section and then initialize * all per-cpu storage from that. */ ef->pcpu_base = (Elf_Addr)(uintptr_t)dpcpu_alloc(count); if (ef->pcpu_base == 0) return (ENOSPC); memcpy((void *)ef->pcpu_base, (void *)ef->pcpu_start, count); dpcpu_copy((void *)ef->pcpu_base, count); elf_set_add(&set_pcpu_list, ef->pcpu_start, ef->pcpu_stop, ef->pcpu_base); return (0); } #ifdef VIMAGE static int parse_vnet(elf_file_t ef) { int count; int error; ef->vnet_start = 0; ef->vnet_stop = 0; error = link_elf_lookup_set(&ef->lf, "vnet", (void ***)&ef->vnet_start, (void ***)&ef->vnet_stop, &count); /* Error just means there is no vnet data set to relocate. */ if (error != 0) return (0); count *= sizeof(void *); /* * Allocate space in the primary vnet area. Copy in our * initialization from the data section and then initialize * all per-vnet storage from that. */ ef->vnet_base = (Elf_Addr)(uintptr_t)vnet_data_alloc(count); if (ef->vnet_base == 0) return (ENOSPC); memcpy((void *)ef->vnet_base, (void *)ef->vnet_start, count); vnet_data_copy((void *)ef->vnet_base, count); elf_set_add(&set_vnet_list, ef->vnet_start, ef->vnet_stop, ef->vnet_base); return (0); } #endif static int link_elf_link_preload(linker_class_t cls, const char* filename, linker_file_t *result) { Elf_Addr *ctors_addrp; Elf_Size *ctors_sizep; caddr_t modptr, baseptr, sizeptr, dynptr; char *type; elf_file_t ef; linker_file_t lf; int error; vm_offset_t dp; /* Look to see if we have the file preloaded */ modptr = preload_search_by_name(filename); if (modptr == NULL) return (ENOENT); type = (char *)preload_search_info(modptr, MODINFO_TYPE); baseptr = preload_search_info(modptr, MODINFO_ADDR); sizeptr = preload_search_info(modptr, MODINFO_SIZE); dynptr = preload_search_info(modptr, MODINFO_METADATA | MODINFOMD_DYNAMIC); if (type == NULL || (strcmp(type, "elf" __XSTRING(__ELF_WORD_SIZE) " module") != 0 && strcmp(type, "elf module") != 0)) return (EFTYPE); if (baseptr == NULL || sizeptr == NULL || dynptr == NULL) return (EINVAL); lf = linker_make_file(filename, &link_elf_class); if (lf == NULL) return (ENOMEM); ef = (elf_file_t) lf; ef->preloaded = 1; ef->modptr = modptr; ef->address = *(caddr_t *)baseptr; #ifdef SPARSE_MAPPING ef->object = 0; #endif dp = (vm_offset_t)ef->address + *(vm_offset_t *)dynptr; ef->dynamic = (Elf_Dyn *)dp; lf->address = ef->address; lf->size = *(size_t *)sizeptr; ctors_addrp = (Elf_Addr *)preload_search_info(modptr, MODINFO_METADATA | MODINFOMD_CTORS_ADDR); ctors_sizep = (Elf_Size *)preload_search_info(modptr, MODINFO_METADATA | MODINFOMD_CTORS_SIZE); if (ctors_addrp != NULL && ctors_sizep != NULL) { lf->ctors_addr = ef->address + *ctors_addrp; lf->ctors_size = *ctors_sizep; } error = parse_dynamic(ef); if (error == 0) error = parse_dpcpu(ef); #ifdef VIMAGE if (error == 0) error = parse_vnet(ef); #endif if (error != 0) { linker_file_unload(lf, LINKER_UNLOAD_FORCE); return (error); } link_elf_reloc_local(lf); *result = lf; return (0); } static int link_elf_link_preload_finish(linker_file_t lf) { elf_file_t ef; int error; ef = (elf_file_t) lf; error = relocate_file(ef); if (error != 0) return (error); (void)link_elf_preload_parse_symbols(ef); return (link_elf_link_common_finish(lf)); } static int link_elf_load_file(linker_class_t cls, const char* filename, linker_file_t* result) { struct nameidata nd; struct thread* td = curthread; /* XXX */ Elf_Ehdr *hdr; caddr_t firstpage; int nbytes, i; Elf_Phdr *phdr; Elf_Phdr *phlimit; Elf_Phdr *segs[MAXSEGS]; int nsegs; Elf_Phdr *phdyn; Elf_Phdr *phphdr; caddr_t mapbase; size_t mapsize; Elf_Off base_offset; Elf_Addr base_vaddr; Elf_Addr base_vlimit; int error = 0; ssize_t resid; int flags; elf_file_t ef; linker_file_t lf; Elf_Shdr *shdr; int symtabindex; int symstrindex; int shstrindex; int symcnt; int strcnt; char *shstrs; shdr = NULL; lf = NULL; shstrs = NULL; NDINIT(&nd, LOOKUP, FOLLOW, UIO_SYSSPACE, filename, td); flags = FREAD; error = vn_open(&nd, &flags, 0, NULL); if (error != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); if (nd.ni_vp->v_type != VREG) { error = ENOEXEC; firstpage = NULL; goto out; } #ifdef MAC error = mac_kld_check_load(curthread->td_ucred, nd.ni_vp); if (error != 0) { firstpage = NULL; goto out; } #endif /* * Read the elf header from the file. */ firstpage = malloc(PAGE_SIZE, M_LINKER, M_WAITOK); hdr = (Elf_Ehdr *)firstpage; error = vn_rdwr(UIO_READ, nd.ni_vp, firstpage, PAGE_SIZE, 0, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, &resid, td); nbytes = PAGE_SIZE - resid; if (error != 0) goto out; if (!IS_ELF(*hdr)) { error = ENOEXEC; goto out; } if (hdr->e_ident[EI_CLASS] != ELF_TARG_CLASS || hdr->e_ident[EI_DATA] != ELF_TARG_DATA) { link_elf_error(filename, "Unsupported file layout"); error = ENOEXEC; goto out; } if (hdr->e_ident[EI_VERSION] != EV_CURRENT || hdr->e_version != EV_CURRENT) { link_elf_error(filename, "Unsupported file version"); error = ENOEXEC; goto out; } if (hdr->e_type != ET_EXEC && hdr->e_type != ET_DYN) { error = ENOSYS; goto out; } if (hdr->e_machine != ELF_TARG_MACH) { link_elf_error(filename, "Unsupported machine"); error = ENOEXEC; goto out; } /* * We rely on the program header being in the first page. * This is not strictly required by the ABI specification, but * it seems to always true in practice. And, it simplifies * things considerably. */ if (!((hdr->e_phentsize == sizeof(Elf_Phdr)) && (hdr->e_phoff + hdr->e_phnum*sizeof(Elf_Phdr) <= PAGE_SIZE) && (hdr->e_phoff + hdr->e_phnum*sizeof(Elf_Phdr) <= nbytes))) link_elf_error(filename, "Unreadable program headers"); /* * Scan the program header entries, and save key information. * * We rely on there being exactly two load segments, text and data, * in that order. */ phdr = (Elf_Phdr *) (firstpage + hdr->e_phoff); phlimit = phdr + hdr->e_phnum; nsegs = 0; phdyn = NULL; phphdr = NULL; while (phdr < phlimit) { switch (phdr->p_type) { case PT_LOAD: if (nsegs == MAXSEGS) { link_elf_error(filename, "Too many sections"); error = ENOEXEC; goto out; } /* * XXX: We just trust they come in right order ?? */ segs[nsegs] = phdr; ++nsegs; break; case PT_PHDR: phphdr = phdr; break; case PT_DYNAMIC: phdyn = phdr; break; case PT_INTERP: error = ENOSYS; goto out; } ++phdr; } if (phdyn == NULL) { link_elf_error(filename, "Object is not dynamically-linked"); error = ENOEXEC; goto out; } if (nsegs == 0) { link_elf_error(filename, "No sections"); error = ENOEXEC; goto out; } /* * Allocate the entire address space of the object, to stake * out our contiguous region, and to establish the base * address for relocation. */ base_offset = trunc_page(segs[0]->p_offset); base_vaddr = trunc_page(segs[0]->p_vaddr); base_vlimit = round_page(segs[nsegs - 1]->p_vaddr + segs[nsegs - 1]->p_memsz); mapsize = base_vlimit - base_vaddr; lf = linker_make_file(filename, &link_elf_class); if (lf == NULL) { error = ENOMEM; goto out; } ef = (elf_file_t) lf; #ifdef SPARSE_MAPPING ef->object = vm_object_allocate(OBJT_DEFAULT, mapsize >> PAGE_SHIFT); if (ef->object == NULL) { error = ENOMEM; goto out; } ef->address = (caddr_t) vm_map_min(kernel_map); error = vm_map_find(kernel_map, ef->object, 0, (vm_offset_t *) &ef->address, mapsize, 0, VMFS_OPTIMAL_SPACE, VM_PROT_ALL, VM_PROT_ALL, 0); if (error != 0) { vm_object_deallocate(ef->object); ef->object = 0; goto out; } #else ef->address = malloc(mapsize, M_LINKER, M_WAITOK); #endif mapbase = ef->address; /* * Read the text and data sections and zero the bss. */ for (i = 0; i < nsegs; i++) { caddr_t segbase = mapbase + segs[i]->p_vaddr - base_vaddr; error = vn_rdwr(UIO_READ, nd.ni_vp, segbase, segs[i]->p_filesz, segs[i]->p_offset, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, &resid, td); if (error != 0) goto out; bzero(segbase + segs[i]->p_filesz, segs[i]->p_memsz - segs[i]->p_filesz); #ifdef SPARSE_MAPPING /* * Wire down the pages */ error = vm_map_wire(kernel_map, (vm_offset_t) segbase, (vm_offset_t) segbase + segs[i]->p_memsz, VM_MAP_WIRE_SYSTEM|VM_MAP_WIRE_NOHOLES); if (error != KERN_SUCCESS) { error = ENOMEM; goto out; } #endif } #ifdef GPROF /* Update profiling information with the new text segment. */ mtx_lock(&Giant); kmupetext((uintfptr_t)(mapbase + segs[0]->p_vaddr - base_vaddr + segs[0]->p_memsz)); mtx_unlock(&Giant); #endif ef->dynamic = (Elf_Dyn *) (mapbase + phdyn->p_vaddr - base_vaddr); lf->address = ef->address; lf->size = mapsize; error = parse_dynamic(ef); if (error != 0) goto out; error = parse_dpcpu(ef); if (error != 0) goto out; #ifdef VIMAGE error = parse_vnet(ef); if (error != 0) goto out; #endif link_elf_reloc_local(lf); VOP_UNLOCK(nd.ni_vp, 0); error = linker_load_dependencies(lf); vn_lock(nd.ni_vp, LK_EXCLUSIVE | LK_RETRY); if (error != 0) goto out; error = relocate_file(ef); if (error != 0) goto out; /* * Try and load the symbol table if it's present. (you can * strip it!) */ nbytes = hdr->e_shnum * hdr->e_shentsize; if (nbytes == 0 || hdr->e_shoff == 0) goto nosyms; shdr = malloc(nbytes, M_LINKER, M_WAITOK | M_ZERO); error = vn_rdwr(UIO_READ, nd.ni_vp, (caddr_t)shdr, nbytes, hdr->e_shoff, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, &resid, td); if (error != 0) goto out; /* Read section string table */ shstrindex = hdr->e_shstrndx; if (shstrindex != 0 && shdr[shstrindex].sh_type == SHT_STRTAB && shdr[shstrindex].sh_size != 0) { nbytes = shdr[shstrindex].sh_size; shstrs = malloc(nbytes, M_LINKER, M_WAITOK | M_ZERO); error = vn_rdwr(UIO_READ, nd.ni_vp, (caddr_t)shstrs, nbytes, shdr[shstrindex].sh_offset, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, &resid, td); if (error) goto out; } symtabindex = -1; symstrindex = -1; for (i = 0; i < hdr->e_shnum; i++) { if (shdr[i].sh_type == SHT_SYMTAB) { symtabindex = i; symstrindex = shdr[i].sh_link; } else if (shstrs != NULL && shdr[i].sh_name != 0 && strcmp(shstrs + shdr[i].sh_name, ".ctors") == 0) { /* Record relocated address and size of .ctors. */ lf->ctors_addr = mapbase + shdr[i].sh_addr - base_vaddr; lf->ctors_size = shdr[i].sh_size; } } if (symtabindex < 0 || symstrindex < 0) goto nosyms; symcnt = shdr[symtabindex].sh_size; ef->symbase = malloc(symcnt, M_LINKER, M_WAITOK); strcnt = shdr[symstrindex].sh_size; ef->strbase = malloc(strcnt, M_LINKER, M_WAITOK); error = vn_rdwr(UIO_READ, nd.ni_vp, ef->symbase, symcnt, shdr[symtabindex].sh_offset, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, &resid, td); if (error != 0) goto out; error = vn_rdwr(UIO_READ, nd.ni_vp, ef->strbase, strcnt, shdr[symstrindex].sh_offset, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, &resid, td); if (error != 0) goto out; ef->ddbsymcnt = symcnt / sizeof(Elf_Sym); ef->ddbsymtab = (const Elf_Sym *)ef->symbase; ef->ddbstrcnt = strcnt; ef->ddbstrtab = ef->strbase; nosyms: error = link_elf_link_common_finish(lf); if (error != 0) goto out; *result = lf; out: VOP_UNLOCK(nd.ni_vp, 0); vn_close(nd.ni_vp, FREAD, td->td_ucred, td); if (error != 0 && lf != NULL) linker_file_unload(lf, LINKER_UNLOAD_FORCE); if (shdr != NULL) free(shdr, M_LINKER); if (firstpage != NULL) free(firstpage, M_LINKER); if (shstrs != NULL) free(shstrs, M_LINKER); return (error); } Elf_Addr elf_relocaddr(linker_file_t lf, Elf_Addr x) { elf_file_t ef; ef = (elf_file_t)lf; if (x >= ef->pcpu_start && x < ef->pcpu_stop) return ((x - ef->pcpu_start) + ef->pcpu_base); #ifdef VIMAGE if (x >= ef->vnet_start && x < ef->vnet_stop) return ((x - ef->vnet_start) + ef->vnet_base); #endif return (x); } static void link_elf_unload_file(linker_file_t file) { elf_file_t ef = (elf_file_t) file; if (ef->pcpu_base != 0) { dpcpu_free((void *)ef->pcpu_base, ef->pcpu_stop - ef->pcpu_start); elf_set_delete(&set_pcpu_list, ef->pcpu_start); } #ifdef VIMAGE if (ef->vnet_base != 0) { vnet_data_free((void *)ef->vnet_base, ef->vnet_stop - ef->vnet_start); elf_set_delete(&set_vnet_list, ef->vnet_start); } #endif #ifdef GDB if (ef->gdb.l_ld != NULL) { GDB_STATE(RT_DELETE); free((void *)(uintptr_t)ef->gdb.l_name, M_LINKER); link_elf_delete_gdb(&ef->gdb); GDB_STATE(RT_CONSISTENT); } #endif /* Notify MD code that a module is being unloaded. */ elf_cpu_unload_file(file); if (ef->preloaded) { link_elf_unload_preload(file); return; } #ifdef SPARSE_MAPPING if (ef->object != NULL) { vm_map_remove(kernel_map, (vm_offset_t) ef->address, (vm_offset_t) ef->address + (ef->object->size << PAGE_SHIFT)); } #else if (ef->address != NULL) free(ef->address, M_LINKER); #endif if (ef->symbase != NULL) free(ef->symbase, M_LINKER); if (ef->strbase != NULL) free(ef->strbase, M_LINKER); if (ef->ctftab != NULL) free(ef->ctftab, M_LINKER); if (ef->ctfoff != NULL) free(ef->ctfoff, M_LINKER); if (ef->typoff != NULL) free(ef->typoff, M_LINKER); } static void link_elf_unload_preload(linker_file_t file) { if (file->filename != NULL) preload_delete_name(file->filename); } static const char * symbol_name(elf_file_t ef, Elf_Size r_info) { const Elf_Sym *ref; if (ELF_R_SYM(r_info)) { ref = ef->symtab + ELF_R_SYM(r_info); return (ef->strtab + ref->st_name); } return (NULL); } static int relocate_file(elf_file_t ef) { const Elf_Rel *rellim; const Elf_Rel *rel; const Elf_Rela *relalim; const Elf_Rela *rela; const char *symname; /* Perform relocations without addend if there are any: */ rel = ef->rel; if (rel != NULL) { rellim = (const Elf_Rel *) ((const char *)ef->rel + ef->relsize); while (rel < rellim) { if (elf_reloc(&ef->lf, (Elf_Addr)ef->address, rel, ELF_RELOC_REL, elf_lookup)) { symname = symbol_name(ef, rel->r_info); printf("link_elf: symbol %s undefined\n", symname); return (ENOENT); } rel++; } } /* Perform relocations with addend if there are any: */ rela = ef->rela; if (rela != NULL) { relalim = (const Elf_Rela *) ((const char *)ef->rela + ef->relasize); while (rela < relalim) { if (elf_reloc(&ef->lf, (Elf_Addr)ef->address, rela, ELF_RELOC_RELA, elf_lookup)) { symname = symbol_name(ef, rela->r_info); printf("link_elf: symbol %s undefined\n", symname); return (ENOENT); } rela++; } } /* Perform PLT relocations without addend if there are any: */ rel = ef->pltrel; if (rel != NULL) { rellim = (const Elf_Rel *) ((const char *)ef->pltrel + ef->pltrelsize); while (rel < rellim) { if (elf_reloc(&ef->lf, (Elf_Addr)ef->address, rel, ELF_RELOC_REL, elf_lookup)) { symname = symbol_name(ef, rel->r_info); printf("link_elf: symbol %s undefined\n", symname); return (ENOENT); } rel++; } } /* Perform relocations with addend if there are any: */ rela = ef->pltrela; if (rela != NULL) { relalim = (const Elf_Rela *) ((const char *)ef->pltrela + ef->pltrelasize); while (rela < relalim) { if (elf_reloc(&ef->lf, (Elf_Addr)ef->address, rela, ELF_RELOC_RELA, elf_lookup)) { symname = symbol_name(ef, rela->r_info); printf("link_elf: symbol %s undefined\n", symname); return (ENOENT); } rela++; } } return (0); } /* * Hash function for symbol table lookup. Don't even think about changing * this. It is specified by the System V ABI. */ static unsigned long elf_hash(const char *name) { const unsigned char *p = (const unsigned char *) name; unsigned long h = 0; unsigned long g; while (*p != '\0') { h = (h << 4) + *p++; if ((g = h & 0xf0000000) != 0) h ^= g >> 24; h &= ~g; } return (h); } static int link_elf_lookup_symbol(linker_file_t lf, const char* name, c_linker_sym_t* sym) { elf_file_t ef = (elf_file_t) lf; unsigned long symnum; const Elf_Sym* symp; const char *strp; unsigned long hash; int i; /* If we don't have a hash, bail. */ if (ef->buckets == NULL || ef->nbuckets == 0) { printf("link_elf_lookup_symbol: missing symbol hash table\n"); return (ENOENT); } /* First, search hashed global symbols */ hash = elf_hash(name); symnum = ef->buckets[hash % ef->nbuckets]; while (symnum != STN_UNDEF) { if (symnum >= ef->nchains) { printf("%s: corrupt symbol table\n", __func__); return (ENOENT); } symp = ef->symtab + symnum; if (symp->st_name == 0) { printf("%s: corrupt symbol table\n", __func__); return (ENOENT); } strp = ef->strtab + symp->st_name; if (strcmp(name, strp) == 0) { if (symp->st_shndx != SHN_UNDEF || (symp->st_value != 0 && ELF_ST_TYPE(symp->st_info) == STT_FUNC)) { *sym = (c_linker_sym_t) symp; return (0); } return (ENOENT); } symnum = ef->chains[symnum]; } /* If we have not found it, look at the full table (if loaded) */ if (ef->symtab == ef->ddbsymtab) return (ENOENT); /* Exhaustive search */ for (i = 0, symp = ef->ddbsymtab; i < ef->ddbsymcnt; i++, symp++) { strp = ef->ddbstrtab + symp->st_name; if (strcmp(name, strp) == 0) { if (symp->st_shndx != SHN_UNDEF || (symp->st_value != 0 && ELF_ST_TYPE(symp->st_info) == STT_FUNC)) { *sym = (c_linker_sym_t) symp; return (0); } return (ENOENT); } } return (ENOENT); } static int link_elf_symbol_values(linker_file_t lf, c_linker_sym_t sym, linker_symval_t *symval) { elf_file_t ef = (elf_file_t) lf; const Elf_Sym* es = (const Elf_Sym*) sym; if (es >= ef->symtab && es < (ef->symtab + ef->nchains)) { symval->name = ef->strtab + es->st_name; symval->value = (caddr_t) ef->address + es->st_value; symval->size = es->st_size; return (0); } if (ef->symtab == ef->ddbsymtab) return (ENOENT); if (es >= ef->ddbsymtab && es < (ef->ddbsymtab + ef->ddbsymcnt)) { symval->name = ef->ddbstrtab + es->st_name; symval->value = (caddr_t) ef->address + es->st_value; symval->size = es->st_size; return (0); } return (ENOENT); } static int link_elf_search_symbol(linker_file_t lf, caddr_t value, c_linker_sym_t *sym, long *diffp) { elf_file_t ef = (elf_file_t) lf; u_long off = (uintptr_t) (void *) value; u_long diff = off; u_long st_value; const Elf_Sym* es; const Elf_Sym* best = 0; int i; for (i = 0, es = ef->ddbsymtab; i < ef->ddbsymcnt; i++, es++) { if (es->st_name == 0) continue; st_value = es->st_value + (uintptr_t) (void *) ef->address; if (off >= st_value) { if (off - st_value < diff) { diff = off - st_value; best = es; if (diff == 0) break; } else if (off - st_value == diff) { best = es; } } } if (best == 0) *diffp = off; else *diffp = diff; *sym = (c_linker_sym_t) best; return (0); } /* * Look up a linker set on an ELF system. */ static int link_elf_lookup_set(linker_file_t lf, const char *name, void ***startp, void ***stopp, int *countp) { c_linker_sym_t sym; linker_symval_t symval; char *setsym; void **start, **stop; int len, error = 0, count; len = strlen(name) + sizeof("__start_set_"); /* sizeof includes \0 */ setsym = malloc(len, M_LINKER, M_WAITOK); /* get address of first entry */ snprintf(setsym, len, "%s%s", "__start_set_", name); error = link_elf_lookup_symbol(lf, setsym, &sym); if (error != 0) goto out; link_elf_symbol_values(lf, sym, &symval); if (symval.value == 0) { error = ESRCH; goto out; } start = (void **)symval.value; /* get address of last entry */ snprintf(setsym, len, "%s%s", "__stop_set_", name); error = link_elf_lookup_symbol(lf, setsym, &sym); if (error != 0) goto out; link_elf_symbol_values(lf, sym, &symval); if (symval.value == 0) { error = ESRCH; goto out; } stop = (void **)symval.value; /* and the number of entries */ count = stop - start; /* and copy out */ if (startp != NULL) *startp = start; if (stopp != NULL) *stopp = stop; if (countp != NULL) *countp = count; out: free(setsym, M_LINKER); return (error); } static int link_elf_each_function_name(linker_file_t file, int (*callback)(const char *, void *), void *opaque) { elf_file_t ef = (elf_file_t)file; const Elf_Sym *symp; int i, error; /* Exhaustive search */ for (i = 0, symp = ef->ddbsymtab; i < ef->ddbsymcnt; i++, symp++) { if (symp->st_value != 0 && ELF_ST_TYPE(symp->st_info) == STT_FUNC) { error = callback(ef->ddbstrtab + symp->st_name, opaque); if (error != 0) return (error); } } return (0); } static int link_elf_each_function_nameval(linker_file_t file, linker_function_nameval_callback_t callback, void *opaque) { linker_symval_t symval; elf_file_t ef = (elf_file_t)file; const Elf_Sym* symp; int i, error; /* Exhaustive search */ for (i = 0, symp = ef->ddbsymtab; i < ef->ddbsymcnt; i++, symp++) { if (symp->st_value != 0 && ELF_ST_TYPE(symp->st_info) == STT_FUNC) { error = link_elf_symbol_values(file, (c_linker_sym_t) symp, &symval); if (error != 0) return (error); error = callback(file, i, &symval, opaque); if (error != 0) return (error); } } return (0); } const Elf_Sym * elf_get_sym(linker_file_t lf, Elf_Size symidx) { elf_file_t ef = (elf_file_t)lf; if (symidx >= ef->nchains) return (NULL); return (ef->symtab + symidx); } const char * elf_get_symname(linker_file_t lf, Elf_Size symidx) { elf_file_t ef = (elf_file_t)lf; const Elf_Sym *sym; if (symidx >= ef->nchains) return (NULL); sym = ef->symtab + symidx; return (ef->strtab + sym->st_name); } /* * Symbol lookup function that can be used when the symbol index is known (ie * in relocations). It uses the symbol index instead of doing a fully fledged * hash table based lookup when such is valid. For example for local symbols. * This is not only more efficient, it's also more correct. It's not always * the case that the symbol can be found through the hash table. */ static Elf_Addr elf_lookup(linker_file_t lf, Elf_Size symidx, int deps) { elf_file_t ef = (elf_file_t)lf; const Elf_Sym *sym; const char *symbol; Elf_Addr addr, start, base; /* Don't even try to lookup the symbol if the index is bogus. */ if (symidx >= ef->nchains) return (0); sym = ef->symtab + symidx; /* * Don't do a full lookup when the symbol is local. It may even * fail because it may not be found through the hash table. */ if (ELF_ST_BIND(sym->st_info) == STB_LOCAL) { /* Force lookup failure when we have an insanity. */ if (sym->st_shndx == SHN_UNDEF || sym->st_value == 0) return (0); return ((Elf_Addr)ef->address + sym->st_value); } /* * XXX we can avoid doing a hash table based lookup for global * symbols as well. This however is not always valid, so we'll * just do it the hard way for now. Performance tweaks can * always be added. */ symbol = ef->strtab + sym->st_name; /* Force a lookup failure if the symbol name is bogus. */ if (*symbol == 0) return (0); addr = ((Elf_Addr)linker_file_lookup_symbol(lf, symbol, deps)); if (elf_set_find(&set_pcpu_list, addr, &start, &base)) addr = addr - start + base; #ifdef VIMAGE else if (elf_set_find(&set_vnet_list, addr, &start, &base)) addr = addr - start + base; #endif return addr; } static void link_elf_reloc_local(linker_file_t lf) { const Elf_Rel *rellim; const Elf_Rel *rel; const Elf_Rela *relalim; const Elf_Rela *rela; elf_file_t ef = (elf_file_t)lf; /* Perform relocations without addend if there are any: */ if ((rel = ef->rel) != NULL) { rellim = (const Elf_Rel *)((const char *)ef->rel + ef->relsize); while (rel < rellim) { elf_reloc_local(lf, (Elf_Addr)ef->address, rel, ELF_RELOC_REL, elf_lookup); rel++; } } /* Perform relocations with addend if there are any: */ if ((rela = ef->rela) != NULL) { relalim = (const Elf_Rela *) ((const char *)ef->rela + ef->relasize); while (rela < relalim) { elf_reloc_local(lf, (Elf_Addr)ef->address, rela, ELF_RELOC_RELA, elf_lookup); rela++; } } } static long link_elf_symtab_get(linker_file_t lf, const Elf_Sym **symtab) { elf_file_t ef = (elf_file_t)lf; *symtab = ef->ddbsymtab; if (*symtab == NULL) return (0); return (ef->ddbsymcnt); } static long link_elf_strtab_get(linker_file_t lf, caddr_t *strtab) { elf_file_t ef = (elf_file_t)lf; *strtab = ef->ddbstrtab; if (*strtab == NULL) return (0); return (ef->ddbstrcnt); } Index: head/sys/powerpc/aim/locore32.S =================================================================== --- head/sys/powerpc/aim/locore32.S (revision 279749) +++ head/sys/powerpc/aim/locore32.S (revision 279750) @@ -1,165 +1,176 @@ /* $FreeBSD$ */ /* $NetBSD: locore.S,v 1.24 2000/05/31 05:09:17 thorpej Exp $ */ /*- * Copyright (C) 2001 Benno Rice * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY Benno Rice ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /*- * Copyright (C) 1995, 1996 Wolfgang Solfrank. * Copyright (C) 1995, 1996 TooLs GmbH. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "assym.s" #include #include #include #include #include #include "opt_platform.h" /* Locate the per-CPU data structure */ #define GET_CPUINFO(r) \ mfsprg0 r /* * Compiled KERNBASE location and the kernel load address */ .globl kernbase .set kernbase, KERNBASE /* * Globals */ .data .align 3 GLOBAL(__startkernel) .long begin GLOBAL(__endkernel) .long end .align 4 #define TMPSTKSZ 8192 /* 8K temporary stack */ GLOBAL(tmpstk) .space TMPSTKSZ .text .globl btext btext: /* * This symbol is here for the benefit of kvm_mkdb, and is supposed to * mark the start of kernel text. */ .globl kernel_text kernel_text: /* * Startup entry. Note, this must be the first thing in the text * segment! */ .text .globl __start __start: - li 8,0 - li 9,0x100 - mtctr 9 -1: - dcbf 0,8 - icbi 0,8 - addi 8,8,0x20 - bdnz 1b - sync - isync + /* Figure out where we are */ + bl 1f + .long _DYNAMIC-. + .long _GLOBAL_OFFSET_TABLE_-. + .long tmpstk-. +1: mflr %r30 - /* Zero bss, in case we were started by something unhelpful */ - li 0,0 - lis 8,_edata@ha - addi 8,8,_edata@l - lis 9,_end@ha - addi 9,9,_end@l -2: stw 0,0(8) - addi 8,8,4 - cmplw 8,9 - blt 2b + /* Set up temporary stack pointer */ + lwz %r1,8(%r30) + add %r1,%r1,%r30 + addi %r1,%r1,(8+TMPSTKSZ-32) + + /* Relocate self */ + stw %r3,16(%r1) + stw %r4,20(%r1) + stw %r5,24(%r1) + stw %r6,28(%r1) + + lwz %r3,0(%r30) /* _DYNAMIC in %r3 */ + add %r3,%r3,%r30 + lwz %r4,4(%r30) /* GOT pointer */ + add %r4,%r4,%r30 + lwz %r4,4(%r4) /* got[0] is _DYNAMIC link addr */ + subf %r4,%r4,%r3 /* subtract to calculate relocbase */ + bl elf_reloc_self - lis 1,(tmpstk+TMPSTKSZ-16)@ha - addi 1,1,(tmpstk+TMPSTKSZ-16)@l + lwz %r3,16(%r1) + lwz %r4,20(%r1) + lwz %r5,24(%r1) + lwz %r6,28(%r1) + /* MD setup */ bl powerpc_init + + /* Set stack pointer to new value and branch to mi_startup */ mr %r1, %r3 li %r3, 0 stw %r3, 0(%r1) bl mi_startup + + /* If mi_startup somehow returns, exit. This would be bad. */ b OF_exit /* * int setfault() * * Similar to setjmp to setup for handling faults on accesses to user memory. * Any routine using this may only call bcopy, either the form below, * or the (currently used) C code optimized, so it doesn't use any non-volatile * registers. */ .globl setfault setfault: mflr 0 mfcr 12 mfsprg 4,0 lwz 4,TD_PCB(2) /* curthread = r2 */ stw 3,PCB_ONFAULT(4) stw 0,0(3) stw 1,4(3) stw 2,8(3) stmw 12,12(3) xor 3,3,3 blr #include Index: head/sys/powerpc/aim/machdep.c =================================================================== --- head/sys/powerpc/aim/machdep.c (revision 279749) +++ head/sys/powerpc/aim/machdep.c (revision 279750) @@ -1,974 +1,972 @@ /*- * Copyright (C) 1995, 1996 Wolfgang Solfrank. * Copyright (C) 1995, 1996 TooLs GmbH. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /*- * Copyright (C) 2001 Benno Rice * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY Benno Rice ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * $NetBSD: machdep.c,v 1.74.2.1 2000/11/01 16:13:48 tv Exp $ */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include "opt_ddb.h" #include "opt_kstack_pages.h" #include "opt_platform.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifndef __powerpc64__ #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include int cold = 1; #ifdef __powerpc64__ extern int n_slbs; int cacheline_size = 128; #else int cacheline_size = 32; #endif int hw_direct_map = 1; extern void *ap_pcpu; struct pcpu __pcpu[MAXCPU]; static struct trapframe frame0; char machine[] = "powerpc"; SYSCTL_STRING(_hw, HW_MACHINE, machine, CTLFLAG_RD, machine, 0, ""); static void cpu_startup(void *); SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL); SYSCTL_INT(_machdep, CPU_CACHELINE, cacheline_size, CTLFLAG_RD, &cacheline_size, 0, ""); uintptr_t powerpc_init(vm_offset_t, vm_offset_t, vm_offset_t, void *); long Maxmem = 0; long realmem = 0; #ifndef __powerpc64__ struct bat battable[16]; #endif struct kva_md_info kmi; static void cpu_startup(void *dummy) { /* * Initialise the decrementer-based clock. */ decr_init(); /* * Good {morning,afternoon,evening,night}. */ cpu_setup(PCPU_GET(cpuid)); #ifdef PERFMON perfmon_init(); #endif printf("real memory = %ld (%ld MB)\n", ptoa(physmem), ptoa(physmem) / 1048576); realmem = physmem; if (bootverbose) printf("available KVA = %zd (%zd MB)\n", virtual_end - virtual_avail, (virtual_end - virtual_avail) / 1048576); /* * Display any holes after the first chunk of extended memory. */ if (bootverbose) { int indx; printf("Physical memory chunk(s):\n"); for (indx = 0; phys_avail[indx + 1] != 0; indx += 2) { vm_offset_t size1 = phys_avail[indx + 1] - phys_avail[indx]; #ifdef __powerpc64__ printf("0x%016lx - 0x%016lx, %ld bytes (%ld pages)\n", #else printf("0x%08x - 0x%08x, %d bytes (%ld pages)\n", #endif phys_avail[indx], phys_avail[indx + 1] - 1, size1, size1 / PAGE_SIZE); } } vm_ksubmap_init(&kmi); printf("avail memory = %ld (%ld MB)\n", ptoa(vm_cnt.v_free_count), ptoa(vm_cnt.v_free_count) / 1048576); /* * Set up buffers, so they can be used to read disk labels. */ bufinit(); vm_pager_bufferinit(); } extern vm_offset_t __startkernel, __endkernel; extern unsigned char __bss_start[]; extern unsigned char __sbss_start[]; extern unsigned char __sbss_end[]; extern unsigned char _end[]; #ifndef __powerpc64__ /* Bits for running on 64-bit systems in 32-bit mode. */ extern void *testppc64, *testppc64size; extern void *restorebridge, *restorebridgesize; extern void *rfid_patch, *rfi_patch1, *rfi_patch2; extern void *trapcode64; + +extern Elf_Addr _GLOBAL_OFFSET_TABLE_[]; #endif extern void *rstcode, *rstcodeend; -extern void *trapcode, *trapcodeend, *trapcode2; +extern void *trapcode, *trapcodeend; +extern void *generictrap, *generictrap64; extern void *slbtrap, *slbtrapend; extern void *alitrap, *aliend; extern void *dsitrap, *dsiend; extern void *decrint, *decrsize; extern void *extint, *extsize; extern void *dblow, *dbend; extern void *imisstrap, *imisssize; extern void *dlmisstrap, *dlmisssize; extern void *dsmisstrap, *dsmisssize; uintptr_t powerpc_init(vm_offset_t fdt, vm_offset_t toc, vm_offset_t ofentry, void *mdp) { struct pcpu *pc; vm_offset_t startkernel, endkernel; - void *generictrap; size_t trap_offset, trapsize; vm_offset_t trap; void *kmdp; char *env; register_t msr, scratch; uint8_t *cache_check; int cacheline_warn; #ifndef __powerpc64__ int ppc64; #endif #ifdef DDB vm_offset_t ksym_start; vm_offset_t ksym_end; #endif kmdp = NULL; trap_offset = 0; cacheline_warn = 0; /* First guess at start/end kernel positions */ startkernel = __startkernel; endkernel = __endkernel; /* Check for ePAPR loader, which puts a magic value into r6 */ if (mdp == (void *)0x65504150) mdp = NULL; /* * Parse metadata if present and fetch parameters. Must be done * before console is inited so cninit gets the right value of * boothowto. */ if (mdp != NULL) { preload_metadata = mdp; kmdp = preload_search_by_type("elf kernel"); if (kmdp != NULL) { boothowto = MD_FETCH(kmdp, MODINFOMD_HOWTO, int); kern_envp = MD_FETCH(kmdp, MODINFOMD_ENVP, char *); endkernel = ulmax(endkernel, MD_FETCH(kmdp, MODINFOMD_KERNEND, vm_offset_t)); #ifdef DDB ksym_start = MD_FETCH(kmdp, MODINFOMD_SSYM, uintptr_t); ksym_end = MD_FETCH(kmdp, MODINFOMD_ESYM, uintptr_t); db_fetch_ksymtab(ksym_start, ksym_end); #endif } } else { bzero(__sbss_start, __sbss_end - __sbss_start); bzero(__bss_start, _end - __bss_start); } /* Store boot environment state */ OF_initial_setup((void *)fdt, NULL, (int (*)(void *))ofentry); /* * Init params/tunables that can be overridden by the loader */ init_param1(); /* * Start initializing proc0 and thread0. */ proc_linkup0(&proc0, &thread0); thread0.td_frame = &frame0; /* * Set up per-cpu data. */ pc = __pcpu; pcpu_init(pc, 0, sizeof(struct pcpu)); pc->pc_curthread = &thread0; #ifdef __powerpc64__ __asm __volatile("mr 13,%0" :: "r"(pc->pc_curthread)); #else __asm __volatile("mr 2,%0" :: "r"(pc->pc_curthread)); #endif pc->pc_cpuid = 0; __asm __volatile("mtsprg 0, %0" :: "r"(pc)); /* * Init mutexes, which we use heavily in PMAP */ mutex_init(); /* * Install the OF client interface */ OF_bootstrap(); /* * Initialize the console before printing anything. */ cninit(); /* * Complain if there is no metadata. */ if (mdp == NULL || kmdp == NULL) { printf("powerpc_init: no loader metadata.\n"); } /* * Init KDB */ kdb_init(); /* Various very early CPU fix ups */ switch (mfpvr() >> 16) { /* * PowerPC 970 CPUs have a misfeature requested by Apple that * makes them pretend they have a 32-byte cacheline. Turn this * off before we measure the cacheline size. */ case IBM970: case IBM970FX: case IBM970MP: case IBM970GX: scratch = mfspr(SPR_HID5); scratch &= ~HID5_970_DCBZ_SIZE_HI; mtspr(SPR_HID5, scratch); break; #ifdef __powerpc64__ case IBMPOWER7: case IBMPOWER7PLUS: case IBMPOWER8: case IBMPOWER8E: /* XXX: get from ibm,slb-size in device tree */ n_slbs = 32; break; #endif } /* * Initialize the interrupt tables and figure out our cache line * size and whether or not we need the 64-bit bridge code. */ /* * Disable translation in case the vector area hasn't been * mapped (G5). Note that no OFW calls can be made until * translation is re-enabled. */ msr = mfmsr(); mtmsr((msr & ~(PSL_IR | PSL_DR)) | PSL_RI); /* * Measure the cacheline size using dcbz * * Use EXC_PGM as a playground. We are about to overwrite it * anyway, we know it exists, and we know it is cache-aligned. */ cache_check = (void *)EXC_PGM; for (cacheline_size = 0; cacheline_size < 0x100; cacheline_size++) cache_check[cacheline_size] = 0xff; __asm __volatile("dcbz 0,%0":: "r" (cache_check) : "memory"); /* Find the first byte dcbz did not zero to get the cache line size */ for (cacheline_size = 0; cacheline_size < 0x100 && cache_check[cacheline_size] == 0; cacheline_size++); /* Work around psim bug */ if (cacheline_size == 0) { cacheline_warn = 1; cacheline_size = 32; } /* Make sure the kernel icache is valid before we go too much further */ __syncicache((caddr_t)startkernel, endkernel - startkernel); #ifndef __powerpc64__ /* * Figure out whether we need to use the 64 bit PMAP. This works by * executing an instruction that is only legal on 64-bit PPC (mtmsrd), * and setting ppc64 = 0 if that causes a trap. */ ppc64 = 1; bcopy(&testppc64, (void *)EXC_PGM, (size_t)&testppc64size); __syncicache((void *)EXC_PGM, (size_t)&testppc64size); __asm __volatile("\ mfmsr %0; \ mtsprg2 %1; \ \ mtmsrd %0; \ mfsprg2 %1;" : "=r"(scratch), "=r"(ppc64)); if (ppc64) cpu_features |= PPC_FEATURE_64; /* * Now copy restorebridge into all the handlers, if necessary, * and set up the trap tables. */ if (cpu_features & PPC_FEATURE_64) { /* Patch the two instances of rfi -> rfid */ bcopy(&rfid_patch,&rfi_patch1,4); #ifdef KDB /* rfi_patch2 is at the end of dbleave */ bcopy(&rfid_patch,&rfi_patch2,4); #endif - - /* - * Set the common trap entry point to the one that - * knows to restore 32-bit operation on execution. - */ - - generictrap = &trapcode64; - } else { - generictrap = &trapcode; } - #else /* powerpc64 */ cpu_features |= PPC_FEATURE_64; - generictrap = &trapcode; #endif trapsize = (size_t)&trapcodeend - (size_t)&trapcode; /* * Copy generic handler into every possible trap. Special cases will get * different ones in a minute. */ for (trap = EXC_RST; trap < EXC_LAST; trap += 0x20) - bcopy(generictrap, (void *)trap, trapsize); + bcopy(&trapcode, (void *)trap, trapsize); #ifndef __powerpc64__ if (cpu_features & PPC_FEATURE_64) { /* * Copy a code snippet to restore 32-bit bridge mode * to the top of every non-generic trap handler */ trap_offset += (size_t)&restorebridgesize; bcopy(&restorebridge, (void *)EXC_RST, trap_offset); bcopy(&restorebridge, (void *)EXC_DSI, trap_offset); bcopy(&restorebridge, (void *)EXC_ALI, trap_offset); bcopy(&restorebridge, (void *)EXC_PGM, trap_offset); bcopy(&restorebridge, (void *)EXC_MCHK, trap_offset); bcopy(&restorebridge, (void *)EXC_TRC, trap_offset); bcopy(&restorebridge, (void *)EXC_BPT, trap_offset); } #endif bcopy(&rstcode, (void *)(EXC_RST + trap_offset), (size_t)&rstcodeend - (size_t)&rstcode); #ifdef KDB bcopy(&dblow, (void *)(EXC_MCHK + trap_offset), (size_t)&dbend - (size_t)&dblow); bcopy(&dblow, (void *)(EXC_PGM + trap_offset), (size_t)&dbend - (size_t)&dblow); bcopy(&dblow, (void *)(EXC_TRC + trap_offset), (size_t)&dbend - (size_t)&dblow); bcopy(&dblow, (void *)(EXC_BPT + trap_offset), (size_t)&dbend - (size_t)&dblow); #endif bcopy(&alitrap, (void *)(EXC_ALI + trap_offset), (size_t)&aliend - (size_t)&alitrap); bcopy(&dsitrap, (void *)(EXC_DSI + trap_offset), (size_t)&dsiend - (size_t)&dsitrap); #ifdef __powerpc64__ /* Set TOC base so that the interrupt code can get at it */ - *((void **)TRAP_GENTRAP) = &trapcode2; + *((void **)TRAP_GENTRAP) = &generictrap; *((register_t *)TRAP_TOCBASE) = toc; bcopy(&slbtrap, (void *)EXC_DSE,(size_t)&slbtrapend - (size_t)&slbtrap); bcopy(&slbtrap, (void *)EXC_ISE,(size_t)&slbtrapend - (size_t)&slbtrap); #else + /* Set branch address for trap code */ + if (cpu_features & PPC_FEATURE_64) + *((void **)TRAP_GENTRAP) = &generictrap64; + else + *((void **)TRAP_GENTRAP) = &generictrap; + *((void **)TRAP_TOCBASE) = _GLOBAL_OFFSET_TABLE_; + /* G2-specific TLB miss helper handlers */ bcopy(&imisstrap, (void *)EXC_IMISS, (size_t)&imisssize); bcopy(&dlmisstrap, (void *)EXC_DLMISS, (size_t)&dlmisssize); bcopy(&dsmisstrap, (void *)EXC_DSMISS, (size_t)&dsmisssize); #endif __syncicache(EXC_RSVD, EXC_LAST - EXC_RSVD); /* * Restore MSR */ mtmsr(msr); /* Warn if cachline size was not determined */ if (cacheline_warn == 1) { printf("WARNING: cacheline size undetermined, setting to 32\n"); } /* * Choose a platform module so we can get the physical memory map. */ platform_probe_and_attach(); /* * Initialise virtual memory. Use BUS_PROBE_GENERIC priority * in case the platform module had a better idea of what we * should do. */ if (cpu_features & PPC_FEATURE_64) pmap_mmu_install(MMU_TYPE_G5, BUS_PROBE_GENERIC); else pmap_mmu_install(MMU_TYPE_OEA, BUS_PROBE_GENERIC); pmap_bootstrap(startkernel, endkernel); mtmsr(PSL_KERNSET & ~PSL_EE); /* * Initialize params/tunables that are derived from memsize */ init_param2(physmem); /* * Grab booted kernel's name */ env = kern_getenv("kernelname"); if (env != NULL) { strlcpy(kernelname, env, sizeof(kernelname)); freeenv(env); } /* * Finish setting up thread0. */ thread0.td_pcb = (struct pcb *) ((thread0.td_kstack + thread0.td_kstack_pages * PAGE_SIZE - sizeof(struct pcb)) & ~15UL); bzero((void *)thread0.td_pcb, sizeof(struct pcb)); pc->pc_curpcb = thread0.td_pcb; /* Initialise the message buffer. */ msgbufinit(msgbufp, msgbufsize); #ifdef KDB if (boothowto & RB_KDB) kdb_enter(KDB_WHY_BOOTFLAGS, "Boot flags requested debugger"); #endif return (((uintptr_t)thread0.td_pcb - (sizeof(struct callframe) - 3*sizeof(register_t))) & ~15UL); } void bzero(void *buf, size_t len) { caddr_t p; p = buf; while (((vm_offset_t) p & (sizeof(u_long) - 1)) && len) { *p++ = 0; len--; } while (len >= sizeof(u_long) * 8) { *(u_long*) p = 0; *((u_long*) p + 1) = 0; *((u_long*) p + 2) = 0; *((u_long*) p + 3) = 0; len -= sizeof(u_long) * 8; *((u_long*) p + 4) = 0; *((u_long*) p + 5) = 0; *((u_long*) p + 6) = 0; *((u_long*) p + 7) = 0; p += sizeof(u_long) * 8; } while (len >= sizeof(u_long)) { *(u_long*) p = 0; len -= sizeof(u_long); p += sizeof(u_long); } while (len) { *p++ = 0; len--; } } void cpu_boot(int howto) { } /* * Flush the D-cache for non-DMA I/O so that the I-cache can * be made coherent later. */ void cpu_flush_dcache(void *ptr, size_t len) { /* TBD */ } /* * Shutdown the CPU as much as possible. */ void cpu_halt(void) { OF_exit(); } int ptrace_set_pc(struct thread *td, unsigned long addr) { struct trapframe *tf; tf = td->td_frame; tf->srr0 = (register_t)addr; return (0); } int ptrace_single_step(struct thread *td) { struct trapframe *tf; tf = td->td_frame; tf->srr1 |= PSL_SE; return (0); } int ptrace_clear_single_step(struct thread *td) { struct trapframe *tf; tf = td->td_frame; tf->srr1 &= ~PSL_SE; return (0); } void kdb_cpu_clear_singlestep(void) { kdb_frame->srr1 &= ~PSL_SE; } void kdb_cpu_set_singlestep(void) { kdb_frame->srr1 |= PSL_SE; } /* * Initialise a struct pcpu. */ void cpu_pcpu_init(struct pcpu *pcpu, int cpuid, size_t sz) { #ifdef __powerpc64__ /* Copy the SLB contents from the current CPU */ memcpy(pcpu->pc_slb, PCPU_GET(slb), sizeof(pcpu->pc_slb)); #endif } void spinlock_enter(void) { struct thread *td; register_t msr; td = curthread; if (td->td_md.md_spinlock_count == 0) { __asm __volatile("or 2,2,2"); /* Set high thread priority */ msr = intr_disable(); td->td_md.md_spinlock_count = 1; td->td_md.md_saved_msr = msr; } else td->td_md.md_spinlock_count++; critical_enter(); } void spinlock_exit(void) { struct thread *td; register_t msr; td = curthread; critical_exit(); msr = td->td_md.md_saved_msr; td->td_md.md_spinlock_count--; if (td->td_md.md_spinlock_count == 0) { intr_restore(msr); __asm __volatile("or 6,6,6"); /* Set normal thread priority */ } } int db_trap_glue(struct trapframe *); /* Called from trap_subr.S */ int db_trap_glue(struct trapframe *frame) { if (!(frame->srr1 & PSL_PR) && (frame->exc == EXC_TRC || frame->exc == EXC_RUNMODETRC || (frame->exc == EXC_PGM && (frame->srr1 & 0x20000)) || frame->exc == EXC_BPT || frame->exc == EXC_DSI)) { int type = frame->exc; /* Ignore DTrace traps. */ if (*(uint32_t *)frame->srr0 == EXC_DTRACE) return (0); if (type == EXC_PGM && (frame->srr1 & 0x20000)) { type = T_BREAKPOINT; } return (kdb_trap(type, 0, frame)); } return (0); } #ifndef __powerpc64__ uint64_t va_to_vsid(pmap_t pm, vm_offset_t va) { return ((pm->pm_sr[(uintptr_t)va >> ADDR_SR_SHFT]) & SR_VSID_MASK); } #endif vm_offset_t pmap_early_io_map(vm_paddr_t pa, vm_size_t size) { return (pa); } /* From p3-53 of the MPC7450 RISC Microprocessor Family Reference Manual */ void flush_disable_caches(void) { register_t msr; register_t msscr0; register_t cache_reg; volatile uint32_t *memp; uint32_t temp; int i; int x; msr = mfmsr(); powerpc_sync(); mtmsr(msr & ~(PSL_EE | PSL_DR)); msscr0 = mfspr(SPR_MSSCR0); msscr0 &= ~MSSCR0_L2PFE; mtspr(SPR_MSSCR0, msscr0); powerpc_sync(); isync(); __asm__ __volatile__("dssall; sync"); powerpc_sync(); isync(); __asm__ __volatile__("dcbf 0,%0" :: "r"(0)); __asm__ __volatile__("dcbf 0,%0" :: "r"(0)); __asm__ __volatile__("dcbf 0,%0" :: "r"(0)); /* Lock the L1 Data cache. */ mtspr(SPR_LDSTCR, mfspr(SPR_LDSTCR) | 0xFF); powerpc_sync(); isync(); mtspr(SPR_LDSTCR, 0); /* * Perform this in two stages: Flush the cache starting in RAM, then do it * from ROM. */ memp = (volatile uint32_t *)0x00000000; for (i = 0; i < 128 * 1024; i++) { temp = *memp; __asm__ __volatile__("dcbf 0,%0" :: "r"(memp)); memp += 32/sizeof(*memp); } memp = (volatile uint32_t *)0xfff00000; x = 0xfe; for (; x != 0xff;) { mtspr(SPR_LDSTCR, x); for (i = 0; i < 128; i++) { temp = *memp; __asm__ __volatile__("dcbf 0,%0" :: "r"(memp)); memp += 32/sizeof(*memp); } x = ((x << 1) | 1) & 0xff; } mtspr(SPR_LDSTCR, 0); cache_reg = mfspr(SPR_L2CR); if (cache_reg & L2CR_L2E) { cache_reg &= ~(L2CR_L2IO_7450 | L2CR_L2DO_7450); mtspr(SPR_L2CR, cache_reg); powerpc_sync(); mtspr(SPR_L2CR, cache_reg | L2CR_L2HWF); while (mfspr(SPR_L2CR) & L2CR_L2HWF) ; /* Busy wait for cache to flush */ powerpc_sync(); cache_reg &= ~L2CR_L2E; mtspr(SPR_L2CR, cache_reg); powerpc_sync(); mtspr(SPR_L2CR, cache_reg | L2CR_L2I); powerpc_sync(); while (mfspr(SPR_L2CR) & L2CR_L2I) ; /* Busy wait for L2 cache invalidate */ powerpc_sync(); } cache_reg = mfspr(SPR_L3CR); if (cache_reg & L3CR_L3E) { cache_reg &= ~(L3CR_L3IO | L3CR_L3DO); mtspr(SPR_L3CR, cache_reg); powerpc_sync(); mtspr(SPR_L3CR, cache_reg | L3CR_L3HWF); while (mfspr(SPR_L3CR) & L3CR_L3HWF) ; /* Busy wait for cache to flush */ powerpc_sync(); cache_reg &= ~L3CR_L3E; mtspr(SPR_L3CR, cache_reg); powerpc_sync(); mtspr(SPR_L3CR, cache_reg | L3CR_L3I); powerpc_sync(); while (mfspr(SPR_L3CR) & L3CR_L3I) ; /* Busy wait for L3 cache invalidate */ powerpc_sync(); } mtspr(SPR_HID0, mfspr(SPR_HID0) & ~HID0_DCE); powerpc_sync(); isync(); mtmsr(msr); } void cpu_sleep() { static u_quad_t timebase = 0; static register_t sprgs[4]; static register_t srrs[2]; jmp_buf resetjb; struct thread *fputd; struct thread *vectd; register_t hid0; register_t msr; register_t saved_msr; ap_pcpu = pcpup; PCPU_SET(restore, &resetjb); saved_msr = mfmsr(); fputd = PCPU_GET(fputhread); vectd = PCPU_GET(vecthread); if (fputd != NULL) save_fpu(fputd); if (vectd != NULL) save_vec(vectd); if (setjmp(resetjb) == 0) { sprgs[0] = mfspr(SPR_SPRG0); sprgs[1] = mfspr(SPR_SPRG1); sprgs[2] = mfspr(SPR_SPRG2); sprgs[3] = mfspr(SPR_SPRG3); srrs[0] = mfspr(SPR_SRR0); srrs[1] = mfspr(SPR_SRR1); timebase = mftb(); powerpc_sync(); flush_disable_caches(); hid0 = mfspr(SPR_HID0); hid0 = (hid0 & ~(HID0_DOZE | HID0_NAP)) | HID0_SLEEP; powerpc_sync(); isync(); msr = mfmsr() | PSL_POW; mtspr(SPR_HID0, hid0); powerpc_sync(); while (1) mtmsr(msr); } mttb(timebase); PCPU_SET(curthread, curthread); PCPU_SET(curpcb, curthread->td_pcb); pmap_activate(curthread); powerpc_sync(); mtspr(SPR_SPRG0, sprgs[0]); mtspr(SPR_SPRG1, sprgs[1]); mtspr(SPR_SPRG2, sprgs[2]); mtspr(SPR_SPRG3, sprgs[3]); mtspr(SPR_SRR0, srrs[0]); mtspr(SPR_SRR1, srrs[1]); mtmsr(saved_msr); if (fputd == curthread) enable_fpu(curthread); if (vectd == curthread) enable_vec(curthread); powerpc_sync(); } Index: head/sys/powerpc/aim/trap_subr32.S =================================================================== --- head/sys/powerpc/aim/trap_subr32.S (revision 279749) +++ head/sys/powerpc/aim/trap_subr32.S (revision 279750) @@ -1,913 +1,938 @@ /* $FreeBSD$ */ /* $NetBSD: trap_subr.S,v 1.20 2002/04/22 23:20:08 kleink Exp $ */ /*- * Copyright (C) 1995, 1996 Wolfgang Solfrank. * Copyright (C) 1995, 1996 TooLs GmbH. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * NOTICE: This is not a standalone file. to use it, #include it in * your port's locore.S, like so: * * #include */ /* * Save/restore segment registers */ #define RESTORE_SRS(pmap,sr) mtsr 0,sr; \ lwz sr,1*4(pmap); mtsr 1,sr; \ lwz sr,2*4(pmap); mtsr 2,sr; \ lwz sr,3*4(pmap); mtsr 3,sr; \ lwz sr,4*4(pmap); mtsr 4,sr; \ lwz sr,5*4(pmap); mtsr 5,sr; \ lwz sr,6*4(pmap); mtsr 6,sr; \ lwz sr,7*4(pmap); mtsr 7,sr; \ lwz sr,8*4(pmap); mtsr 8,sr; \ lwz sr,9*4(pmap); mtsr 9,sr; \ lwz sr,10*4(pmap); mtsr 10,sr; \ lwz sr,11*4(pmap); mtsr 11,sr; \ /* Skip segment 12 (USER_SR), which is restored differently */ \ lwz sr,13*4(pmap); mtsr 13,sr; \ lwz sr,14*4(pmap); mtsr 14,sr; \ lwz sr,15*4(pmap); mtsr 15,sr; isync; /* * User SRs are loaded through a pointer to the current pmap. */ #define RESTORE_USER_SRS(pmap,sr) \ GET_CPUINFO(pmap); \ lwz pmap,PC_CURPMAP(pmap); \ lwzu sr,PM_SR(pmap); \ RESTORE_SRS(pmap,sr) \ /* Restore SR 12 */ \ lwz sr,12*4(pmap); mtsr 12,sr /* * Kernel SRs are loaded directly from kernel_pmap_ */ #define RESTORE_KERN_SRS(pmap,sr) \ - lis pmap,CNAME(kernel_pmap_store)@ha; \ - lwzu sr,CNAME(kernel_pmap_store)+PM_SR@l(pmap); \ + lwz pmap,TRAP_TOCBASE(0); \ + lwz pmap,CNAME(kernel_pmap_store)@got(pmap); \ + lwzu sr,PM_SR(pmap); \ RESTORE_SRS(pmap,sr) /* * FRAME_SETUP assumes: * SPRG1 SP (1) * SPRG3 trap type * savearea r28-r31,DAR,DSISR (DAR & DSISR only for DSI traps) * r28 LR * r29 CR * r30 scratch * r31 scratch * r1 kernel stack * SRR0/1 as at start of trap */ #define FRAME_SETUP(savearea) \ /* Have to enable translation to allow access of kernel stack: */ \ GET_CPUINFO(%r31); \ mfsrr0 %r30; \ stw %r30,(savearea+CPUSAVE_SRR0)(%r31); /* save SRR0 */ \ mfsrr1 %r30; \ stw %r30,(savearea+CPUSAVE_SRR1)(%r31); /* save SRR1 */ \ mfmsr %r30; \ ori %r30,%r30,(PSL_DR|PSL_IR|PSL_RI)@l; /* relocation on */ \ mtmsr %r30; /* stack can now be accessed */ \ isync; \ mfsprg1 %r31; /* get saved SP */ \ stwu %r31,-FRAMELEN(%r1); /* save it in the callframe */ \ stw %r0, FRAME_0+8(%r1); /* save r0 in the trapframe */ \ stw %r31,FRAME_1+8(%r1); /* save SP " " */ \ stw %r2, FRAME_2+8(%r1); /* save r2 " " */ \ stw %r28,FRAME_LR+8(%r1); /* save LR " " */ \ stw %r29,FRAME_CR+8(%r1); /* save CR " " */ \ GET_CPUINFO(%r2); \ lwz %r28,(savearea+CPUSAVE_R28)(%r2); /* get saved r28 */ \ lwz %r29,(savearea+CPUSAVE_R29)(%r2); /* get saved r29 */ \ lwz %r30,(savearea+CPUSAVE_R30)(%r2); /* get saved r30 */ \ lwz %r31,(savearea+CPUSAVE_R31)(%r2); /* get saved r31 */ \ stw %r3, FRAME_3+8(%r1); /* save r3-r31 */ \ stw %r4, FRAME_4+8(%r1); \ stw %r5, FRAME_5+8(%r1); \ stw %r6, FRAME_6+8(%r1); \ stw %r7, FRAME_7+8(%r1); \ stw %r8, FRAME_8+8(%r1); \ stw %r9, FRAME_9+8(%r1); \ stw %r10, FRAME_10+8(%r1); \ stw %r11, FRAME_11+8(%r1); \ stw %r12, FRAME_12+8(%r1); \ stw %r13, FRAME_13+8(%r1); \ stw %r14, FRAME_14+8(%r1); \ stw %r15, FRAME_15+8(%r1); \ stw %r16, FRAME_16+8(%r1); \ stw %r17, FRAME_17+8(%r1); \ stw %r18, FRAME_18+8(%r1); \ stw %r19, FRAME_19+8(%r1); \ stw %r20, FRAME_20+8(%r1); \ stw %r21, FRAME_21+8(%r1); \ stw %r22, FRAME_22+8(%r1); \ stw %r23, FRAME_23+8(%r1); \ stw %r24, FRAME_24+8(%r1); \ stw %r25, FRAME_25+8(%r1); \ stw %r26, FRAME_26+8(%r1); \ stw %r27, FRAME_27+8(%r1); \ stw %r28, FRAME_28+8(%r1); \ stw %r29, FRAME_29+8(%r1); \ stw %r30, FRAME_30+8(%r1); \ stw %r31, FRAME_31+8(%r1); \ lwz %r28,(savearea+CPUSAVE_AIM_DAR)(%r2); /* saved DAR */ \ lwz %r29,(savearea+CPUSAVE_AIM_DSISR)(%r2);/* saved DSISR */\ lwz %r30,(savearea+CPUSAVE_SRR0)(%r2); /* saved SRR0 */ \ lwz %r31,(savearea+CPUSAVE_SRR1)(%r2); /* saved SRR1 */ \ mfxer %r3; \ mfctr %r4; \ mfsprg3 %r5; \ stw %r3, FRAME_XER+8(1); /* save xer/ctr/exc */ \ stw %r4, FRAME_CTR+8(1); \ stw %r5, FRAME_EXC+8(1); \ stw %r28,FRAME_AIM_DAR+8(1); \ stw %r29,FRAME_AIM_DSISR+8(1); /* save dsisr/srr0/srr1 */ \ stw %r30,FRAME_SRR0+8(1); \ stw %r31,FRAME_SRR1+8(1); \ lwz %r2,PC_CURTHREAD(%r2) /* set curthread pointer */ #define FRAME_LEAVE(savearea) \ /* Disable exceptions: */ \ mfmsr %r2; \ andi. %r2,%r2,~PSL_EE@l; \ mtmsr %r2; \ isync; \ /* Now restore regs: */ \ lwz %r2,FRAME_SRR0+8(%r1); \ lwz %r3,FRAME_SRR1+8(%r1); \ lwz %r4,FRAME_CTR+8(%r1); \ lwz %r5,FRAME_XER+8(%r1); \ lwz %r6,FRAME_LR+8(%r1); \ GET_CPUINFO(%r7); \ stw %r2,(savearea+CPUSAVE_SRR0)(%r7); /* save SRR0 */ \ stw %r3,(savearea+CPUSAVE_SRR1)(%r7); /* save SRR1 */ \ lwz %r7,FRAME_CR+8(%r1); \ mtctr %r4; \ mtxer %r5; \ mtlr %r6; \ mtsprg1 %r7; /* save cr */ \ lwz %r31,FRAME_31+8(%r1); /* restore r0-31 */ \ lwz %r30,FRAME_30+8(%r1); \ lwz %r29,FRAME_29+8(%r1); \ lwz %r28,FRAME_28+8(%r1); \ lwz %r27,FRAME_27+8(%r1); \ lwz %r26,FRAME_26+8(%r1); \ lwz %r25,FRAME_25+8(%r1); \ lwz %r24,FRAME_24+8(%r1); \ lwz %r23,FRAME_23+8(%r1); \ lwz %r22,FRAME_22+8(%r1); \ lwz %r21,FRAME_21+8(%r1); \ lwz %r20,FRAME_20+8(%r1); \ lwz %r19,FRAME_19+8(%r1); \ lwz %r18,FRAME_18+8(%r1); \ lwz %r17,FRAME_17+8(%r1); \ lwz %r16,FRAME_16+8(%r1); \ lwz %r15,FRAME_15+8(%r1); \ lwz %r14,FRAME_14+8(%r1); \ lwz %r13,FRAME_13+8(%r1); \ lwz %r12,FRAME_12+8(%r1); \ lwz %r11,FRAME_11+8(%r1); \ lwz %r10,FRAME_10+8(%r1); \ lwz %r9, FRAME_9+8(%r1); \ lwz %r8, FRAME_8+8(%r1); \ lwz %r7, FRAME_7+8(%r1); \ lwz %r6, FRAME_6+8(%r1); \ lwz %r5, FRAME_5+8(%r1); \ lwz %r4, FRAME_4+8(%r1); \ lwz %r3, FRAME_3+8(%r1); \ lwz %r2, FRAME_2+8(%r1); \ lwz %r0, FRAME_0+8(%r1); \ lwz %r1, FRAME_1+8(%r1); \ /* Can't touch %r1 from here on */ \ mtsprg2 %r2; /* save r2 & r3 */ \ mtsprg3 %r3; \ /* Disable translation, machine check and recoverability: */ \ mfmsr %r2; \ andi. %r2,%r2,~(PSL_DR|PSL_IR|PSL_ME|PSL_RI)@l; \ mtmsr %r2; \ isync; \ /* Decide whether we return to user mode: */ \ GET_CPUINFO(%r2); \ lwz %r3,(savearea+CPUSAVE_SRR1)(%r2); \ mtcr %r3; \ bf 17,1f; /* branch if PSL_PR is false */ \ /* Restore user SRs */ \ RESTORE_USER_SRS(%r2,%r3); \ 1: mfsprg1 %r2; /* restore cr */ \ mtcr %r2; \ GET_CPUINFO(%r2); \ lwz %r3,(savearea+CPUSAVE_SRR0)(%r2); /* restore srr0 */ \ mtsrr0 %r3; \ lwz %r3,(savearea+CPUSAVE_SRR1)(%r2); /* restore srr1 */ \ \ /* Make sure HV bit of MSR propagated to SRR1 */ \ mfmsr %r2; \ or %r3,%r2,%r3; \ \ mtsrr1 %r3; \ mfsprg2 %r2; /* restore r2 & r3 */ \ mfsprg3 %r3 #ifdef KDTRACE_HOOKS .data .globl dtrace_invop_calltrap_addr .align 4 .type dtrace_invop_calltrap_addr, @object .size dtrace_invop_calltrap_addr, 4 dtrace_invop_calltrap_addr: .word 0 .word 0 .text #endif /* * The next two routines are 64-bit glue code. The first is used to test if * we are on a 64-bit system. By copying it to the illegal instruction * handler, we can test for 64-bit mode by trying to execute a 64-bit * instruction and seeing what happens. The second gets copied in front * of all the other handlers to restore 32-bit bridge mode when traps * are taken. */ /* 64-bit test code. Sets SPRG2 to 0 if an illegal instruction is executed */ .globl CNAME(testppc64),CNAME(testppc64size) CNAME(testppc64): mtsprg1 %r31 mfsrr0 %r31 addi %r31, %r31, 4 mtsrr0 %r31 li %r31, 0 mtsprg2 %r31 mfsprg1 %r31 rfi CNAME(testppc64size) = .-CNAME(testppc64) /* 64-bit bridge mode restore snippet. Gets copied in front of everything else * on 64-bit systems. */ .globl CNAME(restorebridge),CNAME(restorebridgesize) CNAME(restorebridge): mtsprg1 %r31 mfmsr %r31 clrldi %r31,%r31,1 mtmsrd %r31 mfsprg1 %r31 isync CNAME(restorebridgesize) = .-CNAME(restorebridge) /* * Processor reset exception handler. These are typically * the first instructions the processor executes after a * software reset. We do this in two bits so that we are * not still hanging around in the trap handling region * once the MMU is turned on. */ .globl CNAME(rstcode), CNAME(rstcodeend) CNAME(rstcode): - ba cpu_reset + bl 1f + .long cpu_reset +1: mflr %r31 + lwz %r31,0(%r31) + mtlr %r31 + blrl CNAME(rstcodeend): cpu_reset: bl 1f .space 124 1: mflr %r1 addi %r1,%r1,(124-16)@l - bla CNAME(cpudep_ap_early_bootstrap) + bl CNAME(cpudep_ap_early_bootstrap) lis %r3,1@l - bla CNAME(pmap_cpu_bootstrap) - bla CNAME(cpudep_ap_bootstrap) + bl CNAME(pmap_cpu_bootstrap) + bl CNAME(cpudep_ap_bootstrap) mr %r1,%r3 - bla CNAME(cpudep_ap_setup) + bl CNAME(cpudep_ap_setup) GET_CPUINFO(%r5) lwz %r3,(PC_RESTORE)(%r5) cmplwi %cr0,%r3,0 beq %cr0,2f li %r4, 1 b CNAME(longjmp) 2: #ifdef SMP - bla CNAME(machdep_ap_bootstrap) + bl CNAME(machdep_ap_bootstrap) #endif /* Should not be reached */ 9: b 9b /* * This code gets copied to all the trap vectors * (except ISI/DSI, ALI, and the interrupts) */ .globl CNAME(trapcode),CNAME(trapcodeend) CNAME(trapcode): mtsprg1 %r1 /* save SP */ mflr %r1 /* Save the old LR in r1 */ mtsprg2 %r1 /* And then in SPRG2 */ - li %r1, 0x20 /* How to get the vector from LR */ - bla generictrap /* LR & SPRG3 is exception # */ + lwz %r1, TRAP_GENTRAP(0) /* Get branch address */ + mtlr %r1 + li %r1, 0xe0 /* How to get the vector from LR */ + blrl /* LR & (0xff00 | r1) is exception # */ CNAME(trapcodeend): /* - * 64-bit version of trapcode. Identical, except it calls generictrap64. - */ - .globl CNAME(trapcode64) -CNAME(trapcode64): - mtsprg1 %r1 /* save SP */ - mflr %r1 /* Save the old LR in r1 */ - mtsprg2 %r1 /* And then in SPRG2 */ - li %r1, 0x20 /* How to get the vector from LR */ - bla generictrap64 /* LR & SPRG3 is exception # */ - -/* * For ALI: has to save DSISR and DAR */ .globl CNAME(alitrap),CNAME(aliend) CNAME(alitrap): mtsprg1 %r1 /* save SP */ GET_CPUINFO(%r1) stw %r28,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) /* free r28-r31 */ stw %r29,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) stw %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mfdar %r30 mfdsisr %r31 stw %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_AIM_DSISR)(%r1) mfsprg1 %r1 /* restore SP, in case of branch */ mflr %r28 /* save LR */ mfcr %r29 /* save CR */ /* Put our exception vector in SPRG3 */ li %r31, EXC_ALI mtsprg3 %r31 /* Test whether we already had PR set */ mfsrr1 %r31 mtcr %r31 - bla s_trap + + /* Jump to s_trap */ + bl 1f + .long s_trap +1: mflr %r31 + lwz %r31,0(%r31) + mtlr %r31 + blrl CNAME(aliend): /* * G2 specific: instuction TLB miss. */ .globl CNAME(imisstrap),CNAME(imisssize) CNAME(imisstrap): mfspr %r2, SPR_HASH1 /* get first pointer */ addi %r1, 0, 8 /* load 8 for counter */ mfctr %r0 /* save counter */ mfspr %r3, SPR_ICMP /* get first compare value */ addi %r2, %r2, -8 /* pre dec the pointer */ im0: mtctr %r1 /* load counter */ im1: lwzu %r1, 8(%r2) /* get next pte */ cmp 0, %r1, %r3 /* see if found pte */ bdnzf 2, im1 /* dec count br if cmp ne and if * count not zero */ bne instr_sec_hash /* if not found set up second hash * or exit */ lwz %r1, +4(%r2) /* load tlb entry lower-word */ andi. %r3, %r1, 8 /* check G bit */ bne do_isi_prot /* if guarded, take an ISI */ mtctr %r0 /* restore counter */ mfspr %r0, SPR_IMISS /* get the miss address for the tlbli */ mfspr %r3, SPR_SRR1 /* get the saved cr0 bits */ mtcrf 0x80, %r3 /* restore CR0 */ mtspr SPR_RPA, %r1 /* set the pte */ ori %r1, %r1, 0x100 /* set reference bit */ srwi %r1, %r1, 8 /* get byte 7 of pte */ tlbli %r0 /* load the itlb */ stb %r1, +6(%r2) /* update page table */ rfi /* return to executing program */ instr_sec_hash: andi. %r1, %r3, 0x0040 /* see if we have done second hash */ bne do_isi /* if so, go to ISI interrupt */ mfspr %r2, SPR_HASH2 /* get the second pointer */ ori %r3, %r3, 0x0040 /* change the compare value */ addi %r1, %r0, 8 /* load 8 for counter */ addi %r2, %r2, -8 /* pre dec for update on load */ b im0 /* try second hash */ /* Create a faked ISI interrupt as the address was not found */ do_isi_prot: mfspr %r3, SPR_SRR1 /* get srr1 */ andi. %r2, %r3, 0xffff /* clean upper srr1 */ addis %r2, %r2, 0x0800 /* or in srr<4> = 1 to flag prot * violation */ b isi1 do_isi: mfspr %r3, SPR_SRR1 /* get srr1 */ andi. %r2, %r3, 0xffff /* clean srr1 */ addis %r2, %r2, 0x4000 /* or in srr1<1> = 1 to flag pte * not found */ isi1: mtctr %r0 /* restore counter */ mtspr SPR_SRR1, %r2 /* set srr1 */ mfmsr %r0 /* get msr */ xoris %r0, %r0, 0x2 /* flip the msr bit */ mtcrf 0x80, %r3 /* restore CR0 */ mtmsr %r0 /* flip back to the native gprs */ - ba EXC_ISI /* go to instr. access interrupt */ + ba EXC_ISI /* go to instr. access interrupt */ CNAME(imisssize) = .-CNAME(imisstrap) /* * G2 specific: data load TLB miss. */ .globl CNAME(dlmisstrap),CNAME(dlmisssize) CNAME(dlmisstrap): mfspr %r2, SPR_HASH1 /* get first pointer */ addi %r1, 0, 8 /* load 8 for counter */ mfctr %r0 /* save counter */ mfspr %r3, SPR_DCMP /* get first compare value */ addi %r2, %r2, -8 /* pre dec the pointer */ dm0: mtctr %r1 /* load counter */ dm1: lwzu %r1, 8(%r2) /* get next pte */ cmp 0, 0, %r1, %r3 /* see if found pte */ bdnzf 2, dm1 /* dec count br if cmp ne and if * count not zero */ bne data_sec_hash /* if not found set up second hash * or exit */ lwz %r1, +4(%r2) /* load tlb entry lower-word */ mtctr %r0 /* restore counter */ mfspr %r0, SPR_DMISS /* get the miss address for the tlbld */ mfspr %r3, SPR_SRR1 /* get the saved cr0 bits */ mtcrf 0x80, %r3 /* restore CR0 */ mtspr SPR_RPA, %r1 /* set the pte */ ori %r1, %r1, 0x100 /* set reference bit */ srwi %r1, %r1, 8 /* get byte 7 of pte */ tlbld %r0 /* load the dtlb */ stb %r1, +6(%r2) /* update page table */ rfi /* return to executing program */ data_sec_hash: andi. %r1, %r3, 0x0040 /* see if we have done second hash */ bne do_dsi /* if so, go to DSI interrupt */ mfspr %r2, SPR_HASH2 /* get the second pointer */ ori %r3, %r3, 0x0040 /* change the compare value */ addi %r1, 0, 8 /* load 8 for counter */ addi %r2, %r2, -8 /* pre dec for update on load */ b dm0 /* try second hash */ CNAME(dlmisssize) = .-CNAME(dlmisstrap) /* * G2 specific: data store TLB miss. */ .globl CNAME(dsmisstrap),CNAME(dsmisssize) CNAME(dsmisstrap): mfspr %r2, SPR_HASH1 /* get first pointer */ addi %r1, 0, 8 /* load 8 for counter */ mfctr %r0 /* save counter */ mfspr %r3, SPR_DCMP /* get first compare value */ addi %r2, %r2, -8 /* pre dec the pointer */ ds0: mtctr %r1 /* load counter */ ds1: lwzu %r1, 8(%r2) /* get next pte */ cmp 0, 0, %r1, %r3 /* see if found pte */ bdnzf 2, ds1 /* dec count br if cmp ne and if * count not zero */ bne data_store_sec_hash /* if not found set up second hash * or exit */ lwz %r1, +4(%r2) /* load tlb entry lower-word */ andi. %r3, %r1, 0x80 /* check the C-bit */ beq data_store_chk_prot /* if (C==0) * go check protection modes */ ds2: mtctr %r0 /* restore counter */ mfspr %r0, SPR_DMISS /* get the miss address for the tlbld */ mfspr %r3, SPR_SRR1 /* get the saved cr0 bits */ mtcrf 0x80, %r3 /* restore CR0 */ mtspr SPR_RPA, %r1 /* set the pte */ tlbld %r0 /* load the dtlb */ rfi /* return to executing program */ data_store_sec_hash: andi. %r1, %r3, 0x0040 /* see if we have done second hash */ bne do_dsi /* if so, go to DSI interrupt */ mfspr %r2, SPR_HASH2 /* get the second pointer */ ori %r3, %r3, 0x0040 /* change the compare value */ addi %r1, 0, 8 /* load 8 for counter */ addi %r2, %r2, -8 /* pre dec for update on load */ b ds0 /* try second hash */ /* Check the protection before setting PTE(c-bit) */ data_store_chk_prot: rlwinm. %r3,%r1,30,0,1 /* test PP */ bge- chk0 /* if (PP == 00 or PP == 01) * goto chk0: */ andi. %r3, %r1, 1 /* test PP[0] */ beq+ chk2 /* return if PP[0] == 0 */ b do_dsi_prot /* else DSIp */ chk0: mfspr %r3,SPR_SRR1 /* get old msr */ andis. %r3,%r3,0x0008 /* test the KEY bit (SRR1-bit 12) */ beq chk2 /* if (KEY==0) goto chk2: */ b do_dsi_prot /* else do_dsi_prot */ chk2: ori %r1, %r1, 0x180 /* set reference and change bit */ sth %r1, 6(%r2) /* update page table */ b ds2 /* and back we go */ /* Create a faked DSI interrupt as the address was not found */ do_dsi: mfspr %r3, SPR_SRR1 /* get srr1 */ rlwinm %r1,%r3,9,6,6 /* get srr1 to bit 6 for * load/store, zero rest */ addis %r1, %r1, 0x4000 /* or in dsisr<1> = 1 to flag pte * not found */ b dsi1 do_dsi_prot: mfspr %r3, SPR_SRR1 /* get srr1 */ rlwinm %r1,%r3,9,6,6 /* get srr1 to bit 6 for *load/store, zero rest */ addis %r1, %r1, 0x0800 /* or in dsisr<4> = 1 to flag prot * violation */ dsi1: mtctr %r0 /* restore counter */ andi. %r2, %r3, 0xffff /* clear upper bits of srr1 */ mtspr SPR_SRR1, %r2 /* set srr1 */ mtspr SPR_DSISR, %r1 /* load the dsisr */ mfspr %r1, SPR_DMISS /* get miss address */ rlwinm. %r2,%r2,0,31,31 /* test LE bit */ beq dsi2 /* if little endian then: */ xor %r1, %r1, 0x07 /* de-mung the data address */ dsi2: mtspr SPR_DAR, %r1 /* put in dar */ mfmsr %r0 /* get msr */ xoris %r0, %r0, 0x2 /* flip the msr bit */ mtcrf 0x80, %r3 /* restore CR0 */ mtmsr %r0 /* flip back to the native gprs */ ba EXC_DSI /* branch to DSI interrupt */ CNAME(dsmisssize) = .-CNAME(dsmisstrap) /* * Similar to the above for DSI * Has to handle BAT spills * and standard pagetable spills */ .globl CNAME(dsitrap),CNAME(dsiend) CNAME(dsitrap): mtsprg1 %r1 /* save SP */ GET_CPUINFO(%r1) stw %r28,(PC_DISISAVE+CPUSAVE_R28)(%r1) /* free r28-r31 */ stw %r29,(PC_DISISAVE+CPUSAVE_R29)(%r1) stw %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) stw %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) mfsprg1 %r1 /* restore SP */ mfcr %r29 /* save CR */ mfxer %r30 /* save XER */ mtsprg2 %r30 /* in SPRG2 */ mfsrr1 %r31 /* test kernel mode */ mtcr %r31 bt 17,1f /* branch if PSL_PR is set */ mfdar %r31 /* get fault address */ rlwinm %r31,%r31,7,25,28 /* get segment * 8 */ /* get batu */ - addis %r31,%r31,CNAME(battable)@ha - lwz %r30,CNAME(battable)@l(31) + lwz %r30,TRAP_TOCBASE(0) + lwz %r30,CNAME(battable)@got(%r30) + add %r31,%r30,%r31 + lwz %r30,0(%r31) mtcr %r30 bf 30,1f /* branch if supervisor valid is false */ /* get batl */ - lwz %r31,CNAME(battable)+4@l(31) + lwz %r31,4(%r31) /* We randomly use the highest two bat registers here */ mftb %r28 andi. %r28,%r28,1 bne 2f mtdbatu 2,%r30 mtdbatl 2,%r31 b 3f 2: mtdbatu 3,%r30 mtdbatl 3,%r31 3: mfsprg2 %r30 /* restore XER */ mtxer %r30 mtcr %r29 /* restore CR */ mtsprg1 %r1 GET_CPUINFO(%r1) lwz %r28,(PC_DISISAVE+CPUSAVE_R28)(%r1) /* restore r28-r31 */ lwz %r29,(PC_DISISAVE+CPUSAVE_R29)(%r1) lwz %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) lwz %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) mfsprg1 %r1 rfi /* return to trapped code */ 1: mflr %r28 /* save LR (SP already saved) */ - bla disitrap + + /* Jump to disitrap */ + bl 4f + .long disitrap +4: mflr %r1 + lwz %r1,0(%r1) + mtlr %r1 + blrl CNAME(dsiend): /* * Preamble code for DSI/ISI traps */ disitrap: /* Write the trap vector to SPRG3 by computing LR & 0xff00 */ mflr %r1 andi. %r1,%r1,0xff00 mtsprg3 %r1 GET_CPUINFO(%r1) lwz %r30,(PC_DISISAVE+CPUSAVE_R28)(%r1) stw %r30,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) lwz %r31,(PC_DISISAVE+CPUSAVE_R29)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) lwz %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) stw %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) lwz %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mfdar %r30 mfdsisr %r31 stw %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_AIM_DSISR)(%r1) #ifdef KDB /* Try to detect a kernel stack overflow */ mfsrr1 %r31 mtcr %r31 bt 17,realtrap /* branch is user mode */ mfsprg1 %r31 /* get old SP */ clrrwi %r31,%r31,12 /* Round SP down to nearest page */ sub. %r30,%r31,%r30 /* SP - DAR */ bge 1f neg %r30,%r30 /* modulo value */ 1: cmplwi %cr0,%r30,4096 /* is DAR within a page of SP? */ bge %cr0,realtrap /* no, too far away. */ /* Now convert this DSI into a DDB trap. */ GET_CPUINFO(%r1) lwz %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) /* get DAR */ stw %r30,(PC_DBSAVE +CPUSAVE_AIM_DAR)(%r1) /* save DAR */ lwz %r31,(PC_TEMPSAVE+CPUSAVE_AIM_DSISR)(%r1) /* get DSISR */ stw %r31,(PC_DBSAVE +CPUSAVE_AIM_DSISR)(%r1) /* save DSISR */ lwz %r30,(PC_DISISAVE+CPUSAVE_R28)(%r1) /* get r28 */ stw %r30,(PC_DBSAVE +CPUSAVE_R28)(%r1) /* save r28 */ lwz %r31,(PC_DISISAVE+CPUSAVE_R29)(%r1) /* get r29 */ stw %r31,(PC_DBSAVE +CPUSAVE_R29)(%r1) /* save r29 */ lwz %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) /* get r30 */ stw %r30,(PC_DBSAVE +CPUSAVE_R30)(%r1) /* save r30 */ lwz %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) /* get r31 */ stw %r31,(PC_DBSAVE +CPUSAVE_R31)(%r1) /* save r31 */ b dbtrap #endif /* XXX need stack probe here */ realtrap: /* Test whether we already had PR set */ mfsrr1 %r1 mtcr %r1 mfsprg1 %r1 /* restore SP (might have been overwritten) */ bf 17,k_trap /* branch if PSL_PR is false */ GET_CPUINFO(%r1) lwz %r1,PC_CURPCB(%r1) RESTORE_KERN_SRS(%r30,%r31) /* enable kernel mapping */ - ba s_trap + b s_trap /* * generictrap does some standard setup for trap handling to minimize * the code that need be installed in the actual vectors. It expects * the following conditions. * * R1 - Trap vector = LR & (0xff00 | R1) * SPRG1 - Original R1 contents * SPRG2 - Original LR */ + .globl CNAME(generictrap64) generictrap64: mtsprg3 %r31 mfmsr %r31 clrldi %r31,%r31,1 mtmsrd %r31 mfsprg3 %r31 isync + .globl CNAME(generictrap) generictrap: /* Save R1 for computing the exception vector */ mtsprg3 %r1 /* Save interesting registers */ GET_CPUINFO(%r1) stw %r28,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) /* free r28-r31 */ stw %r29,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) stw %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mfsprg1 %r1 /* restore SP, in case of branch */ mfsprg2 %r28 /* save LR */ mfcr %r29 /* save CR */ /* Compute the exception vector from the link register */ mfsprg3 %r31 ori %r31,%r31,0xff00 mflr %r30 and %r30,%r30,%r31 mtsprg3 %r30 /* Test whether we already had PR set */ mfsrr1 %r31 mtcr %r31 s_trap: bf 17,k_trap /* branch if PSL_PR is false */ GET_CPUINFO(%r1) u_trap: lwz %r1,PC_CURPCB(%r1) RESTORE_KERN_SRS(%r30,%r31) /* enable kernel mapping */ /* * Now the common trap catching code. */ k_trap: FRAME_SETUP(PC_TEMPSAVE) /* Restore USER_SR */ GET_CPUINFO(%r30) lwz %r30,PC_CURPCB(%r30) lwz %r30,PCB_AIM_USR_VSID(%r30) mtsr USER_SR,%r30; sync; isync /* Call C interrupt dispatcher: */ trapagain: addi %r3,%r1,8 bl CNAME(powerpc_interrupt) .globl CNAME(trapexit) /* backtrace code sentinel */ CNAME(trapexit): /* Disable interrupts: */ mfmsr %r3 andi. %r3,%r3,~PSL_EE@l mtmsr %r3 /* Test AST pending: */ lwz %r5,FRAME_SRR1+8(%r1) mtcr %r5 bf 17,1f /* branch if PSL_PR is false */ GET_CPUINFO(%r3) /* get per-CPU pointer */ lwz %r4, TD_FLAGS(%r2) /* get thread flags value * (r2 is curthread) */ lis %r5, (TDF_ASTPENDING|TDF_NEEDRESCHED)@h ori %r5,%r5, (TDF_ASTPENDING|TDF_NEEDRESCHED)@l and. %r4,%r4,%r5 beq 1f mfmsr %r3 /* re-enable interrupts */ ori %r3,%r3,PSL_EE@l mtmsr %r3 isync addi %r3,%r1,8 bl CNAME(ast) .globl CNAME(asttrapexit) /* backtrace code sentinel #2 */ CNAME(asttrapexit): b trapexit /* test ast ret value ? */ 1: FRAME_LEAVE(PC_TEMPSAVE) .globl CNAME(rfi_patch1) /* replace rfi with rfid on ppc64 */ CNAME(rfi_patch1): rfi .globl CNAME(rfid_patch) CNAME(rfid_patch): rfid #if defined(KDB) /* * Deliberate entry to dbtrap */ .globl CNAME(breakpoint) CNAME(breakpoint): mtsprg1 %r1 mfmsr %r3 mtsrr1 %r3 andi. %r3,%r3,~(PSL_EE|PSL_ME)@l mtmsr %r3 /* disable interrupts */ isync GET_CPUINFO(%r3) stw %r28,(PC_DBSAVE+CPUSAVE_R28)(%r3) stw %r29,(PC_DBSAVE+CPUSAVE_R29)(%r3) stw %r30,(PC_DBSAVE+CPUSAVE_R30)(%r3) stw %r31,(PC_DBSAVE+CPUSAVE_R31)(%r3) mflr %r28 li %r29,EXC_BPT mtlr %r29 mfcr %r29 mtsrr0 %r28 /* * Now the kdb trap catching code. */ dbtrap: /* Write the trap vector to SPRG3 by computing LR & 0xff00 */ mflr %r1 andi. %r1,%r1,0xff00 mtsprg3 %r1 - lis %r1,(tmpstk+TMPSTKSZ-16)@ha /* get new SP */ - addi %r1,%r1,(tmpstk+TMPSTKSZ-16)@l + lwz %r1,TRAP_TOCBASE(0) /* get new SP */ + lwz %r1,tmpstk@got(%r1) + addi %r1,%r1,TMPSTKSZ-16 FRAME_SETUP(PC_DBSAVE) /* Call C trap code: */ addi %r3,%r1,8 bl CNAME(db_trap_glue) or. %r3,%r3,%r3 bne dbleave /* This wasn't for KDB, so switch to real trap: */ lwz %r3,FRAME_EXC+8(%r1) /* save exception */ GET_CPUINFO(%r4) stw %r3,(PC_DBSAVE+CPUSAVE_R31)(%r4) FRAME_LEAVE(PC_DBSAVE) mtsprg1 %r1 /* prepare for entrance to realtrap */ GET_CPUINFO(%r1) stw %r28,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) stw %r29,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) stw %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) stw %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mflr %r28 mfcr %r29 lwz %r31,(PC_DBSAVE+CPUSAVE_R31)(%r1) mtsprg3 %r31 /* SPRG3 was clobbered by FRAME_LEAVE */ mfsprg1 %r1 b realtrap dbleave: FRAME_LEAVE(PC_DBSAVE) .globl CNAME(rfi_patch2) /* replace rfi with rfid on ppc64 */ CNAME(rfi_patch2): rfi /* * In case of KDB we want a separate trap catcher for it */ .globl CNAME(dblow),CNAME(dbend) CNAME(dblow): mtsprg1 %r1 /* save SP */ mtsprg2 %r29 /* save r29 */ mfcr %r29 /* save CR in r29 */ mfsrr1 %r1 mtcr %r1 bf 17,1f /* branch if privileged */ /* Unprivileged case */ mtcr %r29 /* put the condition register back */ mfsprg2 %r29 /* ... and r29 */ mflr %r1 /* save LR */ mtsprg2 %r1 /* And then in SPRG2 */ - li %r1, 0 /* How to get the vector from LR */ - bla generictrap /* and we look like a generic trap */ + lwz %r1, TRAP_GENTRAP(0) /* Get branch address */ + mtlr %r1 + li %r1, 0 /* How to get the vector from LR */ + blrl /* LR & (0xff00 | r1) is exception # */ 1: /* Privileged, so drop to KDB */ GET_CPUINFO(%r1) stw %r28,(PC_DBSAVE+CPUSAVE_R28)(%r1) /* free r28 */ mfsprg2 %r28 /* r29 holds cr... */ stw %r28,(PC_DBSAVE+CPUSAVE_R29)(%r1) /* free r29 */ stw %r30,(PC_DBSAVE+CPUSAVE_R30)(%r1) /* free r30 */ stw %r31,(PC_DBSAVE+CPUSAVE_R31)(%r1) /* free r31 */ mflr %r28 /* save LR */ - bla dbtrap + + /* Jump to dbtrap */ + bl 2f + .long dbtrap +2: mflr %r1 + lwz %r1,0(%r1) + mtlr %r1 + blrl CNAME(dbend): #endif /* KDB */ Index: head/sys/powerpc/aim/trap_subr64.S =================================================================== --- head/sys/powerpc/aim/trap_subr64.S (revision 279749) +++ head/sys/powerpc/aim/trap_subr64.S (revision 279750) @@ -1,873 +1,872 @@ /* $FreeBSD$ */ /* $NetBSD: trap_subr.S,v 1.20 2002/04/22 23:20:08 kleink Exp $ */ /*- * Copyright (C) 1995, 1996 Wolfgang Solfrank. * Copyright (C) 1995, 1996 TooLs GmbH. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * NOTICE: This is not a standalone file. to use it, #include it in * your port's locore.S, like so: * * #include */ /* * Save/restore segment registers */ /* * Restore SRs for a pmap * * Requires that r28-r31 be scratch, with r28 initialized to the SLB cache */ /* * User SRs are loaded through a pointer to the current pmap. */ restore_usersrs: GET_CPUINFO(%r28) ld %r28,PC_USERSLB(%r28) li %r29, 0 /* Set the counter to zero */ slbia slbmfee %r31,%r29 clrrdi %r31,%r31,28 slbie %r31 1: ld %r31, 0(%r28) /* Load SLB entry pointer */ cmpli 0, %r31, 0 /* If NULL, stop */ beqlr ld %r30, 0(%r31) /* Load SLBV */ ld %r31, 8(%r31) /* Load SLBE */ or %r31, %r31, %r29 /* Set SLBE slot */ slbmte %r30, %r31 /* Install SLB entry */ addi %r28, %r28, 8 /* Advance pointer */ addi %r29, %r29, 1 b 1b /* Repeat */ /* * Kernel SRs are loaded directly from the PCPU fields */ restore_kernsrs: GET_CPUINFO(%r28) addi %r28,%r28,PC_KERNSLB li %r29, 0 /* Set the counter to zero */ slbia slbmfee %r31,%r29 clrrdi %r31,%r31,28 slbie %r31 1: cmpli 0, %r29, USER_SLB_SLOT /* Skip the user slot */ beq- 2f ld %r31, 8(%r28) /* Load SLBE */ cmpli 0, %r31, 0 /* If SLBE is not valid, stop */ beqlr ld %r30, 0(%r28) /* Load SLBV */ slbmte %r30, %r31 /* Install SLB entry */ 2: addi %r28, %r28, 16 /* Advance pointer */ addi %r29, %r29, 1 cmpli 0, %r29, 64 /* Repeat if we are not at the end */ blt 1b blr /* * FRAME_SETUP assumes: * SPRG1 SP (1) * SPRG3 trap type * savearea r27-r31,DAR,DSISR (DAR & DSISR only for DSI traps) * r28 LR * r29 CR * r30 scratch * r31 scratch * r1 kernel stack * SRR0/1 as at start of trap * * NOTE: SPRG1 is never used while the MMU is on, making it safe to reuse * in any real-mode fault handler, including those handling double faults. */ #define FRAME_SETUP(savearea) \ /* Have to enable translation to allow access of kernel stack: */ \ GET_CPUINFO(%r31); \ mfsrr0 %r30; \ std %r30,(savearea+CPUSAVE_SRR0)(%r31); /* save SRR0 */ \ mfsrr1 %r30; \ std %r30,(savearea+CPUSAVE_SRR1)(%r31); /* save SRR1 */ \ mfsprg1 %r31; /* get saved SP (clears SPRG1) */ \ mfmsr %r30; \ ori %r30,%r30,(PSL_DR|PSL_IR|PSL_RI)@l; /* relocation on */ \ mtmsr %r30; /* stack can now be accessed */ \ isync; \ stdu %r31,-(FRAMELEN+288)(%r1); /* save it in the callframe */ \ std %r0, FRAME_0+48(%r1); /* save r0 in the trapframe */ \ std %r31,FRAME_1+48(%r1); /* save SP " " */ \ std %r2, FRAME_2+48(%r1); /* save r2 " " */ \ std %r28,FRAME_LR+48(%r1); /* save LR " " */ \ std %r29,FRAME_CR+48(%r1); /* save CR " " */ \ GET_CPUINFO(%r2); \ ld %r27,(savearea+CPUSAVE_R27)(%r2); /* get saved r27 */ \ ld %r28,(savearea+CPUSAVE_R28)(%r2); /* get saved r28 */ \ ld %r29,(savearea+CPUSAVE_R29)(%r2); /* get saved r29 */ \ ld %r30,(savearea+CPUSAVE_R30)(%r2); /* get saved r30 */ \ ld %r31,(savearea+CPUSAVE_R31)(%r2); /* get saved r31 */ \ std %r3, FRAME_3+48(%r1); /* save r3-r31 */ \ std %r4, FRAME_4+48(%r1); \ std %r5, FRAME_5+48(%r1); \ std %r6, FRAME_6+48(%r1); \ std %r7, FRAME_7+48(%r1); \ std %r8, FRAME_8+48(%r1); \ std %r9, FRAME_9+48(%r1); \ std %r10, FRAME_10+48(%r1); \ std %r11, FRAME_11+48(%r1); \ std %r12, FRAME_12+48(%r1); \ std %r13, FRAME_13+48(%r1); \ std %r14, FRAME_14+48(%r1); \ std %r15, FRAME_15+48(%r1); \ std %r16, FRAME_16+48(%r1); \ std %r17, FRAME_17+48(%r1); \ std %r18, FRAME_18+48(%r1); \ std %r19, FRAME_19+48(%r1); \ std %r20, FRAME_20+48(%r1); \ std %r21, FRAME_21+48(%r1); \ std %r22, FRAME_22+48(%r1); \ std %r23, FRAME_23+48(%r1); \ std %r24, FRAME_24+48(%r1); \ std %r25, FRAME_25+48(%r1); \ std %r26, FRAME_26+48(%r1); \ std %r27, FRAME_27+48(%r1); \ std %r28, FRAME_28+48(%r1); \ std %r29, FRAME_29+48(%r1); \ std %r30, FRAME_30+48(%r1); \ std %r31, FRAME_31+48(%r1); \ ld %r28,(savearea+CPUSAVE_AIM_DAR)(%r2); /* saved DAR */ \ ld %r29,(savearea+CPUSAVE_AIM_DSISR)(%r2);/* saved DSISR */\ ld %r30,(savearea+CPUSAVE_SRR0)(%r2); /* saved SRR0 */ \ ld %r31,(savearea+CPUSAVE_SRR1)(%r2); /* saved SRR1 */ \ mfxer %r3; \ mfctr %r4; \ mfsprg3 %r5; \ std %r3, FRAME_XER+48(1); /* save xer/ctr/exc */ \ std %r4, FRAME_CTR+48(1); \ std %r5, FRAME_EXC+48(1); \ std %r28,FRAME_AIM_DAR+48(1); \ std %r29,FRAME_AIM_DSISR+48(1); /* save dsisr/srr0/srr1 */ \ std %r30,FRAME_SRR0+48(1); \ std %r31,FRAME_SRR1+48(1); \ ld %r13,PC_CURTHREAD(%r2) /* set kernel curthread */ #define FRAME_LEAVE(savearea) \ /* Disable exceptions: */ \ mfmsr %r2; \ andi. %r2,%r2,~PSL_EE@l; \ mtmsr %r2; \ isync; \ /* Now restore regs: */ \ ld %r2,FRAME_SRR0+48(%r1); \ ld %r3,FRAME_SRR1+48(%r1); \ ld %r4,FRAME_CTR+48(%r1); \ ld %r5,FRAME_XER+48(%r1); \ ld %r6,FRAME_LR+48(%r1); \ GET_CPUINFO(%r7); \ std %r2,(savearea+CPUSAVE_SRR0)(%r7); /* save SRR0 */ \ std %r3,(savearea+CPUSAVE_SRR1)(%r7); /* save SRR1 */ \ ld %r7,FRAME_CR+48(%r1); \ mtctr %r4; \ mtxer %r5; \ mtlr %r6; \ mtsprg2 %r7; /* save cr */ \ ld %r31,FRAME_31+48(%r1); /* restore r0-31 */ \ ld %r30,FRAME_30+48(%r1); \ ld %r29,FRAME_29+48(%r1); \ ld %r28,FRAME_28+48(%r1); \ ld %r27,FRAME_27+48(%r1); \ ld %r26,FRAME_26+48(%r1); \ ld %r25,FRAME_25+48(%r1); \ ld %r24,FRAME_24+48(%r1); \ ld %r23,FRAME_23+48(%r1); \ ld %r22,FRAME_22+48(%r1); \ ld %r21,FRAME_21+48(%r1); \ ld %r20,FRAME_20+48(%r1); \ ld %r19,FRAME_19+48(%r1); \ ld %r18,FRAME_18+48(%r1); \ ld %r17,FRAME_17+48(%r1); \ ld %r16,FRAME_16+48(%r1); \ ld %r15,FRAME_15+48(%r1); \ ld %r14,FRAME_14+48(%r1); \ ld %r13,FRAME_13+48(%r1); \ ld %r12,FRAME_12+48(%r1); \ ld %r11,FRAME_11+48(%r1); \ ld %r10,FRAME_10+48(%r1); \ ld %r9, FRAME_9+48(%r1); \ ld %r8, FRAME_8+48(%r1); \ ld %r7, FRAME_7+48(%r1); \ ld %r6, FRAME_6+48(%r1); \ ld %r5, FRAME_5+48(%r1); \ ld %r4, FRAME_4+48(%r1); \ ld %r3, FRAME_3+48(%r1); \ ld %r2, FRAME_2+48(%r1); \ ld %r0, FRAME_0+48(%r1); \ ld %r1, FRAME_1+48(%r1); \ /* Can't touch %r1 from here on */ \ mtsprg3 %r3; /* save r3 */ \ /* Disable translation, machine check and recoverability: */ \ mfmsr %r3; \ andi. %r3,%r3,~(PSL_DR|PSL_IR|PSL_ME|PSL_RI)@l; \ mtmsr %r3; \ isync; \ /* Decide whether we return to user mode: */ \ GET_CPUINFO(%r3); \ ld %r3,(savearea+CPUSAVE_SRR1)(%r3); \ mtcr %r3; \ bf 17,1f; /* branch if PSL_PR is false */ \ /* Restore user SRs */ \ GET_CPUINFO(%r3); \ std %r27,(savearea+CPUSAVE_R27)(%r3); \ std %r28,(savearea+CPUSAVE_R28)(%r3); \ std %r29,(savearea+CPUSAVE_R29)(%r3); \ std %r30,(savearea+CPUSAVE_R30)(%r3); \ std %r31,(savearea+CPUSAVE_R31)(%r3); \ mflr %r27; /* preserve LR */ \ bl restore_usersrs; /* uses r28-r31 */ \ mtlr %r27; \ ld %r31,(savearea+CPUSAVE_R31)(%r3); \ ld %r30,(savearea+CPUSAVE_R30)(%r3); \ ld %r29,(savearea+CPUSAVE_R29)(%r3); \ ld %r28,(savearea+CPUSAVE_R28)(%r3); \ ld %r27,(savearea+CPUSAVE_R27)(%r3); \ 1: mfsprg2 %r3; /* restore cr */ \ mtcr %r3; \ GET_CPUINFO(%r3); \ ld %r3,(savearea+CPUSAVE_SRR0)(%r3); /* restore srr0 */ \ mtsrr0 %r3; \ GET_CPUINFO(%r3); \ ld %r3,(savearea+CPUSAVE_SRR1)(%r3); /* restore srr1 */ \ mtsrr1 %r3; \ mfsprg3 %r3 /* restore r3 */ #ifdef KDTRACE_HOOKS .data .globl dtrace_invop_calltrap_addr .align 8 .type dtrace_invop_calltrap_addr, @object .size dtrace_invop_calltrap_addr, 8 dtrace_invop_calltrap_addr: .word 0 .word 0 .text #endif /* * Processor reset exception handler. These are typically * the first instructions the processor executes after a * software reset. We do this in two bits so that we are * not still hanging around in the trap handling region * once the MMU is turned on. */ .globl CNAME(rstcode), CNAME(rstcodeend) CNAME(rstcode): /* Explicitly set MSR[SF] */ mfmsr %r9 li %r8,1 insrdi %r9,%r8,1,0 mtmsrd %r9 isync bl 1f .llong cpu_reset 1: mflr %r9 ld %r9,0(%r9) mtlr %r9 blr CNAME(rstcodeend): cpu_reset: GET_TOCBASE(%r2) ld %r1,TOC_REF(tmpstk)(%r2) /* get new SP */ addi %r1,%r1,(TMPSTKSZ-48) bl CNAME(cpudep_ap_early_bootstrap) /* Set PCPU */ nop lis %r3,1@l bl CNAME(pmap_cpu_bootstrap) /* Turn on virtual memory */ nop bl CNAME(cpudep_ap_bootstrap) /* Set up PCPU and stack */ nop mr %r1,%r3 /* Use new stack */ bl CNAME(cpudep_ap_setup) nop GET_CPUINFO(%r5) ld %r3,(PC_RESTORE)(%r5) cmpldi %cr0,%r3,0 beq %cr0,2f nop li %r4,1 b CNAME(longjmp) nop 2: #ifdef SMP bl CNAME(machdep_ap_bootstrap) /* And away! */ nop #endif /* Should not be reached */ 9: b 9b /* * This code gets copied to all the trap vectors * (except ISI/DSI, ALI, and the interrupts). Has to fit in 8 instructions! */ .globl CNAME(trapcode),CNAME(trapcodeend) .p2align 3 CNAME(trapcode): mtsprg1 %r1 /* save SP */ mflr %r1 /* Save the old LR in r1 */ mtsprg2 %r1 /* And then in SPRG2 */ li %r1,TRAP_GENTRAP ld %r1,0(%r1) mtlr %r1 li %r1, 0xe0 /* How to get the vector from LR */ blrl /* Branch to generictrap */ CNAME(trapcodeend): /* * For SLB misses: do special things for the kernel * * Note: SPRG1 is always safe to overwrite any time the MMU is on, which is * the only time this can be called. */ .globl CNAME(slbtrap),CNAME(slbtrapend) .p2align 3 CNAME(slbtrap): mtsprg1 %r1 /* save SP */ GET_CPUINFO(%r1) std %r2,(PC_SLBSAVE+16)(%r1) mfcr %r2 /* save CR */ std %r2,(PC_SLBSAVE+104)(%r1) mfsrr1 %r2 /* test kernel mode */ mtcr %r2 bf 17,2f /* branch if PSL_PR is false */ /* User mode */ ld %r2,(PC_SLBSAVE+104)(%r1) /* Restore CR */ mtcr %r2 ld %r2,(PC_SLBSAVE+16)(%r1) /* Restore R2 */ mflr %r1 /* Save the old LR in r1 */ mtsprg2 %r1 /* And then in SPRG2 */ /* 52 bytes so far */ bl 1f .llong generictrap 1: mflr %r1 ld %r1,0(%r1) mtlr %r1 li %r1, 0x80 /* How to get the vector from LR */ blrl /* Branch to generictrap */ /* 84 bytes */ 2: mflr %r2 /* Save the old LR in r2 */ nop bl 3f /* Begin dance to jump to kern_slbtrap*/ .llong kern_slbtrap 3: mflr %r1 ld %r1,0(%r1) mtlr %r1 GET_CPUINFO(%r1) blrl /* 124 bytes -- 4 to spare */ CNAME(slbtrapend): kern_slbtrap: std %r2,(PC_SLBSAVE+136)(%r1) /* old LR */ std %r3,(PC_SLBSAVE+24)(%r1) /* save R3 */ /* Check if this needs to be handled as a regular trap (userseg miss) */ mflr %r2 andi. %r2,%r2,0xff80 cmpwi %r2,0x380 bne 1f mfdar %r2 b 2f 1: mfsrr0 %r2 2: /* r2 now contains the fault address */ lis %r3,SEGMENT_MASK@highesta ori %r3,%r3,SEGMENT_MASK@highera sldi %r3,%r3,32 oris %r3,%r3,SEGMENT_MASK@ha ori %r3,%r3,SEGMENT_MASK@l and %r2,%r2,%r3 /* R2 = segment base address */ lis %r3,USER_ADDR@highesta ori %r3,%r3,USER_ADDR@highera sldi %r3,%r3,32 oris %r3,%r3,USER_ADDR@ha ori %r3,%r3,USER_ADDR@l cmpd %r2,%r3 /* Compare fault base to USER_ADDR */ bne 3f /* User seg miss, handle as a regular trap */ ld %r2,(PC_SLBSAVE+104)(%r1) /* Restore CR */ mtcr %r2 ld %r2,(PC_SLBSAVE+16)(%r1) /* Restore R2,R3 */ ld %r3,(PC_SLBSAVE+24)(%r1) ld %r1,(PC_SLBSAVE+136)(%r1) /* Save the old LR in r1 */ mtsprg2 %r1 /* And then in SPRG2 */ li %r1, 0x80 /* How to get the vector from LR */ b generictrap /* Retain old LR using b */ 3: /* Real kernel SLB miss */ std %r0,(PC_SLBSAVE+0)(%r1) /* free all volatile regs */ mfsprg1 %r2 /* Old R1 */ std %r2,(PC_SLBSAVE+8)(%r1) /* R2,R3 already saved */ std %r4,(PC_SLBSAVE+32)(%r1) std %r5,(PC_SLBSAVE+40)(%r1) std %r6,(PC_SLBSAVE+48)(%r1) std %r7,(PC_SLBSAVE+56)(%r1) std %r8,(PC_SLBSAVE+64)(%r1) std %r9,(PC_SLBSAVE+72)(%r1) std %r10,(PC_SLBSAVE+80)(%r1) std %r11,(PC_SLBSAVE+88)(%r1) std %r12,(PC_SLBSAVE+96)(%r1) /* CR already saved */ mfxer %r2 /* save XER */ std %r2,(PC_SLBSAVE+112)(%r1) mflr %r2 /* save LR (SP already saved) */ std %r2,(PC_SLBSAVE+120)(%r1) mfctr %r2 /* save CTR */ std %r2,(PC_SLBSAVE+128)(%r1) /* Call handler */ addi %r1,%r1,PC_SLBSTACK-48+1024 li %r2,~15 and %r1,%r1,%r2 GET_TOCBASE(%r2) mflr %r3 andi. %r3,%r3,0xff80 mfdar %r4 mfsrr0 %r5 bl handle_kernel_slb_spill nop /* Save r28-31, restore r4-r12 */ GET_CPUINFO(%r1) ld %r4,(PC_SLBSAVE+32)(%r1) ld %r5,(PC_SLBSAVE+40)(%r1) ld %r6,(PC_SLBSAVE+48)(%r1) ld %r7,(PC_SLBSAVE+56)(%r1) ld %r8,(PC_SLBSAVE+64)(%r1) ld %r9,(PC_SLBSAVE+72)(%r1) ld %r10,(PC_SLBSAVE+80)(%r1) ld %r11,(PC_SLBSAVE+88)(%r1) ld %r12,(PC_SLBSAVE+96)(%r1) std %r28,(PC_SLBSAVE+64)(%r1) std %r29,(PC_SLBSAVE+72)(%r1) std %r30,(PC_SLBSAVE+80)(%r1) std %r31,(PC_SLBSAVE+88)(%r1) /* Restore kernel mapping */ bl restore_kernsrs /* Restore remaining registers */ ld %r28,(PC_SLBSAVE+64)(%r1) ld %r29,(PC_SLBSAVE+72)(%r1) ld %r30,(PC_SLBSAVE+80)(%r1) ld %r31,(PC_SLBSAVE+88)(%r1) ld %r2,(PC_SLBSAVE+104)(%r1) mtcr %r2 ld %r2,(PC_SLBSAVE+112)(%r1) mtxer %r2 ld %r2,(PC_SLBSAVE+120)(%r1) mtlr %r2 ld %r2,(PC_SLBSAVE+128)(%r1) mtctr %r2 ld %r2,(PC_SLBSAVE+136)(%r1) mtlr %r2 /* Restore r0-r3 */ ld %r0,(PC_SLBSAVE+0)(%r1) ld %r2,(PC_SLBSAVE+16)(%r1) ld %r3,(PC_SLBSAVE+24)(%r1) mfsprg1 %r1 /* Back to whatever we were doing */ rfid /* * For ALI: has to save DSISR and DAR */ .globl CNAME(alitrap),CNAME(aliend) CNAME(alitrap): mtsprg1 %r1 /* save SP */ GET_CPUINFO(%r1) std %r27,(PC_TEMPSAVE+CPUSAVE_R27)(%r1) /* free r27-r31 */ std %r28,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) std %r29,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) std %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mfdar %r30 mfdsisr %r31 std %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_AIM_DSISR)(%r1) mfsprg1 %r1 /* restore SP, in case of branch */ mflr %r28 /* save LR */ mfcr %r29 /* save CR */ /* Begin dance to branch to s_trap in a bit */ b 1f .p2align 3 1: nop bl 1f .llong s_trap 1: mflr %r31 ld %r31,0(%r31) mtlr %r31 /* Put our exception vector in SPRG3 */ li %r31, EXC_ALI mtsprg3 %r31 /* Test whether we already had PR set */ mfsrr1 %r31 mtcr %r31 blrl CNAME(aliend): /* * Similar to the above for DSI * Has to handle standard pagetable spills */ .globl CNAME(dsitrap),CNAME(dsiend) CNAME(dsitrap): mtsprg1 %r1 /* save SP */ GET_CPUINFO(%r1) std %r27,(PC_DISISAVE+CPUSAVE_R27)(%r1) /* free r27-r31 */ std %r28,(PC_DISISAVE+CPUSAVE_R28)(%r1) std %r29,(PC_DISISAVE+CPUSAVE_R29)(%r1) std %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) std %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) mfcr %r29 /* save CR */ mfxer %r30 /* save XER */ mtsprg2 %r30 /* in SPRG2 */ mfsrr1 %r31 /* test kernel mode */ mtcr %r31 mflr %r28 /* save LR (SP already saved) */ bl 1f /* Begin branching to disitrap */ .llong disitrap 1: mflr %r1 ld %r1,0(%r1) mtlr %r1 blrl /* Branch to generictrap */ CNAME(dsiend): /* * Preamble code for DSI/ISI traps */ disitrap: /* Write the trap vector to SPRG3 by computing LR & 0xff00 */ mflr %r1 andi. %r1,%r1,0xff00 mtsprg3 %r1 GET_CPUINFO(%r1) ld %r31,(PC_DISISAVE+CPUSAVE_R27)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_R27)(%r1) ld %r30,(PC_DISISAVE+CPUSAVE_R28)(%r1) std %r30,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) ld %r31,(PC_DISISAVE+CPUSAVE_R29)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) ld %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) std %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) ld %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mfdar %r30 mfdsisr %r31 std %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_AIM_DSISR)(%r1) #ifdef KDB /* Try to detect a kernel stack overflow */ mfsrr1 %r31 mtcr %r31 bt 17,realtrap /* branch is user mode */ mfsprg1 %r31 /* get old SP */ clrrdi %r31,%r31,12 /* Round SP down to nearest page */ sub. %r30,%r31,%r30 /* SP - DAR */ bge 1f neg %r30,%r30 /* modulo value */ 1: cmpldi %cr0,%r30,4096 /* is DAR within a page of SP? */ bge %cr0,realtrap /* no, too far away. */ /* Now convert this DSI into a DDB trap. */ GET_CPUINFO(%r1) ld %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) /* get DAR */ std %r30,(PC_DBSAVE +CPUSAVE_AIM_DAR)(%r1) /* save DAR */ ld %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DSISR)(%r1) /* get DSISR */ std %r30,(PC_DBSAVE +CPUSAVE_AIM_DSISR)(%r1) /* save DSISR */ ld %r31,(PC_DISISAVE+CPUSAVE_R27)(%r1) /* get r27 */ std %r31,(PC_DBSAVE +CPUSAVE_R27)(%r1) /* save r27 */ ld %r30,(PC_DISISAVE+CPUSAVE_R28)(%r1) /* get r28 */ std %r30,(PC_DBSAVE +CPUSAVE_R28)(%r1) /* save r28 */ ld %r31,(PC_DISISAVE+CPUSAVE_R29)(%r1) /* get r29 */ std %r31,(PC_DBSAVE +CPUSAVE_R29)(%r1) /* save r29 */ ld %r30,(PC_DISISAVE+CPUSAVE_R30)(%r1) /* get r30 */ std %r30,(PC_DBSAVE +CPUSAVE_R30)(%r1) /* save r30 */ ld %r31,(PC_DISISAVE+CPUSAVE_R31)(%r1) /* get r31 */ std %r31,(PC_DBSAVE +CPUSAVE_R31)(%r1) /* save r31 */ b dbtrap #endif /* XXX need stack probe here */ realtrap: /* Test whether we already had PR set */ mfsrr1 %r1 mtcr %r1 mfsprg1 %r1 /* restore SP (might have been overwritten) */ bf 17,k_trap /* branch if PSL_PR is false */ GET_CPUINFO(%r1) ld %r1,PC_CURPCB(%r1) mr %r27,%r28 /* Save LR, r29 */ mtsprg2 %r29 bl restore_kernsrs /* enable kernel mapping */ mfsprg2 %r29 mr %r28,%r27 b s_trap /* * generictrap does some standard setup for trap handling to minimize * the code that need be installed in the actual vectors. It expects * the following conditions. * * R1 - Trap vector = LR & (0xff00 | R1) * SPRG1 - Original R1 contents * SPRG2 - Original LR */ - .globl CNAME(trapcode2) -trapcode2: + .globl CNAME(generictrap) generictrap: /* Save R1 for computing the exception vector */ mtsprg3 %r1 /* Save interesting registers */ GET_CPUINFO(%r1) std %r27,(PC_TEMPSAVE+CPUSAVE_R27)(%r1) /* free r27-r31 */ std %r28,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) std %r29,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) std %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mfdar %r30 std %r30,(PC_TEMPSAVE+CPUSAVE_AIM_DAR)(%r1) mfsprg1 %r1 /* restore SP, in case of branch */ mfsprg2 %r28 /* save LR */ mfcr %r29 /* save CR */ /* Compute the exception vector from the link register */ mfsprg3 %r31 ori %r31,%r31,0xff00 mflr %r30 addi %r30,%r30,-4 /* The branch instruction, not the next */ and %r30,%r30,%r31 mtsprg3 %r30 /* Test whether we already had PR set */ mfsrr1 %r31 mtcr %r31 s_trap: bf 17,k_trap /* branch if PSL_PR is false */ GET_CPUINFO(%r1) u_trap: ld %r1,PC_CURPCB(%r1) mr %r27,%r28 /* Save LR, r29 */ mtsprg2 %r29 bl restore_kernsrs /* enable kernel mapping */ mfsprg2 %r29 mr %r28,%r27 /* * Now the common trap catching code. */ k_trap: FRAME_SETUP(PC_TEMPSAVE) /* Call C interrupt dispatcher: */ trapagain: GET_TOCBASE(%r2) addi %r3,%r1,48 bl CNAME(powerpc_interrupt) nop .globl CNAME(trapexit) /* backtrace code sentinel */ CNAME(trapexit): /* Disable interrupts: */ mfmsr %r3 andi. %r3,%r3,~PSL_EE@l mtmsr %r3 isync /* Test AST pending: */ ld %r5,FRAME_SRR1+48(%r1) mtcr %r5 bf 17,1f /* branch if PSL_PR is false */ GET_CPUINFO(%r3) /* get per-CPU pointer */ lwz %r4, TD_FLAGS(%r13) /* get thread flags value */ lis %r5, (TDF_ASTPENDING|TDF_NEEDRESCHED)@h ori %r5,%r5, (TDF_ASTPENDING|TDF_NEEDRESCHED)@l and. %r4,%r4,%r5 beq 1f mfmsr %r3 /* re-enable interrupts */ ori %r3,%r3,PSL_EE@l mtmsr %r3 isync GET_TOCBASE(%r2) addi %r3,%r1,48 bl CNAME(ast) nop .globl CNAME(asttrapexit) /* backtrace code sentinel #2 */ CNAME(asttrapexit): b trapexit /* test ast ret value ? */ 1: FRAME_LEAVE(PC_TEMPSAVE) rfid #if defined(KDB) /* * Deliberate entry to dbtrap */ ASENTRY_NOPROF(breakpoint) mtsprg1 %r1 mfmsr %r3 mtsrr1 %r3 andi. %r3,%r3,~(PSL_EE|PSL_ME)@l mtmsr %r3 /* disable interrupts */ isync GET_CPUINFO(%r3) std %r27,(PC_DBSAVE+CPUSAVE_R27)(%r3) std %r28,(PC_DBSAVE+CPUSAVE_R28)(%r3) std %r29,(PC_DBSAVE+CPUSAVE_R29)(%r3) std %r30,(PC_DBSAVE+CPUSAVE_R30)(%r3) std %r31,(PC_DBSAVE+CPUSAVE_R31)(%r3) mflr %r28 li %r29,EXC_BPT mtlr %r29 mfcr %r29 mtsrr0 %r28 /* * Now the kdb trap catching code. */ dbtrap: /* Write the trap vector to SPRG3 by computing LR & 0xff00 */ mflr %r1 andi. %r1,%r1,0xff00 mtsprg3 %r1 li %r1,TRAP_TOCBASE /* get new SP */ ld %r1,0(%r1) ld %r1,TOC_REF(tmpstk)(%r1) addi %r1,%r1,(TMPSTKSZ-48) FRAME_SETUP(PC_DBSAVE) /* Call C trap code: */ GET_TOCBASE(%r2) addi %r3,%r1,48 bl CNAME(db_trap_glue) nop or. %r3,%r3,%r3 bne dbleave /* This wasn't for KDB, so switch to real trap: */ ld %r3,FRAME_EXC+48(%r1) /* save exception */ GET_CPUINFO(%r4) std %r3,(PC_DBSAVE+CPUSAVE_R31)(%r4) FRAME_LEAVE(PC_DBSAVE) mtsprg1 %r1 /* prepare for entrance to realtrap */ GET_CPUINFO(%r1) std %r27,(PC_TEMPSAVE+CPUSAVE_R27)(%r1) std %r28,(PC_TEMPSAVE+CPUSAVE_R28)(%r1) std %r29,(PC_TEMPSAVE+CPUSAVE_R29)(%r1) std %r30,(PC_TEMPSAVE+CPUSAVE_R30)(%r1) std %r31,(PC_TEMPSAVE+CPUSAVE_R31)(%r1) mflr %r28 mfcr %r29 ld %r31,(PC_DBSAVE+CPUSAVE_R31)(%r1) mtsprg3 %r31 /* SPRG3 was clobbered by FRAME_LEAVE */ mfsprg1 %r1 b realtrap dbleave: FRAME_LEAVE(PC_DBSAVE) rfid /* * In case of KDB we want a separate trap catcher for it */ .globl CNAME(dblow),CNAME(dbend) CNAME(dblow): mtsprg1 %r1 /* save SP */ mtsprg2 %r29 /* save r29 */ mfcr %r29 /* save CR in r29 */ mfsrr1 %r1 mtcr %r1 bf 17,1f /* branch if privileged */ /* Unprivileged case */ mtcr %r29 /* put the condition register back */ mfsprg2 %r29 /* ... and r29 */ mflr %r1 /* save LR */ mtsprg2 %r1 /* And then in SPRG2 */ nop /* Begin branching to generictrap */ bl 9f .llong generictrap 9: mflr %r1 ld %r1,0(%r1) mtlr %r1 li %r1, 0 /* How to get the vector from LR */ blrl /* Branch to generictrap */ 1: GET_CPUINFO(%r1) std %r27,(PC_DBSAVE+CPUSAVE_R27)(%r1) /* free r27 */ std %r28,(PC_DBSAVE+CPUSAVE_R28)(%r1) /* free r28 */ mfsprg2 %r28 /* r29 holds cr... */ std %r28,(PC_DBSAVE+CPUSAVE_R29)(%r1) /* free r29 */ std %r30,(PC_DBSAVE+CPUSAVE_R30)(%r1) /* free r30 */ std %r31,(PC_DBSAVE+CPUSAVE_R31)(%r1) /* free r31 */ mflr %r28 /* save LR */ bl 9f /* Begin branch */ .llong dbtrap 9: mflr %r1 ld %r1,0(%r1) mtlr %r1 blrl /* Branch to generictrap */ CNAME(dbend): #endif /* KDB */ Index: head/sys/powerpc/booke/locore.S =================================================================== --- head/sys/powerpc/booke/locore.S (revision 279749) +++ head/sys/powerpc/booke/locore.S (revision 279750) @@ -1,733 +1,755 @@ /*- * Copyright (C) 2007-2009 Semihalf, Rafal Jaworowski * Copyright (C) 2006 Semihalf, Marian Balakowicz * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #include "assym.s" #include #include #include #include #include #include #include #include #define TMPSTACKSZ 16384 .text .globl btext btext: /* * This symbol is here for the benefit of kvm_mkdb, and is supposed to * mark the start of kernel text. */ .globl kernel_text kernel_text: /* * Startup entry. Note, this must be the first thing in the text segment! */ .text .globl __start __start: /* * Assumptions on the boot loader: * - system memory starts from physical address 0 * - it's mapped by a single TBL1 entry * - TLB1 mapping is 1:1 pa to va * - kernel is loaded at 16MB boundary * - all PID registers are set to the same value * - CPU is running in AS=0 * * Registers contents provided by the loader(8): * r1 : stack pointer * r3 : metadata pointer * * We rearrange the TLB1 layout as follows: * - find TLB1 entry we started in * - make sure it's protected, ivalidate other entries * - create temp entry in the second AS (make sure it's not TLB[1]) * - switch to temp mapping * - map 16MB of RAM in TLB1[1] * - use AS=1, set EPN to KERNBASE and RPN to kernel load address * - switch to to TLB1[1] mapping * - invalidate temp mapping * * locore registers use: * r1 : stack pointer * r2 : trace pointer (AP only, for early diagnostics) * r3-r27 : scratch registers * r28 : temp TLB1 entry * r29 : initial TLB1 entry we started in * r30-r31 : arguments (metadata pointer) */ /* * Keep arguments in r30 & r31 for later use. */ mr %r30, %r3 mr %r31, %r4 /* * Initial cleanup */ li %r3, PSL_DE /* Keep debug exceptions for CodeWarrior. */ mtmsr %r3 isync lis %r3, HID0_E500_DEFAULT_SET@h ori %r3, %r3, HID0_E500_DEFAULT_SET@l mtspr SPR_HID0, %r3 isync lis %r3, HID1_E500_DEFAULT_SET@h ori %r3, %r3, HID1_E500_DEFAULT_SET@l mtspr SPR_HID1, %r3 isync /* Invalidate all entries in TLB0 */ li %r3, 0 bl tlb_inval_all cmpwi %r30, 0 beq done_mapping /* * Locate the TLB1 entry that maps this code */ bl 1f 1: mflr %r3 bl tlb1_find_current /* the entry found is returned in r29 */ bl tlb1_inval_all_but_current /* * Create temporary mapping in AS=1 and switch to it */ addi %r3, %r29, 1 bl tlb1_temp_mapping_as1 mfmsr %r3 ori %r3, %r3, (PSL_IS | PSL_DS) bl 2f 2: mflr %r4 addi %r4, %r4, 20 mtspr SPR_SRR0, %r4 mtspr SPR_SRR1, %r3 rfi /* Switch context */ /* * Invalidate initial entry */ mr %r3, %r29 bl tlb1_inval_entry /* * Setup final mapping in TLB1[1] and switch to it */ /* Final kernel mapping, map in 16 MB of RAM */ lis %r3, MAS0_TLBSEL1@h /* Select TLB1 */ li %r4, 0 /* Entry 0 */ rlwimi %r3, %r4, 16, 12, 15 mtspr SPR_MAS0, %r3 isync li %r3, (TLB_SIZE_64M << MAS1_TSIZE_SHIFT)@l oris %r3, %r3, (MAS1_VALID | MAS1_IPROT)@h mtspr SPR_MAS1, %r3 /* note TS was not filled, so it's TS=0 */ isync lis %r3, KERNBASE@h ori %r3, %r3, KERNBASE@l /* EPN = KERNBASE */ #ifdef SMP ori %r3, %r3, MAS2_M@l /* WIMGE = 0b00100 */ #endif mtspr SPR_MAS2, %r3 isync /* Discover phys load address */ bl 3f 3: mflr %r4 /* Use current address */ rlwinm %r4, %r4, 0, 0, 7 /* 16MB alignment mask */ ori %r4, %r4, (MAS3_SX | MAS3_SW | MAS3_SR)@l mtspr SPR_MAS3, %r4 /* Set RPN and protection */ isync tlbwe isync msync /* Switch to the above TLB1[1] mapping */ bl 4f 4: mflr %r4 rlwinm %r4, %r4, 0, 8, 31 /* Current offset from kernel load address */ rlwinm %r3, %r3, 0, 0, 19 add %r4, %r4, %r3 /* Convert to kernel virtual address */ addi %r4, %r4, 36 li %r3, PSL_DE /* Note AS=0 */ mtspr SPR_SRR0, %r4 mtspr SPR_SRR1, %r3 rfi /* * Invalidate temp mapping */ mr %r3, %r28 bl tlb1_inval_entry done_mapping: /* * Setup a temporary stack */ - lis %r1, tmpstack@ha - addi %r1, %r1, tmpstack@l + bl 1f + .long tmpstack-. +1: mflr %r1 + lwz %r2,0(%r1) + add %r1,%r1,%r2 addi %r1, %r1, (TMPSTACKSZ - 16) /* + * Relocate kernel + */ + bl 1f + .long _DYNAMIC-. + .long _GLOBAL_OFFSET_TABLE_-. +1: mflr %r5 + lwz %r3,0(%r5) /* _DYNAMIC in %r3 */ + add %r3,%r3,%r5 + lwz %r4,4(%r5) /* GOT pointer */ + add %r4,%r4,%r5 + lwz %r4,4(%r4) /* got[0] is _DYNAMIC link addr */ + subf %r4,%r4,%r3 /* subtract to calculate relocbase */ + bl elf_reloc_self + +/* * Initialise exception vector offsets */ bl ivor_setup /* * Set up arguments and jump to system initialization code */ mr %r3, %r30 mr %r4, %r31 /* Prepare core */ bl booke_init /* Switch to thread0.td_kstack now */ mr %r1, %r3 li %r3, 0 stw %r3, 0(%r1) /* Machine independet part, does not return */ bl mi_startup /* NOT REACHED */ 5: b 5b #ifdef SMP /************************************************************************/ /* AP Boot page */ /************************************************************************/ .text .globl __boot_page .align 12 __boot_page: bl 1f .globl bp_ntlb1s bp_ntlb1s: .long 0 .globl bp_tlb1 bp_tlb1: .space 4 * 3 * 16 .globl bp_tlb1_end bp_tlb1_end: /* * Initial configuration */ 1: mflr %r31 /* r31 hold the address of bp_ntlb1s */ /* Set HIDs */ lis %r3, HID0_E500_DEFAULT_SET@h ori %r3, %r3, HID0_E500_DEFAULT_SET@l mtspr SPR_HID0, %r3 isync lis %r3, HID1_E500_DEFAULT_SET@h ori %r3, %r3, HID1_E500_DEFAULT_SET@l mtspr SPR_HID1, %r3 isync /* Enable branch prediction */ li %r3, BUCSR_BPEN mtspr SPR_BUCSR, %r3 isync /* Invalidate all entries in TLB0 */ li %r3, 0 bl tlb_inval_all /* * Find TLB1 entry which is translating us now */ bl 2f 2: mflr %r3 bl tlb1_find_current /* the entry number found is in r29 */ bl tlb1_inval_all_but_current /* * Create temporary translation in AS=1 and switch to it */ lwz %r3, 0(%r31) bl tlb1_temp_mapping_as1 mfmsr %r3 ori %r3, %r3, (PSL_IS | PSL_DS) bl 3f 3: mflr %r4 addi %r4, %r4, 20 mtspr SPR_SRR0, %r4 mtspr SPR_SRR1, %r3 rfi /* Switch context */ /* * Invalidate initial entry */ mr %r3, %r29 bl tlb1_inval_entry /* * Setup final mapping in TLB1[1] and switch to it */ lwz %r6, 0(%r31) addi %r5, %r31, 4 li %r4, 0 4: lis %r3, MAS0_TLBSEL1@h rlwimi %r3, %r4, 16, 12, 15 mtspr SPR_MAS0, %r3 isync lwz %r3, 0(%r5) mtspr SPR_MAS1, %r3 isync lwz %r3, 4(%r5) mtspr SPR_MAS2, %r3 isync lwz %r3, 8(%r5) mtspr SPR_MAS3, %r3 isync tlbwe isync msync addi %r5, %r5, 12 addi %r4, %r4, 1 cmpw %r4, %r6 blt 4b /* Switch to the final mapping */ - lis %r5, __boot_page@ha - ori %r5, %r5, __boot_page@l bl 5f -5: mflr %r3 + .long __boot_page-. +5: mflr %r5 + lwz %r3,0(%r3) + add %r5,%r5,%r3 /* __boot_page in r5 */ + bl 6f +6: mflr %r3 rlwinm %r3, %r3, 0, 0xfff /* Offset from boot page start */ add %r3, %r3, %r5 /* Make this virtual address */ addi %r3, %r3, 32 li %r4, 0 /* Note AS=0 */ mtspr SPR_SRR0, %r3 mtspr SPR_SRR1, %r4 rfi /* * At this point we're running at virtual addresses KERNBASE and beyond so * it's allowed to directly access all locations the kernel was linked * against. */ /* * Invalidate temp mapping */ mr %r3, %r28 bl tlb1_inval_entry /* * Setup a temporary stack */ - lis %r1, tmpstack@ha - addi %r1, %r1, tmpstack@l + bl 1f + .long tmpstack-. +1: mflr %r1 + lwz %r2,0(%r1) + add %r1,%r1,%r2 addi %r1, %r1, (TMPSTACKSZ - 16) /* * Initialise exception vector offsets */ bl ivor_setup /* * Assign our pcpu instance */ - lis %r3, ap_pcpu@h - ori %r3, %r3, ap_pcpu@l + bl 1f + .long ap_pcpu-. +1: mflr %r4 + lwz %r3, 0(%r4) + add %r3, %r3, %r4 lwz %r3, 0(%r3) mtsprg0 %r3 bl pmap_bootstrap_ap bl cpudep_ap_bootstrap /* Switch to the idle thread's kstack */ mr %r1, %r3 bl machdep_ap_bootstrap /* NOT REACHED */ 6: b 6b #endif /* SMP */ /* * Invalidate all entries in the given TLB. * * r3 TLBSEL */ tlb_inval_all: rlwinm %r3, %r3, 3, 0x18 /* TLBSEL */ ori %r3, %r3, 0x4 /* INVALL */ tlbivax 0, %r3 isync msync tlbsync msync blr /* * expects address to look up in r3, returns entry number in r29 * * FIXME: the hidden assumption is we are now running in AS=0, but we should * retrieve actual AS from MSR[IS|DS] and put it in MAS6[SAS] */ tlb1_find_current: mfspr %r17, SPR_PID0 slwi %r17, %r17, MAS6_SPID0_SHIFT mtspr SPR_MAS6, %r17 isync tlbsx 0, %r3 mfspr %r17, SPR_MAS0 rlwinm %r29, %r17, 16, 20, 31 /* MAS0[ESEL] -> r29 */ /* Make sure we have IPROT set on the entry */ mfspr %r17, SPR_MAS1 oris %r17, %r17, MAS1_IPROT@h mtspr SPR_MAS1, %r17 isync tlbwe isync msync blr /* * Invalidates a single entry in TLB1. * * r3 ESEL * r4-r5 scratched */ tlb1_inval_entry: lis %r4, MAS0_TLBSEL1@h /* Select TLB1 */ rlwimi %r4, %r3, 16, 12, 15 /* Select our entry */ mtspr SPR_MAS0, %r4 isync tlbre li %r5, 0 /* MAS1[V] = 0 */ mtspr SPR_MAS1, %r5 isync tlbwe isync msync blr /* * r3 entry of temp translation * r29 entry of current translation * r28 returns temp entry passed in r3 * r4-r5 scratched */ tlb1_temp_mapping_as1: mr %r28, %r3 /* Read our current translation */ lis %r3, MAS0_TLBSEL1@h /* Select TLB1 */ rlwimi %r3, %r29, 16, 12, 15 /* Select our current entry */ mtspr SPR_MAS0, %r3 isync tlbre /* Prepare and write temp entry */ lis %r3, MAS0_TLBSEL1@h /* Select TLB1 */ rlwimi %r3, %r28, 16, 12, 15 /* Select temp entry */ mtspr SPR_MAS0, %r3 isync mfspr %r5, SPR_MAS1 li %r4, 1 /* AS=1 */ rlwimi %r5, %r4, 12, 19, 19 li %r4, 0 /* Global mapping, TID=0 */ rlwimi %r5, %r4, 16, 8, 15 oris %r5, %r5, (MAS1_VALID | MAS1_IPROT)@h mtspr SPR_MAS1, %r5 isync tlbwe isync msync blr /* * Loops over TLB1, invalidates all entries skipping the one which currently * maps this code. * * r29 current entry * r3-r5 scratched */ tlb1_inval_all_but_current: mr %r6, %r3 mfspr %r3, SPR_TLB1CFG /* Get number of entries */ andi. %r3, %r3, TLBCFG_NENTRY_MASK@l li %r4, 0 /* Start from Entry 0 */ 1: lis %r5, MAS0_TLBSEL1@h rlwimi %r5, %r4, 16, 12, 15 mtspr SPR_MAS0, %r5 isync tlbre mfspr %r5, SPR_MAS1 cmpw %r4, %r29 /* our current entry? */ beq 2f rlwinm %r5, %r5, 0, 2, 31 /* clear VALID and IPROT bits */ mtspr SPR_MAS1, %r5 isync tlbwe isync msync 2: addi %r4, %r4, 1 cmpw %r4, %r3 /* Check if this is the last entry */ bne 1b blr #ifdef SMP __boot_page_padding: /* * Boot page needs to be exactly 4K, with the last word of this page * acting as the reset vector, so we need to stuff the remainder. * Upon release from holdoff CPU fetches the last word of the boot * page. */ .space 4092 - (__boot_page_padding - __boot_page) b __boot_page #endif /* SMP */ /************************************************************************/ /* locore subroutines */ /************************************************************************/ /* * void tid_flush(tlbtid_t tid); * * Invalidate all TLB0 entries which match the given TID. Note this is * dedicated for cases when invalidation(s) should NOT be propagated to other * CPUs. * - * Global vars tlb0_ways, tlb0_entries_per_way are assumed to have been set up - * correctly (by tlb0_get_tlbconf()). + * void tid_flush(tlbtid_t tid, int tlb0_ways, int tlb0_entries_per_way); * + * XXX: why isn't this in C? */ ENTRY(tid_flush) cmpwi %r3, TID_KERNEL beq tid_flush_end /* don't evict kernel translations */ - /* Number of TLB0 ways */ - lis %r4, tlb0_ways@h - ori %r4, %r4, tlb0_ways@l - lwz %r4, 0(%r4) - - /* Number of entries / way */ - lis %r5, tlb0_entries_per_way@h - ori %r5, %r5, tlb0_entries_per_way@l - lwz %r5, 0(%r5) - /* Disable interrupts */ mfmsr %r10 wrteei 0 li %r6, 0 /* ways counter */ loop_ways: li %r7, 0 /* entries [per way] counter */ loop_entries: /* Select TLB0 and ESEL (way) */ lis %r8, MAS0_TLBSEL0@h rlwimi %r8, %r6, 16, 14, 15 mtspr SPR_MAS0, %r8 isync /* Select EPN (entry within the way) */ rlwinm %r8, %r7, 12, 13, 19 mtspr SPR_MAS2, %r8 isync tlbre /* Check if valid entry */ mfspr %r8, SPR_MAS1 andis. %r9, %r8, MAS1_VALID@h beq next_entry /* invalid entry */ /* Check if this is our TID */ rlwinm %r9, %r8, 16, 24, 31 cmplw %r9, %r3 bne next_entry /* not our TID */ /* Clear VALID bit */ rlwinm %r8, %r8, 0, 1, 31 mtspr SPR_MAS1, %r8 isync tlbwe isync msync next_entry: addi %r7, %r7, 1 cmpw %r7, %r5 bne loop_entries /* Next way */ addi %r6, %r6, 1 cmpw %r6, %r4 bne loop_ways /* Restore MSR (possibly re-enable interrupts) */ mtmsr %r10 isync tid_flush_end: blr /* * Cache disable/enable/inval sequences according * to section 2.16 of E500CORE RM. */ ENTRY(dcache_inval) /* Invalidate d-cache */ mfspr %r3, SPR_L1CSR0 ori %r3, %r3, (L1CSR0_DCFI | L1CSR0_DCLFR)@l msync isync mtspr SPR_L1CSR0, %r3 isync 1: mfspr %r3, SPR_L1CSR0 andi. %r3, %r3, L1CSR0_DCFI bne 1b blr ENTRY(dcache_disable) /* Disable d-cache */ mfspr %r3, SPR_L1CSR0 li %r4, L1CSR0_DCE@l not %r4, %r4 and %r3, %r3, %r4 msync isync mtspr SPR_L1CSR0, %r3 isync blr ENTRY(dcache_enable) /* Enable d-cache */ mfspr %r3, SPR_L1CSR0 oris %r3, %r3, (L1CSR0_DCPE | L1CSR0_DCE)@h ori %r3, %r3, (L1CSR0_DCPE | L1CSR0_DCE)@l msync isync mtspr SPR_L1CSR0, %r3 isync blr ENTRY(icache_inval) /* Invalidate i-cache */ mfspr %r3, SPR_L1CSR1 ori %r3, %r3, (L1CSR1_ICFI | L1CSR1_ICLFR)@l isync mtspr SPR_L1CSR1, %r3 isync 1: mfspr %r3, SPR_L1CSR1 andi. %r3, %r3, L1CSR1_ICFI bne 1b blr ENTRY(icache_disable) /* Disable i-cache */ mfspr %r3, SPR_L1CSR1 li %r4, L1CSR1_ICE@l not %r4, %r4 and %r3, %r3, %r4 isync mtspr SPR_L1CSR1, %r3 isync blr ENTRY(icache_enable) /* Enable i-cache */ mfspr %r3, SPR_L1CSR1 oris %r3, %r3, (L1CSR1_ICPE | L1CSR1_ICE)@h ori %r3, %r3, (L1CSR1_ICPE | L1CSR1_ICE)@l isync mtspr SPR_L1CSR1, %r3 isync blr /* * int setfault() * * Similar to setjmp to setup for handling faults on accesses to user memory. * Any routine using this may only call bcopy, either the form below, * or the (currently used) C code optimized, so it doesn't use any non-volatile * registers. */ .globl setfault setfault: mflr %r0 mfsprg0 %r4 lwz %r4, TD_PCB(%r2) stw %r3, PCB_ONFAULT(%r4) mfcr %r10 mfctr %r11 mfxer %r12 stw %r0, 0(%r3) stw %r1, 4(%r3) stw %r2, 8(%r3) stmw %r10, 12(%r3) /* store CR, CTR, XER, [r13 .. r31] */ li %r3, 0 /* return FALSE */ blr /************************************************************************/ /* Data section */ /************************************************************************/ .data + .align 3 +GLOBAL(__startkernel) + .long begin +GLOBAL(__endkernel) + .long end .align 4 tmpstack: .space TMPSTACKSZ tmpstackbound: .space 10240 /* XXX: this really should not be necessary */ /* * Compiled KERNBASE locations */ .globl kernbase .set kernbase, KERNBASE #include Index: head/sys/powerpc/booke/pmap.c =================================================================== --- head/sys/powerpc/booke/pmap.c (revision 279749) +++ head/sys/powerpc/booke/pmap.c (revision 279750) @@ -1,3323 +1,3323 @@ /*- * Copyright (C) 2007-2009 Semihalf, Rafal Jaworowski * Copyright (C) 2006 Semihalf, Marian Balakowicz * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Some hw specific parts of this pmap were derived or influenced * by NetBSD's ibm4xx pmap module. More generic code is shared with * a few other pmap modules from the FreeBSD tree. */ /* * VM layout notes: * * Kernel and user threads run within one common virtual address space * defined by AS=0. * * Virtual address space layout: * ----------------------------- * 0x0000_0000 - 0xafff_ffff : user process * 0xb000_0000 - 0xbfff_ffff : pmap_mapdev()-ed area (PCI/PCIE etc.) * 0xc000_0000 - 0xc0ff_ffff : kernel reserved * 0xc000_0000 - data_end : kernel code+data, env, metadata etc. * 0xc100_0000 - 0xfeef_ffff : KVA * 0xc100_0000 - 0xc100_3fff : reserved for page zero/copy * 0xc100_4000 - 0xc200_3fff : reserved for ptbl bufs * 0xc200_4000 - 0xc200_8fff : guard page + kstack0 * 0xc200_9000 - 0xfeef_ffff : actual free KVA space * 0xfef0_0000 - 0xffff_ffff : I/O devices region */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mmu_if.h" #ifdef DEBUG #define debugf(fmt, args...) printf(fmt, ##args) #else #define debugf(fmt, args...) #endif #define TODO panic("%s: not implemented", __func__); extern unsigned char _etext[]; extern unsigned char _end[]; extern uint32_t *bootinfo; #ifdef SMP extern uint32_t bp_ntlb1s; #endif vm_paddr_t kernload; vm_offset_t kernstart; vm_size_t kernsize; /* Message buffer and tables. */ static vm_offset_t data_start; static vm_size_t data_end; /* Phys/avail memory regions. */ static struct mem_region *availmem_regions; static int availmem_regions_sz; static struct mem_region *physmem_regions; static int physmem_regions_sz; /* Reserved KVA space and mutex for mmu_booke_zero_page. */ static vm_offset_t zero_page_va; static struct mtx zero_page_mutex; static struct mtx tlbivax_mutex; /* * Reserved KVA space for mmu_booke_zero_page_idle. This is used * by idle thred only, no lock required. */ static vm_offset_t zero_page_idle_va; /* Reserved KVA space and mutex for mmu_booke_copy_page. */ static vm_offset_t copy_page_src_va; static vm_offset_t copy_page_dst_va; static struct mtx copy_page_mutex; /**************************************************************************/ /* PMAP */ /**************************************************************************/ static int mmu_booke_enter_locked(mmu_t, pmap_t, vm_offset_t, vm_page_t, vm_prot_t, u_int flags, int8_t psind); unsigned int kptbl_min; /* Index of the first kernel ptbl. */ unsigned int kernel_ptbls; /* Number of KVA ptbls. */ /* * If user pmap is processed with mmu_booke_remove and the resident count * drops to 0, there are no more pages to remove, so we need not continue. */ #define PMAP_REMOVE_DONE(pmap) \ ((pmap) != kernel_pmap && (pmap)->pm_stats.resident_count == 0) -extern void tid_flush(tlbtid_t); +extern void tid_flush(tlbtid_t tid, int tlb0_ways, int tlb0_entries_per_way); extern int elf32_nxstack; /**************************************************************************/ /* TLB and TID handling */ /**************************************************************************/ /* Translation ID busy table */ static volatile pmap_t tidbusy[MAXCPU][TID_MAX + 1]; /* * TLB0 capabilities (entry, way numbers etc.). These can vary between e500 * core revisions and should be read from h/w registers during early config. */ uint32_t tlb0_entries; uint32_t tlb0_ways; uint32_t tlb0_entries_per_way; #define TLB0_ENTRIES (tlb0_entries) #define TLB0_WAYS (tlb0_ways) #define TLB0_ENTRIES_PER_WAY (tlb0_entries_per_way) #define TLB1_ENTRIES 16 /* In-ram copy of the TLB1 */ static tlb_entry_t tlb1[TLB1_ENTRIES]; /* Next free entry in the TLB1 */ static unsigned int tlb1_idx; static vm_offset_t tlb1_map_base = VM_MAX_KERNEL_ADDRESS; static tlbtid_t tid_alloc(struct pmap *); static void tlb_print_entry(int, uint32_t, uint32_t, uint32_t, uint32_t); static int tlb1_set_entry(vm_offset_t, vm_offset_t, vm_size_t, uint32_t); static void tlb1_write_entry(unsigned int); static int tlb1_iomapped(int, vm_paddr_t, vm_size_t, vm_offset_t *); static vm_size_t tlb1_mapin_region(vm_offset_t, vm_paddr_t, vm_size_t); static vm_size_t tsize2size(unsigned int); static unsigned int size2tsize(vm_size_t); static unsigned int ilog2(unsigned int); static void set_mas4_defaults(void); static inline void tlb0_flush_entry(vm_offset_t); static inline unsigned int tlb0_tableidx(vm_offset_t, unsigned int); /**************************************************************************/ /* Page table management */ /**************************************************************************/ static struct rwlock_padalign pvh_global_lock; /* Data for the pv entry allocation mechanism */ static uma_zone_t pvzone; static int pv_entry_count = 0, pv_entry_max = 0, pv_entry_high_water = 0; #define PV_ENTRY_ZONE_MIN 2048 /* min pv entries in uma zone */ #ifndef PMAP_SHPGPERPROC #define PMAP_SHPGPERPROC 200 #endif static void ptbl_init(void); static struct ptbl_buf *ptbl_buf_alloc(void); static void ptbl_buf_free(struct ptbl_buf *); static void ptbl_free_pmap_ptbl(pmap_t, pte_t *); static pte_t *ptbl_alloc(mmu_t, pmap_t, unsigned int, boolean_t); static void ptbl_free(mmu_t, pmap_t, unsigned int); static void ptbl_hold(mmu_t, pmap_t, unsigned int); static int ptbl_unhold(mmu_t, pmap_t, unsigned int); static vm_paddr_t pte_vatopa(mmu_t, pmap_t, vm_offset_t); static pte_t *pte_find(mmu_t, pmap_t, vm_offset_t); static int pte_enter(mmu_t, pmap_t, vm_page_t, vm_offset_t, uint32_t, boolean_t); static int pte_remove(mmu_t, pmap_t, vm_offset_t, uint8_t); static pv_entry_t pv_alloc(void); static void pv_free(pv_entry_t); static void pv_insert(pmap_t, vm_offset_t, vm_page_t); static void pv_remove(pmap_t, vm_offset_t, vm_page_t); /* Number of kva ptbl buffers, each covering one ptbl (PTBL_PAGES). */ #define PTBL_BUFS (128 * 16) struct ptbl_buf { TAILQ_ENTRY(ptbl_buf) link; /* list link */ vm_offset_t kva; /* va of mapping */ }; /* ptbl free list and a lock used for access synchronization. */ static TAILQ_HEAD(, ptbl_buf) ptbl_buf_freelist; static struct mtx ptbl_buf_freelist_lock; /* Base address of kva space allocated fot ptbl bufs. */ static vm_offset_t ptbl_buf_pool_vabase; /* Pointer to ptbl_buf structures. */ static struct ptbl_buf *ptbl_bufs; void pmap_bootstrap_ap(volatile uint32_t *); /* * Kernel MMU interface */ static void mmu_booke_clear_modify(mmu_t, vm_page_t); static void mmu_booke_copy(mmu_t, pmap_t, pmap_t, vm_offset_t, vm_size_t, vm_offset_t); static void mmu_booke_copy_page(mmu_t, vm_page_t, vm_page_t); static void mmu_booke_copy_pages(mmu_t, vm_page_t *, vm_offset_t, vm_page_t *, vm_offset_t, int); static int mmu_booke_enter(mmu_t, pmap_t, vm_offset_t, vm_page_t, vm_prot_t, u_int flags, int8_t psind); static void mmu_booke_enter_object(mmu_t, pmap_t, vm_offset_t, vm_offset_t, vm_page_t, vm_prot_t); static void mmu_booke_enter_quick(mmu_t, pmap_t, vm_offset_t, vm_page_t, vm_prot_t); static vm_paddr_t mmu_booke_extract(mmu_t, pmap_t, vm_offset_t); static vm_page_t mmu_booke_extract_and_hold(mmu_t, pmap_t, vm_offset_t, vm_prot_t); static void mmu_booke_init(mmu_t); static boolean_t mmu_booke_is_modified(mmu_t, vm_page_t); static boolean_t mmu_booke_is_prefaultable(mmu_t, pmap_t, vm_offset_t); static boolean_t mmu_booke_is_referenced(mmu_t, vm_page_t); static int mmu_booke_ts_referenced(mmu_t, vm_page_t); static vm_offset_t mmu_booke_map(mmu_t, vm_offset_t *, vm_paddr_t, vm_paddr_t, int); static int mmu_booke_mincore(mmu_t, pmap_t, vm_offset_t, vm_paddr_t *); static void mmu_booke_object_init_pt(mmu_t, pmap_t, vm_offset_t, vm_object_t, vm_pindex_t, vm_size_t); static boolean_t mmu_booke_page_exists_quick(mmu_t, pmap_t, vm_page_t); static void mmu_booke_page_init(mmu_t, vm_page_t); static int mmu_booke_page_wired_mappings(mmu_t, vm_page_t); static void mmu_booke_pinit(mmu_t, pmap_t); static void mmu_booke_pinit0(mmu_t, pmap_t); static void mmu_booke_protect(mmu_t, pmap_t, vm_offset_t, vm_offset_t, vm_prot_t); static void mmu_booke_qenter(mmu_t, vm_offset_t, vm_page_t *, int); static void mmu_booke_qremove(mmu_t, vm_offset_t, int); static void mmu_booke_release(mmu_t, pmap_t); static void mmu_booke_remove(mmu_t, pmap_t, vm_offset_t, vm_offset_t); static void mmu_booke_remove_all(mmu_t, vm_page_t); static void mmu_booke_remove_write(mmu_t, vm_page_t); static void mmu_booke_unwire(mmu_t, pmap_t, vm_offset_t, vm_offset_t); static void mmu_booke_zero_page(mmu_t, vm_page_t); static void mmu_booke_zero_page_area(mmu_t, vm_page_t, int, int); static void mmu_booke_zero_page_idle(mmu_t, vm_page_t); static void mmu_booke_activate(mmu_t, struct thread *); static void mmu_booke_deactivate(mmu_t, struct thread *); static void mmu_booke_bootstrap(mmu_t, vm_offset_t, vm_offset_t); static void *mmu_booke_mapdev(mmu_t, vm_paddr_t, vm_size_t); static void *mmu_booke_mapdev_attr(mmu_t, vm_paddr_t, vm_size_t, vm_memattr_t); static void mmu_booke_unmapdev(mmu_t, vm_offset_t, vm_size_t); static vm_paddr_t mmu_booke_kextract(mmu_t, vm_offset_t); static void mmu_booke_kenter(mmu_t, vm_offset_t, vm_paddr_t); static void mmu_booke_kenter_attr(mmu_t, vm_offset_t, vm_paddr_t, vm_memattr_t); static void mmu_booke_kremove(mmu_t, vm_offset_t); static boolean_t mmu_booke_dev_direct_mapped(mmu_t, vm_paddr_t, vm_size_t); static void mmu_booke_sync_icache(mmu_t, pmap_t, vm_offset_t, vm_size_t); static void mmu_booke_dumpsys_map(mmu_t, vm_paddr_t pa, size_t, void **); static void mmu_booke_dumpsys_unmap(mmu_t, vm_paddr_t pa, size_t, void *); static void mmu_booke_scan_init(mmu_t); static mmu_method_t mmu_booke_methods[] = { /* pmap dispatcher interface */ MMUMETHOD(mmu_clear_modify, mmu_booke_clear_modify), MMUMETHOD(mmu_copy, mmu_booke_copy), MMUMETHOD(mmu_copy_page, mmu_booke_copy_page), MMUMETHOD(mmu_copy_pages, mmu_booke_copy_pages), MMUMETHOD(mmu_enter, mmu_booke_enter), MMUMETHOD(mmu_enter_object, mmu_booke_enter_object), MMUMETHOD(mmu_enter_quick, mmu_booke_enter_quick), MMUMETHOD(mmu_extract, mmu_booke_extract), MMUMETHOD(mmu_extract_and_hold, mmu_booke_extract_and_hold), MMUMETHOD(mmu_init, mmu_booke_init), MMUMETHOD(mmu_is_modified, mmu_booke_is_modified), MMUMETHOD(mmu_is_prefaultable, mmu_booke_is_prefaultable), MMUMETHOD(mmu_is_referenced, mmu_booke_is_referenced), MMUMETHOD(mmu_ts_referenced, mmu_booke_ts_referenced), MMUMETHOD(mmu_map, mmu_booke_map), MMUMETHOD(mmu_mincore, mmu_booke_mincore), MMUMETHOD(mmu_object_init_pt, mmu_booke_object_init_pt), MMUMETHOD(mmu_page_exists_quick,mmu_booke_page_exists_quick), MMUMETHOD(mmu_page_init, mmu_booke_page_init), MMUMETHOD(mmu_page_wired_mappings, mmu_booke_page_wired_mappings), MMUMETHOD(mmu_pinit, mmu_booke_pinit), MMUMETHOD(mmu_pinit0, mmu_booke_pinit0), MMUMETHOD(mmu_protect, mmu_booke_protect), MMUMETHOD(mmu_qenter, mmu_booke_qenter), MMUMETHOD(mmu_qremove, mmu_booke_qremove), MMUMETHOD(mmu_release, mmu_booke_release), MMUMETHOD(mmu_remove, mmu_booke_remove), MMUMETHOD(mmu_remove_all, mmu_booke_remove_all), MMUMETHOD(mmu_remove_write, mmu_booke_remove_write), MMUMETHOD(mmu_sync_icache, mmu_booke_sync_icache), MMUMETHOD(mmu_unwire, mmu_booke_unwire), MMUMETHOD(mmu_zero_page, mmu_booke_zero_page), MMUMETHOD(mmu_zero_page_area, mmu_booke_zero_page_area), MMUMETHOD(mmu_zero_page_idle, mmu_booke_zero_page_idle), MMUMETHOD(mmu_activate, mmu_booke_activate), MMUMETHOD(mmu_deactivate, mmu_booke_deactivate), /* Internal interfaces */ MMUMETHOD(mmu_bootstrap, mmu_booke_bootstrap), MMUMETHOD(mmu_dev_direct_mapped,mmu_booke_dev_direct_mapped), MMUMETHOD(mmu_mapdev, mmu_booke_mapdev), MMUMETHOD(mmu_mapdev_attr, mmu_booke_mapdev_attr), MMUMETHOD(mmu_kenter, mmu_booke_kenter), MMUMETHOD(mmu_kenter_attr, mmu_booke_kenter_attr), MMUMETHOD(mmu_kextract, mmu_booke_kextract), /* MMUMETHOD(mmu_kremove, mmu_booke_kremove), */ MMUMETHOD(mmu_unmapdev, mmu_booke_unmapdev), /* dumpsys() support */ MMUMETHOD(mmu_dumpsys_map, mmu_booke_dumpsys_map), MMUMETHOD(mmu_dumpsys_unmap, mmu_booke_dumpsys_unmap), MMUMETHOD(mmu_scan_init, mmu_booke_scan_init), { 0, 0 } }; MMU_DEF(booke_mmu, MMU_TYPE_BOOKE, mmu_booke_methods, 0); static __inline uint32_t tlb_calc_wimg(vm_offset_t pa, vm_memattr_t ma) { uint32_t attrib; int i; if (ma != VM_MEMATTR_DEFAULT) { switch (ma) { case VM_MEMATTR_UNCACHEABLE: return (PTE_I | PTE_G); case VM_MEMATTR_WRITE_COMBINING: case VM_MEMATTR_WRITE_BACK: case VM_MEMATTR_PREFETCHABLE: return (PTE_I); case VM_MEMATTR_WRITE_THROUGH: return (PTE_W | PTE_M); } } /* * Assume the page is cache inhibited and access is guarded unless * it's in our available memory array. */ attrib = _TLB_ENTRY_IO; for (i = 0; i < physmem_regions_sz; i++) { if ((pa >= physmem_regions[i].mr_start) && (pa < (physmem_regions[i].mr_start + physmem_regions[i].mr_size))) { attrib = _TLB_ENTRY_MEM; break; } } return (attrib); } static inline void tlb_miss_lock(void) { #ifdef SMP struct pcpu *pc; if (!smp_started) return; STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { if (pc != pcpup) { CTR3(KTR_PMAP, "%s: tlb miss LOCK of CPU=%d, " "tlb_lock=%p", __func__, pc->pc_cpuid, pc->pc_booke_tlb_lock); KASSERT((pc->pc_cpuid != PCPU_GET(cpuid)), ("tlb_miss_lock: tried to lock self")); tlb_lock(pc->pc_booke_tlb_lock); CTR1(KTR_PMAP, "%s: locked", __func__); } } #endif } static inline void tlb_miss_unlock(void) { #ifdef SMP struct pcpu *pc; if (!smp_started) return; STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { if (pc != pcpup) { CTR2(KTR_PMAP, "%s: tlb miss UNLOCK of CPU=%d", __func__, pc->pc_cpuid); tlb_unlock(pc->pc_booke_tlb_lock); CTR1(KTR_PMAP, "%s: unlocked", __func__); } } #endif } /* Return number of entries in TLB0. */ static __inline void tlb0_get_tlbconf(void) { uint32_t tlb0_cfg; tlb0_cfg = mfspr(SPR_TLB0CFG); tlb0_entries = tlb0_cfg & TLBCFG_NENTRY_MASK; tlb0_ways = (tlb0_cfg & TLBCFG_ASSOC_MASK) >> TLBCFG_ASSOC_SHIFT; tlb0_entries_per_way = tlb0_entries / tlb0_ways; } /* Initialize pool of kva ptbl buffers. */ static void ptbl_init(void) { int i; CTR3(KTR_PMAP, "%s: s (ptbl_bufs = 0x%08x size 0x%08x)", __func__, (uint32_t)ptbl_bufs, sizeof(struct ptbl_buf) * PTBL_BUFS); CTR3(KTR_PMAP, "%s: s (ptbl_buf_pool_vabase = 0x%08x size = 0x%08x)", __func__, ptbl_buf_pool_vabase, PTBL_BUFS * PTBL_PAGES * PAGE_SIZE); mtx_init(&ptbl_buf_freelist_lock, "ptbl bufs lock", NULL, MTX_DEF); TAILQ_INIT(&ptbl_buf_freelist); for (i = 0; i < PTBL_BUFS; i++) { ptbl_bufs[i].kva = ptbl_buf_pool_vabase + i * PTBL_PAGES * PAGE_SIZE; TAILQ_INSERT_TAIL(&ptbl_buf_freelist, &ptbl_bufs[i], link); } } /* Get a ptbl_buf from the freelist. */ static struct ptbl_buf * ptbl_buf_alloc(void) { struct ptbl_buf *buf; mtx_lock(&ptbl_buf_freelist_lock); buf = TAILQ_FIRST(&ptbl_buf_freelist); if (buf != NULL) TAILQ_REMOVE(&ptbl_buf_freelist, buf, link); mtx_unlock(&ptbl_buf_freelist_lock); CTR2(KTR_PMAP, "%s: buf = %p", __func__, buf); return (buf); } /* Return ptbl buff to free pool. */ static void ptbl_buf_free(struct ptbl_buf *buf) { CTR2(KTR_PMAP, "%s: buf = %p", __func__, buf); mtx_lock(&ptbl_buf_freelist_lock); TAILQ_INSERT_TAIL(&ptbl_buf_freelist, buf, link); mtx_unlock(&ptbl_buf_freelist_lock); } /* * Search the list of allocated ptbl bufs and find on list of allocated ptbls */ static void ptbl_free_pmap_ptbl(pmap_t pmap, pte_t *ptbl) { struct ptbl_buf *pbuf; CTR2(KTR_PMAP, "%s: ptbl = %p", __func__, ptbl); PMAP_LOCK_ASSERT(pmap, MA_OWNED); TAILQ_FOREACH(pbuf, &pmap->pm_ptbl_list, link) if (pbuf->kva == (vm_offset_t)ptbl) { /* Remove from pmap ptbl buf list. */ TAILQ_REMOVE(&pmap->pm_ptbl_list, pbuf, link); /* Free corresponding ptbl buf. */ ptbl_buf_free(pbuf); break; } } /* Allocate page table. */ static pte_t * ptbl_alloc(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx, boolean_t nosleep) { vm_page_t mtbl[PTBL_PAGES]; vm_page_t m; struct ptbl_buf *pbuf; unsigned int pidx; pte_t *ptbl; int i, j; CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap, (pmap == kernel_pmap), pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_alloc: invalid pdir_idx")); KASSERT((pmap->pm_pdir[pdir_idx] == NULL), ("pte_alloc: valid ptbl entry exists!")); pbuf = ptbl_buf_alloc(); if (pbuf == NULL) panic("pte_alloc: couldn't alloc kernel virtual memory"); ptbl = (pte_t *)pbuf->kva; CTR2(KTR_PMAP, "%s: ptbl kva = %p", __func__, ptbl); /* Allocate ptbl pages, this will sleep! */ for (i = 0; i < PTBL_PAGES; i++) { pidx = (PTBL_PAGES * pdir_idx) + i; while ((m = vm_page_alloc(NULL, pidx, VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) { PMAP_UNLOCK(pmap); rw_wunlock(&pvh_global_lock); if (nosleep) { ptbl_free_pmap_ptbl(pmap, ptbl); for (j = 0; j < i; j++) vm_page_free(mtbl[j]); atomic_subtract_int(&vm_cnt.v_wire_count, i); return (NULL); } VM_WAIT; rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); } mtbl[i] = m; } /* Map allocated pages into kernel_pmap. */ mmu_booke_qenter(mmu, (vm_offset_t)ptbl, mtbl, PTBL_PAGES); /* Zero whole ptbl. */ bzero((caddr_t)ptbl, PTBL_PAGES * PAGE_SIZE); /* Add pbuf to the pmap ptbl bufs list. */ TAILQ_INSERT_TAIL(&pmap->pm_ptbl_list, pbuf, link); return (ptbl); } /* Free ptbl pages and invalidate pdir entry. */ static void ptbl_free(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx) { pte_t *ptbl; vm_paddr_t pa; vm_offset_t va; vm_page_t m; int i; CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap, (pmap == kernel_pmap), pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_free: invalid pdir_idx")); ptbl = pmap->pm_pdir[pdir_idx]; CTR2(KTR_PMAP, "%s: ptbl = %p", __func__, ptbl); KASSERT((ptbl != NULL), ("ptbl_free: null ptbl")); /* * Invalidate the pdir entry as soon as possible, so that other CPUs * don't attempt to look up the page tables we are releasing. */ mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); pmap->pm_pdir[pdir_idx] = NULL; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); for (i = 0; i < PTBL_PAGES; i++) { va = ((vm_offset_t)ptbl + (i * PAGE_SIZE)); pa = pte_vatopa(mmu, kernel_pmap, va); m = PHYS_TO_VM_PAGE(pa); vm_page_free_zero(m); atomic_subtract_int(&vm_cnt.v_wire_count, 1); mmu_booke_kremove(mmu, va); } ptbl_free_pmap_ptbl(pmap, ptbl); } /* * Decrement ptbl pages hold count and attempt to free ptbl pages. * Called when removing pte entry from ptbl. * * Return 1 if ptbl pages were freed. */ static int ptbl_unhold(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx) { pte_t *ptbl; vm_paddr_t pa; vm_page_t m; int i; CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap, (pmap == kernel_pmap), pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_unhold: invalid pdir_idx")); KASSERT((pmap != kernel_pmap), ("ptbl_unhold: unholding kernel ptbl!")); ptbl = pmap->pm_pdir[pdir_idx]; //debugf("ptbl_unhold: ptbl = 0x%08x\n", (u_int32_t)ptbl); KASSERT(((vm_offset_t)ptbl >= VM_MIN_KERNEL_ADDRESS), ("ptbl_unhold: non kva ptbl")); /* decrement hold count */ for (i = 0; i < PTBL_PAGES; i++) { pa = pte_vatopa(mmu, kernel_pmap, (vm_offset_t)ptbl + (i * PAGE_SIZE)); m = PHYS_TO_VM_PAGE(pa); m->wire_count--; } /* * Free ptbl pages if there are no pte etries in this ptbl. * wire_count has the same value for all ptbl pages, so check the last * page. */ if (m->wire_count == 0) { ptbl_free(mmu, pmap, pdir_idx); //debugf("ptbl_unhold: e (freed ptbl)\n"); return (1); } return (0); } /* * Increment hold count for ptbl pages. This routine is used when a new pte * entry is being inserted into the ptbl. */ static void ptbl_hold(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx) { vm_paddr_t pa; pte_t *ptbl; vm_page_t m; int i; CTR3(KTR_PMAP, "%s: pmap = %p pdir_idx = %d", __func__, pmap, pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_hold: invalid pdir_idx")); KASSERT((pmap != kernel_pmap), ("ptbl_hold: holding kernel ptbl!")); ptbl = pmap->pm_pdir[pdir_idx]; KASSERT((ptbl != NULL), ("ptbl_hold: null ptbl")); for (i = 0; i < PTBL_PAGES; i++) { pa = pte_vatopa(mmu, kernel_pmap, (vm_offset_t)ptbl + (i * PAGE_SIZE)); m = PHYS_TO_VM_PAGE(pa); m->wire_count++; } } /* Allocate pv_entry structure. */ pv_entry_t pv_alloc(void) { pv_entry_t pv; pv_entry_count++; if (pv_entry_count > pv_entry_high_water) pagedaemon_wakeup(); pv = uma_zalloc(pvzone, M_NOWAIT); return (pv); } /* Free pv_entry structure. */ static __inline void pv_free(pv_entry_t pve) { pv_entry_count--; uma_zfree(pvzone, pve); } /* Allocate and initialize pv_entry structure. */ static void pv_insert(pmap_t pmap, vm_offset_t va, vm_page_t m) { pv_entry_t pve; //int su = (pmap == kernel_pmap); //debugf("pv_insert: s (su = %d pmap = 0x%08x va = 0x%08x m = 0x%08x)\n", su, // (u_int32_t)pmap, va, (u_int32_t)m); pve = pv_alloc(); if (pve == NULL) panic("pv_insert: no pv entries!"); pve->pv_pmap = pmap; pve->pv_va = va; /* add to pv_list */ PMAP_LOCK_ASSERT(pmap, MA_OWNED); rw_assert(&pvh_global_lock, RA_WLOCKED); TAILQ_INSERT_TAIL(&m->md.pv_list, pve, pv_link); //debugf("pv_insert: e\n"); } /* Destroy pv entry. */ static void pv_remove(pmap_t pmap, vm_offset_t va, vm_page_t m) { pv_entry_t pve; //int su = (pmap == kernel_pmap); //debugf("pv_remove: s (su = %d pmap = 0x%08x va = 0x%08x)\n", su, (u_int32_t)pmap, va); PMAP_LOCK_ASSERT(pmap, MA_OWNED); rw_assert(&pvh_global_lock, RA_WLOCKED); /* find pv entry */ TAILQ_FOREACH(pve, &m->md.pv_list, pv_link) { if ((pmap == pve->pv_pmap) && (va == pve->pv_va)) { /* remove from pv_list */ TAILQ_REMOVE(&m->md.pv_list, pve, pv_link); if (TAILQ_EMPTY(&m->md.pv_list)) vm_page_aflag_clear(m, PGA_WRITEABLE); /* free pv entry struct */ pv_free(pve); break; } } //debugf("pv_remove: e\n"); } /* * Clean pte entry, try to free page table page if requested. * * Return 1 if ptbl pages were freed, otherwise return 0. */ static int pte_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, uint8_t flags) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); vm_page_t m; pte_t *ptbl; pte_t *pte; //int su = (pmap == kernel_pmap); //debugf("pte_remove: s (su = %d pmap = 0x%08x va = 0x%08x flags = %d)\n", // su, (u_int32_t)pmap, va, flags); ptbl = pmap->pm_pdir[pdir_idx]; KASSERT(ptbl, ("pte_remove: null ptbl")); pte = &ptbl[ptbl_idx]; if (pte == NULL || !PTE_ISVALID(pte)) return (0); if (PTE_ISWIRED(pte)) pmap->pm_stats.wired_count--; /* Handle managed entry. */ if (PTE_ISMANAGED(pte)) { /* Get vm_page_t for mapped pte. */ m = PHYS_TO_VM_PAGE(PTE_PA(pte)); if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); if (PTE_ISREFERENCED(pte)) vm_page_aflag_set(m, PGA_REFERENCED); pv_remove(pmap, va, m); } mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); pte->flags = 0; pte->rpn = 0; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); pmap->pm_stats.resident_count--; if (flags & PTBL_UNHOLD) { //debugf("pte_remove: e (unhold)\n"); return (ptbl_unhold(mmu, pmap, pdir_idx)); } //debugf("pte_remove: e\n"); return (0); } /* * Insert PTE for a given page and virtual address. */ static int pte_enter(mmu_t mmu, pmap_t pmap, vm_page_t m, vm_offset_t va, uint32_t flags, boolean_t nosleep) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); pte_t *ptbl, *pte; CTR4(KTR_PMAP, "%s: su = %d pmap = %p va = %p", __func__, pmap == kernel_pmap, pmap, va); /* Get the page table pointer. */ ptbl = pmap->pm_pdir[pdir_idx]; if (ptbl == NULL) { /* Allocate page table pages. */ ptbl = ptbl_alloc(mmu, pmap, pdir_idx, nosleep); if (ptbl == NULL) { KASSERT(nosleep, ("nosleep and NULL ptbl")); return (ENOMEM); } } else { /* * Check if there is valid mapping for requested * va, if there is, remove it. */ pte = &pmap->pm_pdir[pdir_idx][ptbl_idx]; if (PTE_ISVALID(pte)) { pte_remove(mmu, pmap, va, PTBL_HOLD); } else { /* * pte is not used, increment hold count * for ptbl pages. */ if (pmap != kernel_pmap) ptbl_hold(mmu, pmap, pdir_idx); } } /* * Insert pv_entry into pv_list for mapped page if part of managed * memory. */ if ((m->oflags & VPO_UNMANAGED) == 0) { flags |= PTE_MANAGED; /* Create and insert pv entry. */ pv_insert(pmap, va, m); } pmap->pm_stats.resident_count++; mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); if (pmap->pm_pdir[pdir_idx] == NULL) { /* * If we just allocated a new page table, hook it in * the pdir. */ pmap->pm_pdir[pdir_idx] = ptbl; } pte = &(pmap->pm_pdir[pdir_idx][ptbl_idx]); pte->rpn = VM_PAGE_TO_PHYS(m) & ~PTE_PA_MASK; pte->flags |= (PTE_VALID | flags); tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); return (0); } /* Return the pa for the given pmap/va. */ static vm_paddr_t pte_vatopa(mmu_t mmu, pmap_t pmap, vm_offset_t va) { vm_paddr_t pa = 0; pte_t *pte; pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) pa = (PTE_PA(pte) | (va & PTE_PA_MASK)); return (pa); } /* Get a pointer to a PTE in a page table. */ static pte_t * pte_find(mmu_t mmu, pmap_t pmap, vm_offset_t va) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); KASSERT((pmap != NULL), ("pte_find: invalid pmap")); if (pmap->pm_pdir[pdir_idx]) return (&(pmap->pm_pdir[pdir_idx][ptbl_idx])); return (NULL); } /**************************************************************************/ /* PMAP related */ /**************************************************************************/ /* * This is called during booke_init, before the system is really initialized. */ static void mmu_booke_bootstrap(mmu_t mmu, vm_offset_t start, vm_offset_t kernelend) { vm_offset_t phys_kernelend; struct mem_region *mp, *mp1; int cnt, i, j; u_int s, e, sz; u_int phys_avail_count; vm_size_t physsz, hwphyssz, kstack0_sz; vm_offset_t kernel_pdir, kstack0, va; vm_paddr_t kstack0_phys; void *dpcpu; pte_t *pte; debugf("mmu_booke_bootstrap: entered\n"); /* Set interesting system properties */ hw_direct_map = 0; elf32_nxstack = 1; /* Initialize invalidation mutex */ mtx_init(&tlbivax_mutex, "tlbivax", NULL, MTX_SPIN); /* Read TLB0 size and associativity. */ tlb0_get_tlbconf(); /* * Align kernel start and end address (kernel image). * Note that kernel end does not necessarily relate to kernsize. * kernsize is the size of the kernel that is actually mapped. * Also note that "start - 1" is deliberate. With SMP, the * entry point is exactly a page from the actual load address. * As such, trunc_page() has no effect and we're off by a page. * Since we always have the ELF header between the load address * and the entry point, we can safely subtract 1 to compensate. */ kernstart = trunc_page(start - 1); data_start = round_page(kernelend); data_end = data_start; /* * Addresses of preloaded modules (like file systems) use * physical addresses. Make sure we relocate those into * virtual addresses. */ preload_addr_relocate = kernstart - kernload; /* Allocate the dynamic per-cpu area. */ dpcpu = (void *)data_end; data_end += DPCPU_SIZE; /* Allocate space for the message buffer. */ msgbufp = (struct msgbuf *)data_end; data_end += msgbufsize; debugf(" msgbufp at 0x%08x end = 0x%08x\n", (uint32_t)msgbufp, data_end); data_end = round_page(data_end); /* Allocate space for ptbl_bufs. */ ptbl_bufs = (struct ptbl_buf *)data_end; data_end += sizeof(struct ptbl_buf) * PTBL_BUFS; debugf(" ptbl_bufs at 0x%08x end = 0x%08x\n", (uint32_t)ptbl_bufs, data_end); data_end = round_page(data_end); /* Allocate PTE tables for kernel KVA. */ kernel_pdir = data_end; kernel_ptbls = (VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS + PDIR_SIZE - 1) / PDIR_SIZE; data_end += kernel_ptbls * PTBL_PAGES * PAGE_SIZE; debugf(" kernel ptbls: %d\n", kernel_ptbls); debugf(" kernel pdir at 0x%08x end = 0x%08x\n", kernel_pdir, data_end); debugf(" data_end: 0x%08x\n", data_end); if (data_end - kernstart > kernsize) { kernsize += tlb1_mapin_region(kernstart + kernsize, kernload + kernsize, (data_end - kernstart) - kernsize); } data_end = kernstart + kernsize; debugf(" updated data_end: 0x%08x\n", data_end); /* * Clear the structures - note we can only do it safely after the * possible additional TLB1 translations are in place (above) so that * all range up to the currently calculated 'data_end' is covered. */ dpcpu_init(dpcpu, 0); memset((void *)ptbl_bufs, 0, sizeof(struct ptbl_buf) * PTBL_SIZE); memset((void *)kernel_pdir, 0, kernel_ptbls * PTBL_PAGES * PAGE_SIZE); /*******************************************************/ /* Set the start and end of kva. */ /*******************************************************/ virtual_avail = round_page(data_end); virtual_end = VM_MAX_KERNEL_ADDRESS; /* Allocate KVA space for page zero/copy operations. */ zero_page_va = virtual_avail; virtual_avail += PAGE_SIZE; zero_page_idle_va = virtual_avail; virtual_avail += PAGE_SIZE; copy_page_src_va = virtual_avail; virtual_avail += PAGE_SIZE; copy_page_dst_va = virtual_avail; virtual_avail += PAGE_SIZE; debugf("zero_page_va = 0x%08x\n", zero_page_va); debugf("zero_page_idle_va = 0x%08x\n", zero_page_idle_va); debugf("copy_page_src_va = 0x%08x\n", copy_page_src_va); debugf("copy_page_dst_va = 0x%08x\n", copy_page_dst_va); /* Initialize page zero/copy mutexes. */ mtx_init(&zero_page_mutex, "mmu_booke_zero_page", NULL, MTX_DEF); mtx_init(©_page_mutex, "mmu_booke_copy_page", NULL, MTX_DEF); /* Allocate KVA space for ptbl bufs. */ ptbl_buf_pool_vabase = virtual_avail; virtual_avail += PTBL_BUFS * PTBL_PAGES * PAGE_SIZE; debugf("ptbl_buf_pool_vabase = 0x%08x end = 0x%08x\n", ptbl_buf_pool_vabase, virtual_avail); /* Calculate corresponding physical addresses for the kernel region. */ phys_kernelend = kernload + kernsize; debugf("kernel image and allocated data:\n"); debugf(" kernload = 0x%08x\n", kernload); debugf(" kernstart = 0x%08x\n", kernstart); debugf(" kernsize = 0x%08x\n", kernsize); if (sizeof(phys_avail) / sizeof(phys_avail[0]) < availmem_regions_sz) panic("mmu_booke_bootstrap: phys_avail too small"); /* * Remove kernel physical address range from avail regions list. Page * align all regions. Non-page aligned memory isn't very interesting * to us. Also, sort the entries for ascending addresses. */ /* Retrieve phys/avail mem regions */ mem_regions(&physmem_regions, &physmem_regions_sz, &availmem_regions, &availmem_regions_sz); sz = 0; cnt = availmem_regions_sz; debugf("processing avail regions:\n"); for (mp = availmem_regions; mp->mr_size; mp++) { s = mp->mr_start; e = mp->mr_start + mp->mr_size; debugf(" %08x-%08x -> ", s, e); /* Check whether this region holds all of the kernel. */ if (s < kernload && e > phys_kernelend) { availmem_regions[cnt].mr_start = phys_kernelend; availmem_regions[cnt++].mr_size = e - phys_kernelend; e = kernload; } /* Look whether this regions starts within the kernel. */ if (s >= kernload && s < phys_kernelend) { if (e <= phys_kernelend) goto empty; s = phys_kernelend; } /* Now look whether this region ends within the kernel. */ if (e > kernload && e <= phys_kernelend) { if (s >= kernload) goto empty; e = kernload; } /* Now page align the start and size of the region. */ s = round_page(s); e = trunc_page(e); if (e < s) e = s; sz = e - s; debugf("%08x-%08x = %x\n", s, e, sz); /* Check whether some memory is left here. */ if (sz == 0) { empty: memmove(mp, mp + 1, (cnt - (mp - availmem_regions)) * sizeof(*mp)); cnt--; mp--; continue; } /* Do an insertion sort. */ for (mp1 = availmem_regions; mp1 < mp; mp1++) if (s < mp1->mr_start) break; if (mp1 < mp) { memmove(mp1 + 1, mp1, (char *)mp - (char *)mp1); mp1->mr_start = s; mp1->mr_size = sz; } else { mp->mr_start = s; mp->mr_size = sz; } } availmem_regions_sz = cnt; /*******************************************************/ /* Steal physical memory for kernel stack from the end */ /* of the first avail region */ /*******************************************************/ kstack0_sz = KSTACK_PAGES * PAGE_SIZE; kstack0_phys = availmem_regions[0].mr_start + availmem_regions[0].mr_size; kstack0_phys -= kstack0_sz; availmem_regions[0].mr_size -= kstack0_sz; /*******************************************************/ /* Fill in phys_avail table, based on availmem_regions */ /*******************************************************/ phys_avail_count = 0; physsz = 0; hwphyssz = 0; TUNABLE_ULONG_FETCH("hw.physmem", (u_long *) &hwphyssz); debugf("fill in phys_avail:\n"); for (i = 0, j = 0; i < availmem_regions_sz; i++, j += 2) { debugf(" region: 0x%08x - 0x%08x (0x%08x)\n", availmem_regions[i].mr_start, availmem_regions[i].mr_start + availmem_regions[i].mr_size, availmem_regions[i].mr_size); if (hwphyssz != 0 && (physsz + availmem_regions[i].mr_size) >= hwphyssz) { debugf(" hw.physmem adjust\n"); if (physsz < hwphyssz) { phys_avail[j] = availmem_regions[i].mr_start; phys_avail[j + 1] = availmem_regions[i].mr_start + hwphyssz - physsz; physsz = hwphyssz; phys_avail_count++; } break; } phys_avail[j] = availmem_regions[i].mr_start; phys_avail[j + 1] = availmem_regions[i].mr_start + availmem_regions[i].mr_size; phys_avail_count++; physsz += availmem_regions[i].mr_size; } physmem = btoc(physsz); /* Calculate the last available physical address. */ for (i = 0; phys_avail[i + 2] != 0; i += 2) ; Maxmem = powerpc_btop(phys_avail[i + 1]); debugf("Maxmem = 0x%08lx\n", Maxmem); debugf("phys_avail_count = %d\n", phys_avail_count); debugf("physsz = 0x%08x physmem = %ld (0x%08lx)\n", physsz, physmem, physmem); /*******************************************************/ /* Initialize (statically allocated) kernel pmap. */ /*******************************************************/ PMAP_LOCK_INIT(kernel_pmap); kptbl_min = VM_MIN_KERNEL_ADDRESS / PDIR_SIZE; debugf("kernel_pmap = 0x%08x\n", (uint32_t)kernel_pmap); debugf("kptbl_min = %d, kernel_ptbls = %d\n", kptbl_min, kernel_ptbls); debugf("kernel pdir range: 0x%08x - 0x%08x\n", kptbl_min * PDIR_SIZE, (kptbl_min + kernel_ptbls) * PDIR_SIZE - 1); /* Initialize kernel pdir */ for (i = 0; i < kernel_ptbls; i++) kernel_pmap->pm_pdir[kptbl_min + i] = (pte_t *)(kernel_pdir + (i * PAGE_SIZE * PTBL_PAGES)); for (i = 0; i < MAXCPU; i++) { kernel_pmap->pm_tid[i] = TID_KERNEL; /* Initialize each CPU's tidbusy entry 0 with kernel_pmap */ tidbusy[i][0] = kernel_pmap; } /* * Fill in PTEs covering kernel code and data. They are not required * for address translation, as this area is covered by static TLB1 * entries, but for pte_vatopa() to work correctly with kernel area * addresses. */ for (va = kernstart; va < data_end; va += PAGE_SIZE) { pte = &(kernel_pmap->pm_pdir[PDIR_IDX(va)][PTBL_IDX(va)]); pte->rpn = kernload + (va - kernstart); pte->flags = PTE_M | PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID; } /* Mark kernel_pmap active on all CPUs */ CPU_FILL(&kernel_pmap->pm_active); /* * Initialize the global pv list lock. */ rw_init(&pvh_global_lock, "pmap pv global"); /*******************************************************/ /* Final setup */ /*******************************************************/ /* Enter kstack0 into kernel map, provide guard page */ kstack0 = virtual_avail + KSTACK_GUARD_PAGES * PAGE_SIZE; thread0.td_kstack = kstack0; thread0.td_kstack_pages = KSTACK_PAGES; debugf("kstack_sz = 0x%08x\n", kstack0_sz); debugf("kstack0_phys at 0x%08x - 0x%08x\n", kstack0_phys, kstack0_phys + kstack0_sz); debugf("kstack0 at 0x%08x - 0x%08x\n", kstack0, kstack0 + kstack0_sz); virtual_avail += KSTACK_GUARD_PAGES * PAGE_SIZE + kstack0_sz; for (i = 0; i < KSTACK_PAGES; i++) { mmu_booke_kenter(mmu, kstack0, kstack0_phys); kstack0 += PAGE_SIZE; kstack0_phys += PAGE_SIZE; } pmap_bootstrapped = 1; debugf("virtual_avail = %08x\n", virtual_avail); debugf("virtual_end = %08x\n", virtual_end); debugf("mmu_booke_bootstrap: exit\n"); } void pmap_bootstrap_ap(volatile uint32_t *trcp __unused) { int i; /* * Finish TLB1 configuration: the BSP already set up its TLB1 and we * have the snapshot of its contents in the s/w tlb1[] table, so use * these values directly to (re)program AP's TLB1 hardware. */ for (i = bp_ntlb1s; i < tlb1_idx; i++) { /* Skip invalid entries */ if (!(tlb1[i].mas1 & MAS1_VALID)) continue; tlb1_write_entry(i); } set_mas4_defaults(); } /* * Get the physical page address for the given pmap/virtual address. */ static vm_paddr_t mmu_booke_extract(mmu_t mmu, pmap_t pmap, vm_offset_t va) { vm_paddr_t pa; PMAP_LOCK(pmap); pa = pte_vatopa(mmu, pmap, va); PMAP_UNLOCK(pmap); return (pa); } /* * Extract the physical page address associated with the given * kernel virtual address. */ static vm_paddr_t mmu_booke_kextract(mmu_t mmu, vm_offset_t va) { int i; /* Check TLB1 mappings */ for (i = 0; i < tlb1_idx; i++) { if (!(tlb1[i].mas1 & MAS1_VALID)) continue; if (va >= tlb1[i].virt && va < tlb1[i].virt + tlb1[i].size) return (tlb1[i].phys + (va - tlb1[i].virt)); } return (pte_vatopa(mmu, kernel_pmap, va)); } /* * Initialize the pmap module. * Called by vm_init, to initialize any structures that the pmap * system needs to map virtual memory. */ static void mmu_booke_init(mmu_t mmu) { int shpgperproc = PMAP_SHPGPERPROC; /* * Initialize the address space (zone) for the pv entries. Set a * high water mark so that the system can recover from excessive * numbers of pv entries. */ pvzone = uma_zcreate("PV ENTRY", sizeof(struct pv_entry), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_VM | UMA_ZONE_NOFREE); TUNABLE_INT_FETCH("vm.pmap.shpgperproc", &shpgperproc); pv_entry_max = shpgperproc * maxproc + vm_cnt.v_page_count; TUNABLE_INT_FETCH("vm.pmap.pv_entries", &pv_entry_max); pv_entry_high_water = 9 * (pv_entry_max / 10); uma_zone_reserve_kva(pvzone, pv_entry_max); /* Pre-fill pvzone with initial number of pv entries. */ uma_prealloc(pvzone, PV_ENTRY_ZONE_MIN); /* Initialize ptbl allocation. */ ptbl_init(); } /* * Map a list of wired pages into kernel virtual address space. This is * intended for temporary mappings which do not need page modification or * references recorded. Existing mappings in the region are overwritten. */ static void mmu_booke_qenter(mmu_t mmu, vm_offset_t sva, vm_page_t *m, int count) { vm_offset_t va; va = sva; while (count-- > 0) { mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(*m)); va += PAGE_SIZE; m++; } } /* * Remove page mappings from kernel virtual address space. Intended for * temporary mappings entered by mmu_booke_qenter. */ static void mmu_booke_qremove(mmu_t mmu, vm_offset_t sva, int count) { vm_offset_t va; va = sva; while (count-- > 0) { mmu_booke_kremove(mmu, va); va += PAGE_SIZE; } } /* * Map a wired page into kernel virtual address space. */ static void mmu_booke_kenter(mmu_t mmu, vm_offset_t va, vm_paddr_t pa) { mmu_booke_kenter_attr(mmu, va, pa, VM_MEMATTR_DEFAULT); } static void mmu_booke_kenter_attr(mmu_t mmu, vm_offset_t va, vm_paddr_t pa, vm_memattr_t ma) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); uint32_t flags; pte_t *pte; KASSERT(((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_kenter: invalid va")); flags = PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID; flags |= tlb_calc_wimg(pa, ma); pte = &(kernel_pmap->pm_pdir[pdir_idx][ptbl_idx]); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); if (PTE_ISVALID(pte)) { CTR1(KTR_PMAP, "%s: replacing entry!", __func__); /* Flush entry from TLB0 */ tlb0_flush_entry(va); } pte->rpn = pa & ~PTE_PA_MASK; pte->flags = flags; //debugf("mmu_booke_kenter: pdir_idx = %d ptbl_idx = %d va=0x%08x " // "pa=0x%08x rpn=0x%08x flags=0x%08x\n", // pdir_idx, ptbl_idx, va, pa, pte->rpn, pte->flags); /* Flush the real memory from the instruction cache. */ if ((flags & (PTE_I | PTE_G)) == 0) { __syncicache((void *)va, PAGE_SIZE); } tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } /* * Remove a page from kernel page table. */ static void mmu_booke_kremove(mmu_t mmu, vm_offset_t va) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); pte_t *pte; // CTR2(KTR_PMAP,("%s: s (va = 0x%08x)\n", __func__, va)); KASSERT(((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_kremove: invalid va")); pte = &(kernel_pmap->pm_pdir[pdir_idx][ptbl_idx]); if (!PTE_ISVALID(pte)) { CTR1(KTR_PMAP, "%s: invalid pte", __func__); return; } mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); /* Invalidate entry in TLB0, update PTE. */ tlb0_flush_entry(va); pte->flags = 0; pte->rpn = 0; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } /* * Initialize pmap associated with process 0. */ static void mmu_booke_pinit0(mmu_t mmu, pmap_t pmap) { PMAP_LOCK_INIT(pmap); mmu_booke_pinit(mmu, pmap); PCPU_SET(curpmap, pmap); } /* * Initialize a preallocated and zeroed pmap structure, * such as one in a vmspace structure. */ static void mmu_booke_pinit(mmu_t mmu, pmap_t pmap) { int i; CTR4(KTR_PMAP, "%s: pmap = %p, proc %d '%s'", __func__, pmap, curthread->td_proc->p_pid, curthread->td_proc->p_comm); KASSERT((pmap != kernel_pmap), ("pmap_pinit: initializing kernel_pmap")); for (i = 0; i < MAXCPU; i++) pmap->pm_tid[i] = TID_NONE; CPU_ZERO(&kernel_pmap->pm_active); bzero(&pmap->pm_stats, sizeof(pmap->pm_stats)); bzero(&pmap->pm_pdir, sizeof(pte_t *) * PDIR_NENTRIES); TAILQ_INIT(&pmap->pm_ptbl_list); } /* * Release any resources held by the given physical map. * Called when a pmap initialized by mmu_booke_pinit is being released. * Should only be called if the map contains no valid mappings. */ static void mmu_booke_release(mmu_t mmu, pmap_t pmap) { KASSERT(pmap->pm_stats.resident_count == 0, ("pmap_release: pmap resident count %ld != 0", pmap->pm_stats.resident_count)); } /* * Insert the given physical page at the specified virtual address in the * target physical map with the protection requested. If specified the page * will be wired down. */ static int mmu_booke_enter(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot, u_int flags, int8_t psind) { int error; rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); error = mmu_booke_enter_locked(mmu, pmap, va, m, prot, flags, psind); rw_wunlock(&pvh_global_lock); PMAP_UNLOCK(pmap); return (error); } static int mmu_booke_enter_locked(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot, u_int pmap_flags, int8_t psind __unused) { pte_t *pte; vm_paddr_t pa; uint32_t flags; int error, su, sync; pa = VM_PAGE_TO_PHYS(m); su = (pmap == kernel_pmap); sync = 0; //debugf("mmu_booke_enter_locked: s (pmap=0x%08x su=%d tid=%d m=0x%08x va=0x%08x " // "pa=0x%08x prot=0x%08x flags=%#x)\n", // (u_int32_t)pmap, su, pmap->pm_tid, // (u_int32_t)m, va, pa, prot, flags); if (su) { KASSERT(((va >= virtual_avail) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_enter_locked: kernel pmap, non kernel va")); } else { KASSERT((va <= VM_MAXUSER_ADDRESS), ("mmu_booke_enter_locked: user pmap, non user va")); } if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m)) VM_OBJECT_ASSERT_LOCKED(m->object); PMAP_LOCK_ASSERT(pmap, MA_OWNED); /* * If there is an existing mapping, and the physical address has not * changed, must be protection or wiring change. */ if (((pte = pte_find(mmu, pmap, va)) != NULL) && (PTE_ISVALID(pte)) && (PTE_PA(pte) == pa)) { /* * Before actually updating pte->flags we calculate and * prepare its new value in a helper var. */ flags = pte->flags; flags &= ~(PTE_UW | PTE_UX | PTE_SW | PTE_SX | PTE_MODIFIED); /* Wiring change, just update stats. */ if ((pmap_flags & PMAP_ENTER_WIRED) != 0) { if (!PTE_ISWIRED(pte)) { flags |= PTE_WIRED; pmap->pm_stats.wired_count++; } } else { if (PTE_ISWIRED(pte)) { flags &= ~PTE_WIRED; pmap->pm_stats.wired_count--; } } if (prot & VM_PROT_WRITE) { /* Add write permissions. */ flags |= PTE_SW; if (!su) flags |= PTE_UW; if ((flags & PTE_MANAGED) != 0) vm_page_aflag_set(m, PGA_WRITEABLE); } else { /* Handle modified pages, sense modify status. */ /* * The PTE_MODIFIED flag could be set by underlying * TLB misses since we last read it (above), possibly * other CPUs could update it so we check in the PTE * directly rather than rely on that saved local flags * copy. */ if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); } if (prot & VM_PROT_EXECUTE) { flags |= PTE_SX; if (!su) flags |= PTE_UX; /* * Check existing flags for execute permissions: if we * are turning execute permissions on, icache should * be flushed. */ if ((pte->flags & (PTE_UX | PTE_SX)) == 0) sync++; } flags &= ~PTE_REFERENCED; /* * The new flags value is all calculated -- only now actually * update the PTE. */ mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); pte->flags = flags; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } else { /* * If there is an existing mapping, but it's for a different * physical address, pte_enter() will delete the old mapping. */ //if ((pte != NULL) && PTE_ISVALID(pte)) // debugf("mmu_booke_enter_locked: replace\n"); //else // debugf("mmu_booke_enter_locked: new\n"); /* Now set up the flags and install the new mapping. */ flags = (PTE_SR | PTE_VALID); flags |= PTE_M; if (!su) flags |= PTE_UR; if (prot & VM_PROT_WRITE) { flags |= PTE_SW; if (!su) flags |= PTE_UW; if ((m->oflags & VPO_UNMANAGED) == 0) vm_page_aflag_set(m, PGA_WRITEABLE); } if (prot & VM_PROT_EXECUTE) { flags |= PTE_SX; if (!su) flags |= PTE_UX; } /* If its wired update stats. */ if ((pmap_flags & PMAP_ENTER_WIRED) != 0) flags |= PTE_WIRED; error = pte_enter(mmu, pmap, m, va, flags, (pmap_flags & PMAP_ENTER_NOSLEEP) != 0); if (error != 0) return (KERN_RESOURCE_SHORTAGE); if ((flags & PMAP_ENTER_WIRED) != 0) pmap->pm_stats.wired_count++; /* Flush the real memory from the instruction cache. */ if (prot & VM_PROT_EXECUTE) sync++; } if (sync && (su || pmap == PCPU_GET(curpmap))) { __syncicache((void *)va, PAGE_SIZE); sync = 0; } return (KERN_SUCCESS); } /* * Maps a sequence of resident pages belonging to the same object. * The sequence begins with the given page m_start. This page is * mapped at the given virtual address start. Each subsequent page is * mapped at a virtual address that is offset from start by the same * amount as the page is offset from m_start within the object. The * last page in the sequence is the page with the largest offset from * m_start that can be mapped at a virtual address less than the given * virtual address end. Not every virtual page between start and end * is mapped; only those for which a resident page exists with the * corresponding offset from m_start are mapped. */ static void mmu_booke_enter_object(mmu_t mmu, pmap_t pmap, vm_offset_t start, vm_offset_t end, vm_page_t m_start, vm_prot_t prot) { vm_page_t m; vm_pindex_t diff, psize; VM_OBJECT_ASSERT_LOCKED(m_start->object); psize = atop(end - start); m = m_start; rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) { mmu_booke_enter_locked(mmu, pmap, start + ptoa(diff), m, prot & (VM_PROT_READ | VM_PROT_EXECUTE), PMAP_ENTER_NOSLEEP, 0); m = TAILQ_NEXT(m, listq); } rw_wunlock(&pvh_global_lock); PMAP_UNLOCK(pmap); } static void mmu_booke_enter_quick(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot) { rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); mmu_booke_enter_locked(mmu, pmap, va, m, prot & (VM_PROT_READ | VM_PROT_EXECUTE), PMAP_ENTER_NOSLEEP, 0); rw_wunlock(&pvh_global_lock); PMAP_UNLOCK(pmap); } /* * Remove the given range of addresses from the specified map. * * It is assumed that the start and end are properly rounded to the page size. */ static void mmu_booke_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_offset_t endva) { pte_t *pte; uint8_t hold_flag; int su = (pmap == kernel_pmap); //debugf("mmu_booke_remove: s (su = %d pmap=0x%08x tid=%d va=0x%08x endva=0x%08x)\n", // su, (u_int32_t)pmap, pmap->pm_tid, va, endva); if (su) { KASSERT(((va >= virtual_avail) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_remove: kernel pmap, non kernel va")); } else { KASSERT((va <= VM_MAXUSER_ADDRESS), ("mmu_booke_remove: user pmap, non user va")); } if (PMAP_REMOVE_DONE(pmap)) { //debugf("mmu_booke_remove: e (empty)\n"); return; } hold_flag = PTBL_HOLD_FLAG(pmap); //debugf("mmu_booke_remove: hold_flag = %d\n", hold_flag); rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); for (; va < endva; va += PAGE_SIZE) { pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) pte_remove(mmu, pmap, va, hold_flag); } PMAP_UNLOCK(pmap); rw_wunlock(&pvh_global_lock); //debugf("mmu_booke_remove: e\n"); } /* * Remove physical page from all pmaps in which it resides. */ static void mmu_booke_remove_all(mmu_t mmu, vm_page_t m) { pv_entry_t pv, pvn; uint8_t hold_flag; rw_wlock(&pvh_global_lock); for (pv = TAILQ_FIRST(&m->md.pv_list); pv != NULL; pv = pvn) { pvn = TAILQ_NEXT(pv, pv_link); PMAP_LOCK(pv->pv_pmap); hold_flag = PTBL_HOLD_FLAG(pv->pv_pmap); pte_remove(mmu, pv->pv_pmap, pv->pv_va, hold_flag); PMAP_UNLOCK(pv->pv_pmap); } vm_page_aflag_clear(m, PGA_WRITEABLE); rw_wunlock(&pvh_global_lock); } /* * Map a range of physical addresses into kernel virtual address space. */ static vm_offset_t mmu_booke_map(mmu_t mmu, vm_offset_t *virt, vm_paddr_t pa_start, vm_paddr_t pa_end, int prot) { vm_offset_t sva = *virt; vm_offset_t va = sva; //debugf("mmu_booke_map: s (sva = 0x%08x pa_start = 0x%08x pa_end = 0x%08x)\n", // sva, pa_start, pa_end); while (pa_start < pa_end) { mmu_booke_kenter(mmu, va, pa_start); va += PAGE_SIZE; pa_start += PAGE_SIZE; } *virt = va; //debugf("mmu_booke_map: e (va = 0x%08x)\n", va); return (sva); } /* * The pmap must be activated before it's address space can be accessed in any * way. */ static void mmu_booke_activate(mmu_t mmu, struct thread *td) { pmap_t pmap; u_int cpuid; pmap = &td->td_proc->p_vmspace->vm_pmap; CTR5(KTR_PMAP, "%s: s (td = %p, proc = '%s', id = %d, pmap = 0x%08x)", __func__, td, td->td_proc->p_comm, td->td_proc->p_pid, pmap); KASSERT((pmap != kernel_pmap), ("mmu_booke_activate: kernel_pmap!")); sched_pin(); cpuid = PCPU_GET(cpuid); CPU_SET_ATOMIC(cpuid, &pmap->pm_active); PCPU_SET(curpmap, pmap); if (pmap->pm_tid[cpuid] == TID_NONE) tid_alloc(pmap); /* Load PID0 register with pmap tid value. */ mtspr(SPR_PID0, pmap->pm_tid[cpuid]); __asm __volatile("isync"); mtspr(SPR_DBCR0, td->td_pcb->pcb_cpu.booke.dbcr0); sched_unpin(); CTR3(KTR_PMAP, "%s: e (tid = %d for '%s')", __func__, pmap->pm_tid[PCPU_GET(cpuid)], td->td_proc->p_comm); } /* * Deactivate the specified process's address space. */ static void mmu_booke_deactivate(mmu_t mmu, struct thread *td) { pmap_t pmap; pmap = &td->td_proc->p_vmspace->vm_pmap; CTR5(KTR_PMAP, "%s: td=%p, proc = '%s', id = %d, pmap = 0x%08x", __func__, td, td->td_proc->p_comm, td->td_proc->p_pid, pmap); td->td_pcb->pcb_cpu.booke.dbcr0 = mfspr(SPR_DBCR0); CPU_CLR_ATOMIC(PCPU_GET(cpuid), &pmap->pm_active); PCPU_SET(curpmap, NULL); } /* * Copy the range specified by src_addr/len * from the source map to the range dst_addr/len * in the destination map. * * This routine is only advisory and need not do anything. */ static void mmu_booke_copy(mmu_t mmu, pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len, vm_offset_t src_addr) { } /* * Set the physical protection on the specified range of this map as requested. */ static void mmu_booke_protect(mmu_t mmu, pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot) { vm_offset_t va; vm_page_t m; pte_t *pte; if ((prot & VM_PROT_READ) == VM_PROT_NONE) { mmu_booke_remove(mmu, pmap, sva, eva); return; } if (prot & VM_PROT_WRITE) return; PMAP_LOCK(pmap); for (va = sva; va < eva; va += PAGE_SIZE) { if ((pte = pte_find(mmu, pmap, va)) != NULL) { if (PTE_ISVALID(pte)) { m = PHYS_TO_VM_PAGE(PTE_PA(pte)); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); /* Handle modified pages. */ if (PTE_ISMODIFIED(pte) && PTE_ISMANAGED(pte)) vm_page_dirty(m); tlb0_flush_entry(va); pte->flags &= ~(PTE_UW | PTE_SW | PTE_MODIFIED); tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } } } PMAP_UNLOCK(pmap); } /* * Clear the write and modified bits in each of the given page's mappings. */ static void mmu_booke_remove_write(mmu_t mmu, vm_page_t m) { pv_entry_t pv; pte_t *pte; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_remove_write: page %p is not managed", m)); /* * If the page is not exclusive busied, then PGA_WRITEABLE cannot be * set by another thread while the object is locked. Thus, * if PGA_WRITEABLE is clear, no page table entries need updating. */ VM_OBJECT_ASSERT_WLOCKED(m->object); if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0) return; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL) { if (PTE_ISVALID(pte)) { m = PHYS_TO_VM_PAGE(PTE_PA(pte)); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); /* Handle modified pages. */ if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); /* Flush mapping from TLB0. */ pte->flags &= ~(PTE_UW | PTE_SW | PTE_MODIFIED); tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } } PMAP_UNLOCK(pv->pv_pmap); } vm_page_aflag_clear(m, PGA_WRITEABLE); rw_wunlock(&pvh_global_lock); } static void mmu_booke_sync_icache(mmu_t mmu, pmap_t pm, vm_offset_t va, vm_size_t sz) { pte_t *pte; pmap_t pmap; vm_page_t m; vm_offset_t addr; vm_paddr_t pa = 0; int active, valid; va = trunc_page(va); sz = round_page(sz); rw_wlock(&pvh_global_lock); pmap = PCPU_GET(curpmap); active = (pm == kernel_pmap || pm == pmap) ? 1 : 0; while (sz > 0) { PMAP_LOCK(pm); pte = pte_find(mmu, pm, va); valid = (pte != NULL && PTE_ISVALID(pte)) ? 1 : 0; if (valid) pa = PTE_PA(pte); PMAP_UNLOCK(pm); if (valid) { if (!active) { /* Create a mapping in the active pmap. */ addr = 0; m = PHYS_TO_VM_PAGE(pa); PMAP_LOCK(pmap); pte_enter(mmu, pmap, m, addr, PTE_SR | PTE_VALID | PTE_UR, FALSE); __syncicache((void *)addr, PAGE_SIZE); pte_remove(mmu, pmap, addr, PTBL_UNHOLD); PMAP_UNLOCK(pmap); } else __syncicache((void *)va, PAGE_SIZE); } va += PAGE_SIZE; sz -= PAGE_SIZE; } rw_wunlock(&pvh_global_lock); } /* * Atomically extract and hold the physical page with the given * pmap and virtual address pair if that mapping permits the given * protection. */ static vm_page_t mmu_booke_extract_and_hold(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_prot_t prot) { pte_t *pte; vm_page_t m; uint32_t pte_wbit; vm_paddr_t pa; m = NULL; pa = 0; PMAP_LOCK(pmap); retry: pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) { if (pmap == kernel_pmap) pte_wbit = PTE_SW; else pte_wbit = PTE_UW; if ((pte->flags & pte_wbit) || ((prot & VM_PROT_WRITE) == 0)) { if (vm_page_pa_tryrelock(pmap, PTE_PA(pte), &pa)) goto retry; m = PHYS_TO_VM_PAGE(PTE_PA(pte)); vm_page_hold(m); } } PA_UNLOCK_COND(pa); PMAP_UNLOCK(pmap); return (m); } /* * Initialize a vm_page's machine-dependent fields. */ static void mmu_booke_page_init(mmu_t mmu, vm_page_t m) { TAILQ_INIT(&m->md.pv_list); } /* * mmu_booke_zero_page_area zeros the specified hardware page by * mapping it into virtual memory and using bzero to clear * its contents. * * off and size must reside within a single page. */ static void mmu_booke_zero_page_area(mmu_t mmu, vm_page_t m, int off, int size) { vm_offset_t va; /* XXX KASSERT off and size are within a single page? */ mtx_lock(&zero_page_mutex); va = zero_page_va; mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(m)); bzero((caddr_t)va + off, size); mmu_booke_kremove(mmu, va); mtx_unlock(&zero_page_mutex); } /* * mmu_booke_zero_page zeros the specified hardware page. */ static void mmu_booke_zero_page(mmu_t mmu, vm_page_t m) { mmu_booke_zero_page_area(mmu, m, 0, PAGE_SIZE); } /* * mmu_booke_copy_page copies the specified (machine independent) page by * mapping the page into virtual memory and using memcopy to copy the page, * one machine dependent page at a time. */ static void mmu_booke_copy_page(mmu_t mmu, vm_page_t sm, vm_page_t dm) { vm_offset_t sva, dva; sva = copy_page_src_va; dva = copy_page_dst_va; mtx_lock(©_page_mutex); mmu_booke_kenter(mmu, sva, VM_PAGE_TO_PHYS(sm)); mmu_booke_kenter(mmu, dva, VM_PAGE_TO_PHYS(dm)); memcpy((caddr_t)dva, (caddr_t)sva, PAGE_SIZE); mmu_booke_kremove(mmu, dva); mmu_booke_kremove(mmu, sva); mtx_unlock(©_page_mutex); } static inline void mmu_booke_copy_pages(mmu_t mmu, vm_page_t *ma, vm_offset_t a_offset, vm_page_t *mb, vm_offset_t b_offset, int xfersize) { void *a_cp, *b_cp; vm_offset_t a_pg_offset, b_pg_offset; int cnt; mtx_lock(©_page_mutex); while (xfersize > 0) { a_pg_offset = a_offset & PAGE_MASK; cnt = min(xfersize, PAGE_SIZE - a_pg_offset); mmu_booke_kenter(mmu, copy_page_src_va, VM_PAGE_TO_PHYS(ma[a_offset >> PAGE_SHIFT])); a_cp = (char *)copy_page_src_va + a_pg_offset; b_pg_offset = b_offset & PAGE_MASK; cnt = min(cnt, PAGE_SIZE - b_pg_offset); mmu_booke_kenter(mmu, copy_page_dst_va, VM_PAGE_TO_PHYS(mb[b_offset >> PAGE_SHIFT])); b_cp = (char *)copy_page_dst_va + b_pg_offset; bcopy(a_cp, b_cp, cnt); mmu_booke_kremove(mmu, copy_page_dst_va); mmu_booke_kremove(mmu, copy_page_src_va); a_offset += cnt; b_offset += cnt; xfersize -= cnt; } mtx_unlock(©_page_mutex); } /* * mmu_booke_zero_page_idle zeros the specified hardware page by mapping it * into virtual memory and using bzero to clear its contents. This is intended * to be called from the vm_pagezero process only and outside of Giant. No * lock is required. */ static void mmu_booke_zero_page_idle(mmu_t mmu, vm_page_t m) { vm_offset_t va; va = zero_page_idle_va; mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(m)); bzero((caddr_t)va, PAGE_SIZE); mmu_booke_kremove(mmu, va); } /* * Return whether or not the specified physical page was modified * in any of physical maps. */ static boolean_t mmu_booke_is_modified(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; boolean_t rv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_is_modified: page %p is not managed", m)); rv = FALSE; /* * If the page is not exclusive busied, then PGA_WRITEABLE cannot be * concurrently set while the object is locked. Thus, if PGA_WRITEABLE * is clear, no PTEs can be modified. */ VM_OBJECT_ASSERT_WLOCKED(m->object); if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0) return (rv); rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { if (PTE_ISMODIFIED(pte)) rv = TRUE; } PMAP_UNLOCK(pv->pv_pmap); if (rv) break; } rw_wunlock(&pvh_global_lock); return (rv); } /* * Return whether or not the specified virtual address is eligible * for prefault. */ static boolean_t mmu_booke_is_prefaultable(mmu_t mmu, pmap_t pmap, vm_offset_t addr) { return (FALSE); } /* * Return whether or not the specified physical page was referenced * in any physical maps. */ static boolean_t mmu_booke_is_referenced(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; boolean_t rv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_is_referenced: page %p is not managed", m)); rv = FALSE; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { if (PTE_ISREFERENCED(pte)) rv = TRUE; } PMAP_UNLOCK(pv->pv_pmap); if (rv) break; } rw_wunlock(&pvh_global_lock); return (rv); } /* * Clear the modify bits on the specified physical page. */ static void mmu_booke_clear_modify(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_clear_modify: page %p is not managed", m)); VM_OBJECT_ASSERT_WLOCKED(m->object); KASSERT(!vm_page_xbusied(m), ("mmu_booke_clear_modify: page %p is exclusive busied", m)); /* * If the page is not PG_AWRITEABLE, then no PTEs can be modified. * If the object containing the page is locked and the page is not * exclusive busied, then PG_AWRITEABLE cannot be concurrently set. */ if ((m->aflags & PGA_WRITEABLE) == 0) return; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); if (pte->flags & (PTE_SW | PTE_UW | PTE_MODIFIED)) { tlb0_flush_entry(pv->pv_va); pte->flags &= ~(PTE_SW | PTE_UW | PTE_MODIFIED | PTE_REFERENCED); } tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } PMAP_UNLOCK(pv->pv_pmap); } rw_wunlock(&pvh_global_lock); } /* * Return a count of reference bits for a page, clearing those bits. * It is not necessary for every reference bit to be cleared, but it * is necessary that 0 only be returned when there are truly no * reference bits set. * * XXX: The exact number of bits to check and clear is a matter that * should be tested and standardized at some point in the future for * optimal aging of shared pages. */ static int mmu_booke_ts_referenced(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; int count; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_ts_referenced: page %p is not managed", m)); count = 0; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { if (PTE_ISREFERENCED(pte)) { mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(pv->pv_va); pte->flags &= ~PTE_REFERENCED; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); if (++count > 4) { PMAP_UNLOCK(pv->pv_pmap); break; } } } PMAP_UNLOCK(pv->pv_pmap); } rw_wunlock(&pvh_global_lock); return (count); } /* * Clear the wired attribute from the mappings for the specified range of * addresses in the given pmap. Every valid mapping within that range must * have the wired attribute set. In contrast, invalid mappings cannot have * the wired attribute set, so they are ignored. * * The wired attribute of the page table entry is not a hardware feature, so * there is no need to invalidate any TLB entries. */ static void mmu_booke_unwire(mmu_t mmu, pmap_t pmap, vm_offset_t sva, vm_offset_t eva) { vm_offset_t va; pte_t *pte; PMAP_LOCK(pmap); for (va = sva; va < eva; va += PAGE_SIZE) { if ((pte = pte_find(mmu, pmap, va)) != NULL && PTE_ISVALID(pte)) { if (!PTE_ISWIRED(pte)) panic("mmu_booke_unwire: pte %p isn't wired", pte); pte->flags &= ~PTE_WIRED; pmap->pm_stats.wired_count--; } } PMAP_UNLOCK(pmap); } /* * Return true if the pmap's pv is one of the first 16 pvs linked to from this * page. This count may be changed upwards or downwards in the future; it is * only necessary that true be returned for a small subset of pmaps for proper * page aging. */ static boolean_t mmu_booke_page_exists_quick(mmu_t mmu, pmap_t pmap, vm_page_t m) { pv_entry_t pv; int loops; boolean_t rv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_page_exists_quick: page %p is not managed", m)); loops = 0; rv = FALSE; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { if (pv->pv_pmap == pmap) { rv = TRUE; break; } if (++loops >= 16) break; } rw_wunlock(&pvh_global_lock); return (rv); } /* * Return the number of managed mappings to the given physical page that are * wired. */ static int mmu_booke_page_wired_mappings(mmu_t mmu, vm_page_t m) { pv_entry_t pv; pte_t *pte; int count = 0; if ((m->oflags & VPO_UNMANAGED) != 0) return (count); rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL) if (PTE_ISVALID(pte) && PTE_ISWIRED(pte)) count++; PMAP_UNLOCK(pv->pv_pmap); } rw_wunlock(&pvh_global_lock); return (count); } static int mmu_booke_dev_direct_mapped(mmu_t mmu, vm_paddr_t pa, vm_size_t size) { int i; vm_offset_t va; /* * This currently does not work for entries that * overlap TLB1 entries. */ for (i = 0; i < tlb1_idx; i ++) { if (tlb1_iomapped(i, pa, size, &va) == 0) return (0); } return (EFAULT); } void mmu_booke_dumpsys_map(mmu_t mmu, vm_paddr_t pa, size_t sz, void **va) { vm_paddr_t ppa; vm_offset_t ofs; vm_size_t gran; /* Minidumps are based on virtual memory addresses. */ if (do_minidump) { *va = (void *)pa; return; } /* Raw physical memory dumps don't have a virtual address. */ /* We always map a 256MB page at 256M. */ gran = 256 * 1024 * 1024; ppa = pa & ~(gran - 1); ofs = pa - ppa; *va = (void *)gran; tlb1_set_entry((vm_offset_t)va, ppa, gran, _TLB_ENTRY_IO); if (sz > (gran - ofs)) tlb1_set_entry((vm_offset_t)(va + gran), ppa + gran, gran, _TLB_ENTRY_IO); } void mmu_booke_dumpsys_unmap(mmu_t mmu, vm_paddr_t pa, size_t sz, void *va) { vm_paddr_t ppa; vm_offset_t ofs; vm_size_t gran; /* Minidumps are based on virtual memory addresses. */ /* Nothing to do... */ if (do_minidump) return; /* Raw physical memory dumps don't have a virtual address. */ tlb1_idx--; tlb1[tlb1_idx].mas1 = 0; tlb1[tlb1_idx].mas2 = 0; tlb1[tlb1_idx].mas3 = 0; tlb1_write_entry(tlb1_idx); gran = 256 * 1024 * 1024; ppa = pa & ~(gran - 1); ofs = pa - ppa; if (sz > (gran - ofs)) { tlb1_idx--; tlb1[tlb1_idx].mas1 = 0; tlb1[tlb1_idx].mas2 = 0; tlb1[tlb1_idx].mas3 = 0; tlb1_write_entry(tlb1_idx); } } extern struct dump_pa dump_map[PHYS_AVAIL_SZ + 1]; void mmu_booke_scan_init(mmu_t mmu) { vm_offset_t va; pte_t *pte; int i; if (!do_minidump) { /* Initialize phys. segments for dumpsys(). */ memset(&dump_map, 0, sizeof(dump_map)); mem_regions(&physmem_regions, &physmem_regions_sz, &availmem_regions, &availmem_regions_sz); for (i = 0; i < physmem_regions_sz; i++) { dump_map[i].pa_start = physmem_regions[i].mr_start; dump_map[i].pa_size = physmem_regions[i].mr_size; } return; } /* Virtual segments for minidumps: */ memset(&dump_map, 0, sizeof(dump_map)); /* 1st: kernel .data and .bss. */ dump_map[0].pa_start = trunc_page((uintptr_t)_etext); dump_map[0].pa_size = round_page((uintptr_t)_end) - dump_map[0].pa_start; /* 2nd: msgbuf and tables (see pmap_bootstrap()). */ dump_map[1].pa_start = data_start; dump_map[1].pa_size = data_end - data_start; /* 3rd: kernel VM. */ va = dump_map[1].pa_start + dump_map[1].pa_size; /* Find start of next chunk (from va). */ while (va < virtual_end) { /* Don't dump the buffer cache. */ if (va >= kmi.buffer_sva && va < kmi.buffer_eva) { va = kmi.buffer_eva; continue; } pte = pte_find(mmu, kernel_pmap, va); if (pte != NULL && PTE_ISVALID(pte)) break; va += PAGE_SIZE; } if (va < virtual_end) { dump_map[2].pa_start = va; va += PAGE_SIZE; /* Find last page in chunk. */ while (va < virtual_end) { /* Don't run into the buffer cache. */ if (va == kmi.buffer_sva) break; pte = pte_find(mmu, kernel_pmap, va); if (pte == NULL || !PTE_ISVALID(pte)) break; va += PAGE_SIZE; } dump_map[2].pa_size = va - dump_map[2].pa_start; } } /* * Map a set of physical memory pages into the kernel virtual address space. * Return a pointer to where it is mapped. This routine is intended to be used * for mapping device memory, NOT real memory. */ static void * mmu_booke_mapdev(mmu_t mmu, vm_paddr_t pa, vm_size_t size) { return (mmu_booke_mapdev_attr(mmu, pa, size, VM_MEMATTR_DEFAULT)); } static void * mmu_booke_mapdev_attr(mmu_t mmu, vm_paddr_t pa, vm_size_t size, vm_memattr_t ma) { void *res; uintptr_t va; vm_size_t sz; int i; /* * Check if this is premapped in TLB1. Note: this should probably also * check whether a sequence of TLB1 entries exist that match the * requirement, but now only checks the easy case. */ if (ma == VM_MEMATTR_DEFAULT) { for (i = 0; i < tlb1_idx; i++) { if (!(tlb1[i].mas1 & MAS1_VALID)) continue; if (pa >= tlb1[i].phys && (pa + size) <= (tlb1[i].phys + tlb1[i].size)) return (void *)(tlb1[i].virt + (pa - tlb1[i].phys)); } } size = roundup(size, PAGE_SIZE); /* * We leave a hole for device direct mapping between the maximum user * address (0x8000000) and the minimum KVA address (0xc0000000). If * devices are in there, just map them 1:1. If not, map them to the * device mapping area about VM_MAX_KERNEL_ADDRESS. These mapped * addresses should be pulled from an allocator, but since we do not * ever free TLB1 entries, it is safe just to increment a counter. * Note that there isn't a lot of address space here (128 MB) and it * is not at all difficult to imagine running out, since that is a 4:1 * compression from the 0xc0000000 - 0xf0000000 address space that gets * mapped there. */ if (pa >= (VM_MAXUSER_ADDRESS + PAGE_SIZE) && (pa + size - 1) < VM_MIN_KERNEL_ADDRESS) va = pa; else va = atomic_fetchadd_int(&tlb1_map_base, size); res = (void *)va; do { sz = 1 << (ilog2(size) & ~1); if (bootverbose) printf("Wiring VA=%x to PA=%x (size=%x), " "using TLB1[%d]\n", va, pa, sz, tlb1_idx); tlb1_set_entry(va, pa, sz, tlb_calc_wimg(pa, ma)); size -= sz; pa += sz; va += sz; } while (size > 0); return (res); } /* * 'Unmap' a range mapped by mmu_booke_mapdev(). */ static void mmu_booke_unmapdev(mmu_t mmu, vm_offset_t va, vm_size_t size) { #ifdef SUPPORTS_SHRINKING_TLB1 vm_offset_t base, offset; /* * Unmap only if this is inside kernel virtual space. */ if ((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)) { base = trunc_page(va); offset = va & PAGE_MASK; size = roundup(offset + size, PAGE_SIZE); kva_free(base, size); } #endif } /* * mmu_booke_object_init_pt preloads the ptes for a given object into the * specified pmap. This eliminates the blast of soft faults on process startup * and immediately after an mmap. */ static void mmu_booke_object_init_pt(mmu_t mmu, pmap_t pmap, vm_offset_t addr, vm_object_t object, vm_pindex_t pindex, vm_size_t size) { VM_OBJECT_ASSERT_WLOCKED(object); KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG, ("mmu_booke_object_init_pt: non-device object")); } /* * Perform the pmap work for mincore. */ static int mmu_booke_mincore(mmu_t mmu, pmap_t pmap, vm_offset_t addr, vm_paddr_t *locked_pa) { /* XXX: this should be implemented at some point */ return (0); } /**************************************************************************/ /* TID handling */ /**************************************************************************/ /* * Allocate a TID. If necessary, steal one from someone else. * The new TID is flushed from the TLB before returning. */ static tlbtid_t tid_alloc(pmap_t pmap) { tlbtid_t tid; int thiscpu; KASSERT((pmap != kernel_pmap), ("tid_alloc: kernel pmap")); CTR2(KTR_PMAP, "%s: s (pmap = %p)", __func__, pmap); thiscpu = PCPU_GET(cpuid); tid = PCPU_GET(tid_next); if (tid > TID_MAX) tid = TID_MIN; PCPU_SET(tid_next, tid + 1); /* If we are stealing TID then clear the relevant pmap's field */ if (tidbusy[thiscpu][tid] != NULL) { CTR2(KTR_PMAP, "%s: warning: stealing tid %d", __func__, tid); tidbusy[thiscpu][tid]->pm_tid[thiscpu] = TID_NONE; /* Flush all entries from TLB0 matching this TID. */ - tid_flush(tid); + tid_flush(tid, tlb0_ways, tlb0_entries_per_way); } tidbusy[thiscpu][tid] = pmap; pmap->pm_tid[thiscpu] = tid; __asm __volatile("msync; isync"); CTR3(KTR_PMAP, "%s: e (%02d next = %02d)", __func__, tid, PCPU_GET(tid_next)); return (tid); } /**************************************************************************/ /* TLB0 handling */ /**************************************************************************/ static void tlb_print_entry(int i, uint32_t mas1, uint32_t mas2, uint32_t mas3, uint32_t mas7) { int as; char desc[3]; tlbtid_t tid; vm_size_t size; unsigned int tsize; desc[2] = '\0'; if (mas1 & MAS1_VALID) desc[0] = 'V'; else desc[0] = ' '; if (mas1 & MAS1_IPROT) desc[1] = 'P'; else desc[1] = ' '; as = (mas1 & MAS1_TS_MASK) ? 1 : 0; tid = MAS1_GETTID(mas1); tsize = (mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT; size = 0; if (tsize) size = tsize2size(tsize); debugf("%3d: (%s) [AS=%d] " "sz = 0x%08x tsz = %d tid = %d mas1 = 0x%08x " "mas2(va) = 0x%08x mas3(pa) = 0x%08x mas7 = 0x%08x\n", i, desc, as, size, tsize, tid, mas1, mas2, mas3, mas7); } /* Convert TLB0 va and way number to tlb0[] table index. */ static inline unsigned int tlb0_tableidx(vm_offset_t va, unsigned int way) { unsigned int idx; idx = (way * TLB0_ENTRIES_PER_WAY); idx += (va & MAS2_TLB0_ENTRY_IDX_MASK) >> MAS2_TLB0_ENTRY_IDX_SHIFT; return (idx); } /* * Invalidate TLB0 entry. */ static inline void tlb0_flush_entry(vm_offset_t va) { CTR2(KTR_PMAP, "%s: s va=0x%08x", __func__, va); mtx_assert(&tlbivax_mutex, MA_OWNED); __asm __volatile("tlbivax 0, %0" :: "r"(va & MAS2_EPN_MASK)); __asm __volatile("isync; msync"); __asm __volatile("tlbsync; msync"); CTR1(KTR_PMAP, "%s: e", __func__); } /* Print out contents of the MAS registers for each TLB0 entry */ void tlb0_print_tlbentries(void) { uint32_t mas0, mas1, mas2, mas3, mas7; int entryidx, way, idx; debugf("TLB0 entries:\n"); for (way = 0; way < TLB0_WAYS; way ++) for (entryidx = 0; entryidx < TLB0_ENTRIES_PER_WAY; entryidx++) { mas0 = MAS0_TLBSEL(0) | MAS0_ESEL(way); mtspr(SPR_MAS0, mas0); __asm __volatile("isync"); mas2 = entryidx << MAS2_TLB0_ENTRY_IDX_SHIFT; mtspr(SPR_MAS2, mas2); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); mas2 = mfspr(SPR_MAS2); mas3 = mfspr(SPR_MAS3); mas7 = mfspr(SPR_MAS7); idx = tlb0_tableidx(mas2, way); tlb_print_entry(idx, mas1, mas2, mas3, mas7); } } /**************************************************************************/ /* TLB1 handling */ /**************************************************************************/ /* * TLB1 mapping notes: * * TLB1[0] Kernel text and data. * TLB1[1-15] Additional kernel text and data mappings (if required), PCI * windows, other devices mappings. */ /* * Write given entry to TLB1 hardware. * Use 32 bit pa, clear 4 high-order bits of RPN (mas7). */ static void tlb1_write_entry(unsigned int idx) { uint32_t mas0, mas7; //debugf("tlb1_write_entry: s\n"); /* Clear high order RPN bits */ mas7 = 0; /* Select entry */ mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(idx); //debugf("tlb1_write_entry: mas0 = 0x%08x\n", mas0); mtspr(SPR_MAS0, mas0); __asm __volatile("isync"); mtspr(SPR_MAS1, tlb1[idx].mas1); __asm __volatile("isync"); mtspr(SPR_MAS2, tlb1[idx].mas2); __asm __volatile("isync"); mtspr(SPR_MAS3, tlb1[idx].mas3); __asm __volatile("isync"); mtspr(SPR_MAS7, mas7); __asm __volatile("isync; tlbwe; isync; msync"); //debugf("tlb1_write_entry: e\n"); } /* * Return the largest uint value log such that 2^log <= num. */ static unsigned int ilog2(unsigned int num) { int lz; __asm ("cntlzw %0, %1" : "=r" (lz) : "r" (num)); return (31 - lz); } /* * Convert TLB TSIZE value to mapped region size. */ static vm_size_t tsize2size(unsigned int tsize) { /* * size = 4^tsize KB * size = 4^tsize * 2^10 = 2^(2 * tsize - 10) */ return ((1 << (2 * tsize)) * 1024); } /* * Convert region size (must be power of 4) to TLB TSIZE value. */ static unsigned int size2tsize(vm_size_t size) { return (ilog2(size) / 2 - 5); } /* * Register permanent kernel mapping in TLB1. * * Entries are created starting from index 0 (current free entry is * kept in tlb1_idx) and are not supposed to be invalidated. */ static int tlb1_set_entry(vm_offset_t va, vm_offset_t pa, vm_size_t size, uint32_t flags) { uint32_t ts, tid; int tsize, index; index = atomic_fetchadd_int(&tlb1_idx, 1); if (index >= TLB1_ENTRIES) { printf("tlb1_set_entry: TLB1 full!\n"); return (-1); } /* Convert size to TSIZE */ tsize = size2tsize(size); tid = (TID_KERNEL << MAS1_TID_SHIFT) & MAS1_TID_MASK; /* XXX TS is hard coded to 0 for now as we only use single address space */ ts = (0 << MAS1_TS_SHIFT) & MAS1_TS_MASK; /* * Atomicity is preserved by the atomic increment above since nothing * is ever removed from tlb1. */ tlb1[index].phys = pa; tlb1[index].virt = va; tlb1[index].size = size; tlb1[index].mas1 = MAS1_VALID | MAS1_IPROT | ts | tid; tlb1[index].mas1 |= ((tsize << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK); tlb1[index].mas2 = (va & MAS2_EPN_MASK) | flags; /* Set supervisor RWX permission bits */ tlb1[index].mas3 = (pa & MAS3_RPN) | MAS3_SR | MAS3_SW | MAS3_SX; tlb1_write_entry(index); /* * XXX in general TLB1 updates should be propagated between CPUs, * since current design assumes to have the same TLB1 set-up on all * cores. */ return (0); } /* * Map in contiguous RAM region into the TLB1 using maximum of * KERNEL_REGION_MAX_TLB_ENTRIES entries. * * If necessary round up last entry size and return total size * used by all allocated entries. */ vm_size_t tlb1_mapin_region(vm_offset_t va, vm_paddr_t pa, vm_size_t size) { vm_size_t pgs[KERNEL_REGION_MAX_TLB_ENTRIES]; vm_size_t mapped, pgsz, base, mask; int idx, nents; /* Round up to the next 1M */ size = (size + (1 << 20) - 1) & ~((1 << 20) - 1); mapped = 0; idx = 0; base = va; pgsz = 64*1024*1024; while (mapped < size) { while (mapped < size && idx < KERNEL_REGION_MAX_TLB_ENTRIES) { while (pgsz > (size - mapped)) pgsz >>= 2; pgs[idx++] = pgsz; mapped += pgsz; } /* We under-map. Correct for this. */ if (mapped < size) { while (pgs[idx - 1] == pgsz) { idx--; mapped -= pgsz; } /* XXX We may increase beyond out starting point. */ pgsz <<= 2; pgs[idx++] = pgsz; mapped += pgsz; } } nents = idx; mask = pgs[0] - 1; /* Align address to the boundary */ if (va & mask) { va = (va + mask) & ~mask; pa = (pa + mask) & ~mask; } for (idx = 0; idx < nents; idx++) { pgsz = pgs[idx]; debugf("%u: %x -> %x, size=%x\n", idx, pa, va, pgsz); tlb1_set_entry(va, pa, pgsz, _TLB_ENTRY_MEM); pa += pgsz; va += pgsz; } mapped = (va - base); printf("mapped size 0x%08x (wasted space 0x%08x)\n", mapped, mapped - size); return (mapped); } /* * TLB1 initialization routine, to be called after the very first * assembler level setup done in locore.S. */ void tlb1_init() { uint32_t mas0, mas1, mas2, mas3; uint32_t tsz; u_int i; if (bootinfo != NULL && bootinfo[0] != 1) { tlb1_idx = *((uint16_t *)(bootinfo + 8)); } else tlb1_idx = 1; /* The first entry/entries are used to map the kernel. */ for (i = 0; i < tlb1_idx; i++) { mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(i); mtspr(SPR_MAS0, mas0); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); if ((mas1 & MAS1_VALID) == 0) continue; mas2 = mfspr(SPR_MAS2); mas3 = mfspr(SPR_MAS3); tlb1[i].mas1 = mas1; tlb1[i].mas2 = mfspr(SPR_MAS2); tlb1[i].mas3 = mas3; tlb1[i].virt = mas2 & MAS2_EPN_MASK; tlb1[i].phys = mas3 & MAS3_RPN; if (i == 0) kernload = mas3 & MAS3_RPN; tsz = (mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT; tlb1[i].size = (tsz > 0) ? tsize2size(tsz) : 0; kernsize += tlb1[i].size; } #ifdef SMP bp_ntlb1s = tlb1_idx; #endif /* Purge the remaining entries */ for (i = tlb1_idx; i < TLB1_ENTRIES; i++) tlb1_write_entry(i); /* Setup TLB miss defaults */ set_mas4_defaults(); } vm_offset_t pmap_early_io_map(vm_paddr_t pa, vm_size_t size) { vm_paddr_t pa_base; vm_offset_t va, sz; int i; KASSERT(!pmap_bootstrapped, ("Do not use after PMAP is up!")); for (i = 0; i < tlb1_idx; i++) { if (!(tlb1[i].mas1 & MAS1_VALID)) continue; if (pa >= tlb1[i].phys && (pa + size) <= (tlb1[i].phys + tlb1[i].size)) return (tlb1[i].virt + (pa - tlb1[i].phys)); } pa_base = trunc_page(pa); size = roundup(size + (pa - pa_base), PAGE_SIZE); tlb1_map_base = roundup2(tlb1_map_base, 1 << (ilog2(size) & ~1)); va = tlb1_map_base + (pa - pa_base); do { sz = 1 << (ilog2(size) & ~1); tlb1_set_entry(tlb1_map_base, pa_base, sz, _TLB_ENTRY_IO); size -= sz; pa_base += sz; tlb1_map_base += sz; } while (size > 0); #ifdef SMP bp_ntlb1s = tlb1_idx; #endif return (va); } /* * Setup MAS4 defaults. * These values are loaded to MAS0-2 on a TLB miss. */ static void set_mas4_defaults(void) { uint32_t mas4; /* Defaults: TLB0, PID0, TSIZED=4K */ mas4 = MAS4_TLBSELD0; mas4 |= (TLB_SIZE_4K << MAS4_TSIZED_SHIFT) & MAS4_TSIZED_MASK; #ifdef SMP mas4 |= MAS4_MD; #endif mtspr(SPR_MAS4, mas4); __asm __volatile("isync"); } /* * Print out contents of the MAS registers for each TLB1 entry */ void tlb1_print_tlbentries(void) { uint32_t mas0, mas1, mas2, mas3, mas7; int i; debugf("TLB1 entries:\n"); for (i = 0; i < TLB1_ENTRIES; i++) { mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(i); mtspr(SPR_MAS0, mas0); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); mas2 = mfspr(SPR_MAS2); mas3 = mfspr(SPR_MAS3); mas7 = mfspr(SPR_MAS7); tlb_print_entry(i, mas1, mas2, mas3, mas7); } } /* * Print out contents of the in-ram tlb1 table. */ void tlb1_print_entries(void) { int i; debugf("tlb1[] table entries:\n"); for (i = 0; i < TLB1_ENTRIES; i++) tlb_print_entry(i, tlb1[i].mas1, tlb1[i].mas2, tlb1[i].mas3, 0); } /* * Return 0 if the physical IO range is encompassed by one of the * the TLB1 entries, otherwise return related error code. */ static int tlb1_iomapped(int i, vm_paddr_t pa, vm_size_t size, vm_offset_t *va) { uint32_t prot; vm_paddr_t pa_start; vm_paddr_t pa_end; unsigned int entry_tsize; vm_size_t entry_size; *va = (vm_offset_t)NULL; /* Skip invalid entries */ if (!(tlb1[i].mas1 & MAS1_VALID)) return (EINVAL); /* * The entry must be cache-inhibited, guarded, and r/w * so it can function as an i/o page */ prot = tlb1[i].mas2 & (MAS2_I | MAS2_G); if (prot != (MAS2_I | MAS2_G)) return (EPERM); prot = tlb1[i].mas3 & (MAS3_SR | MAS3_SW); if (prot != (MAS3_SR | MAS3_SW)) return (EPERM); /* The address should be within the entry range. */ entry_tsize = (tlb1[i].mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT; KASSERT((entry_tsize), ("tlb1_iomapped: invalid entry tsize")); entry_size = tsize2size(entry_tsize); pa_start = tlb1[i].mas3 & MAS3_RPN; pa_end = pa_start + entry_size - 1; if ((pa < pa_start) || ((pa + size) > pa_end)) return (ERANGE); /* Return virtual address of this mapping. */ *va = (tlb1[i].mas2 & MAS2_EPN_MASK) + (pa - pa_start); return (0); } Index: head/sys/powerpc/booke/trap_subr.S =================================================================== --- head/sys/powerpc/booke/trap_subr.S (revision 279749) +++ head/sys/powerpc/booke/trap_subr.S (revision 279750) @@ -1,893 +1,901 @@ /*- * Copyright (C) 2006-2009 Semihalf, Rafal Jaworowski * Copyright (C) 2006 Semihalf, Marian Balakowicz * Copyright (C) 2006 Juniper Networks, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /*- * Copyright (C) 1995, 1996 Wolfgang Solfrank. * Copyright (C) 1995, 1996 TooLs GmbH. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * from: $NetBSD: trap_subr.S,v 1.20 2002/04/22 23:20:08 kleink Exp $ */ /* * NOTICE: This is not a standalone file. to use it, #include it in * your port's locore.S, like so: * * #include */ /* * SPRG usage notes * * SPRG0 - pcpu pointer * SPRG1 - all interrupts except TLB miss, critical, machine check * SPRG2 - critical * SPRG3 - machine check * SPRG4-6 - scratch * */ /* Get the per-CPU data structure */ #define GET_CPUINFO(r) mfsprg0 r #define RES_GRANULE 32 #define RES_LOCK 0 /* offset to the 'lock' word */ #define RES_RECURSE 4 /* offset to the 'recurse' word */ /* * Standard interrupt prolog * * sprg_sp - SPRG{1-3} reg used to temporarily store the SP * savearea - temp save area (pc_{tempsave, disisave, critsave, mchksave}) * isrr0-1 - save restore registers with CPU state at interrupt time (may be * SRR0-1, CSRR0-1, MCSRR0-1 * * 1. saves in the given savearea: * - R30-31 * - DEAR, ESR * - xSRR0-1 * * 2. saves CR -> R30 * * 3. switches to kstack if needed * * 4. notes: * - R31 can be used as scratch register until a new frame is layed on * the stack with FRAME_SETUP * * - potential TLB miss: NO. Saveareas are always acessible via TLB1 * permanent entries, and within this prolog we do not dereference any * locations potentially not in the TLB */ #define STANDARD_PROLOG(sprg_sp, savearea, isrr0, isrr1) \ mtspr sprg_sp, %r1; /* Save SP */ \ GET_CPUINFO(%r1); /* Per-cpu structure */ \ stw %r30, (savearea+CPUSAVE_R30)(%r1); \ stw %r31, (savearea+CPUSAVE_R31)(%r1); \ mfdear %r30; \ mfesr %r31; \ stw %r30, (savearea+CPUSAVE_BOOKE_DEAR)(%r1); \ stw %r31, (savearea+CPUSAVE_BOOKE_ESR)(%r1); \ mfspr %r30, isrr0; \ mfspr %r31, isrr1; /* MSR at interrupt time */ \ stw %r30, (savearea+CPUSAVE_SRR0)(%r1); \ stw %r31, (savearea+CPUSAVE_SRR1)(%r1); \ isync; \ mfspr %r1, sprg_sp; /* Restore SP */ \ mfcr %r30; /* Save CR */ \ /* switch to per-thread kstack if intr taken in user mode */ \ mtcr %r31; /* MSR at interrupt time */ \ bf 17, 1f; \ GET_CPUINFO(%r1); /* Per-cpu structure */ \ lwz %r1, PC_CURPCB(%r1); /* Per-thread kernel stack */ \ 1: #define STANDARD_CRIT_PROLOG(sprg_sp, savearea, isrr0, isrr1) \ mtspr sprg_sp, %r1; /* Save SP */ \ GET_CPUINFO(%r1); /* Per-cpu structure */ \ stw %r30, (savearea+CPUSAVE_R30)(%r1); \ stw %r31, (savearea+CPUSAVE_R31)(%r1); \ mfdear %r30; \ mfesr %r31; \ stw %r30, (savearea+CPUSAVE_BOOKE_DEAR)(%r1); \ stw %r31, (savearea+CPUSAVE_BOOKE_ESR)(%r1); \ mfspr %r30, isrr0; \ mfspr %r31, isrr1; /* MSR at interrupt time */ \ stw %r30, (savearea+CPUSAVE_SRR0)(%r1); \ stw %r31, (savearea+CPUSAVE_SRR1)(%r1); \ mfspr %r30, SPR_SRR0; \ mfspr %r31, SPR_SRR1; /* MSR at interrupt time */ \ stw %r30, (savearea+CPUSAVE_SRR0+8)(%r1); \ stw %r31, (savearea+CPUSAVE_SRR1+8)(%r1); \ isync; \ mfspr %r1, sprg_sp; /* Restore SP */ \ mfcr %r30; /* Save CR */ \ /* switch to per-thread kstack if intr taken in user mode */ \ mtcr %r31; /* MSR at interrupt time */ \ bf 17, 1f; \ GET_CPUINFO(%r1); /* Per-cpu structure */ \ lwz %r1, PC_CURPCB(%r1); /* Per-thread kernel stack */ \ 1: /* * FRAME_SETUP assumes: * SPRG{1-3} SP at the time interrupt occured * savearea r30-r31, DEAR, ESR, xSRR0-1 * r30 CR * r31 scratch * r1 kernel stack * * sprg_sp - SPRG reg containing SP at the time interrupt occured * savearea - temp save * exc - exception number (EXC_xxx) * * 1. sets a new frame * 2. saves in the frame: * - R0, R1 (SP at the time of interrupt), R2, LR, CR * - R3-31 (R30-31 first restored from savearea) * - XER, CTR, DEAR, ESR (from savearea), xSRR0-1 * * Notes: * - potential TLB miss: YES, since we make dereferences to kstack, which * can happen not covered (we can have up to two DTLB misses if fortunate * enough i.e. when kstack crosses page boundary and both pages are * untranslated) */ #define FRAME_SETUP(sprg_sp, savearea, exc) \ mfspr %r31, sprg_sp; /* get saved SP */ \ /* establish a new stack frame and put everything on it */ \ stwu %r31, -FRAMELEN(%r1); \ stw %r0, FRAME_0+8(%r1); /* save r0 in the trapframe */ \ stw %r31, FRAME_1+8(%r1); /* save SP " " */ \ stw %r2, FRAME_2+8(%r1); /* save r2 " " */ \ mflr %r31; \ stw %r31, FRAME_LR+8(%r1); /* save LR " " */ \ stw %r30, FRAME_CR+8(%r1); /* save CR " " */ \ GET_CPUINFO(%r2); \ lwz %r30, (savearea+CPUSAVE_R30)(%r2); /* get saved r30 */ \ lwz %r31, (savearea+CPUSAVE_R31)(%r2); /* get saved r31 */ \ /* save R3-31 */ \ stmw %r3, FRAME_3+8(%r1) ; \ /* save DEAR, ESR */ \ lwz %r28, (savearea+CPUSAVE_BOOKE_DEAR)(%r2); \ lwz %r29, (savearea+CPUSAVE_BOOKE_ESR)(%r2); \ stw %r28, FRAME_BOOKE_DEAR+8(%r1); \ stw %r29, FRAME_BOOKE_ESR+8(%r1); \ /* save XER, CTR, exc number */ \ mfxer %r3; \ mfctr %r4; \ stw %r3, FRAME_XER+8(%r1); \ stw %r4, FRAME_CTR+8(%r1); \ li %r5, exc; \ stw %r5, FRAME_EXC+8(%r1); \ /* save DBCR0 */ \ mfspr %r3, SPR_DBCR0; \ stw %r3, FRAME_BOOKE_DBCR0+8(%r1); \ /* save xSSR0-1 */ \ lwz %r30, (savearea+CPUSAVE_SRR0)(%r2); \ lwz %r31, (savearea+CPUSAVE_SRR1)(%r2); \ stw %r30, FRAME_SRR0+8(%r1); \ stw %r31, FRAME_SRR1+8(%r1); \ lwz %r2,PC_CURTHREAD(%r2) /* set curthread pointer */ /* * * isrr0-1 - save restore registers to restore CPU state to (may be * SRR0-1, CSRR0-1, MCSRR0-1 * * Notes: * - potential TLB miss: YES. The deref'd kstack may be not covered */ #define FRAME_LEAVE(isrr0, isrr1) \ /* restore CTR, XER, LR, CR */ \ lwz %r4, FRAME_CTR+8(%r1); \ lwz %r5, FRAME_XER+8(%r1); \ lwz %r6, FRAME_LR+8(%r1); \ lwz %r7, FRAME_CR+8(%r1); \ mtctr %r4; \ mtxer %r5; \ mtlr %r6; \ mtcr %r7; \ /* restore DBCR0 */ \ lwz %r4, FRAME_BOOKE_DBCR0+8(%r1); \ mtspr SPR_DBCR0, %r4; \ /* restore xSRR0-1 */ \ lwz %r30, FRAME_SRR0+8(%r1); \ lwz %r31, FRAME_SRR1+8(%r1); \ mtspr isrr0, %r30; \ mtspr isrr1, %r31; \ /* restore R2-31, SP */ \ lmw %r2, FRAME_2+8(%r1) ; \ lwz %r0, FRAME_0+8(%r1); \ lwz %r1, FRAME_1+8(%r1); \ isync /* * TLB miss prolog * * saves LR, CR, SRR0-1, R20-31 in the TLBSAVE area * * Notes: * - potential TLB miss: NO. It is crucial that we do not generate a TLB * miss within the TLB prolog itself! * - TLBSAVE is always translated */ #define TLB_PROLOG \ mtsprg4 %r1; /* Save SP */ \ mtsprg5 %r28; \ mtsprg6 %r29; \ /* calculate TLB nesting level and TLBSAVE instance address */ \ GET_CPUINFO(%r1); /* Per-cpu structure */ \ lwz %r28, PC_BOOKE_TLB_LEVEL(%r1); \ rlwinm %r29, %r28, 6, 23, 25; /* 4 x TLBSAVE_LEN */ \ addi %r28, %r28, 1; \ stw %r28, PC_BOOKE_TLB_LEVEL(%r1); \ addi %r29, %r29, PC_BOOKE_TLBSAVE@l; \ add %r1, %r1, %r29; /* current TLBSAVE ptr */ \ \ /* save R20-31 */ \ mfsprg5 %r28; \ mfsprg6 %r29; \ stmw %r20, (TLBSAVE_BOOKE_R20)(%r1); \ /* save LR, CR */ \ mflr %r30; \ mfcr %r31; \ stw %r30, (TLBSAVE_BOOKE_LR)(%r1); \ stw %r31, (TLBSAVE_BOOKE_CR)(%r1); \ /* save SRR0-1 */ \ mfsrr0 %r30; /* execution addr at interrupt time */ \ mfsrr1 %r31; /* MSR at interrupt time*/ \ stw %r30, (TLBSAVE_BOOKE_SRR0)(%r1); /* save SRR0 */ \ stw %r31, (TLBSAVE_BOOKE_SRR1)(%r1); /* save SRR1 */ \ isync; \ mfsprg4 %r1 /* * restores LR, CR, SRR0-1, R20-31 from the TLBSAVE area * * same notes as for the TLB_PROLOG */ #define TLB_RESTORE \ mtsprg4 %r1; /* Save SP */ \ GET_CPUINFO(%r1); /* Per-cpu structure */ \ /* calculate TLB nesting level and TLBSAVE instance addr */ \ lwz %r28, PC_BOOKE_TLB_LEVEL(%r1); \ subi %r28, %r28, 1; \ stw %r28, PC_BOOKE_TLB_LEVEL(%r1); \ rlwinm %r29, %r28, 6, 23, 25; /* 4 x TLBSAVE_LEN */ \ addi %r29, %r29, PC_BOOKE_TLBSAVE@l; \ add %r1, %r1, %r29; \ \ /* restore LR, CR */ \ lwz %r30, (TLBSAVE_BOOKE_LR)(%r1); \ lwz %r31, (TLBSAVE_BOOKE_CR)(%r1); \ mtlr %r30; \ mtcr %r31; \ /* restore SRR0-1 */ \ lwz %r30, (TLBSAVE_BOOKE_SRR0)(%r1); \ lwz %r31, (TLBSAVE_BOOKE_SRR1)(%r1); \ mtsrr0 %r30; \ mtsrr1 %r31; \ /* restore R20-31 */ \ lmw %r20, (TLBSAVE_BOOKE_R20)(%r1); \ mfsprg4 %r1 #ifdef SMP #define TLB_LOCK \ GET_CPUINFO(%r20); \ lwz %r21, PC_CURTHREAD(%r20); \ lwz %r22, PC_BOOKE_TLB_LOCK(%r20); \ \ 1: lwarx %r23, 0, %r22; \ cmpwi %r23, TLB_UNLOCKED; \ beq 2f; \ \ /* check if this is recursion */ \ cmplw cr0, %r21, %r23; \ bne- 1b; \ \ 2: /* try to acquire lock */ \ stwcx. %r21, 0, %r22; \ bne- 1b; \ \ /* got it, update recursion counter */ \ lwz %r21, RES_RECURSE(%r22); \ addi %r21, %r21, 1; \ stw %r21, RES_RECURSE(%r22); \ isync; \ msync #define TLB_UNLOCK \ GET_CPUINFO(%r20); \ lwz %r21, PC_CURTHREAD(%r20); \ lwz %r22, PC_BOOKE_TLB_LOCK(%r20); \ \ /* update recursion counter */ \ lwz %r23, RES_RECURSE(%r22); \ subi %r23, %r23, 1; \ stw %r23, RES_RECURSE(%r22); \ \ cmpwi %r23, 0; \ bne 1f; \ isync; \ msync; \ \ /* release the lock */ \ li %r23, TLB_UNLOCKED; \ stw %r23, 0(%r22); \ 1: isync; \ msync #else #define TLB_LOCK #define TLB_UNLOCK #endif /* SMP */ #define INTERRUPT(label) \ .globl label; \ .align 5; \ CNAME(label): /* * Interrupt handling routines in BookE can be flexibly placed and do not have * to live in pre-defined vectors location. Note they need to be TLB-mapped at * all times in order to be able to handle exceptions. We thus arrange for * them to be part of kernel text which is always TLB-accessible. * * The interrupt handling routines have to be 16 bytes aligned: we align them * to 32 bytes (cache line length) which supposedly performs better. * */ .text .globl CNAME(interrupt_vector_base) .align 5 interrupt_vector_base: /***************************************************************************** * Critical input interrupt ****************************************************************************/ INTERRUPT(int_critical_input) STANDARD_PROLOG(SPR_SPRG2, PC_BOOKE_CRITSAVE, SPR_CSRR0, SPR_CSRR1) FRAME_SETUP(SPR_SPRG2, PC_BOOKE_CRITSAVE, EXC_CRIT) addi %r3, %r1, 8 bl CNAME(powerpc_crit_interrupt) FRAME_LEAVE(SPR_CSRR0, SPR_CSRR1) rfci /***************************************************************************** * Machine check interrupt ****************************************************************************/ INTERRUPT(int_machine_check) STANDARD_PROLOG(SPR_SPRG3, PC_BOOKE_MCHKSAVE, SPR_MCSRR0, SPR_MCSRR1) FRAME_SETUP(SPR_SPRG3, PC_BOOKE_MCHKSAVE, EXC_MCHK) addi %r3, %r1, 8 bl CNAME(powerpc_mchk_interrupt) FRAME_LEAVE(SPR_MCSRR0, SPR_MCSRR1) rfmci /***************************************************************************** * Data storage interrupt ****************************************************************************/ INTERRUPT(int_data_storage) STANDARD_PROLOG(SPR_SPRG1, PC_DISISAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_DISISAVE, EXC_DSI) b trap_common /***************************************************************************** * Instruction storage interrupt ****************************************************************************/ INTERRUPT(int_instr_storage) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_ISI) b trap_common /***************************************************************************** * External input interrupt ****************************************************************************/ INTERRUPT(int_external_input) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_EXI) addi %r3, %r1, 8 bl CNAME(powerpc_extr_interrupt) b trapexit INTERRUPT(int_alignment) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_ALI) b trap_common INTERRUPT(int_program) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_PGM) b trap_common /***************************************************************************** * System call ****************************************************************************/ INTERRUPT(int_syscall) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_SC) b trap_common /***************************************************************************** * Decrementer interrupt ****************************************************************************/ INTERRUPT(int_decrementer) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_DECR) addi %r3, %r1, 8 bl CNAME(powerpc_decr_interrupt) b trapexit /***************************************************************************** * Fixed interval timer ****************************************************************************/ INTERRUPT(int_fixed_interval_timer) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_FIT) b trap_common /***************************************************************************** * Watchdog interrupt ****************************************************************************/ INTERRUPT(int_watchdog) STANDARD_PROLOG(SPR_SPRG1, PC_TEMPSAVE, SPR_SRR0, SPR_SRR1) FRAME_SETUP(SPR_SPRG1, PC_TEMPSAVE, EXC_WDOG) b trap_common /***************************************************************************** * Data TLB miss interrupt * * There can be nested TLB misses - while handling a TLB miss we reference * data structures that may be not covered by translations. We support up to * TLB_NESTED_MAX-1 nested misses. * * Registers use: * r31 - dear * r30 - unused * r29 - saved mas0 * r28 - saved mas1 * r27 - saved mas2 * r26 - pmap address * r25 - pte address * * r20:r23 - scratch registers ****************************************************************************/ INTERRUPT(int_data_tlb_error) TLB_PROLOG TLB_LOCK mfdear %r31 /* * Save MAS0-MAS2 registers. There might be another tlb miss during * pte lookup overwriting current contents (which was hw filled). */ mfspr %r29, SPR_MAS0 mfspr %r28, SPR_MAS1 mfspr %r27, SPR_MAS2 /* Check faulting address. */ lis %r21, VM_MAXUSER_ADDRESS@h ori %r21, %r21, VM_MAXUSER_ADDRESS@l cmplw cr0, %r31, %r21 blt search_user_pmap /* If it's kernel address, allow only supervisor mode misses. */ mfsrr1 %r21 mtcr %r21 bt 17, search_failed /* check MSR[PR] */ search_kernel_pmap: /* Load r26 with kernel_pmap address */ - lis %r26, kernel_pmap_store@h - ori %r26, %r26, kernel_pmap_store@l + bl 1f + .long kernel_pmap_store-. +1: mflr %r21 + lwz %r26, 0(%r21) + add %r26, %r21, %r26 /* kernel_pmap_store in r26 */ /* Force kernel tid, set TID to 0 in MAS1. */ li %r21, 0 rlwimi %r28, %r21, 0, 8, 15 /* clear TID bits */ tlb_miss_handle: /* This may result in nested tlb miss. */ bl pte_lookup /* returns PTE address in R25 */ cmpwi %r25, 0 /* pte found? */ beq search_failed /* Finish up, write TLB entry. */ bl tlb_fill_entry tlb_miss_return: TLB_UNLOCK TLB_RESTORE rfi search_user_pmap: /* Load r26 with current user space process pmap */ GET_CPUINFO(%r26) lwz %r26, PC_CURPMAP(%r26) b tlb_miss_handle search_failed: /* * Whenever we don't find a TLB mapping in PT, set a TLB0 entry with * the faulting virtual address anyway, but put a fake RPN and no * access rights. This should cause a following {D,I}SI exception. */ lis %r23, 0xffff0000@h /* revoke all permissions */ /* Load MAS registers. */ mtspr SPR_MAS0, %r29 isync mtspr SPR_MAS1, %r28 isync mtspr SPR_MAS2, %r27 isync mtspr SPR_MAS3, %r23 isync tlbwe msync isync b tlb_miss_return /***************************************************************************** * * Return pte address that corresponds to given pmap/va. If there is no valid * entry return 0. * * input: r26 - pmap * input: r31 - dear * output: r25 - pte address * * scratch regs used: r21 * ****************************************************************************/ pte_lookup: cmpwi %r26, 0 beq 1f /* fail quickly if pmap is invalid */ srwi %r21, %r31, PDIR_SHIFT /* pdir offset */ slwi %r21, %r21, PDIR_ENTRY_SHIFT /* multiply by pdir entry size */ addi %r25, %r26, PM_PDIR /* pmap pm_dir[] address */ add %r25, %r25, %r21 /* offset within pm_pdir[] table */ /* * Get ptbl address, i.e. pmap->pm_pdir[pdir_idx] * This load may cause a Data TLB miss for non-kernel pmap! */ lwz %r25, 0(%r25) cmpwi %r25, 0 beq 2f lis %r21, PTBL_MASK@h ori %r21, %r21, PTBL_MASK@l and %r21, %r21, %r31 /* ptbl offset, multiply by ptbl entry size */ srwi %r21, %r21, (PTBL_SHIFT - PTBL_ENTRY_SHIFT) add %r25, %r25, %r21 /* address of pte entry */ /* * Get pte->flags * This load may cause a Data TLB miss for non-kernel pmap! */ lwz %r21, PTE_FLAGS(%r25) andis. %r21, %r21, PTE_VALID@h bne 2f 1: li %r25, 0 2: blr /***************************************************************************** * * Load MAS1-MAS3 registers with data, write TLB entry * * input: * r29 - mas0 * r28 - mas1 * r27 - mas2 * r25 - pte * * output: none * * scratch regs: r21-r23 * ****************************************************************************/ tlb_fill_entry: /* * Update PTE flags: we have to do it atomically, as pmap_protect() * running on other CPUs could attempt to update the flags at the same * time. */ li %r23, PTE_FLAGS 1: lwarx %r21, %r23, %r25 /* get pte->flags */ oris %r21, %r21, PTE_REFERENCED@h /* set referenced bit */ andi. %r22, %r21, (PTE_SW | PTE_UW)@l /* check if writable */ beq 2f oris %r21, %r21, PTE_MODIFIED@h /* set modified bit */ 2: stwcx. %r21, %r23, %r25 /* write it back */ bne- 1b /* Update MAS2. */ rlwimi %r27, %r21, 0, 27, 30 /* insert WIMG bits from pte */ /* Setup MAS3 value in r23. */ lwz %r23, PTE_RPN(%r25) /* get pte->rpn */ rlwimi %r23, %r21, 24, 26, 31 /* insert protection bits from pte */ /* Load MAS registers. */ mtspr SPR_MAS0, %r29 isync mtspr SPR_MAS1, %r28 isync mtspr SPR_MAS2, %r27 isync mtspr SPR_MAS3, %r23 isync tlbwe isync msync blr /***************************************************************************** * Instruction TLB miss interrupt * * Same notes as for the Data TLB miss ****************************************************************************/ INTERRUPT(int_inst_tlb_error) TLB_PROLOG TLB_LOCK mfsrr0 %r31 /* faulting address */ /* * Save MAS0-MAS2 registers. There might be another tlb miss during pte * lookup overwriting current contents (which was hw filled). */ mfspr %r29, SPR_MAS0 mfspr %r28, SPR_MAS1 mfspr %r27, SPR_MAS2 mfsrr1 %r21 mtcr %r21 /* check MSR[PR] */ bt 17, search_user_pmap b search_kernel_pmap .globl interrupt_vector_top interrupt_vector_top: /***************************************************************************** * Debug interrupt ****************************************************************************/ INTERRUPT(int_debug) STANDARD_CRIT_PROLOG(SPR_SPRG2, PC_BOOKE_CRITSAVE, SPR_CSRR0, SPR_CSRR1) FRAME_SETUP(SPR_SPRG2, PC_BOOKE_CRITSAVE, EXC_DEBUG) GET_CPUINFO(%r3) lwz %r3, (PC_BOOKE_CRITSAVE+CPUSAVE_SRR0)(%r3) - lis %r4, interrupt_vector_base@ha - addi %r4, %r4, interrupt_vector_base@l + bl 0f + .long interrupt_vector_base-. + .long interrupt_vector_top-. +0: mflr %r5 + lwz %r4,0(%r5) /* interrupt_vector_base in r4 */ + add %r4,%r4,%r5 cmplw cr0, %r3, %r4 blt 1f - lis %r4, interrupt_vector_top@ha - addi %r4, %r4, interrupt_vector_top@l + lwz %r4,4(%r5) /* interrupt_vector_top in r4 */ + add %r4,%r4,%r5 + addi %r4,%r4,4 cmplw cr0, %r3, %r4 bge 1f /* Disable single-stepping for the interrupt handlers. */ lwz %r3, FRAME_SRR1+8(%r1); rlwinm %r3, %r3, 0, 23, 21 stw %r3, FRAME_SRR1+8(%r1); /* Restore srr0 and srr1 as they could have been clobbered. */ GET_CPUINFO(%r4) lwz %r3, (PC_BOOKE_CRITSAVE+CPUSAVE_SRR0+8)(%r4); mtspr SPR_SRR0, %r3 lwz %r4, (PC_BOOKE_CRITSAVE+CPUSAVE_SRR1+8)(%r4); mtspr SPR_SRR1, %r4 b 9f 1: addi %r3, %r1, 8 bl CNAME(trap) /* * Handle ASTs, needed for proper support of single-stepping. * We actually need to return to the process with an rfi. */ b trapexit 9: FRAME_LEAVE(SPR_CSRR0, SPR_CSRR1) rfci /***************************************************************************** * Common trap code ****************************************************************************/ trap_common: /* Call C trap dispatcher */ addi %r3, %r1, 8 bl CNAME(trap) .globl CNAME(trapexit) /* exported for db_backtrace use */ CNAME(trapexit): /* disable interrupts */ wrteei 0 /* Test AST pending - makes sense for user process only */ lwz %r5, FRAME_SRR1+8(%r1) mtcr %r5 bf 17, 1f GET_CPUINFO(%r3) lwz %r4, PC_CURTHREAD(%r3) lwz %r4, TD_FLAGS(%r4) lis %r5, (TDF_ASTPENDING | TDF_NEEDRESCHED)@h ori %r5, %r5, (TDF_ASTPENDING | TDF_NEEDRESCHED)@l and. %r4, %r4, %r5 beq 1f /* re-enable interrupts before calling ast() */ wrteei 1 addi %r3, %r1, 8 bl CNAME(ast) .globl CNAME(asttrapexit) /* db_backtrace code sentinel #2 */ CNAME(asttrapexit): b trapexit /* test ast ret value ? */ 1: FRAME_LEAVE(SPR_SRR0, SPR_SRR1) rfi #if defined(KDB) /* * Deliberate entry to dbtrap */ .globl CNAME(breakpoint) CNAME(breakpoint): mtsprg1 %r1 mfmsr %r3 mtsrr1 %r3 andi. %r3, %r3, ~(PSL_EE | PSL_ME)@l mtmsr %r3 /* disable interrupts */ isync GET_CPUINFO(%r3) stw %r30, (PC_DBSAVE+CPUSAVE_R30)(%r3) stw %r31, (PC_DBSAVE+CPUSAVE_R31)(%r3) mflr %r31 mtsrr0 %r31 mfdear %r30 mfesr %r31 stw %r30, (PC_DBSAVE+CPUSAVE_BOOKE_DEAR)(%r3) stw %r31, (PC_DBSAVE+CPUSAVE_BOOKE_ESR)(%r3) mfsrr0 %r30 mfsrr1 %r31 stw %r30, (PC_DBSAVE+CPUSAVE_SRR0)(%r3) stw %r31, (PC_DBSAVE+CPUSAVE_SRR1)(%r3) isync mfcr %r30 /* * Now the kdb trap catching code. */ dbtrap: FRAME_SETUP(SPR_SPRG1, PC_DBSAVE, EXC_DEBUG) /* Call C trap code: */ addi %r3, %r1, 8 bl CNAME(db_trap_glue) or. %r3, %r3, %r3 bne dbleave /* This wasn't for KDB, so switch to real trap: */ b trap_common dbleave: FRAME_LEAVE(SPR_SRR0, SPR_SRR1) rfi #endif /* KDB */ #ifdef SMP ENTRY(tlb_lock) GET_CPUINFO(%r5) lwz %r5, PC_CURTHREAD(%r5) 1: lwarx %r4, 0, %r3 cmpwi %r4, TLB_UNLOCKED bne 1b stwcx. %r5, 0, %r3 bne- 1b isync msync blr ENTRY(tlb_unlock) isync msync li %r4, TLB_UNLOCKED stw %r4, 0(%r3) isync msync blr /* * TLB miss spin locks. For each CPU we have a reservation granule (32 bytes); * only a single word from this granule will actually be used as a spin lock * for mutual exclusion between TLB miss handler and pmap layer that * manipulates page table contents. */ .data .align 5 GLOBAL(tlb0_miss_locks) .space RES_GRANULE * MAXCPU #endif Index: head/sys/powerpc/ofw/ofwcall32.S =================================================================== --- head/sys/powerpc/ofw/ofwcall32.S (revision 279749) +++ head/sys/powerpc/ofw/ofwcall32.S (revision 279750) @@ -1,156 +1,168 @@ /*- * Copyright (C) 2009-2011 Nathan Whitehorn * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #include #include #define OFWSTKSZ 4096 /* 4K Open Firmware stack */ /* * Globals */ .data GLOBAL(ofmsr) .long 0, 0, 0, 0, 0 /* msr/sprg0-3 used in Open Firmware */ GLOBAL(rtasmsr) .long 0 GLOBAL(openfirmware_entry) .long 0 /* Open Firmware entry point */ GLOBAL(rtas_entry) .long 0 /* RTAS entry point */ .align 4 ofwstk: .space OFWSTKSZ rtas_regsave: .space 4 /* * Open Firmware Entry Point. May need to enter real mode. * * C prototype: int ofwcall(void *callbuffer); */ ASENTRY(ofwcall) mflr %r0 stw %r0,4(%r1) /* Record the old MSR */ mfmsr %r6 + /* GOT pointer in r7 */ + bl _GLOBAL_OFFSET_TABLE_@local-4 + mflr %r7 + /* read client interface handler */ - lis %r4,openfirmware_entry@ha - lwz %r4,openfirmware_entry@l(%r4) + lwz %r4,openfirmware_entry@got(%r7) + lwz %r4,0(%r4) /* * Set the MSR to the OF value. This has the side effect of disabling * exceptions, which prevents preemption later. */ - lis %r5,ofmsr@ha - lwz %r5,ofmsr@l(%r5) + lwz %r5,ofmsr@got(%r7) + lwz %r5,0(%r5) mtmsr %r5 isync /* * Set up OF stack. This needs to be potentially accessible in real mode * The pointer to the current kernel stack is placed at the very * top of the stack along with the old MSR so we can get them back * later. */ mr %r5,%r1 - lis %r1,(ofwstk+OFWSTKSZ-32)@ha - addi %r1,%r1,(ofwstk+OFWSTKSZ-32)@l + lwz %r1,ofwstk@got(%r7) + addi %r1,%r1,(OFWSTKSZ-32) stw %r5,20(%r1) /* Save real stack pointer */ stw %r2,24(%r1) /* Save curthread */ stw %r6,28(%r1) /* Save old MSR */ li %r5,0 stw %r5,4(%r1) stw %r5,0(%r1) /* Finally, branch to OF */ mtctr %r4 bctrl /* Reload stack pointer and MSR from the OFW stack */ lwz %r6,28(%r1) lwz %r2,24(%r1) lwz %r1,20(%r1) /* Now set the real MSR */ mtmsr %r6 isync /* Return */ lwz %r0,4(%r1) mtlr %r0 blr /* * RTAS Entry Point. Similar to the OF one, but simpler (no separate stack) * * C prototype: int rtascall(void *callbuffer, void *rtas_privdat); */ ASENTRY(rtascall) mflr %r0 stw %r0,4(%r1) + /* GOT pointer in r7 */ + bl _GLOBAL_OFFSET_TABLE_@local-4 + mflr %r7 + /* Record the old MSR to real-mode-accessible area */ mfmsr %r0 - lis %r5,rtas_regsave@ha - stw %r0,rtas_regsave@l(%r5) + lwz %r5,rtas_regsave@got(%r7) + stw %r0,0(%r5) /* read client interface handler */ - lis %r5,rtas_entry@ha - lwz %r5,rtas_entry@l(%r5) + lwz %r5,rtas_entry@got(%r7) + lwz %r5,0(%r5) /* Set the MSR to the RTAS value */ - lis %r6,rtasmsr@ha - lwz %r6,rtasmsr@l(%r6) + lwz %r6,rtasmsr@got(%r7) + lwz %r6,0(%r6) mtmsr %r6 isync /* Branch to RTAS */ mtctr %r5 bctrl + /* GOT pointer in r7 */ + bl _GLOBAL_OFFSET_TABLE_@local-4 + mflr %r7 + /* Now set the MSR back */ - lis %r6,rtas_regsave@ha - lwz %r6,rtas_regsave@l(%r6) + lwz %r6,rtas_regsave@got(%r7) + lwz %r6,0(%r6) mtmsr %r6 isync /* And return */ lwz %r0,4(%r1) mtlr %r0 blr Index: head/sys/powerpc/powerpc/elf32_machdep.c =================================================================== --- head/sys/powerpc/powerpc/elf32_machdep.c (revision 279749) +++ head/sys/powerpc/powerpc/elf32_machdep.c (revision 279750) @@ -1,286 +1,321 @@ /*- * Copyright 1996-1998 John D. Polstra. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #define __ELF_WORD_SIZE 32 #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef __powerpc64__ #include #include extern const char *freebsd32_syscallnames[]; #endif struct sysentvec elf32_freebsd_sysvec = { .sv_size = SYS_MAXSYSCALL, #ifdef __powerpc64__ .sv_table = freebsd32_sysent, #else .sv_table = sysent, #endif .sv_mask = 0, .sv_sigsize = 0, .sv_sigtbl = NULL, .sv_errsize = 0, .sv_errtbl = NULL, .sv_transtrap = NULL, .sv_fixup = __elfN(freebsd_fixup), .sv_sendsig = sendsig, .sv_sigcode = sigcode32, .sv_szsigcode = &szsigcode32, .sv_prepsyscall = NULL, .sv_name = "FreeBSD ELF32", .sv_coredump = __elfN(coredump), .sv_imgact_try = NULL, .sv_minsigstksz = MINSIGSTKSZ, .sv_pagesize = PAGE_SIZE, .sv_minuser = VM_MIN_ADDRESS, .sv_stackprot = VM_PROT_ALL, #ifdef __powerpc64__ .sv_maxuser = VM_MAXUSER_ADDRESS, .sv_usrstack = FREEBSD32_USRSTACK, .sv_psstrings = FREEBSD32_PS_STRINGS, .sv_copyout_strings = freebsd32_copyout_strings, .sv_setregs = ppc32_setregs, .sv_syscallnames = freebsd32_syscallnames, #else .sv_maxuser = VM_MAXUSER_ADDRESS, .sv_usrstack = USRSTACK, .sv_psstrings = PS_STRINGS, .sv_copyout_strings = exec_copyout_strings, .sv_setregs = exec_setregs, .sv_syscallnames = syscallnames, #endif .sv_fixlimit = NULL, .sv_maxssiz = NULL, .sv_flags = SV_ABI_FREEBSD | SV_ILP32 | SV_SHP, .sv_set_syscall_retval = cpu_set_syscall_retval, .sv_fetch_syscall_args = cpu_fetch_syscall_args, .sv_shared_page_base = FREEBSD32_SHAREDPAGE, .sv_shared_page_len = PAGE_SIZE, .sv_schedtail = NULL, }; INIT_SYSENTVEC(elf32_sysvec, &elf32_freebsd_sysvec); static Elf32_Brandinfo freebsd_brand_info = { .brand = ELFOSABI_FREEBSD, .machine = EM_PPC, .compat_3_brand = "FreeBSD", .emul_path = NULL, .interp_path = "/libexec/ld-elf.so.1", .sysvec = &elf32_freebsd_sysvec, #ifdef __powerpc64__ .interp_newpath = "/libexec/ld-elf32.so.1", #else .interp_newpath = NULL, #endif .brand_note = &elf32_freebsd_brandnote, .flags = BI_CAN_EXEC_DYN | BI_BRAND_NOTE }; SYSINIT(elf32, SI_SUB_EXEC, SI_ORDER_FIRST, (sysinit_cfunc_t) elf32_insert_brand_entry, &freebsd_brand_info); static Elf32_Brandinfo freebsd_brand_oinfo = { .brand = ELFOSABI_FREEBSD, .machine = EM_PPC, .compat_3_brand = "FreeBSD", .emul_path = NULL, .interp_path = "/usr/libexec/ld-elf.so.1", .sysvec = &elf32_freebsd_sysvec, .interp_newpath = NULL, .brand_note = &elf32_freebsd_brandnote, .flags = BI_CAN_EXEC_DYN | BI_BRAND_NOTE }; SYSINIT(oelf32, SI_SUB_EXEC, SI_ORDER_ANY, (sysinit_cfunc_t) elf32_insert_brand_entry, &freebsd_brand_oinfo); +void elf_reloc_self(Elf_Dyn *dynp, Elf_Addr relocbase); + void elf32_dump_thread(struct thread *td, void *dst, size_t *off) { size_t len; struct pcb *pcb; len = 0; pcb = td->td_pcb; if (pcb->pcb_flags & PCB_VEC) { save_vec_nodrop(td); if (dst != NULL) { len += elf32_populate_note(NT_PPC_VMX, &pcb->pcb_vec, dst, sizeof(pcb->pcb_vec), NULL); } else len += elf32_populate_note(NT_PPC_VMX, NULL, NULL, sizeof(pcb->pcb_vec), NULL); } *off = len; } #ifndef __powerpc64__ /* Process one elf relocation with addend. */ static int elf_reloc_internal(linker_file_t lf, Elf_Addr relocbase, const void *data, int type, int local, elf_lookup_fn lookup) { Elf_Addr *where; Elf_Half *hwhere; Elf_Addr addr; Elf_Addr addend; Elf_Word rtype, symidx; const Elf_Rela *rela; switch (type) { case ELF_RELOC_REL: panic("PPC only supports RELA relocations"); break; case ELF_RELOC_RELA: rela = (const Elf_Rela *)data; where = (Elf_Addr *) ((uintptr_t)relocbase + rela->r_offset); hwhere = (Elf_Half *) ((uintptr_t)relocbase + rela->r_offset); addend = rela->r_addend; rtype = ELF_R_TYPE(rela->r_info); symidx = ELF_R_SYM(rela->r_info); break; default: panic("elf_reloc: unknown relocation mode %d\n", type); } switch (rtype) { case R_PPC_NONE: break; case R_PPC_ADDR32: /* word32 S + A */ addr = lookup(lf, symidx, 1); if (addr == 0) return -1; *where = elf_relocaddr(lf, addr + addend); break; case R_PPC_ADDR16_LO: /* #lo(S) */ addr = lookup(lf, symidx, 1); if (addr == 0) return -1; /* * addend values are sometimes relative to sections * (i.e. .rodata) in rela, where in reality they * are relative to relocbase. Detect this condition. */ if (addr > relocbase && addr <= (relocbase + addend)) addr = relocbase; addr = elf_relocaddr(lf, addr + addend); *hwhere = addr & 0xffff; break; case R_PPC_ADDR16_HA: /* #ha(S) */ addr = lookup(lf, symidx, 1); if (addr == 0) return -1; /* * addend values are sometimes relative to sections * (i.e. .rodata) in rela, where in reality they * are relative to relocbase. Detect this condition. */ if (addr > relocbase && addr <= (relocbase + addend)) addr = relocbase; addr = elf_relocaddr(lf, addr + addend); *hwhere = ((addr >> 16) + ((addr & 0x8000) ? 1 : 0)) & 0xffff; break; case R_PPC_RELATIVE: /* word32 B + A */ *where = elf_relocaddr(lf, relocbase + addend); break; default: printf("kldload: unexpected relocation type %d\n", (int) rtype); return -1; } return(0); +} + +void +elf_reloc_self(Elf_Dyn *dynp, Elf_Addr relocbase) +{ + Elf_Rela *rela = 0, *relalim; + Elf_Addr relasz = 0; + Elf_Addr *where; + + /* + * Extract the rela/relasz values from the dynamic section + */ + for (; dynp->d_tag != DT_NULL; dynp++) { + switch (dynp->d_tag) { + case DT_RELA: + rela = (Elf_Rela *)(relocbase+dynp->d_un.d_ptr); + break; + case DT_RELASZ: + relasz = dynp->d_un.d_val; + break; + } + } + + /* + * Relocate these values + */ + relalim = (Elf_Rela *)((caddr_t)rela + relasz); + for (; rela < relalim; rela++) { + if (ELF_R_TYPE(rela->r_info) != R_PPC_RELATIVE) + continue; + where = (Elf_Addr *)(relocbase + rela->r_offset); + *where = (Elf_Addr)(relocbase + rela->r_addend); + } } int elf_reloc(linker_file_t lf, Elf_Addr relocbase, const void *data, int type, elf_lookup_fn lookup) { return (elf_reloc_internal(lf, relocbase, data, type, 0, lookup)); } int elf_reloc_local(linker_file_t lf, Elf_Addr relocbase, const void *data, int type, elf_lookup_fn lookup) { return (elf_reloc_internal(lf, relocbase, data, type, 1, lookup)); } int elf_cpu_load_file(linker_file_t lf) { /* Only sync the cache for non-kernel modules */ if (lf->id != 1) __syncicache(lf->address, lf->size); return (0); } int elf_cpu_unload_file(linker_file_t lf __unused) { return (0); } #endif Index: head/sys/powerpc/powerpc/swtch32.S =================================================================== --- head/sys/powerpc/powerpc/swtch32.S (revision 279749) +++ head/sys/powerpc/powerpc/swtch32.S (revision 279750) @@ -1,204 +1,205 @@ /* $FreeBSD$ */ /* $NetBSD: locore.S,v 1.24 2000/05/31 05:09:17 thorpej Exp $ */ /*- * Copyright (C) 2001 Benno Rice * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY Benno Rice ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /*- * Copyright (C) 1995, 1996 Wolfgang Solfrank. * Copyright (C) 1995, 1996 TooLs GmbH. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by TooLs GmbH. * 4. The name of TooLs GmbH may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "assym.s" #include "opt_sched.h" #include #include #include #include #include /* * void cpu_throw(struct thread *old, struct thread *new) */ ENTRY(cpu_throw) mr %r2, %r4 li %r14,0 /* Tell cpu_switchin not to release a thread */ b cpu_switchin /* * void cpu_switch(struct thread *old, * struct thread *new, * struct mutex *mtx); * * Switch to a new thread saving the current state in the old thread. */ ENTRY(cpu_switch) lwz %r6,TD_PCB(%r3) /* Get the old thread's PCB ptr */ stmw %r12,PCB_CONTEXT(%r6) /* Save the non-volatile GP regs. These can now be used for scratch */ mfcr %r16 /* Save the condition register */ stw %r16,PCB_CR(%r6) mflr %r16 /* Save the link register */ stw %r16,PCB_LR(%r6) stw %r1,PCB_SP(%r6) /* Save the stack pointer */ mr %r14,%r3 /* Copy the old thread ptr... */ mr %r2,%r4 /* and the new thread ptr in curthread */ mr %r16,%r5 /* and the new lock */ mr %r17,%r6 /* and the PCB */ lwz %r7,PCB_FLAGS(%r17) /* Save FPU context if needed */ andi. %r7, %r7, PCB_FPU beq .L1 bl save_fpu .L1: mr %r3,%r14 /* restore old thread ptr */ lwz %r7,PCB_FLAGS(%r17) /* Save Altivec context if needed */ andi. %r7, %r7, PCB_VEC beq .L2 bl save_vec .L2: mr %r3,%r14 /* restore old thread ptr */ bl pmap_deactivate /* Deactivate the current pmap */ sync /* Make sure all of that finished */ cpu_switchin: #if defined(SMP) && defined(SCHED_ULE) /* Wait for the new thread to become unblocked */ - lis %r6,blocked_lock@ha - addi %r6,%r6,blocked_lock@l + bl _GLOBAL_OFFSET_TABLE_@local-4 + mflr %r6 + lwz %r6,blocked_lock@got(%r6) blocked_loop: lwz %r7,TD_LOCK(%r2) cmpw %r6,%r7 beq- blocked_loop isync #endif lwz %r17,TD_PCB(%r2) /* Get new current PCB */ lwz %r1,PCB_SP(%r17) /* Load new stack pointer */ /* Release old thread now that we have a stack pointer set up */ cmpwi %r14,0 beq- 1f stw %r16,TD_LOCK(%r14) /* ULE: update old thread's lock */ 1: mfsprg %r7,0 /* Get the pcpu pointer */ stw %r2,PC_CURTHREAD(%r7) /* Store new current thread */ lwz %r17,TD_PCB(%r2) /* Store new current PCB */ stw %r17,PC_CURPCB(%r7) mr %r3,%r2 /* Get new thread ptr */ bl pmap_activate /* Activate the new address space */ lwz %r6, PCB_FLAGS(%r17) /* Restore FPU context if needed */ andi. %r6, %r6, PCB_FPU beq .L3 mr %r3,%r2 /* Pass curthread to enable_fpu */ bl enable_fpu .L3: lwz %r6, PCB_FLAGS(%r17) /* Restore Altivec context if needed */ andi. %r6, %r6, PCB_VEC beq .L4 mr %r3,%r2 /* Pass curthread to enable_vec */ bl enable_vec .L4: /* thread to restore is in r3 */ mr %r3,%r17 /* Recover PCB ptr */ lmw %r12,PCB_CONTEXT(%r3) /* Load the non-volatile GP regs */ lwz %r5,PCB_CR(%r3) /* Load the condition register */ mtcr %r5 lwz %r5,PCB_LR(%r3) /* Load the link register */ mtlr %r5 lwz %r1,PCB_SP(%r3) /* Load the stack pointer */ /* * Perform a dummy stwcx. to clear any reservations we may have * inherited from the previous thread. It doesn't matter if the * stwcx succeeds or not. pcb_context[0] can be clobbered. */ stwcx. %r1, 0, %r3 blr /* * savectx(pcb) * Update pcb, saving current processor state */ ENTRY(savectx) stmw %r12,PCB_CONTEXT(%r3) /* Save the non-volatile GP regs */ mfcr %r4 /* Save the condition register */ stw %r4,PCB_CR(%r3) blr /* * fork_trampoline() * Set up the return from cpu_fork() */ ENTRY(fork_trampoline) lwz %r3,CF_FUNC(%r1) lwz %r4,CF_ARG0(%r1) lwz %r5,CF_ARG1(%r1) bl fork_exit addi %r1,%r1,CF_SIZE-FSP /* Allow 8 bytes in front of trapframe to simulate FRAME_SETUP does when allocating space for a frame pointer/saved LR */ b trapexit